Preferences

A more modern one, smolVLA is similar and uses a VLM but skips a few layers and uses an action adapter for outputs. Both are from HF and run on LeRobot.

https://arxiv.org/abs/2506.01844

Explanation by PhosphoAI: https://www.youtube.com/watch?v=00A6j02v450


This item has no comments currently.