Comment by cgearhart - Hacker Neue

cgearhart 20 hours ago parent

The current gen VLA architectures include some tricks (like compressed action tokenization and diffusion decoding) to reach action frequencies between 50-200hz. I think they’re _more_ efficient this way than regular LLMs trying to do everything thru text.

This item has no comments currently.