Comment by gdiamos - Hacker Neue

gdiamos 5 days ago parent

It's surprising to me that the field is willing to invest this much in mega-kernels, but not models that generate multiple tokens in parallel...

liuliu 5 days ago

It is hard to justify tens-of-millions investment in training to just make it faster without any idea how it scores on benchmarks. It is easier to justify keeping the model intact and spend extra millions to make it faster with exotic means (megakernels).

There are some niche research on parallel token generations though as of late...

This item has no comments currently.