Preferences

> They do it by iteratively predicting the next token.

You don't know that. It's how the llm presents, not how it does things. That's what I mean by it being the interface.

There's ever only one word that comes out of your mouth at a time, but we don't conclude that humans only think one word at a time. Who's to say the machine doesn't plan out the full sentence and outputs just the next token?

I don't know either fwiw, and that's my main point. There's a lot to criticize about LLMs and, believe or not, I am a huge detractor of their use in most contexts. But this is a bad criticism of them. And it bugs me a lot because the really important problems with them are broadly ignored by this low-effort, ill-thought-out offhand dismissal.