Comment by lolinder - Hacker Neue

lolinder Aug 11, 2024 parent

In other words, would it be fair to say that any naturally sequential problem is not very far from autocomplete?

Again, I think you're putting words in my mouth and thoughts in my head that aren't there. A lot of people have reacted to AI hype by going the other way and underestimating them—that's not me. I think there are lots of problems they can solve, I just think they all boil down to autocomplete and if you can't boil it down to autocomplete you're not ready to implement it yet with an LLM.

afarviral Aug 12, 2024

What's the difference between "talk" or "produce output" in your mind? I feel that automatic completion makes it sound as if the complete result was already supplied to the model or that it is akin to just the most probable next word from simple frequency rather than the "thinking" that is done by the model to produce the next token. Autocomplete doesn't adequately describe anything. Like imagine a hypothetical machine of infinite ability that still works in a way where it produces tokens in a series from previous tokens, you would still be arguing it is called autocomplete and bothering me.

mike_hearn Aug 11, 2024

This is kind of reductionist. It's like saying that a human writing a book is just doing manual word completion starting from the title. It's technically correct, but what insight is contributed? Would anything about this conversation be different if someone trained a model that did diffusion-like inference in which every possible word in the answer is pushed towards the final result simultaneously? Probably not.

danielmarkbruce Aug 11, 2024

People fine tune LLMs for classification tasks.

This is completely wrong.

lolinder OP Aug 12, 2024

You're just going around saying people are completely wrong without reading what they wrote or providing any justification for that claim. I'm not sure how to respond because your comment is a non sequitur.

danielmarkbruce Aug 12, 2024

The justification is that people are fine tuning LLMs for classification. They take out the last layer, replace it with a layer which maps to n classes instead of vocab_size, and the training data Y's aren't next word, they are a class label (I have a job which does binary classification, for example)

It's just completely wrong to say everything in LLM land is autocomplete. It's trivial and common to do the above.

lolinder OP Aug 12, 2024

That's still autocomplete. You use it by feeding in a context (all the text so far) and asking it to produce the next word (your fine-tuned classification token). The only difference is you don't ask for more tokens once you have one.

That's a very clever way of reducing a problem to autocomplete, but it doesn't change the paradigm.

danielmarkbruce Aug 12, 2024

If an email says "Respond to win $100 now!" and a classifier has it as 99%/1% for two classes representing spam/not spam, "spam" is not a sensible next token, it's a classification. The model is not trying to predict the next token, it's trying to classify the entire body of text. The training data isn't a bunch of samples where y is whatever came after those tokens.

It's a silly way to think about it. Have you seen how people are fine tuning for classification? It's not like fine tuning for instruction or summarization etc, which are still using next token prediction and where the last layer is still mapping to vocab_size outputs.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous