* Read the signatures of the functions.
* Use the code correctly.
* Answer questions about the behavior of the underlying API by consulting the code.
Of course they're just guessing if they go beyond what's in their context window, but don't underestimate context window!
"If you're getting answers, it has seen it elsewhere"
The context window is 'elsewhere'.
It’s silly to say that something LLMs can reliably do is impossible and every time it happens it’s “dumb luck”.
As they say, it sounds like you're technically correct, which is the best kind of correct. You're correct within the extremely artificial parameters that you created for yourself, but not in any real world context that matters when it comes to real people using these tools.
To anyone who has used these tools in anger it’s remarkable given they’re only trained on large corpuses of language and feedback they’re able to produce what they do. I don’t claim they exist outside their weights, that’s absurd. But the entire point of non linear function activations with many layers and parameters is to learn highly complex non linear relationships. The fact they can be trained as much as they are with as much data as they have without overfitting or gradient explosions means the very nature of language contains immense information in its encoding and structure, and the network by definition of how it works and is trained does -not- just return what it was trained on. It’s able to curve fit complex functions that inter relate semantic concepts that are clearly not understood as we understand them, but in some ways it represents an “understanding” that’s sometimes perhaps more complex and nuanced than even we can.
Anyway the stochastic parrot euphemism misses the point that parrots are incredibly intelligent animals - which is apt since those who use that phrase are missing the point.
You want to say this guy's experience isn't reproducible? That's one thing, but that's probably not the case unless you're assuming they're pretty stupid themselves.
You want to say that it Is reproducible, but that "that doesn't mean AI can think"? Okay, but that's not what the thread was about.
When I built my own programming language and used it to build a unique toy reactivity system and then asked the LLM "what can I improve in this file", you're essentially saying it "only" could help me because it learned how it could improve arbitrary code before in other languages and then it generalized those patterns to help me with novel code and my novel reactivity system.
"It just saw that before on Stack Overflow" is a bad trivialization of that.
It saw what on Stack Overflow? Concrete code examples that it generalized into abstract concepts it could apply to novel applications? Because that's the whole damn point.
How would you reconcile this with the fact that SOTA models are only a few TB in size? Trained on exabytes of data, yet only a few TB in the end.
Correct answers couldn't be dumb luck either, because otherwise the models would pretty much only hallucinate (the space of wrong answers is many orders of magnitude larger than the space of correct answers), similar to the early proto GPT models.
This is false. You are off by ~4 orders of magnitude by claiming these models are trained on exabytes of data. It is closer to 500TB of more curated data at most. Contrary to popular belief LLMs are not trained on "all of the data on the internet". I responded to another one of your posts that makes this false claim here:
As to 'knows the answer', I'm don't even know what that means with these tools. All I know is if it is helpful or not.
The amazing thing about LLMs is that we still don’t know how (or why) they work!
Yes, they’re magic mirrors that regurgitate the corpus of human knowledge.
But as it turns out, most human knowledge is already regurgitation (see: the patent system).
Novelty is rare, and LLMs have an incredible ability to pattern match and see issues in “novel” code, because they’ve seen those same patterns elsewhere.
Do they hallucinate? Absolutely.
Does that mean they’re useless? Or does that mean some bespoke code doesn’t provide the most obvious interface?
Having dealt with humans, the confidence problem isn’t unique to LLMs…
You may want to take a course in machine learning and read a few papers.
LLMs are insanely complex systems and their emergent behavior is not explained by the algorithm alone.
Goodness this is a dim view on the breadth of human knowledge.
But I look down my nose at conceptions that human knowledge is packagable as plain text, our lives, experience, and intelligence is so much more than the cognitive strings we assemble in our heads in order to reason. It’s like in that movie Contact when Jodie Foster muses that they should have sent a poet. Our empathy and curiosity and desires are not encoded in UTF8. You might say these are realms other than knowledge, but woe to the engineer who thinks they’re building anything superhuman while leaving these dimensions out, they’re left with a cold super-rationalist with no impulse to create of its own.
That doesn't mean it knows the answer. That means it guessed or hallucinated correctly. Guessing isn't knowing.
edit: people seem to be missing my point, so let me rephrase. Of course AIs don't think, but that wasn't what I was getting at. There is a vast difference between knowing something, and guessing.
Guessing, even in humans, is just the human mind statistically and automatically weighing probabilities and suggesting what may be the answer.
This is akin to what a model might do, without any real information. Yet in both cases, there's zero validation that anything is even remotely correct. It's 100% conjecture.
It therefore doesn't know the answer, it guessed it.
When it comes to being correct about a language or API that there's zero info on, it's just pure happenstance that it got it correct. It's important to know the differences, and not say it "knows" the answer. It doesn't. It guessed.
One of the most massive issues with LLMs is we don't get a probability response back. You ask a human "Do you know how this works", and an honest and helpful human might say "No" or "No, but you should try this. It might work".
That's helpful.
Conversely a human pretending it knows and speaking with deep authority when it doesn't is a liar.
LLMs need more of this type of response, which indicates certainty or not. They're useless without this. But of course, an LLM indicating a lack of certainty, means that customers might use it less, or not trust it as much, so... profits first! Speak with certainty on all things!