And you are parroting the LLMs are stochastic parrots argument.
You are overly confident in your assessment LLMs are not world models, more sure than those in the relevant fields of neuroscience, cognition, and machine learning researchers themselves.
This is an area of active study. Reflect on that. We don't yet know if LLMs aren't modeling something more than the next token.
But you seem to know, because of some sensibilities that LLMs with such simple architecture can't be more than a token predictor. Okay.
But...it is. It's an incredibly complex and impressive stochastic parrot - but that's basically what it is.
That doesn't mean it can't be useful. It absolutely can. There are some problems that will likely be greatly improved by throwing an LLM at it.
What I am saying is that people need to temper their expectations and not get caught up in tech fanaticism and anthropomorphize something that isn't there.
To think otherwise invites mysticism.
> anthropomorphize something that isn't there.
With above as counterexample, you just don't know this.
I otherwise agree in that LLMs in their current form are highly unlikely to give rise to AGI, for many reasons.
But as it stands your argument lacks rigour and actively makes assumptions on matters that remain an open subject of experimental and scientific inquiry (hard problem of consciousness et al).
To emphasize, I want to close that the epistemic position we aught to take is that of uncertainty. We shouldn't be sure something is there, just as we shouldn't be sure something isn't there.
We as yet don't know enough to say one way or the other. That's the point I want to emphasize. Stay open minded until the relevant fields start making stronger claims.
You may find this leading theory on how our brains work interesting: https://en.m.wikipedia.org/wiki/Predictive_coding
It's almost like the situation with finite automata vs a turing machine. There are some problems a finite automata cannot ever solve that a turing machine can. You can't parse HTML with Regex. Some things cannot be done.
In order to make something more powerful, you would need something that isn't a LLM. It would need to be on the next level of artificial learning complexity.
What are the specifics characteristics of a hypothetical AI system that you would feel comfortable giving the label of “understanding” to?
GPT can't make those kinds of reasoning or extensions. It can only regurgitate what is already known and has been stated before, somewhere before in its training set.
It's very impressive, I just think people over-hype it into something it is not.
I tested GPT-4 with your example of sky color / human eyes. It performed quite well and seemed to have a pretty coherent grasp of the subject and related associations.
Link to the convo: https://chat.openai.com/share/77add48f-abdc-4734-ac55-05b8d1...
However, I could see how one might argue the reasoning is not that complex.
I strongly maintain that there is some form of reasoning going on here, for all meaningful definitions of the word. But it is a complex and tricky thing to analyze.
Lastly, your original comment veered dangerously close to claiming that models trained to predict text cannot — by definition! — ever acquire any form of reasoning or understanding of the world. This is a very strong claim that I don’t think can be substantiated, at least with the tools and knowledge we have today.
They can't, and I stand by that. An LLM is always going to be an LLM, it's never magically going to turn into a gAI. Now, some of the lessons learned from developing all these LLMs could be used to develop a gAI, but that's a long way away and would need to be built from the ground up for that purpose by design.
I argue that the model’s weights could contain a rich representation of the world, even if it looks different from that of humans. And that it can use this capability of world modeling to also make new connections.
These “world modeling” / “reasoning” abilities seem to have some similarities with that of humans. We see echoes of our own abilities in gpt-4 and the like. But they also have their own peculiarities and limitations, which isn’t surprising in hindsight.
If you ask why the sky is blue, it will give you a reasonable answer. But that isn't because it understands it, it's because the sentence it gives you is the most likely thing to follow that question. It has read that question and answer over and over again, enough to spit out a similar response.
That is fundamental to it as a tool. You can always try to write extensions to work around like, like something that tries to create inputs for Wolfram. But it is a very limited tool and there are always going to be problems that it just can't handle because of the design.