Preferences

You can count me as an AGI sceptic to extent that I don't think LLMs are the approach that are going to get us there, but I'm equally confident that we will get there, and that predictive neural nets are the core of the right approach.

The article is a bit rambling, but the main claims seem to be:

1) Computers can't emulate brains due to architecture (locality, caching, etc) and power consumption

2) GPUs are maxxing out in terms of performance (and implicitly AGI has to use GPUs)

3) Scaling is not enough, since due to 2) scaling is close to maxxing out

4) AGI won't happen because he defines AGI as requiring robotics, and seeing scaling of robotic experience as a limiting factor

5) Superintelligence (which he associates with self-improving AGI) won't happen because it'll again require more compute

It's a strange set of arguments, most of which don't hold up, and both manages to miss what is actually wrong with the current approach, and to conceive of what different approach will get us to AGI.

1) Brains don't have some exotic architecture than somehow gives them an advantage over computers in terms of locality, etc. The cortex is in fact basically a 2-D structure - a sheet of cortical columns, with a combination of local and long distance connections.

Where brains are different from a von-neumann architecture is that compute & memory are one and the same, but if we're comparing communication speed between different cortical areas, or TPU/etc chips, then the speed advantage goes to the computer.

2) Even if AGI had to use matmul and systolic arrays, and GPUs are maxxing out in terms of FLOPs, we could still scale compute, if needed, just by having more GPUs and faster and.or wider interconnect.

3) As above, it seems we can scale compute just by adding more GPUs and faster interconnect if needed, but in any case I don't think inability to scale is why AGI isn't about to emerge from LLMs.

4) Robotics and AGI are two separate things. A person lying in a hospital bed still has a brain and human-level AGI. Robots will eventually learn individually on-the-job, just as non-embodied AGI instances will, so size of pre-training datasets/experience will become irrelevant.

5) You need to define intelligence before supposing what super-human intelligence is and how it may come about, but Dettmers just talks about superintelligence in hand-wavy fashion as something that AGI may design, and assumes that whatever it is will require more compute than AGI. In reality intelligence is prediction and is limited in domain by your predictive inputs, and in quality/degree by the sophistication of your predictive algorithms, neither of which necessarily need more compute.

What is REALLY wrong with the GPT LLM approach, and why it can't just be scaled to achieve AGI, is that it is missing key architectural and algorithmic components (such as incremental learning, and a half dozen others), and perhaps more fundamentally that auto-regressive self-prediction is just the wrong approach. AGI needs to learn to act and predict the consequences of it's own actions - it needs to predict external inputs, not generative sequences.


This item has no comments currently.