Preferences

Most likely Google has lied. AI playing video games and board games don't translate to real world applications. Many people fail to see that.

In what respect is generating text a better predictor of real world applicability than the ability to achieve goals in a complex simulated environment containing other agents?
It's not one or the other. We need both supervised pre-training and reinforcement learning. The first part represents past human experiences encoded as language. They can bring a model to human level on most tasks, but not make it smarter.

The second approach, with RL, is based on immediate feedback and could make a model smarter than us. Just think of AlphaZero or AlphaTensor. But this requires deploying a wide search over possible solutions and using a mechanism to rank or filter the bad ideas out (code execution, running a simulation or a game, optimizing some metric)

So models need both past experience and new experience to advance. They can use organic text initially, but later need to develop their own training examples. The feedback they get will be on topic, both with the human user and with the model mistakes. That's very valuable. Feedback learning is what could make LLMs finally graduate from mediocre results.

DeepMind is saying they are using both, and feedback learning is dialed up.

The context in simulated environments of games is far less complex than the real world. Also the available interactions far less. It would be different if the agent would be exposed to the real world and use multisensory data to predict the next "token", i.e. thought or action.
People pay money for it.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal