foota parent
I can't see it being superhuman, that's for sure. Chess AI are superhuman because they do vast searches, and I can't see that being replicated by an LLM architecture.
The apples-to-apples comparison would be comparing an LLM with Leela with search turned off (only using a single board state)
According to figure 6b [0] removing MCTS reduces Elo by about 40%, scaling 1800 Elo by 5/3 gives us 3000 Elo which would be superhuman but not as good as e.g. LeelaZero.
[0]: https://gwern.net/doc/reinforcement-learning/model/alphago/2...
Leela policy is around 2600 elo, or around the level of a strong grandmaster.
Note that Go is different from chess since there are no draws, so skill difference is greatly magnified.
Elo is always a relative scale (expected score is based on elo difference) so multiplication should not really make sense anyways.
I don’t think 3000 is superhuman though, it’s peak human as iirc magnus had an Elo of 3000 at one point
Any particular reason why that shouldn't work well with fine-tuning of an LLM using reinforcement learning?
Chess AI used to dominate by computational power but to my knowledge that is no longer true and the engines beat all but the very strongest players even when run on phone CPUs.
Phone cpus have gotten quite fast in the past decade, too.
Deep Blue analyzed some 200 million positions per second. Modern engines analyze a three to four orders of magnitude fewer nodes per second, but have much more refined pruning of the search space.