Comment by kypro - Hacker Neue

kypro Jun 17, 2025 parent

The idea that LLMs can only spew out text they've been trained on is a fundamental miss-understanding of how modern backprop training algorithms work. A lot of work goes into refining training algorithms to preventing overfitting of the training data.

Generalisation is something that neural nets are pretty damn good at, and given the complexity of modern LLMs the idea that they cannot generalise the fairly basic logical rules and patterns found in code such that they're able provide answers to inputs unseen in the training data is quite an extreme position.

fpoling Jun 17, 2025

Yet the models do not (yet) reason. Try to ask them to solve a programming puzzle or exercise from an old paper book that was not scanned. They will produce total garbage.

Models work across programming languages because it turned out programming languages and API are much more similar than one could have expected.

This item has no comments currently.