Yet the models do not (yet) reason. Try to ask them to solve a programming puzzle or exercise from an old paper book that was not scanned. They will produce total garbage.
Models work across programming languages because it turned out programming languages and API are much more similar than one could have expected.
Generalisation is something that neural nets are pretty damn good at, and given the complexity of modern LLMs the idea that they cannot generalise the fairly basic logical rules and patterns found in code such that they're able provide answers to inputs unseen in the training data is quite an extreme position.