Generalization across tasks is clearly still elusive. The only reason we see such success with modern LLMs is because of the heroic amount of parameters used. When you are probing into a space of a billion samples, you will come back with something plausible every time.
The only thing I've seen approximating generalization has appeared in symbolic AI cases with genetic programming. It's arguably dumb luck of the mutation operator, but oftentimes a solution is found that does work for the general case - and it is possible to prove a general solution was found with a symbolic approach.
The only thing I've seen approximating generalization has appeared in symbolic AI cases with genetic programming. It's arguably dumb luck of the mutation operator, but oftentimes a solution is found that does work for the general case - and it is possible to prove a general solution was found with a symbolic approach.