If I had a photographic memory and I used it to replicate parts of GPLed software verbatim while erasing the license, I could not excuse it in court that I simply "learned from" the examples.
Some companies outright bar their employees from reading GPLed code because they see it as too high of a liability. But if a computer does it, then suddenly it is a-ok. Apparently according to the courts too.
If you're going to allow copyright laundering, at least allow it for both humans and computers. It's only fair.
Right, because you would have done more than learning, you would have then gone past learning and used that learning to reproduce the work.
It works exactly the same for a LLM. Training the model on content you have legal access to is fine. Aftwards, somone using that model to produce a replica of that content is engaged in copyright enfringement.
You seem set on conflating the act of learning with the act of reproduction. You are allowed to learn from copyrighted works you have legal access to, you just aren't allowed to duplicate those works.
If someone hires me to write some code, and I give them GPLed code (without telling them it is GPLed), I'm the one who broke the license, not them.
It's also an interesting double standard, wherein if I were to steal OpenAI's models, no AI worshippers would have any issue condemning my action, but when a large company clearly violates the license terms of free software, you give them a pass.
If GPT-5 were "open sourced", I don't think the vast majority of AI users would seriously object.
Which is funny since that's a much clearer case of "learning from" than outright compressing all open source code into a giant pile of weights by learning a low-dimensional probability distribution of token sequences.
Information wants to be free.
That is not nearly the extent of AI training data (e.g. OpenAI training its image models on Studio Ghibli art). But if by "gave their work away for free" you mean "allowed others to make [proprietary] derivative works", then that is in many cases simply not true (e.g. GPL software, or artists who publish work protected by copyright).