Comment by leptons - Hacker Neue

leptons Dec 15, 2025 parent

> LLMs don’t “learn” from the information they operate on, contrary to what a lot of people assume.

Nothing is really preventing this though. AI companies have already proven they will ignore copyright and any other legal nuisance so they can train models.

lioeters Dec 15, 2025

They're already using synthetic data generated by LLMs to further train LLMs. Of course they will not hesitate to feed "anonymized" data generated by user interactions. Who's going to stop them? Or even prove that it's happening. These companies have already been allowed to violate copyright and privacy on a historic global scale.

Archelaos Dec 15, 2025

How should they dinstinguish between real and fake data? It would be far to easy to pollute their models with nonesense.

leptons OP Dec 15, 2025

I have no doubt that Microsoft has already classified the nature of my work and quality of my code. Of course it's probably "anonymized". But there's no doubt in my mind that they are watching everything you give them access to, make no mistake.

tick_tock_tick Dec 15, 2025

I mean is it really ignoring copyright when copyright doesn't limit them in anyway on training?

leptons OP Dec 15, 2025

Tell that to all the people suing them for using their copyrighted work. In some cases the data was even pirated.

Aurornis Dec 15, 2025

> Nothing is really preventing this though

The enterprise user agreement is preventing this.

Suggesting that AI companies will uniquely ignore the law or contracts is conspiracy theory thinking.

leptons OP Dec 15, 2025

It already happened.

"Meta Secretly Trained Its AI on a Notorious Piracy Database, Newly Unredacted Court Docs Reveal"

https://www.wired.com/story/new-documents-unredacted-meta-co...

They even admitted to using copyrighted material.

"‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says"

https://www.theguardian.com/technology/2024/jan/08/ai-tools-...

cess11 Dec 15, 2025

Though the porn they copied was just for personal use, because clearly that's an important perk of being employed there:

https://www.vice.com/en/article/meta-says-the-2400-adult-mov...

This item has no comments currently.