paxys parent
Will be interesting to see how this affects Anthropic's ongoing lawsuit with Reddit, or all the different media publishing ones flying around. Is it okay to train on books but not online posts and articles? Why the distinction between the two?
The distinction will be whether those online posts were obtained legally, analogous to whether the books in this case were pirated.
It’s not as simple as it sounds, since I’m sure scraping is against Reddit’s terms and conditions, but if those posts are made publicly available without the scraper actually agreeing to anything, is that a valid breach of contract?
Will be interesting to see how that plays out.