As a general class of folks, programmers and technologists have been putting people out of work via automation since we existed. We justified it via many ways, but generally "if I can replace you with a small shell script, your job shouldn't exist anyways and you can do something more productive instead". These same programmers would look over the shoulder of "business process" and see how folks did their jobs - "stealing" the workflows and processes so they could be automated.
Now that programmers jobs are on the firing block all of a sudden automation is bad. It's hard to sort through genuine vs. self-serving concern here.
It's more or less a case of what comes around goes around to me so far.
I don't think LLMs are great or problem free - or even that the training data set scraped from the Internet is moral or not. I just find the reaction to be incredibly hypocritical.
Learn to prompt, I guess?
I don't see the connection to handling the utilitarianism of implementing business logic. Would anyone find a thank-you email from an LLM to be of any non-negative value, no matter how specific or accurate in its acknowledgement it was? Isn't it beyond uncanny valley and into absurdism to have your calculator send you a Christmas card?
It was definitely a less-than-useful comment directed towards the tech bro types that came later when the money started getting good.
To GP: not all of us who automate go for low hanging fruit, I guess.
To the peer calling this illegitimate [or anyone, really]: without the assistance of an LLM, please break down the foul nature of... let me check my notes, gainful employment.
Yes, even if they don't say it. The other objections largely come from the need to sound more legitimate.
> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.
At the moment, it's just for taking money from gullible investors.
Its eating into business letters, essays and indie art generation but programming is a really tough cookie to crack.
It's like how "burger flippers" didn't go extinct due to automation. The burger joint simply mechanised and automated the parts that made sense, and now a lunch shift is handled by 5 employees instead of 20.
They will not replace the calibre of folks like Rob Pike in quite some time, perhaps (and I'd bet on) never.
I will grant you that the hype does not live up to the reality. The vast majority of jobs being taken from US developers are simply being offshored with AI as an excuse - but it is an actual real phenomenon I've personally witnessed.
That certainly in the short term took some programmers jobs away. That doesn't mean it pans out in the long term.
I think it's more causing people to do different work. There used to be about 75% of the workforce in agriculture but tractors and the like reduced that to 2% or so. I'm not sure if the people working as programers would be better off if that didn't happen and they were digging potatoes.
But it also automates _everything else_. Art and self-expression, most especially. And it did so in a way that is really fucking disgusting.
The concern is bigger than developer jobs being automated. The stated goal of the tech oligarchs is to create AGI so most labor is no longer needed, while CEOs and board members of major companies get unimaginably wealthy. And their digital gods allow them to carve up nations into fiefdoms for the coming techno fascist societies they envision.
I want no part of that.
i.e. people who are not hackers. Many (most?) hackers have been against the idea of copyright and intellectual property from the beginning. "Information wants to be free." after all.
Must be galling for people to find themselves on the same side as Bill Gates and his Open Letter to Hobbyists in 1976 which was also about "theft of people's labor".
If I had a photographic memory and I used it to replicate parts of GPLed software verbatim while erasing the license, I could not excuse it in court that I simply "learned from" the examples.
Some companies outright bar their employees from reading GPLed code because they see it as too high of a liability. But if a computer does it, then suddenly it is a-ok. Apparently according to the courts too.
If you're going to allow copyright laundering, at least allow it for both humans and computers. It's only fair.
It's also an interesting double standard, wherein if I were to steal OpenAI's models, no AI worshippers would have any issue condemning my action, but when a large company clearly violates the license terms of free software, you give them a pass.
That is not nearly the extent of AI training data (e.g. OpenAI training its image models on Studio Ghibli art). But if by "gave their work away for free" you mean "allowed others to make [proprietary] derivative works", then that is in many cases simply not true (e.g. GPL software, or artists who publish work protected by copyright).
If you distribute child porn, that is a crime. But if you crawl every image on the web and then train a model that can then synthesize child porn, the current legal model apparently has no concept of this and it is treated completely differently.
Generally, I am more interested in how this effects copyright. These AI companies just have free reign to convert copyrighted works into the public domain through the proxy of over-trained AI models. If you release something as GPL, they can strip the license, but the same is not true of closed-source code which isn't trained on.
I mean, this is an ideological point. It's not based in reason, won't be changed by reason, and is really only a signal to end the engagement with the other party. There's no way to address the point other than agreeing with them, which doesn't make for much of a debate.
> an 1800s plantation owner saying "can you imagine trying to explain to someone 100 years from now we tried to stop slavery because of civil rights"
I understand this is just an analogy, but for others: people who genuinely compare AI training data to slavery will have their opinions discarded immediately.
> We have evidence of LLMs reproducing code from github that was never ever released with a license that would permit their use. We know this is illegal.
What is illegal about it? You are allowed to read and learn from publicly available unlicensed code. If you use that learning to produce a copy of those works, that is enfringement.
Meta clearly enganged in copyright enfringement when they torrented books that they hadn't purchased. That is enfringement already before they started training on the data. That doesn't make the training itself enfringement though.
What kind of bullshit argument is this? Really? Works created using illegally obtained copyrighted material are themselves considered to be infringing as well. It's called derivative infringment. This is both common sense and law. Even if not, you agree that they infringed on copyright of something close to all copyrighted works on the internet and this sounds fine to you? The consequences and fines from that would kill any company if they actually had to face them.
That isn't true.
The copyright to derivative works is owned by the copyright holder of the original work. However using illegaly obtained copies to create a fair use transformative work does not taint your copyright of that work.
> Even if not, you agree that they infringed on copyright of something close to all copyrighted works on the internet and this sounds fine to you?
I agree that they violated copyright when they torrented books and scholarly arguments. I don't think that counts at "close to all copyrighted works on the Internet".
> The consequences and fines from that would kill any company if they actually had to face them.
I don't actually agree that copyright that causes no harm should be met with such steep penalties. I didn't agree when it was being done by the RIAA and even though I don't like facebook, I don't like it here either.
>It's a CRYSTAL CLEAR violation of the law
in the court of reddit's public opinion, perhaps.
there is, as far as I can tell, no definite ruling about whether training is a copyright violation.
and even if there was, US law is not global law. China, notably, doesn't give a flying fuck. kill American AI companies and you will hand the market over to China. that is why "everyone just shrugs it off".
The idea that they are coming up with all this stuff from scratch is Public Relations bs. Like Arnold Schwarzenegger never taking steroids, only believable if you know nothing about body building.
If a person "trains" on other creatives' works, they can produce output at the rate of one person. This presents a natural ceiling for the potential impact on those creatives' works, both regarding the amount of competing works, and the number of creatives whose works are impacted (since one person can't "train" on the output of all creatives).
That's not the case with AI models. They can be infinitely replicated AND train on the output of all creatives. A comparable situation isn't one human learning from another human, it's millions of humans learning from every human. Only those humans don't even have to get paid, all their payment is funneled upwards.
It's not one artist vs. another artist, it's one artist against an army of infinitely replicable artists.
What is the basis that an LLM should be included as a "creative type"?
LLMs seem to match.
To go into details though, under copyright law there's a clause for "fair use" under a "transformative" criteria. This allows things like satire, reaction videos to exist. So long as you don't replicate 1-to-1 in product and purpose IMO it's qualifies as tasteful use.
I have no interest in the rest of this argument, but I think I take a bit of issue on this particular point. I don't think the law is fully settled on this in any jurisdiction, but certainly not in the United States.
"Reason" is a more nebulous term; I don't think that training data is inherently "theft", any more than inspiration would be even before generative AI. There's probably not an animator alive that wasn't at least partially inspired by the works of Disney, but I don't think that implies that somehow all animations are "stolen" from Disney just because of that fact.
Obviously where you draw the line on this is obviously subjective, and I've gone back and forth, but I find it really annoying that everyone is acting like this is so clear cut. Evil corporations like Disney have been trying to use this logic for decades to try and abuse copyright and outlaw being inspired by anything.
> I don't think that training data is inherently "theft", any more than inspiration would be even before generative AI. There's probably not an animator alive that wasn't at least partially inspired by the works of Disney ...
Sure, but you can reason about it, such as by using analogies.
You cant be serious
A problem in an of itself.
I'm very glad AI is here and is slowly but surely destroying this terrible idea.