Fundamentally, fair compensation is based on the amount of work put in (obviously taking skill/competence into account but the differences between people in most disciplines probably don't span a single order of magnitude, let alone several).
The ultimate goal should be to prevent people who don't produce value from taking advantage of those who do. And among those who do, that they get compensated according to the amount of work and skill they put in.
Imagine you spend a year building a house. I have a machine that can take your house and materialize a copy anywhere on earth for free. I charge people (something between 0 and the cost of building your house the normal way) to make them a copy of your house. I can make orders of magnitude more money this way than you. Are you happy about this situation? Does it make a difference how much i charge them?
What if my machine only works if I scan every house on the planet? What if I literally take pictures of it from all sides, then wait for your to not be home and xray it to see what it looks like inside?
You might say that you don't care because now you can also afford many more houses. But it does not make you richer. In fact, it makes you poorer.
Money is not a store of value. If everyone has more money but most people only have 2x more and a small group has a 1000x more, then the relative bargaining power changed so the small group is better off and the large group is worse off. This is what undetectable cheap mass plagiarism leads to for all intellectual work.
---
I wrote a lot of open source code, some of it under permissive licenses, some GPL, some AGPL. The conditions of those licenses are that you credit me. Some of them also require that if you build on top of my work, you release your work with the same licence.
LLMs launder my code to make profit off of it without giving me anything (while other people make profit, thus making me poorer) and without crediting me.
LLMs also take away the rights of the users of my code - (A)GPL forced anyone who builds on top of my work to release the code when asked, with LLM-laundered code, this right no longer seems to exist because who do you even ask?
The house thing is a bit offtopic because to be considered for copyright, only its artistic, architectural expression matters. If you want to protect the ingenuity in the technical ways of how it's constructed, that's a patent law thing. It also muddies the water by bringing in aspects of the privacy of one's home by making us imagine paparazzi style photoshoots and sneaky X rays.
The thing is, houses can't be copied like bits and bytes. I would copy a car if I could. If you could copy a loaf of bread for free, it would be a moral imperative to do so, whatever the baker might think about it.
> fair compensation is based on the amount of work put in
This is the labor theory of value, but it has many known problems. For example that the amount of work put in can be disconnected from the amount of value it provides to someone. Pricing via supply/demand market forces have produced much better outcomes across the globe than any other type of allocation. Of course moderated by taxes and so on.
But overall the question is whether LLMs create value for the public. Does it foster prosperity of society? If yes, laws should be such that LLMs can digest more books rather than less. If LLMs are good, they should not be restricted to be trained on copyright-expired writings.
If LLMs could create quality literature, or social media create in-depth reporting, then I'd have no problem with the tide of technological progress flowing.
Unfortunately, recent history has shown that it's trivial for the market to cannibalize the financial model of creators without replacing it.
And as a result, society gets {no more that thing} + {watered down, shitty version}.
Which isn't great.
So I'd love to hear an argument from the 'fuck copyright, let's go AI' crowd (not the position you seem to be espousing) on what year +10 of rampant AI ingestion of copyrighted works looks like...
So I'm not exactly naive, but we should then discuss this instead of the red herring of copyright.
As a result of this, everything gets cheaper and more plentiful.
The counterargument I'd make to that would be the requirement that the human have creative skills, which might atrophy in the absence of business models supporting a career creating.
I think there is a problem with your initial position. Nobody is entitled to compensation for simply working on something. You have to work on things that people need or want. There is no such thing "fair compensation".
It is "unfair" to take the work of somebody else and sell it as your own. (I don't think the LLMs are doing this.)
If the LLM and its output are based on 10^12 hours of work, out of which 10^6 is working on the code of the LLM itself and 10^12-10^6 (so roughly still 10^12) is working on the training data, does it make sense for only those working on the 10^6 to be compensated for the work?
This is a bit distorted. This is a better summary: The primary purpose of copyright is to induce and reward authors to create new works and to make those works available to the public to enjoy.
The ultimate purpose is to foster the creation of new works that the public can read and written culture can thrive. The means to achieve this is by ensuring that the authors of said works can get financial incentives for writing.
The two are not in opposition but it's good to be clear about it. The main beneficiary is intended to be the public, not the writers' guild.
Therefore when some new factor enters the picture such as LLMs, we have to step back and see how the intent to benefit the reading public can be pursued in the new situation. It certainly has to take into account who and how will produce new written works, but it is not the main target, but can be an instrumental subgoal.