Don't worry, local models past a certain size will be declared illegal eventually on the grounds of safety. You can already see the major players advocating for legislation that is a step down that road.
And this wouldn't even be very difficult to enforce. Running SOTA models at useful speeds (which you'd need to actually compete - a local setup like you describe where time to first token is measured in minutes for a decent sized prompt is not going to cut it) requires a lot of compute. Which is to say, hardware, and energy to power it. Both things that can be tracked.
And this wouldn't even be very difficult to enforce. Running SOTA models at useful speeds (which you'd need to actually compete - a local setup like you describe where time to first token is measured in minutes for a decent sized prompt is not going to cut it) requires a lot of compute. Which is to say, hardware, and energy to power it. Both things that can be tracked.