Comment by lcnPylGDnU4H9OF

> broadly capable open models are on track for annihilation

I'm not so sure about this one. In particular, presuming that it is found that models which can produce infringing material are themselves infringing material, the ability to distill models from older models seems to suggest that the older models can actually produce the new, infringing model. It seems like that should mean that all output from the older model is infringing because any and all of it can be used to make infringing material (the new model, distilled from the old).

I don't think it's really tenable for courts to treat any model as though it is, in itself, copyright-infringing material without treating every generative model like that and, thus, killing the GPT/diffusion generation business (that could happen but it seems very unlikely). They will probably stick to being critical of what people generate with them and/or how they distribute what they generate.

ijk 1 day ago

In theory, couldn't you distill a non-infringing model from an infringing one? Just prompt it for continuations and give it a whack every time the output matches something in your dataset of copyrighted works.

You'd need the copyrighted works to compare to, of course, though if you have the permissible training data (as Anthropic apparently does) it should be doable.

This item has no comments currently.