The author of this tool is even aware of that argument and just dismisses it wit...

williamcotton · on Feb 6, 2023

> i spoke to law profs about this - the analogy which kept coming up is the vcr. initially basically a piracy machine, it brought to life an enormous content market. had it been banned, creators would have been worse off in the long run.

It’s called Sony v Universal, and the legal doctrine for fair use that resulted is a test for “commercially significant non-infringing use”, of which a tool used for inpainting to remove power lines, latent space psychedelic visuals, and photo booth-painterly-style all are.

Imagine if Stable Diffusion was made illegal. Someone accuses me of using this illegal tool for one of these non-infringing uses, that is an image that doesn’t look like anyone else’s image as far as the court is concerned for copyright. I put the image on my website. If the image itself is not at all infringing, then what is the evidence that Stable Diffusion was used? Should the police be issued a warrant to search my private property for proof that I used Stable Diffusion without a shred of evidence or based on a tool that will always have both false positives and negatives?

space_fountain · on Feb 6, 2023

I do want to clarify that I think stable diffusion and tools like it can engage in illegal copying. For example it will happily produce infringing images of logos and even somewhat random other images https://arxiv.org/pdf/2212.03860.pdf. It seems like it’s devoting an uneven amount of its weights to different images, but I remain unconvinced that’s all it can do, or at least anymore all it can do than for a human artist

nonbirithm · on Feb 6, 2023

This is what happens when you overtrain a model too. Recent developments have allowed partial sets of model weights called LoRAs to be added to the diffusion model. These models can be fine-tuned independently in under half an hour. If you set the learning rate too high, it will start reproducing the source material with extremely high fidelity. This is what overfitting does.

My conclusion is there is an argument to be made for infringement in some cases, but it's based on degrees instead of absolutes. If infringement is defined as "copyrighted works were used in this dataset", then at a certain point (low enough learning rate) it becomes impossible to tell if infringing data was used. You'd be working with weight amounts that are so miniscule they could be rounding errors, yet by that definition would still be infringing.

And since any arbitrary data can be used with some set of keywords, the standard for what constitutes "infringing" changes with each model. As in, it would probably be hard to have a benchmark test that can definitively state "this model violates copyright." Any number of keywords can be trained on to obfuscate the prompt needed to reproduce the data, assuming there was even a high enough LR for the data to be reproduced similarly enough.

I'm unsure if there can ever be one standard for when a set of a bunch of floating point numbers can pass the threshold for constituting infringement. This is applying an absolute standard to a fuzzy algorithm. It's like compressing a JPEG, at some level of compression on the scale a picture of Mickey Mouse becomes unintelligible. But with JPEGs it isn't really useful to have an unintelligible picture of Mickey Mouse. However, it can be extremely useful to have a LoRA with the weights underfit just enough to where the diffusion gives novel outputs.

williamcotton · on Feb 6, 2023

> historically, creatives have been among the first to embrace new technologies. ever since the renaissance, artists have picked up every new tool as it's become available, and used it to make great things.

> these people aren't 'luddites'

This is just total bullshit. I know plenty of artists who are embracing this technology to make all sorts of things that tools like SD were not designed to do, like psychedelic music videos, etc.

What the author means is that a few loud blue check marks on Twitter who claim to be artists have been tweeting, get ready for it, inflammatory claims.