You should check out this article by Kit Walsh, a senior staff attorney at the EFF. The EFF is a digital rights group who recently won a historic case: border guards now need a warrant to search your phone.
particularly:
and
More importantly, Nightshade is anti-open source. Since the only models with open VAEs are Stable Diffusion’s open models, companies like Midjourney and OpenAI with closed source models you can’t poke around in can’t be like attacked with this tool. That’s not really something that should be celebrated.
Nightshade is also made Ben Zhao, the University of Chicago professor who stole open source code for his last data poisoning scheme. He took GPLv3 code, which is a copyleft license that requires you share your source code and license your project under the same terms as the code you used. You also can’t distribute your project as a binary-only or proprietary software. When pressed, they only released the code for their front end, remaining in violation of the terms of the GPLv3 license.
elliot_crane@lemmy.world 9 months ago
The tagline is really poorly written IMO. From reading the README, this doesn’t outwardly appear to be a tool for bypassing an artist’s choice to use something like Nightshade, but rather it seems to detect if such a tool has been used.
I’m assuming that the use case would be to avoid training on Nightshade-ed images, which would actually be respecting the original artist’s decision?
tyler@programming.dev 9 months ago
I read the whole thing. I understand it’s for detecting use of nightshade, not bypassing it. What other even slightly ethical use for this is there besides trying to make sure you don’t train on a poisoned image? These models are clearly not asking for permission first, else you’d never need to do this, so they’re just taking an image, assuming they’re allowed to use it, and then using this tool to detect if it’s going to poison their model.
elliot_crane@lemmy.world 9 months ago
I don’t think most people are collecting images by hand and saying “ah yes I’m just gonna yoink this and use it in my model”. There are a plethora of sites for sharing repositories of training data, and therefore it’s pretty easy for someone training a model to unknowingly pull down some data they don’t actually have permission to use. It’s completely infeasible to check licensing by hand on what could be millions of images, so this tool makes it easy to simply not train on images that have gone through Nightshade. I fail to see how that’s unethical, as not training on the image is the whole reason the original image was put through Nightshade in the first place.
tyler@programming.dev 9 months ago
Then it shouldn’t be done. That’s the unethical part. Trying to just avoid the problem by continuing to scrape large data sets for images that you shouldn’t be using is the entire problem. Either get permission for each image or don’t build your image model. Doing otherwise is unethical.