Comment on Are there any initiatives aimed at training generative AI using 100% public domain works and works authorized by the creator?

General_Effort@lemmy.world ⁨1⁩ ⁨day⁩ ago

For images, yes. Most notable is probably Adobe. Their AI, which powers photoshop’s generative fill among other things, is trained on public domain and licensed works.

For text, there’s nothing similar. LLMs get better the more data you have. So, the less training data you use, the less useful they are. I think there are 1 or a few small models for research purposes, but it really doesn’t get you there.

Of course, there aren’t any such open source projects. When you take these extreme, maximalist views of (intellectual) property, then giving stuff away for free isn’t the obvious first step.

source
Sort:hotnewtop