What about these? Dozens of TB here:
There is also a LAION-5B now, and several other datasets.
Comment on Bill proposed to outlaw downloading Chinese AI models.
p03locke@lemmy.dbzer0.com 6 days agoNobody releases training data. It’s too large and varied. The best I’ve seen was the LAION-2B set that Stable Diffusion used, and that’s still just a big collection of links. Even that isn’t going to fit on a GitHub repo.
Besides, improving the model means using the model as a base and implementing new training data. Specialize, specialize, specialize.
What about these? Dozens of TB here:
There is also a LAION-5B now, and several other datasets.
Wow, it’s like you didn’t even read my post.
thingsiplay@beehaw.org 6 days ago
That’s why its not Open Source. They do not release the source and its impossible to build the model from source.