Comment on Are there any initiatives aimed at training generative AI using 100% public domain works and works authorized by the creator?

sturlabragason@lemmy.world ⁨1⁩ ⁨day⁩ ago

Hey PubDomainLLM tell me something that only exists in that proprietary dataset? “I’m sorry, you’ve caught me lackin’”

You would want your LLM to be trained on as comprehensive a dataset as you can. But I would suggest we should be coming up with better ways to license proprietary works for uses like this instead of walling it up for the cable tv of proprietary knowledge gardens.

I agree with you partially in principle but not in practice.

Ultimately we want as smart LLMs as we can, just compare the best models with the mediocre ones, or use them all day long, there is a vast difference.

source
Sort:hotnewtop