How does this model collapse thing still get spread around? It’s not true. Synthetic data has actually helped bots get smarter, not dumber. And if you think that all Gemini3 does is recycle idk what to tell you
How does this model collapse thing still get spread around? It’s not true. Synthetic data has actually helped bots get smarter, not dumber. And if you think that all Gemini3 does is recycle idk what to tell you
Devial@discuss.online 17 hours ago
If the model collapse theory weren’t true, then why do LLMs need to scrape so much data from the internet for training ?
According to you, they should be able to just generate synthetic training data purely with the previous model, and then use that to train the next generation.
So why is there even a need for human input at all them ? Why are all LLM companies fighting tooth and nail against their data scraping being restricted, if real human data is in fact so unnecessary for model training.
You can stop models from deteriorating without new data, and you can even train them with synthetic data, but that still requires the synthetic data to either be modelled, or filtered by humans to ensure its quality. If you just take a million random chatGPT outputs, with no human filtering whatsoever, and use that to restrain the chatGPT, and then repeat that over and over again, eventually the model will turn to shit. Each iteration some of the random tweaks chatGPT makes to their output are going to produce a bad output, which is now presented to the new training model as a target to achieve, so the model learns this bad output is less bad than it previously thought.
CatsPajamas@lemmy.dbzer0.com 12 hours ago
I stopped reading when you said according to me and then produced a wall of text of shit I never said.
Synthetic data is massively helpful. You can look it up. This is a myth.
Devial@discuss.online 10 hours ago
That is enormously ironic, since I literally never claimed you said anything except for what you did: Namely, that synthetic data is enough to train models.
Everything else in my comment is quite explicitly my own thoughts on the matter, and why I disagree with that statment, so in actual fact, you’re the one making up shit I never said.