The whole thing can be summed up as the following: they’re selling you a hammer and telling you to use it with screws. Once you hammer the screw, it trashes the wood really bad. Then they’re calling the wood trashing “hallucination”, and promising you better hammers that won’t do this. Except a hammer is not a tool to use with screws dammit, you should be using a screwdriver.
An AI leaderboard suggests the newest reasoning models used in chatbots are producing less accurate results because of higher hallucination rates.
So he’s suggesting that the models are producing less accurate results… because they have higher rates of less accurate results? This is a tautological pseudo-explanation.
AI chatbots from tech companies such as OpenAI and Google have been getting so-called reasoning upgrades over the past months
When are people going to accept the fact that large “language” models are not general intelligence?
ideally to make them better at giving us answers we can trust
Those models are useful, but only a fool trusts = is gullible towards their output.
OpenAI says the reasoning process isn’t to blame.
Just like my dog isn’t to blame for the holes in my garden. Because I don’t have a dog.
This is sounding more and more like model collapse - models perform worse when trained on the output of other models.
inb4 sealions asking what’s my definition of reasoning in 3…2…1…
PonyOfWar@pawb.social 2 days ago
Wonder if we’re already starting to see the impact of AI being trained on AI-generated content.
SippyCup@feddit.nl 2 days ago
Absolutely.
AI generated content was always going to leak in to the training models unless they literally stopped training as soon as it started being used to generate content, around 2022.
And once it’s in, it’s like cancer. There’s no getting it out without completely wiping the training data and starting over. And it’s a feedback loop. It will only get worse with time.
The models could have been great, but they rushed release and made it available too early.
If 60% of the posts on Reddit are bots, which may be a number I made up but I feel like I read that somewhere, then we can safely assume that roughly half the data these models are being trained on is now AI generated.
Rejoice friends, soon the slop will render them useless.
Ulrich@feddit.org 2 days ago
Not before they render the remainder of the internet useless.
vintageballs@feddit.org 1 day ago
In the case of reasoning models, definitely. Reasoning datasets weren’t even a thing a year ago and from what we know about how the larger models are trained, most task-specific training data is artificial (oftentimes a small amount is human-generated and then synthetically augmented).
However, I think it’s safe to assume that this has been the case for regular chat models as well - the self-instruct and ORCA papers are quite old already.