I’ll start believing in AI when, and if, it’s able to eliminate error. When will AI be able to work out whether the training material it used is true, fasle, myth, or other narrative?
AI agents wrong ~70% of time: Carnegie Mellon study
Submitted 1 day ago by MirchiLover@beehaw.org to technology@beehaw.org
https://www.theregister.com/2025/06/29/ai_agents_fail_a_lot/?td=rt-4a
Comments
JackEddyfier@beehaw.org 1 day ago
thebardingreen@lemmy.starlightkel.xyz 1 day ago
We tried to build systems that perform a kind of basic, rudimentary, extremely power intensive and inefficient mimicry of how (we think maybe) brain cells work.
Then that system lies to us, makes epic bumbling mistakes, expresses itself with extreme, overconfidence, and constantly creatively misinterprets simple instructions.
Hmmm… Actually, maybe we’re doing a pretty good job…
Kirk@startrek.website 1 day ago
This bit at the end, wow:
Gartner still expects that by 2028 about 15 percent of daily work decisions will be made autonomously by AI agents, up from 0 percent last year.
Agentic AI is wrong 70% of the time, but even assuming a human employee is barely correct most of the time and wrong 49% of the time, is it really still more efficient to replace them?
Cruxifux@feddit.nl 1 day ago
Honestly this whole argument is insane to me and indicative to the clown world we live in. If AI can do human jobs, even if it’s a little shittier, we should HAVE THAT and then have HUMANS WORK LESS but this thing that should be making our lives awesome is absolutely going to be used to make them worse.
MountingSuspicion@reddthat.com 1 day ago
If you think we should offload to AI even if it’s worse, I have serious questions about your day to day life. What industry do you think could stand to be worse? Doctor’s offices? Lawyers? Mechanics? Accounts?
The end user (aka the PEOPLE NEEDING A SERVICE) are the ones getting screwed over when companies offload to AI. You tell AI to schedule an appointment tomorrow, and 80% of the time it does and 20% it just never does or puts it on for next week. That hurts both the office trying to maximize the people seen/helped and the person that needs the help. Working less hours due to tech advancement is awesome, but in reality offloading to AI in the current work climate is not going to result in working less hours. Additionally, how costly is each task the AI is doing? Are the machines running off of renewables, or is using this going to contribute to worse air quality and worse climate outcomes for people you’re trying to save from working more. People shouldn’t have to work their lives away, but we have other problems that need to be solved before prematurely switching to AI.
Kirk@startrek.website 1 day ago
Right? It actually makes me feel insane that the topic of “humans working less” is never in the selling points of these products.
Honestly I suspect that rather than some nefarious capitalist plot to enslave humanity, it is just more evidence that the software can’t actually do what the people selling it to big corporations claim it can do.
BlameThePeacock@lemmy.ca 1 day ago
I really hate this headline.
They aren’t wrong 70% of the time, the study found that they only successfully complete multi-step business tasks 30-35% of the time.