A lot of hate in the comments but IMO this is one of the few things that LLMs are actually really good for. It’s a shit job nobody wants to do that LLMs are really good at. Notice that they said 70% and not 100%. Yeah that means they’re probably going to have 30 people doing the work that 100 people used to do but people are still in the picture overseeing things. Automation isn’t, by itself, bad. The bad part is that our whole society is built on the idea that your entire value as a person is based on being able to work and make money and job loss is way worse than it should be.
Square Enix says it wants generative AI to be doing 70% of its QA and debugging by the end of 2027
Submitted 1 month ago by inclementimmigrant@lemmy.world to games@lemmy.world
Comments
markovs_gun@lemmy.world 1 month ago
JeeBaiChow@lemmy.world 1 month ago
Lol. Good luck!
Tronn4@lemmy.world 1 month ago
insert plane crashing.gif
Mikina@programming.dev 1 month ago
Large companies probably do that anyway.
Take Blizzard for example. They just released a new patch, where class campaign quests for 8/12 classes do not work. Sure, it’s a remixed version of older expansion, and with all the phasing stuff I can kind of imagine some of the phasing issues being caused by, I don’t know, the player having a weird combination of completed stuff that’s hard to properly catch in testing, since there’s quite a lot of variables.
But the fact that one of the class quests requires crafted items to be completed, while crafting isn’t available by design in the Remix, there’s just no excuse. They either just don’t give a fuck about an issue that’s literally a progression blocker with 100% repro rate, or no one ever tested it even once.
As someone who worked in QA and gamedev, I can’t imagine how could something as obvious as this ever get approved for release. That’s something you catch immediately. Hell, you don’t even have to play through it to realize that this might be a problem.
Rooster326@programming.dev 1 month ago
Work at a larger company. Customer Service in the wild is so bad that we just use our customers as the QA. As they say
All businesses have a test environment. Some are lucky enough to have a separate production environment.
turdcollector69@lemmy.world 1 month ago
70% by what metric?
Is that going by bugs identified, fixes implemented, headcount?
razzazzika@lemmy.zip 1 month ago
So… im a big supporter of squeezing, buy everything they make… but this tells me the quality of their games is going to go down the toilet. Knowing AI it’ll come up with fake lists of bugs that didn’t happen and all the real bugs will not be listed and they’ll release the buggies shit. One thing I LIKE about square, being one of the few companies I do pre-orders from still, is that their products are fairly bug free on launch FFXVI had some graphics optimization issues, but I’ve been happy with most of what I got the past few years.
GladiusB@lemmy.world 1 month ago
FFXIV saved them and they don’t ever put any money back into the game. It’s their cash cow that pays for all their other bad ideas.
Great graphics and legacy. But some crappy ideas about what the players want.
Look at their payment systems for subs. It’s so confusing for no damn reason.
razzazzika@lemmy.zip 1 month ago
Believe it or not FFXI was even MORE confusing to sub
Katana314@lemmy.world 1 month ago
I’m cautious but a little curious about this one, because QA could actually be a very good target for AIs to work with.
- It might not kill jobs. Right now, engineers finish a task and the limited number of QA engineers can’t possibly test it enough before release. That game-breaking bug you found in a game? I’m sure some QA had it in their plan to test every level for those bugs, and yet they just didn’t have enough time - and the studio couldn’t justify hiring 20 more QA squads. Even if they do upscale AI testing, they’ll need knowledgable QA workers to guide them.
- This is often extremely rote, repetitive work. It’s exactly the type of work The Oatmeal said is great for AIs. One person is tuning the balance on the Ether Drive attack, and gives it an extra 40% blarf damage. He tries it, sees it works fine, and eagerly skips past the part of the test plan to verify that all cutscenes are working and unaffected to push it in. An AI will try it out, and find: Actually, since an NPC uses an Ether Drive in a late-game cutscene, this breaks the whole game!
- Even going past existing plans, QA can likely find MORE work for AIs to do that they normally wouldn’t bother with. Think about the current complexity of game dev that leads to the current trope of releasing games half-finished to eventually get patched. It won’t help patch games, but it’ll at least help give devs an up-to-date list of issues.
That said, those talking about human creativity and player expectations are still correct. An AI can report a problem with feedback that a human can say “No, that looks fine. Override that report.” It will also be good to do occasional manual tests, and lament “How did the AI think this was okay??”
Ilixtze@lemmy.ml 1 month ago
more shit
MystValkyrie@lemmy.blahaj.zone 1 month ago
Moment for silence for David “Ribs” Carillo 🪦
DeadDigger@lemmy.zip 1 month ago
Well their goes FFXIV, that will be their end
pineapplelover@lemmy.dbzer0.com 1 month ago
Sure
RinostarGames@mastodon.gamedev.place 1 month ago
@inclementimmigrant I'm so glad I've stopped buying AAA games.
ieatpwns@lemmy.world 1 month ago
Inb4 their games come out even more broken
crmsnbleyd@sopuli.xyz 1 month ago
I will continue ignoring anything they make
BigBananaDealer@lemmy.world 1 month ago
dont they already have dumbbots in playtesting?
frongt@lemmy.zip 1 month ago
The Talos Principle certainly did in 2014.
finitebanjo@piefed.world 1 month ago
I kind of wrote Square Enix off years ago, but I'm definitely not buying anything they make in the future.
wizblizz@lemmy.world 1 month ago
Barf.
dual_sport_dork@lemmy.world 1 month ago
Some AI or central computer going haywire and destroying everything is, like, the third or fourth stock RPG trope just behind the Dark Lord burning down the protagonist’s village in the first act or the mysterious waif girl actually turning out to be a princess.
You really think they’d know better.
Devjavu@lemmy.dbzer0.com 1 month ago
In my country we say ajajajajajjjj
iAmTheTot@sh.itjust.works 1 month ago
Grrroooossss, noooo I liked you Square Enix in spite of everything else.
mostlikelyaperson@lemmy.world 1 month ago
Given how much squenix struggles with changing its development practices, I would be very surprised if they actually got there.
themurphy@lemmy.ml 1 month ago
Well, it’s not game development, but bugfixes and quality testing.
I dont know, but it does makes sense, when there’s still 30% work being done by human eyes. There will still be people checking everything through.
Even if they hit 50-50, they could put more money into the development.
The argument that they will just save the money only works as long as another company doesnt use it for game devs. Otherwise you naturally fall behind.
ampersandrew@lemmy.world 1 month ago
It also only works as long as the AI can actually competently do the QA work. This is what an AI thinks a video game is. To do QA, it will have to know that something is wrong, flag it, and be able to tell when it’s fixed. The most likely situation I can foresee is that it creates even more work for the remaining humans to do when they’re already operating at a deficit.
riskable@programming.dev 1 month ago
To be fair, that’s what an AI video generator thinks an FPS is. That’s not the same thing as AI-assisted coding. Though it’s still hilarious! “Press F to pay respects” 🤣
For reference, using AI to automate your QA isn’t a bad idea. There’s a bunch of ways to handle such things but one of the more interesting ones is to pit AIs against each other. Not in the game, but in their reports… You tell AI to perform some action and generate a report about it while telling another AI to be extremely skeptical about the first AI’s reports and to reject anything that doesn’t meet some minimum standard.
That’s what they’re doing over at Anthropic (internally) with Claude Code QA tasks and it’s super fascinating! Heard them talk about that setup on a podcast recently and it kinda blew my mind… They have more than just two “Claudes” pitted against each other too: In the example they talked about, they had four: One generating PRs, another reviewing/running tests, another one checking the work of the testing Claude, and finally a Claude setup to perform critical security reviews of the final PRs.