Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

Square Enix says it wants generative AI to be doing 70% of its QA and debugging by the end of 2027

⁨245⁩ ⁨likes⁩

Submitted ⁨⁨19⁩ ⁨hours⁩ ago⁩ by ⁨inclementimmigrant@lemmy.world⁩ to ⁨games@lemmy.world⁩

https://www.videogameschronicle.com/news/square-enix-says-it-wants-generative-ai-to-be-doing-70-of-its-qa-and-debugging-by-the-end-of-2027/

source

Comments

Sort:hotnewtop
  • Bakkoda@lemmy.zip ⁨59⁩ ⁨minutes⁩ ago

    Realistic goal considering they already do so little QA.

    source
  • LostWanderer@fedia.io ⁨19⁩ ⁨hours⁩ ago

    Ew, sounds like a great reason to not buy any Square Enix games...

    source
    • Brutticus@midwest.social ⁨18⁩ ⁨hours⁩ ago

      Not even from an ethically standpoint. Color me shocked if these games are like, playable

      source
      • LostWanderer@fedia.io ⁨18⁩ ⁨hours⁩ ago

        Exactly, as I don't expect QA done by something that can't think or feel to know what actually needs to be fixed. AI is a hallucination engine that just agrees rather than points out issues, in some cases it might call attention to non-issues and let critical bugs slip by. The ethical issues are still significant and play into the reason why I would refuse to buy any more Square Enix games going forward. I don't trust them to walk this back, they are high on the AI lie. Human made games with humans handling the QA are the only games that I want.

        source
        • -> View More Comments
    • UnderpantsWeevil@lemmy.world ⁨18⁩ ⁨hours⁩ ago

      I would initially tap the breaks on this, if for no other reason than “AI doing Q&A” reads more like corporate buzzwords than material policy. Big software developers should already have much of their Q&A automated, at least at the base layer. Further automating Q&A is generally a better business practice, as it helps catch more bugs in the Dev/Test cycle sooner.

      Then consider that Q&A work by end users is historically a miserable and soul-sucking job. Converting those roles to debuggers and active devs does a lot for both the business and the workforce. When compared to “AI is doing the art” this is night-and-day, the very definition of the “Getting rid of the jobs people hate so they can do the work they love” that AI was supposed to deliver.

      Finally, I’m forced to drag out the old “95% of AI implementations fail” statistic. Far more worried that they’re going to implement a model that costs a fortune and delivers mediocre results than that they’ll implement an AI driven round of end-user testing.

      Turning Q&A over to the Roomba AI to find corners of the setting that snag the user would be Gud Aktuly.

      source
      • natecox@programming.dev ⁨17⁩ ⁨hours⁩ ago

        Converting those roles to debuggers and active devs does a lot for both the business and the workforce.

        Hahahahaha… on wait you’re serious. Let me laugh even harder.

        They’re just gonna lay them off.

        source
        • -> View More Comments
      • binarytobis@lemmy.world ⁨15⁩ ⁨hours⁩ ago

        I was going to say, this is one job that actually makes sense to automate. I don’t know any QA testers personally, but I’ve heard plenty of accounts of them absolutely hating their jobs and getting laid off after the time crunch anyway.

        source
      • NoForwardslashS@sopuli.xyz ⁨17⁩ ⁨hours⁩ ago

        The repetition of “Q&A” reads like this comment was also outsourced to AI.

        source
      • zerofk@lemmy.zip ⁨17⁩ ⁨hours⁩ ago

        What does Q&A stand for?

        source
        • -> View More Comments
      • Mikina@programming.dev ⁨13⁩ ⁨hours⁩ ago

        They already have a really cool solution for that, which they talked about in their GDC talk.. I don’t think there’s any need to slap a glorified chatbot into this, it already seems to work well and have just the right amount of human input to be reliable, while also leaving the “testcase replay gruntwork” to a script instead of a human.

        source
  • JeeBaiChow@lemmy.world ⁨2⁩ ⁨hours⁩ ago

    Lol. Good luck!

    source
  • VeryInterestingTable@jlai.lu ⁨7⁩ ⁨hours⁩ ago

    QA annnnd Debugging?

    LLMs have a much better chance at succesfuly replacing whoever said that.

    source
  • ghost9@lemmy.world ⁨17⁩ ⁨hours⁩ ago

    That’s a stupid idea. You’re not supposed to QA or debug games. You just release it, customers report bugs, and then you promise to fix the bugs in the next patch (but don’t).

    source
    • FatVegan@leminal.space ⁨42⁩ ⁨minutes⁩ ago

      Or do the Bethesda thing and let people playtest their slop and fix it for free.

      source
    • Rhaedas@fedia.io ⁨15⁩ ⁨hours⁩ ago

      No better testing than in production.

      source
  • Taldan@lemmy.world ⁨13⁩ ⁨hours⁩ ago

    So Square Enix is demanding OpenAI stop using their content, but is 100% okay using AI built off stolen content to make more money themselves

    As a developer, it bothers me that my code is being used to train AI that Square Enix is using while trying to deny anyone else the ability to use their work

    I could go either way on whether or not AI should be able to train on available data, but no one should get to have it both ways

    source
  • mavu@discuss.tchncs.de ⁨12⁩ ⁨hours⁩ ago

    Well, good luck with that. Software development is a shit show already anyway. You can find me in my Gardening business in 2027.

    source
    • Rooster326@programming.dev ⁨12⁩ ⁨hours⁩ ago

      Good Luck. When the economy finally bottoms out the first budget to go is always the gardening budget.

      source
      • Regrettable_incident@lemmy.world ⁨8⁩ ⁨hours⁩ ago

        Market gardening isn’t so bad, people gotta eat. But yeah, if you’re cutting lawns you’re going to suffer when the economy shits the bed.

        source
  • hoshikarakitaridia@lemmy.world ⁨18⁩ ⁨hours⁩ ago

    Literally not how any of this works. You don’t let AI check your work, at best you use AI and check it’s work, and at worst you have to do everything by hand anyway.

    source
    • UnderpantsWeevil@lemmy.world ⁨18⁩ ⁨hours⁩ ago

      You don’t let AI check your work

      From a game dev perspective, user Q&A is often annoying and repetitive labor. Endlessly criss-crossing terran hitting different buttons to make sure you don’t snag a corner or click objects in a sequence that triggers a state freeze. Hooking a PS controller to Roomba logic and having a digital tool rapidly rerun routes and explore button combos over and over, looking for failed states, is significantly better for you than hoping a massive team of dummy players can recreate the failed state by tripping into it manually.

      source
      • subignition@fedia.io ⁨17⁩ ⁨hours⁩ ago

        There's plenty of room for sophisticated automation without any need to involve AI.

        source
        • -> View More Comments
    • zerofk@lemmy.zip ⁨17⁩ ⁨hours⁩ ago

      its *

      Ironically, that’s definitely something AI could check for.

      source
      • hoshikarakitaridia@lemmy.world ⁨17⁩ ⁨hours⁩ ago

        Spell check? Yeah fair enough. The misspelling has historical value now though so I have to keep it in :P

        source
        • -> View More Comments
  • slaacaa@lemmy.world ⁨15⁩ ⁨hours⁩ ago

    Image

    source
  • Mikina@programming.dev ⁨13⁩ ⁨hours⁩ ago

    Square Enix actually has a pretty sick automated QA already. There’s a cool talk about how they did that for FFVII remake in GDC vault, and I highly recommend watching it, if you’re at all interested in QA.

    It has nothing to do with AI, it’s just plain old automation, but they solve most of the issues you get with making automated tests in non-discrete 3D playspace and they do that in a pretty solid way. It’s definitely something I’d love to have implemented in the games I’m working on, as someone who worked in QA and now works in development. Being able to have mostly reliable way how to smoke-test levels for basic gameplay without having to torture QA to run the test-case again is good, and allows QA to focus on something else - but the tools also need oversight, so it’s not really a job lost. In summary - I think the talk is cool tech and worth the watch.

    However, I don’t think AI will help in this regard, and something as unreliable and random as AI models are not a good fit for this job. You want to have deterministic testcases that you can quanitfy, and if something doesn’t match have an actual human to look at why. AI also probably won’t be able to find clever corner-cases and bugs that need human ingenuity.

    Fuck AI, I kind of hope this is just a marketing talk and they are actually just improving the (deterministic) tools they already have, and they are calling it an “AI” to satisfy investors/management without actually slapping a glorified chat-bot into the tech for no reason.

    source
  • Flickerby@lemmy.zip ⁨7⁩ ⁨hours⁩ ago

    And I thought I had no more disappointment left to allocate

    source
    • pirateKaiser@sh.itjust.works ⁨50⁩ ⁨minutes⁩ ago

      It’s just that you’ve reached your free quota, further disappointments will be charged 0.0937 emotional stability per hour

      source
  • MourningDove@lemmy.zip ⁨7⁩ ⁨hours⁩ ago

    So… no more SE games for me. Not a huge loss to be honest.

    source
  • Wildmimic@anarchist.nexus ⁨18⁩ ⁨hours⁩ ago

    I hope they put out the last FF VII remake part before that, so i can finally start playing them all! I don’t care what they want to waste their money on afterwards lol

    source
    • UnderpantsWeevil@lemmy.world ⁨18⁩ ⁨hours⁩ ago

      I wouldn’t be shy about getting into Remake or Rebirth now. They both stand up as their own games (concise start/ending, somewhat distinct mechanics, each one is easily 40+ hours of gameplay). And with Part 3 targeted for 2027 release, I suspect this kind of overhaul would be outside their dev cycle to implement.

      source
  • Gullible@sh.itjust.works ⁨18⁩ ⁨hours⁩ ago

    Frankly, this is good news. Whoever buys the rights to kingdom hearts in 3 years when the company falls apart might manage to create an intelligible storyline.

    source
    • DragonTypeWyvern@midwest.social ⁨17⁩ ⁨hours⁩ ago

      But then it won’t be a Kingdom Hearts game!

      source
      • Gullible@sh.itjust.works ⁨17⁩ ⁨hours⁩ ago

        Kingdom hearts 1 was a coming of age story with some fantastic elements tossed in

        Kingdom hearts 2 was about antipathy and how it destroys the world, with some, uhh, who was that guy? And why’s the bad guy on my side?

        Kingdom hearts 3 was, wait, why was he cloned? When was he cloned? When was she, and him, and him again? And his third clone was a girl? And whose heart was imbued into what? What war? What? Who? What???

        source
    • Dreaming_Novaling@lemmy.zip ⁨14⁩ ⁨hours⁩ ago

      I get that getting all the games as they released was hard, because the series is on so many platforms. But I really don’t get the “KH is hard to understand” argument today, because you can easily find hundreds of letsplays for every game, cutscenes complications, play/watch every game on the PS4 remix disk, and even watch a fandub of the mobile games (Dark Road is a WIP) if you don’t like the KHUX Back Cover recap.

      So like, what’s so hard? If you skip games and only read a wiki (the worst possible way to consume any sort of media, mind you), of course you’re not gonna know the story and characters, and of course it’ll sound confusing.

      source
      • Gullible@sh.itjust.works ⁨13⁩ ⁨hours⁩ ago

        Dude, it’s multi-author-comic level bad. I’ve skipped entire sagas in several book series due to a lack of translations and ended up less confused. It’s green arrow levels of clone shenanigans.

        To be clear, I’ve played most of the games and they’re still ridiculously difficult to keep track of. All besides the mobile, early non-Ventus card mechanic arpg, and the disappearing girl clone sora game. They’d be easier to follow if they stuck to the rules of their own universe. Body and heart separate and the body persists not once but twice? Wut?

        source
  • newthrowaway20@lemmy.world ⁨18⁩ ⁨hours⁩ ago

    Square Enix exec doesn’t know what QA and Debugging entail.

    source
    • ThePowerOfGeek@lemmy.world ⁨17⁩ ⁨hours⁩ ago

      “Well it works for unit testing, so just extend that out to all testing! Problem solved!” -Senior Management, probably

      source
  • Tronn4@lemmy.world ⁨11⁩ ⁨hours⁩ ago

    insert plane crashing.gif

    source
  • henfredemars@infosec.pub ⁨18⁩ ⁨hours⁩ ago

    Considering how the open source community is being inundated with low-quality bug reports filed using AI, I don’t have much faith in the tech reviewing code, let alone writing it correctly.

    Could it be a useful aid? Sure, but 70% of your reviewing is a pie-in-the-sky pipe dream. Not with still delivered a good product.

    source
  • crunchy@lemmy.dbzer0.com ⁨18⁩ ⁨hours⁩ ago

    They jumped on the NFT bandwagon a couple years ago, too. Did they not learn anything from that?

    source
  • pineapplelover@lemmy.dbzer0.com ⁨7⁩ ⁨hours⁩ ago

    Sure

    source
  • Mikina@programming.dev ⁨13⁩ ⁨hours⁩ ago

    Large companies probably do that anyway.

    Take Blizzard for example. They just released a new patch, where class campaign quests for 8/12 classes do not work. Sure, it’s a remixed version of older expansion, and with all the phasing stuff I can kind of imagine some of the phasing issues being caused by, I don’t know, the player having a weird combination of completed stuff that’s hard to properly catch in testing, since there’s quite a lot of variables.

    But the fact that one of the class quests requires crafted items to be completed, while crafting isn’t available by design in the Remix, there’s just no excuse. They either just don’t give a fuck about an issue that’s literally a progression blocker with 100% repro rate, or no one ever tested it even once.

    As someone who worked in QA and gamedev, I can’t imagine how could something as obvious as this ever get approved for release. That’s something you catch immediately. Hell, you don’t even have to play through it to realize that this might be a problem.

    source
    • Rooster326@programming.dev ⁨11⁩ ⁨hours⁩ ago

      Work at a larger company. Customer Service in the wild is so bad that we just use our customers as the QA. As they say

      All businesses have a test environment. Some are lucky enough to have a separate production environment.

      source
  • tal@lemmy.today ⁨18⁩ ⁨hours⁩ ago

    Hmm. While I don’t know what their QA workflow is, my own experience is that working with QA people to design a QA procedure for a given feature tends to require familiarity with the feature and possible problems, and that human-validating a feature isn’t usually something done at massive scale, where you’d get a lot of benefit from heavy automation.

    It’s possible that one might be able to use LLMs to help write test code — reliability and security considerations there are normally less-critical than in front-line code. Worst case is getting a false positive, and if you can get more test cases covered, I imagine that might pay off.

    Square does an MMO, among their other stuff. If they can train a model to produce AI-driven characters that act sufficiently like human players, where they can theoretically log training data from human players, that might be sufficient to populate an MMO “experimental” deployment so that they can see if anything breaks prior to moving code to production.

    source
    • snooggums@piefed.world ⁨18⁩ ⁨hours⁩ ago

      Worst case is getting a false positive, and if you can get more test cases covered, I imagine that might pay off.

      False positives during testing are a huge time sink. QA has to replicate and explain away each false report and the faster AI 'completes' tasks the faster the flood of false reports come in.

      There is plenty of non-AI automation that can be used intentionally to do tedious repetitive tasks already.

      source
  • themurphy@lemmy.ml ⁨18⁩ ⁨hours⁩ ago

    Well, it’s not game development, but bugfixes and quality testing.

    I dont know, but it does makes sense, when there’s still 30% work being done by human eyes. There will still be people checking everything through.

    Even if they hit 50-50, they could put more money into the development.

    The argument that they will just save the money only works as long as another company doesnt use it for game devs. Otherwise you naturally fall behind.

    source
    • ampersandrew@lemmy.world ⁨18⁩ ⁨hours⁩ ago

      It also only works as long as the AI can actually competently do the QA work. This is what an AI thinks a video game is. To do QA, it will have to know that something is wrong, flag it, and be able to tell when it’s fixed. The most likely situation I can foresee is that it creates even more work for the remaining humans to do when they’re already operating at a deficit.

      source
      • riskable@programming.dev ⁨18⁩ ⁨hours⁩ ago

        To be fair, that’s what an AI video generator thinks an FPS is. That’s not the same thing as AI-assisted coding. Though it’s still hilarious! “Press F to pay respects” 🤣

        For reference, using AI to automate your QA isn’t a bad idea. There’s a bunch of ways to handle such things but one of the more interesting ones is to pit AIs against each other. Not in the game, but in their reports… You tell AI to perform some action and generate a report about it while telling another AI to be extremely skeptical about the first AI’s reports and to reject anything that doesn’t meet some minimum standard.

        That’s what they’re doing over at Anthropic (internally) with Claude Code QA tasks and it’s super fascinating! Heard them talk about that setup on a podcast recently and it kinda blew my mind… They have more than just two “Claudes” pitted against each other too: In the example they talked about, they had four: One generating PRs, another reviewing/running tests, another one checking the work of the testing Claude, and finally a Claude setup to perform critical security reviews of the final PRs.

        source
        • -> View More Comments
  • Omegamanthethird@lemmy.world ⁨17⁩ ⁨hours⁩ ago

    From a tech POV, that makes a lot of sense. Use AI to find the needle in the haystack. Then let a person validate. That’s probably one of the better uses for it. Although I don’t love AI for any of the broad reasons to not like AI.

    source
    • SharkAttak@kbin.melroy.org ⁨14⁩ ⁨hours⁩ ago

      Wasn't AI decent at writing code, but bad at review and modifying it?

      source
      • Omegamanthethird@lemmy.world ⁨12⁩ ⁨hours⁩ ago

        Maybe before. But it’s gotten pretty damn good at detecting anomalies and issues. And every time a human QA validates the info, it gets better.

        I’d still leave it to a human to fix the code though. I suspect that letting AI write the code would make it unworkable for people in the future. But maybe it can write code in a straightforward way to be managed. I don’t know. It’s advancing pretty fast.

        source
  • Ilixtze@lemmy.ml ⁨18⁩ ⁨hours⁩ ago

    more shit

    source
  • BigBananaDealer@lemmy.world ⁨12⁩ ⁨hours⁩ ago

    dont they already have dumbbots in playtesting?

    source
    • frongt@lemmy.zip ⁨12⁩ ⁨hours⁩ ago

      The Talos Principle certainly did in 2014.

      source
  • Katana314@lemmy.world ⁨18⁩ ⁨hours⁩ ago

    I’m cautious but a little curious about this one, because QA could actually be a very good target for AIs to work with.

    1. It might not kill jobs. Right now, engineers finish a task and the limited number of QA engineers can’t possibly test it enough before release. That game-breaking bug you found in a game? I’m sure some QA had it in their plan to test every level for those bugs, and yet they just didn’t have enough time - and the studio couldn’t justify hiring 20 more QA squads. Even if they do upscale AI testing, they’ll need knowledgable QA workers to guide them.
    2. This is often extremely rote, repetitive work. It’s exactly the type of work The Oatmeal said is great for AIs. One person is tuning the balance on the Ether Drive attack, and gives it an extra 40% blarf damage. He tries it, sees it works fine, and eagerly skips past the part of the test plan to verify that all cutscenes are working and unaffected to push it in. An AI will try it out, and find: Actually, since an NPC uses an Ether Drive in a late-game cutscene, this breaks the whole game!
    3. Even going past existing plans, QA can likely find MORE work for AIs to do that they normally wouldn’t bother with. Think about the current complexity of game dev that leads to the current trope of releasing games half-finished to eventually get patched. It won’t help patch games, but it’ll at least help give devs an up-to-date list of issues.

    That said, those talking about human creativity and player expectations are still correct. An AI can report a problem with feedback that a human can say “No, that looks fine. Override that report.” It will also be good to do occasional manual tests, and lament “How did the AI think this was okay??”

    source
  • RinostarGames@mastodon.gamedev.place ⁨18⁩ ⁨hours⁩ ago

    @inclementimmigrant I'm so glad I've stopped buying AAA games.

    source
  • ieatpwns@lemmy.world ⁨18⁩ ⁨hours⁩ ago

    Inb4 their games come out even more broken

    source
  • finitebanjo@piefed.world ⁨16⁩ ⁨hours⁩ ago

    I kind of wrote Square Enix off years ago, but I'm definitely not buying anything they make in the future.

    source
  • wizblizz@lemmy.world ⁨18⁩ ⁨hours⁩ ago

    Barf.

    source
-> View More Comments