Comment

Comment on Google AI chatbot responds with a threatening message: "Human … Please die."

Bougie_Birdie@lemmy.blahaj.zone ⁨11⁩ ⁨months⁩ ago

With the sheer volume of training data required, I have a hard time believing that the data sanitation is high quality.

If I had to guess, it’s largely filtered through scripts, and not thoroughly vetted by humans. So data sanitation might look for the removal of slurs and profanity, but wouldn’t have a way to find misinformation or a request that the reader stops existing.

source

Sort:hotnew top

Swedneck@discuss.tchncs.de ⁨11⁩ ⁨months⁩ ago
anything containing “die” ought to warrant a human skimming it over at least

source
- Bougie_Birdie@lemmy.blahaj.zone ⁨11⁩ ⁨months⁩ ago
  I don’t disagree, but it is a challenging problem. If you’re filtering for “die” then you’re going to find diet, indie, diesel, remedied, and just a whole mess of other words.
  
  I’m in the camp where I believe they really should be reading all their inputs. You’ll never know what you’re feeding the machine otherwise.
  
  However I have no illusions that they’re not cutting corners to save money
  
  source
  - Swedneck@discuss.tchncs.de ⁨11⁩ ⁨months⁩ ago
    huh? finding only the literal word “die” is a trivial regex, it’s something vim users do all the time when editing text files lol
    
    source
    Bougie_Birdie@lemmy.blahaj.zone ⁨11⁩ ⁨months⁩ ago
    Sure, but underestimating the scope is how you wind up with a Scunthorpe problem
    
    source
    -> View More Comments