Comment on Google AI chatbot responds with a threatening message: "Human … Please die."
ranandtoldthat@beehaw.org 4 weeks agoThis is just a standard prompt hack. This will always exist with llms. They don’t have any real understanding of language so safety protocols can’t actually ban topics, only sets of words and phrases.
There was an extensive set of prompts working toward elder abuse before the result in question.
My guess is that the redditor who discovered it disguised it to look like homework and reproduced the hack, and added the “brother” to create more authentic rage bait.