Comment on Someone got Gab's AI chatbot to show its instructions

<- View Parent
teawrecks@sopuli.xyz ⁨8⁩ ⁨months⁩ ago

Any input to the 2nd LLM is a prompt, so if it sees the user input, then it affects the probabilities of the output.

There’s no such thing as “training an AI to follow instructions”. The output is just a probibalistic function of the input. This is why a jailbreak is always possible, the probability of getting it to output something that was given as input is never 0.

source
Sort:hotnewtop