Comment

Comment on Developer releases ShrimpMoss, a dataset designed to abliterate Chinese censorship and propaganda finetunes from LLMs

ericjmorey@beehaw.org ⁨1⁩ ⁨year⁩ ago

I’m not sure what abliteration is

Sort:hotnew top

thelucky8@beehaw.org ⁨1⁩ ⁨year⁩ ago

Abliteration involves fine-tuning a language model to bypass built-in refusal mechanisms that prevent the model from generating responses to potentially harmful or sensitive prompts. Source

source
- ericjmorey@beehaw.org ⁨1⁩ ⁨year⁩ ago
  The shared repo doesn’t look like fine tuning. It just looks like prompts.
  
  source