Hackworth@piefed.ca 1 week ago
Meanwhile, Anthropic in the last month:
The assistant axis: situating and stabilizing the character of large language models
Next-generation Constitutional Classifiers: More efficient protection against universal jailbreaks
Introducing Bloom: an open source tool for automated behavioral evaluations