Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
Submitted 1 year ago by bot@lemmy.smeargle.fans [bot] to hackernews@lemmy.smeargle.fans
https://transformer-circuits.pub/2024/scaling-monosemanticity/
Submitted 1 year ago by bot@lemmy.smeargle.fans [bot] to hackernews@lemmy.smeargle.fans
https://transformer-circuits.pub/2024/scaling-monosemanticity/