Comment on Do LLM modelers maintain a list of manual corrections fed by humans?
brucethemoose@lemmy.world 3 days ago
Yes. Absolutely.
The meme in the research community is that current LLMs are literally trained on benchmarks and common stuff people test in LM-Arena, like the how many r’s in strawberry question.
I’m not talking speculatively: Meta literally got caught red-handed doing this. They ran a separate finetune just to look good on lm-arena. And some benchmarks like MMLU have errors in them that many LLMs *answer ‘correctly’.