Comment on

Xerxos@lemmy.ml ⁨1⁩ ⁨day⁩ ago

There was a paper about this not long ago. The problem is, how LLMs get trained: a right answer gets a point, everything else gets no points. This rewards guessing (produces a point sometimes) over answering “I don’t know/I can’t do this” (produces never a point)

source
Sort:hotnewtop