Comment on AI-generated content in Wikipedia - a tale of caution

<- View Parent
lvxferre@mander.xyz ⁨1⁩ ⁨month⁩ ago

I’m still reading the machine generated transcript of the video. But to keep it short:

The author was messing with ISBNs (international standard book numbers), and noticed invalid ones fell into three categories.

He then uses this to highlight that Wikipedia is already infested by bullshit from large “language” models², and this creates a bunch of vicious cycles that go against the spirit of Wikipedia of reliability, factuality, etc.

Then, if I got this right, he lays out four hypotheses (“theories”) on why people do this³:


Notes (all from my/Lvxferre’s part; none of those is said by the author himself)

  1. “Hallucination”: misleading label used to refer to output that has been generated the exact same way as the rest of the output, but when interpreted by humans it leads to bullshit.
  2. I have a rant about calling those models “language” models, but to keep it short: I think “large token models” would be more accurate.
  3. In my opinion, the author is going the wrong way here. Disregard intentions, focus on effect — don’t assume good faith, don’t assume any faith at all, remove dead weight users who are doing shit against the spirit of the project.

source
Sort:hotnewtop