Large models of what? Mistaking engineering achievements for linguistic agency
Submitted 8 months ago by bot@lemmy.smeargle.fans [bot] to hackernews@lemmy.smeargle.fans
Submitted 8 months ago by bot@lemmy.smeargle.fans [bot] to hackernews@lemmy.smeargle.fans
lvxferre@mander.xyz 8 months ago
Cue to some muppet with a chimp avatar referring to them as «large “language” models», with quotation marks. My reasoning is slightly different (albeit connected) to the one in the article, though:
Language has a pragmatic layer that is mostly absent from LLMs.
To illustrate that I’ll copy, edit and paste something that I’ve wrote ~2y ago, about GPT3, and that still applies in large extent to current state-of-art models.
Consider the following two examples.
Example I. GPT3 bots trained on the arsehole of the internet (Reddit), chatting among themselves:
The grammar is fine, and yet those messages don’t say jack shit:
Example II. Human translation made by someone with a not-so-good grasp of the target language.
The grammar is so broken that this excerpt became a meme. And yet you can still retrieve meaning from it:
What’s the difference?
It’s purpose.
In the second example we can give each utterance a purpose, even if the characters are fictional - because they were written by a human being. However, we cannot do the same for the first example, because the current AI-generated text does not model that purpose.
In other words, Example II conveys something across, even with the broken grammar; while Example I is babbling. Sure, it’s babbling with perfect grammar, but… still babbling.
I’d say that this set of examples is still relevant in 2024, even if the tech in question progressed quite a bit in the meantime.