LLM wasn’t made for this
There’s a thought experiment that challenges the concept of cognition, called The Chinese Room. What it essentially postulates is a conversation between two people, one of whom is speaking Chinese and getting responses in Chinese. And the first speaker wonders “Does my conversation partner really understand what I’m saying or am I just getting elaborate stock answers from a big library of pre-defined replies?”
The LLM is literally a Chinese Room. And one way we can know this is through these interactions. The machine isn’t analyzing the fundamental meaning of what I’m saying, it is simply mapping the words I’ve input onto a big catalog of responses and giving me a standard output. In this case, the problem the machine is running into is a legacy meme about people miscounting the number of "r"s in the word Strawberry. So “2” is the stock response it knows via the meme reference, even though a much simpler and dumber machine that was designed to handle this basic input question could have come up with the answer faster and more accurately.
When you hear people complain about how the LLM “wasn’t made for this”, what they’re really complaining about is their own shitty methodology. They build a glorified card catalog. A device that can only take inputs, feed them through a massive library of responses, and sift out the highest probability answer without actually knowing what the inputs or outputs signify cognitively.
Even if you want to argue that having a natural language search engine is useful (damn, wish we had a tool that did exactly this back in August of 1996, amirite?), the implementation of the current iteration of these tools is dogshit because the developers did a dogshit job of sanitizing and rationalizing their library of data.
Imagine asking a librarian “What was happening in Los Angeles in the Summer of 1989?” and that person fetching you back a stack of history textbooks, a stack of Sci-Fi screenplays, a stack of regional newspapers, and a stack of Iron-Man comic books all given equal weight? Imagine hearing the plot of the Terminator and Escape from LA intercut with local elections and the Loma Prieta earthquake.
That’s modern LLMs in a nutshell.
jsomae@lemmy.ml 1 month ago
You’ve missed something about the Chinese Room. The solution to the Chinese Room riddle is that it is not the person in the room but rather the room itself that is communicating with you. The fact that there’s a person there is irrelevant, and they could be replaced with a speaker or computer terminal.
Put differently, it’s not an indictment of LLMs that they are merely Chinese Rooms, but rather one should be impressed that the Chinese Room is so capable despite being a completely deterministic machine.
If one day we discover that the human brain works on much simpler principles than we once thought, would that make humans any less valuable? It should be deeply troubling to us that LLMs can do so much while the mathematics behind them are so simple. Arguments that because LLMs are just scaled-up autocomplete they surely can’t be very good at anything are not comforting to me at all.
kassiopaea@lemmy.blahaj.zone 1 month ago
This. I often see people shitting on AI as “fancy autocomplete” or joking about how they get basic things incorrect like this post but completely discount how incredibly fucking capable they are in every domain that actually matters. That’s what we should be worried about… what does it matter that it doesn’t “work the same” if it still accomplishes the vast majority of the same things? The fact that we can get something that even approximates logic and reasoning ability from a deterministic system is terrifying on implications alone.
Knock_Knock_Lemmy_In@lemmy.world 1 month ago
Why doesn’t the LLM know to write (and run) a program to calculate the number of characters?
I feel like I’m missing something fundamental.
OsrsNeedsF2P@lemmy.ml 1 month ago
You didn’t get good answers so I’ll explain.
First, an LLM can easily write a program to calculate the number of
r
s. If you ask an LLM to do this, you will get the code back.But the website ChatGPT.com has no way of executing this code, even if it was generated.
The second explanation is how LLMs work. They work on the word (technically token, but think word) level. They don’t see letters. The AI behind it literally can only see words. The way it generates output is it starts typing words, and then guesses what word is most likely to come next. So it literally does not know how many
r
s are in strawberry. The impressive part is how good this “guessing what word comes next” is at answering more complex questions.outhouseperilous@lemmy.dbzer0.com 1 month ago
It doesn’t know things.
It’s a statistical model. It cannot synthesize information ir problem solve, only show you a rough average of its library if inputs graphed by proximity to your input.
jsomae@lemmy.ml 1 month ago
The LLM isn’t aware of its own limitations in this regard. The specific problem of getting an LLM to know what characters a token comprises has not been the focus of training. It’s a totally different kind of error than other hallucinations, it’s almost entirely orthogonal, but other hallucinations are much more important to solve, whereas being able to count the number of letters in a word or add numbers together is not very important, since as you point out, there are already programs that can do that.
UnderpantsWeevil@lemmy.world 1 month ago
I’d be more impressed if the room could tell me how many "r"s are in Strawberry inside five minutes.
Human biology, famous for being simple and straightforward.
outhouseperilous@lemmy.dbzer0.com 1 month ago
Ah! But you can skip all that messy biology abd stuff i don’t understand that’s probably not important, abd just think of it as a classical computer running an x86 architecture, and checkmate, liberal my argument owns you now!
jsomae@lemmy.ml 1 month ago
Because LLMs operate at the token level, I think it would be a more fair comparison with humans to ask why humans can’t produce the IPA spelling words they can say, /nɔr kæn ðeɪ ˈizəli rid θɪŋz ˈrɪtən ˈpjʊrli ɪn aɪ pi ˈeɪ/ despite the fact that it should be simple to – they understand the sounds after all. I’d be impressed if somebody could do this too! But that most people can’t shouldn’t really move you to think humans must be fundamentally stupid because of this one curious artifact.
UnderpantsWeevil@lemmy.world 1 month ago
That’s just access to the right keyboard interface. Humans can and do produce those spellings with additional effort or advanced tool sets.
Humans turns oatmeal into essays via a curios lump of muscle is an impressive enough trick on its face.
LLMs have 95% of the work of human intelligence handled for them and still stumble on the last bits.
outhouseperilous@lemmy.dbzer0.com 1 month ago
Its not a fucking riddle, it’s a koan/thought experiment.
It’s questioning what ‘communication’ fundamentally is, and what knowledge fundamentally is.
It’s not even the first thing to do this. Military theory was cracking away at the ‘communication’ thing a century before, and the nature of knowledge has discourse going back thousands of years.
jsomae@lemmy.ml 1 month ago
You’re right, I shouldn’t have called it a riddle. Still, being a fucking thought experiment doesn’t preclude having a solution. Theseus’ ship is another famous fucking thought experiment, which has also been solved.
outhouseperilous@lemmy.dbzer0.com 1 month ago
‘A solution’
That’s not even remotely the point. Yes there are nany valid solutions. The point isn’t to solve it, but what how you solve it says about and clarifies your ideas.