Comment on [deleted]
scrchngwsl@feddit.uk 5 months ago
The summary is total rubbish and completely misrepresents what it’s actually about. I’m not sure why anyone would bother including that poorly AI-generated summary, if they had already watched the video. Useless AI bullshit.
The video is actually about the movement of AI Safety over the past year from something of fringe academic interest or curiosity into the mainstream of tech discourse, and even into active government policy. He discusses the advancements in AI in the past year in the context of AI Safety, namely, that they are moving faster than expected and that this increases the urgency of AI Safety research.
I’ve followed Robert Miles’ YouTube channel for years and watched his old numberphile videos before “GenAI” was really a thing. He’s a great communicator and a genuinely thoughtful guy. I think he’s overly keen on anthropomorphising what AI is doing, partly because it makes it easier to communicate, but also because I think it suits the field of research he’s dedicated himself to. In this particular video, he ascribes a “theory of mind” based on the LLM’s response to a traditional and well-known theory of mind test. The test is included in the training data, and ChatGPT3.5 successfully recognises it and responds correctly. However, when the details of the test (i.e. specific names, items, etc.) are changed, but the form of the problem is the same, ChatGPT3.5 fails. ChatGPT 4, however, still succeeds – which Miles concludes means that ChatGPT 4 has a stronger theory of mind.
My view is that this is obviously wrong. I mean, just prima facie absurd. ChatGPT3.5 correctly recognises the problem as a classic psychology question, and responds with the standard psychology answer. Miles says that the test is found in the training data. So it’s in ChatGPT4’s training data, too. And ChatGPT 4’s LLM is good enough that, even if you change the nouns used in the problem, it is still able to recognise that the problem is the same one found in its training data. That does not in any way prove it has a theory of mind! It just proves that the problem is in its training set! If 3.5 doesn’t have a theory of mind because a small change can mess up its answer, how can 4.0 have a theory of mind, if 4.0 is doing the same thing that 3.5 is doing, just a bit better?
The most obvious problem is that the theory of mind test is designed for determining whether children have developed a theory of mind yet. That is, they test whether the development of the human brain has reached a stage that is common among other human brains, in which they can correctly understand that other people may have different internal mental states. We know that humans are, generally, capable of doing this, that this understanding is developed during childhood years, and that some children develop it sooner than others. So we have devised a test to distinguish between those children who have developed this capability and those children who have not.
It would be absurd to apply the same test to anything other than a human child. It would be like giving the LLM the “mirror test” for animal self-awareness. Clearly, since the LLM cannot recognise itself in a mirror, it is not self-aware. Is that a reasonable conclusion too? Or do we cherry-pick the existing tests to suit the LLM’s capabilities?
Now, Miles’ substantial point is that the “overton window” for AI Safety has shifted, bringing it into the mainstream of tech and political discourse. To that extent, it doesn’t matter whether ChatGPT has consciousness or not, or a theory of mind, as long as enough people in mainstream tech and political discourse believe it does for it to warrant greater attention on AI Safety. Miles further believes that AI Safety is important in its own right, so perhaps he doesn’t mind whether or not the overton window has shifted on the basis of true AI capability or imagined capability. He hints at, but doesn’t really explore, the ulterior motives for large tech companies to suggest that the tools they are developing are so powerful that they might destroy the world. (He doesn’t even say it as explicitly as I did just then, which I think is a failing.) But maybe that’s ok for him, as long as AI Safety research is being taken seriously.
I disagree. It would be better to base policy on things that are true, and if you have to believe that LLMs have a theory of mind in order to gain mainstream attention on AI Safety, then I think this will lead us to bad policymaking. It will miss the real harms that AI pose – facial recognition used to bar people from shops that have a disproportionately high error rate for black people, resumé scanners and other hiring tools that, again, disproportionately discriminate against black people and other minorities, non-consensual AI porn, etc etc. We may well need policies to regulate this stuff, but focus on hypothetical existential risk of AGI in the future, over the very real and present harms that AI is doing right now, is misguided and dangerous.
It’s a pity, because if AI Safety had just stayed an academic curiosity (as Rob says it was for him), maybe we’d have the policy resources to tackle the real and present problems that AI is causing for people.