37 is well represented. Proof that we’ve taught AI some of our own weird biases.
Ask ChatGPT to pick a number between 1 and 100
Submitted 7 months ago by ElCanut@jlai.lu to technology@beehaw.org
https://jlai.lu/pictrs/image/84494667-202e-470f-a5f4-8217521e54a5.jpeg
Comments
Bishma@discuss.tchncs.de 7 months ago
GenderNeutralBro@lemmy.sdf.org 7 months ago
What’s special about 37? Just that it’s prime or is there a superstition or pop culture reference I don’t know?
Bishma@discuss.tchncs.de 7 months ago
If you discount the pop-culture numbers (for us 7, 42, and 69) its the number most often chosen by people if you ask them for a random number between 1 and 100. It just seems the most random one to choose for a lot of people. Veritasium just did a video about it.
Karyoplasma@discuss.tchncs.de 7 months ago
It’s just that humans are terrible at understanding the concept of randomness. A study by Theodore P. Hill showed that when tasked to pick a random number between 1 and 10, almost a third of the subjects (n was over 8500) picked 7. 10 was the least picked number (if you ditch the few idiots that picked 0).
Zorque@kbin.social 7 months ago
gigachad@feddit.de 7 months ago
Johandea@feddit.nu 7 months ago
youtu.be/d6iQrh2TK98?feature=shared
Just a number dumb monkeys believe to be “more random”.
BuboScandiacus@mander.xyz 7 months ago
tooLikeTheNope@lemmy.ml 7 months ago
My art professor wrote a book about famous artists and thinkers dying at 37, www.ibs.it/…/9788804734017
FiniteBanjo@lemmy.today 7 months ago
Why would that need to be proven? We’re the sample data. It’s implied.
jarfil@beehaw.org 7 months ago
The correctness of the sampling process still needs a proof. Like this.
EatATaco@lemm.ee 7 months ago
“we don’t need to prove the 2020 election was stolen, it’s implied because trump had bigger crowds at his rallies!” -90% of trump supporters
Another good example is the Monty Hall “paradox” where 99% of people are going to incorrectly tell you the chance is 50% because they took math and that’s how it works.
Just because something seems obvious to you doesn’t mean it is correct. Always a good idea to test your hypothesis.
olicvb@lemmy.ca 7 months ago
holy crap, the answer to life the universe and everything XD
WarmSoda@lemm.ee 7 months ago
More than likely it’s because of that book and how often it’s qouted
Empricorn@feddit.nl 7 months ago
Yes, but it’s significant because the prompt was to choose a number. I realize computers can’t really be random, but if we needed to just select a popular number…we can already do that!
FiniteBanjo@lemmy.today 7 months ago
No shit, sherlock, it’s sample data is the internet.
Appoxo@lemmy.dbzer0.com 7 months ago
Wheres 69 then?
Chadus_Maximus@lemm.ee 7 months ago
That’s a naughty number and we don’t allow those.
FiniteBanjo@lemmy.today 7 months ago
nice
ReallyActuallyFrankenstein@lemmynsfw.com 7 months ago
What does “temperature” on the Y-axis refer to?
gerryflap@feddit.nl 7 months ago
I’m not a hundred percent sure, but afaik it has to do with how random the output of the GPT model will be. At 0 it will always pick the most probable next continuation of a piece of text according to its own prediction. The higher the temperature, the more chance there is for less probable outputs to get picked. So it’s most likely to pick 42, but as the temperature increases you see the chance of (according to the model) less likely numbers increase.
This is how temperature works in the softmax function, which is often used in deep learning.
driving_crooner@lemmy.eco.br 7 months ago
youtu.be/wjZofJX0v4M your answer from the 22:00 mark on.
ReallyActuallyFrankenstein@lemmynsfw.com 7 months ago
Super helpful, thanks!
HarkMahlberg@kbin.social 7 months ago
I mean... they didn't specify it had to be random? But yeah, it's a good showcase of how GPT acquired the same biases as people, from people..
OsrsNeedsF2P@lemmy.ml 7 months ago
uniform
Reminds me of my previous job where our LLM was grading things too high. The AI “engineer” adjusted the prompt to tell the LLM that the average output should be 3. I had a hard time explaining that wouldn’t do anything at all, because all the chats were independent events.
Anyways, I quit that place and the project completely derailed.
lauha@lemmy.one 7 months ago
Ask humans the same and most common numer is 37
Catsrules@lemmy.ml 7 months ago
I saw that YouTube video as well.
Cethin@lemmy.zip 7 months ago
For very different reasons though. 37 is what people think is the most random, because humans are dumb. The LLM here tried to choose the most likely.
lemmyingly@lemm.ee 7 months ago
Hello Veritasium enjoyer
erwan@lemmy.ml 7 months ago
In his video, he shows that the more common answers are actually 42 and 69.
I discards them because they’re picked for a reason rather than a human genuinely trying trying to pick a random number, but they’re still way more common than 37.
cypherpunks@lemmy.ml 7 months ago
Corgana@startrek.website 7 months ago
HOW DID THE TRUCK GET INTO SPACE??
Love that episode though.
Crozekiel@lemmy.zip 7 months ago
I always like to throw out 37 because of Dante’s girlfriend.
ForestOrca@kbin.social 7 months ago
WAIT A MINUTE!!! You mean Douglas Adams was actually an LLM?
ElCanut@jlai.lu 7 months ago
I’ve never seen Douglas Adams and a LLM in the same room together 🤷
Naboo_calls_for_aid@sopuli.xyz 7 months ago
So many things are starting to make sense
dudinax@programming.dev 7 months ago
In an interview, Douglas Adams said after lengthy consideration John Cleese picked 42 as the least interesting number.
pipows@lemmy.today 7 months ago
LMs aren’t thinking, aren’t inventing, they are predicting what is supposed to be answered next, so it’s expected that they will produce the same results every time
xthexder@l.sw0.com 7 months ago
You’re mostly right, but this graph actually shows a little more about what’s happening with the “temperature” of the LLM.
It’s actually predicting the probability of each word (token) it knows coming next, all at once. The temperature then says how random it should be when picking from that list. A temperature of 0 means always pick the most likely next word, which in this case ends up being 42.
As the temperature increases, it gets more random (but you can see it’s still not a perfect random distribution with a temperature of 1)eluvatar@programming.dev 7 months ago
Except it clearly doesn’t produce the same result every time. You’re not making a good case for whatever you’re trying to say.
Cethin@lemmy.zip 7 months ago
They add some fuzziness to it so it doesn’t give the exact same result. Say one gets a score of 90, another 85, and other 80. The 90 will be picked more often, but they sometimes let it pick the 85, or even the 80. It’s perfectly expected, and you can see that result here with 42 being very common, but then a few others being fairly common, and most being extremely uncommon.
FlashMobOfOne@beehaw.org 7 months ago
HA, funny that this comes up. DND Beyond doesn’t have a d100, so I opened my ChatGPT sub and had it roll a d100 for me a few times so I could use my magic beans properly.
terminhell@lemmy.dbzer0.com 7 months ago
I use the percentile die for that.
FlashMobOfOne@beehaw.org 7 months ago
Also an excellent method.
TauriWarrior@aussie.zone 7 months ago
Opened up DND Beyond to check since i remember rolling it before and its there, its between D8 and D10, the picture even shows 2 dice
FlashMobOfOne@beehaw.org 7 months ago
That’s helpful. Thank you.
Urist@lemmy.ml 7 months ago
Roll two d10, once for each digit, and profit?
Matty_r@programming.dev 7 months ago
I guess you’d need 10 to represent 0, and if you got 2x 10 that would be 100?
Cube6392@beehaw.org 7 months ago
But why use Chatgpt for that? Why not a duck duck go action? I just don’t understand why we’re asking a LLM whose goal is consistency, not randomness, to do random
DarkFox@pawb.social 7 months ago
Which model?
When I tried on ChatGPT 4, it wrote a short python script and executed it to get a random integer.
import random # Pick a random number between 1 and 100 random_number = random.randint(1, 100) random_number
TonyTonyChopper@mander.xyz 7 months ago
does the neural network actually run scripts or is it pretending
amju_wolf@pawb.social 7 months ago
It generates code and then you can use a call to some runtime execution API to run that code, completely separate from the neural network.
Umbrias@beehaw.org 7 months ago
That’s not answering the question though.
“Pick a number between 1 and 100” doesn’t mean “grab two d10” or write a script.
xyguy@startrek.website 7 months ago
Only 1000 times? It’s interesting that there’s such a bias there but it’s a computer. Ask it 100,000 times and make sure it’s not a fluke.
Wirlocke@lemmy.blahaj.zone 7 months ago
I’m curious, is there actually so many 42’s in the system (more than 69 sounds unlikely).
What if the LLM is getting tripped up because 42 is always referred to as the answer to “the Ultimate Question of Life, the Universe, and Everything”.
So you ask it a question like give a number between 1-100, it answers 42 because that’s the answer to “Everything”, according to it’s training data.
Something similar happened to Gemini. Google discouraged Gemini from giving unsafe advice because it’s unethical. Then Gemini refused to answer questions about C++ because it’s considered “unsafe” (referring to memory management). But Gemini thinks C++ is “unsafe” (the normal meaning), therefore it’s unethical. It’s like those jailbreak tricks but from its own training set.
Corgana@startrek.website 7 months ago
I’m curious, is there actually so many 42’s in the system?
Sort of, it’s not actually picking a random number. It does not know what “random” means. It is analyzing the number of times the question “pick a random number” was asked and what the most common responses to that question looked like.
Glasgow@lemmy.ml 7 months ago
I certainly hope that’s what happening or maybe it is actually the answer.
exanime@lemmy.today 7 months ago
I’m curious, is there actually so many 42’s in the system? (more than 69 sounds unlikely)
From hitchhiker’s guide to the galaxy?
thesmokingman@programming.dev 7 months ago
Rekhyt@beehaw.org 7 months ago
There’s a great Veritasium video recently about this exact thing: youtu.be/d6iQrh2TK98
It’s a human thing, though. This is just more evidence of LLM’s problem with garbage in, garbage out: it’s human biases being present in a system that people want to claim doesn’t have them.
Grimpen@lemmy.ca 7 months ago
Veritasium just released a video about people picking 37 when asked to pick a random number.
humbletightband@lemmy.dbzer0.com 7 months ago
People do mention Veritasium, though he doesn’t give any significant explanation of the phenomenon.
I still wonder about 47. In Veritasium plots, all these numbers provide a peak, but not 47. I recall from my childhood that I indeed used to notice that number everywhere, but idk why.
warm@kbin.earth 7 months ago
47 does provide a peak in the plots though? All the numbers ending in 7 do.
thesmokingman@programming.dev 7 months ago
See my link for 47. Its Wikipedia has more context. If you’re a Star Trek fan, you’ve seen it a ton.
BuboScandiacus@mander.xyz 7 months ago
37
PhreakyByNature@feddit.uk 7 months ago
NEEDS MOAR 69 FELLOW HUMAN
Semi-Hemi-Demigod@kbin.social 7 months ago
So what? It figured out The Answer, big whoop.
Get back to me when it figures out The Question.
lolola@lemmy.blahaj.zone 7 months ago
What’s the y axis?
Blackmist@feddit.uk 7 months ago
I spent an afternoon once playing Infinite Craft, which uses some sort of LLM behind the scenes to do it’s combinations.
At one point I got 007, and found 007+007 = 0014.
The maths gets wild though, and because it’s been trained on text, it has no idea when it comes to combinations of numbers it hasn’t seen before. I spent ages trying to get it to 69420 and just couldn’t, although I could get 42069.
phorq@lemmy.ml 7 months ago
I petition to rename ChatGPT to DeepThought based on these results.
Kyre@kbin.social 7 months ago
Phroon@beehaw.org 7 months ago
― Douglas Adams, Life, the Universe and Everything
AlexisFR@jlai.lu 7 months ago
The mattress? Like for sleeping?
Asafum@feddit.nl 7 months ago
Yep! The hitchhikers books are so much fun lol