Comment

Comment on I notice Indians speaking English tend to speak very fast. Are the Indian languages simply spoken faster?

Ok, so I heard a thing a long time ago about information density in languages, and that there’s a specific amount of information conveyed per second which is pretty consistent across languages, even when the number of sounds is higher or lower. Which means that a single word in English, for instance, would convey more information than a single word in Hindi.

Is there anything to that? Or was that just nonsense?

source

Sort:hotnew top

merc@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
Someone posted a link to just that topic here. Apparently almost all languages transmit about 39 bits per second of data. Italians use 9 syllables per second, Germans only about 5-6, but both convey the same amount of information per second. But, not all syllables are equal. Japanese has about 5 bits per syllable, English has about 7 bits per syllable. The most information dense language per syllable is apparently Vietnamese with about 8 bits per syllable.

Apparently though, the bottleneck is the brain. The end result seems to be that languages that have fewer “bits of data” per syllable say those syllables more quickly, and the ones with fewer bits of data per syllable say those syllables more slowly, so that the average is about 39 bits per second no matter what the language.

Having said that, I often listen to podcasts sped up to 1.5x speed, and I listen to podcasts while doing other things, so I guess the bottleneck is probably on the sending side rather than the receiving side.

source
- takeheart@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Podcasts, being prerecorded and edited, don’t really fit this model. It’s more for a conversation with a back and forth where both interlocutors don’t know ahead of time what the other person will say. So they need to observe/listen, reflect while also coming up with answers and putting effort into being properly understood. So basically the natural context in which inter human communication evolved.
  
  source
- ytg@feddit.ch ⁨1⁩ ⁨year⁩ ago
  Does anyone know how the amount of information is actually derived? The article just says “researchers calculated”
  
  source
  - merc@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
    They were vague about it, but they said something about converting it to computer code. I would guess they just wrote it out as ASCII text and counted how many bits of ASCII equivalent they transmitted. (Of course this ignores intonation and emphasis, but I’d guess they did ignore those.)
    
    source
    bleistift2@feddit.de ⁨1⁩ ⁨year⁩ ago
    If that’s really what they did, it’s stupid. First, you need to find a translation for every language to ASCII, which will wildly skew the results. Second, there are many ways to express the same concept, which all vary wildly in length. Take “Hi”, 2 letters, which means exactly the same as “How are you doing?”, 14 letters.
    
    source
    -> View More Comments
actual_patience@programming.dev ⁨1⁩ ⁨year⁩ ago

Ok, so I heard a thing a long time ago about information density in languages, and that there’s a specific amount of information conveyed per second which is pretty consistent across languages, even when the number of sounds is higher or lower.

This is true.

Which means that a single word in English, for instance, would convey more information than a single word in Hindi.

I don’t think that’s the right interpretation. There are words in English that would require sentences to be made for each if conveyed in a different language. But the same is true vice-versa.

Have a look at subtitles for movies from one language to any other. Translators struggle conveying what should be paragraph long sentences of context behind a single word for one language. Do not get me started on double speak.

source
- ilinamorato@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Oh, interesting. I hadn’t considered that there would be variances in information density within a language, but that makes sense; “truth” is a very loaded concept that means a lot of different things in context, even though it’s only one syllable; but on the other hand “authenticity” is five syllables but carries with it a meaning that is a subset of the definition of “truth.”
  
  I guess that’s why subtitling is even possible in different languages; if there were languages with vastly less information density than the source language, they’d need a whole screen just for the captions.
  
  source
LotrOrc@lemmy.world ⁨1⁩ ⁨year⁩ ago
Fairly nonsense If anything I’d say it’s the other way around – there are lots of words in Hindi/Malayalam that you need 5 or 6 English words to describe

source
- bionicjoey@lemmy.ca ⁨1⁩ ⁨year⁩ ago
  It’s not nonsense. Information density isn’t about number of words. It’s about duration and complexity of communication. And it is fairly consistent across all languages. Some languages take 3 words to say something the other can say in one, but those 3 words probably take a similar amount of brainpower and time to communicate as the one word.
  
  source