Syllable Counting And The Secret To The 'Speed' Of Languages
Recent studies from a French laboratory of linguistics reveal surprising aspects of human language that make it even more mysterious than it sounds.
LYON — We all know somebody who speaks with machine-gun speed, and we know others who speaks slowly, dragging a conversation on and on. And on ... Polyglot speakers will likely notice that the difference in the flow of speech not only varies among individuals, but also among languages as well. One need not read Murakami or Cervantes to know that Japanese and Spanish are spoken rather quickly.
These observations have led a team of linguists from the France-based Laboratory for Language Dynamics (University Lumière Lyon II) to ask an interesting question: Are languages that are spoken faster than others more efficient in transmitting information? Last month they published their findings in the journal Science Advances, and the results are quite extraordinary.
For their experiment, the linguists of Lyon applied the information theory of Claude Shannon to their recordings, asking 170 speakers of 17 different languages to read out loud a series of texts. First observation: If it seems intuitive that some languages seem faster than others to the ear this is perfectly justified. Linguists measure the readers' speech rate, by the number of syllables pronounced per second, and found that rate varies significantly between languages. Japanese speech pronounces 8.03 syllables per second while Vietnamese 5.25 and Thai 4.70. Note how variations have nothing to do with geographical distribution, since these Asian languages fall on different ends of the spectrum.
The information density of Japanese is eight times lower than that of Vietnamese.
But speech rate alone says nothing about the supposed effectiveness of information transmission of different languages. Linguists studied another parameter to determine this, one which may be a little more difficult to grasp: the syllabic density of information. The main author of the study, linguist François Pellegrino explains: "If a syllable can be easily distinguished from those which precede it, it is because it brings little information, in Shannon's sense of it; if, on the contrary, it is difficult to distinguish, it brings more information." Consider an example from French, where the commonly used "because" is a two-word construction: "parce que," so that when we hear "parce," we are able to predict with near certainty what the next syllable will be, which means that the information density of the "que" in this case is almost nil.
Not all languages encode the same average amount of information (measured in bits) in each of their syllables. "Average information densities of different languages vary over an interval ranging from 5.03 bits per syllable for Japanese, to 8.02 bits for Vietnamese," says Pellegrino. To say Japanese has an average information density of about 5 bits ber syllable means that there's a 1 in 32 chance of predicting the next syllable. Whereas Vietnamese amounts to a 1 in 256 chance to predict the next possibility. Thus, the information density of Japanese is eight times lower than that of Vietnamese, since it's eight times easier to predict the next syllable in Japanese.
Photo: Nick Fewings
So linguists observed an inverse relation between these two parameters — speech rate (measured syllables per second) and information density (measured in bits per syllable): The higher the speech rate, the lower the information density, and vice versa. This lead linguists to discover a remarkable phenomenon: Both parameters taken together, speech rate and information density, add up to the total information rate of a language. The information rate of all languages across the globe has been calculated at 39 bits per second. No matter how dissimilar languages appear from one another, how fast or slow they are to our ears, all languages on Earth will convey the same amount of information over a given amount of time.
"To be effective in terms of information transmission, languages have the choice between two opposing strategies: Either they favor a high rate of speech at the price of a low density of information, or they do the opposite," Pellegrino explains. French is a "medium" language in this regard, roughly equidistant from the two extremes of speech rate (6.85 syllables per second) and information density (6.68 bits per second).
Our vision of language is evolving constantly.
That information rates of languages are fixed to 39 bits per second probably owes nothing to chance, but is bounded to the cognitive capabilities of the how brains process language. If a language were below the threshold of 39 bits per second, it would not allow speakers to cope with the complexity of the world leading to that language's elimination. If it were over the threshold, cognitive capabilities would overload (we cannot permanently maintain the production or processing of too much information), and the language would die. So the threshold of 39 bits per second corresponds to a niche, both biological and cultural in origin, which defines the zone of viability in which human languages can exist.
Our vision of language is evolving constantly. Previous studies at the Lyon laboratory have shown that new sounds may appear in a language. For example, the addition of nasal vowels (like the French "an", "in", "on" etc.) doubles the total number of vowels a language has. More sounds and more syllables increase the total information density of a language. But does this mean it's possible to dislodge a language from its limit of 39 bits per second? No, responds François Pellegrino, "Our hypothesis is that a change to the structure of language modifies its syllabic density, which also leads its speakers to modify their speech rates inversely, in order to preserve optimal information rates." Not unlike the way Darwinian mechanisms in living species subject them to the laws of evolution: adapt or die.