How languages recycle parts of words to avoid confusion
by Paul Arnold, Phys.org
edited by Gaby Clark, reviewed by Robert Egan
Many languages recycle words, giving them different meanings. For example, in English, “run” can mean to move quickly but also to manage something, like “run a company.” In Spanish, “lengua” is both the word for tongue and language, as in “la lengua española.” This type of word reuse is known as colexification.
But there is another type of recycling, and that is partial colexification, where languages reuse only parts of words. A good example is the word “grand,” which is shared in “grandfather” and “grandmother.” Until now, very little was known about the rules, patterns and how widespread this type of recycling is across different languages.
A new study published in the journal Nature Human Behaviour explores how different languages systematically reuse these smaller word parts while balancing efficiency with the need to keep meanings distinct. Barend Beekhuizen at the Department of Language Studies at the University of Toronto Mississauga in Canada has published a News & Views piece on the research in the same journal.
A linguistic tug-of-war
Before setting out on their study, the research team hypothesized that there is a constant tug-of-war between two opposing forces that shape how meanings are mapped to words. They are lexical compression (reusing words to keep things as simple as possible) and lexical differentiation (using different forms to help distinguish meanings). Language can reuse forms for related meanings, but excessive reuse can make meanings harder to distinguish.
The study authors examined a massive linguistic database called Lexibank, which contains word lists from many languages. They studied data from more than 1,900 languages spanning 192 different language families.
To see how these two forces operate in the real world, the researchers used two tools. First, to measure how closely related two ideas are in human memory, they used data from a word-association game in which thousands of people were given a word and asked to say the first thing that came to mind.
Second, they used AI computer models to analyze millions of sentences and measure how similar the contexts of different words are. This gave them a way to estimate how easily two meanings might be confused if they shared the same form.
Making life easier
The team discovered that reusing word parts is not random and occurs across many different language families. Full word reuse happens when one word is used for more than one closely related meaning that people can easily tell apart, such as “mouth” (used for a body part and the opening of a river).
Partial word reuse is a middle-ground compromise. It occurs when two ideas are highly related but frequently pop up in similar contexts. In these cases, language reuses parts of words with related meanings to make things easier but keeps the words slightly different to avoid mix-ups. As the researchers note in their paper, “partial colexification appears to arise as a middle-ground strategy when full colexification risks ambiguity in overlapping contexts.”
An example predicted by the researchers’ model is “fourteen” and “ten.” They are closely related numbers, but since they are used in similar situations, giving them the exact same names would create confusion. Instead, languages may favor forms that share some material while remaining distinct. The study authors say future studies could explore whether the same balance between efficiency and clarity helps shape other parts of language, such as grammar.
