Survival of the Fittest
The term survival of the fittest has been associated with Charles Darwin but the term was actually coined by Herbert Spencer in 1864. Spencer was an English philosopher and sociologist who published his book “Principles of Biology” after reading Charles Darwin’s On the Origin of Species (1859). Spencer used the phrase to describe his interpretation of Darwin’s concept of natural selection, but Spencer emphasized that cultural institutions, laws, and education systems act as selection pressures just as much as climate or predators. Later Darwin borrowed the phrase and used it in his 5th edition (1869) of On the Origin of Species as a synonym for natural selection.
Since the time of Darwin, the idea of survival of the fittest has diverged from Darwin’s original thesis to consider other factors like accidents, mass extinctions, and pure chance.
“Survival of the fittest” is Oversimplified
Though still widely quoted, the phrase “fitness” originally meant reproductive success, the ability to leave viable offspring. Peter Kropotkin in his Mutual Aid: A Factor of Evolution (1902) highlights cooperation as a major evolutionary driver. Many organisms including honey bees and humans survive through cooperation and altruism, not just competition. These are some of many biological factors affecting the survival of a species.
Central to human culture, education, technology and cooperation are the languages that transmit these important non-biological survival factors. Here we consider if language itself is subject to survival pressures.
The first known application of the concept of “survival of the fittest” to languages is generally credited to August Schleicher (1821 – 1868), a German linguist. Schleicher did not use that actual phrase; all he did is propose that Darwin’s biological theory could be applied to linguistics. His key work was, “Die Darwinsche Theorie und die Sprachwissenschaft” (1863) (The Darwinian Theory and Linguistics). Darwin too pondered the survival of languages:
We see variability in every tongue, and new words are continually cropping up; but as there is a limit to the powers of the memory, single words, like whole languages, gradually become extinct. As Max Müller has well remarked: – “A struggle for life is constantly going on amongst the words and grammatical forms in each language. The better, the shorter, the easier forms are constantly gaining the upper hand, and they owe their success to their own inherent virtue.” To these more important causes of the survival of certain words, mere novelty may, I think, be added; for there is in the mind of man a strong love for slight changes in all things. The survival or preservation of certain favoured words in the struggle for existence is natural selection. (Darwin 1871: 60–61)
Survival has certainly been a theme among many scientists. The definition of how to survive and which are the fittest are questions that constantly evolve. It is obvious that language is a primary driver for human survival. Technology passed from one generation to the next has enabled us to reach a remarkable population of around 8.5 billion. We achieve that, not through reproductive prowess, but because we harnessed energy to produce an environment that made reproduction and survival almost effortless. Now our profligate use of energy is making survival of other forms of life increasingly difficult. We pollute the environment while using our keen language skills to debate the rationality of what we are doing.
Language as a Tool of Aggression
We seldom consider language as a weapon but history shows how we use language to achieve our genetic disposition toward competition for food and resources. Sophisticated warfare would be impossible without language – leaders barking orders to assault and kill opponents who almost universally speak a different language.
A less bloody form of war is class warfare within a population all ostensibly speaking the same language. The battle between aristocracy speaking the Queen’s English and the peasantry speaking various regional dialects effectively kept wealth and power in the hands of aristocracy in England for hundreds of years. The same thing is playing out even today. Inner city young people limit their chances of upward mobility simply by the copying the language of their regional inner-city parents, adding to it unintelligible slang (except to them), thus alienating themselves from the greater society and even their parents. Alienation is a partly a function of language.
Is there a Way to Predict the Survival of a Language?
We propose seven factors that obviously affect how a language remains relatively stable in a population. There are certainly many other factors like Darwin’s quote from Müller above – shorter words. He could have identified the umlaut in Müller’s name as problematic but Darwin never heard of ASCII. We include ASCII in our 7th core factor below, Script Adaptability. The same factors may allow us to predict which languages will not remain stable in a population.
The seven variables (P,T,E,C,G,D,S) are: Population of Speakers (broken down into native speakers and fluent non-native speakers; Inter-generational Transmission Rate; Economic Value; Cultural Prestige; Government Support; Digital Presence; and Script Adaptability.
Language Fitness Model
As with Survival of the Fittest, This Language Fitness Model is admittedly oversimplified. The coefficients (a–g) are are weighted coefficients (0 to 1) based on the empirical impact on each of the seven variables (P,T,E,C,G,D,S). Admittedly some of them are good guesses, even for Grok.
For example we give Hebrew a 1 for government support G and Inter-generational Transmission T. The .005 for Cherokee Digital Presence D was easy for Grok to find and score. This is true of all the “D” scores, while “C” scores tend to be more subjective based on often ambiguous or biased opinion. The relative importance between the factors is also difficult to measure. Regional cultural prestige of Arabic in countries where Islam is the majority religion may result in a high “C” in those regions even though Arabic or its dialects are not spoken, as in Iran. Also missing is any time scale. Which languages will disappear in 50, 100, or 1,000 years?
Fitness Function:
F =(aP* + bT + cE + dC + eG + fD + gS) / 7
Now we split population:
P*= αPL1+(1−α)PL2
The population P is split into two groups, native speakers L1 and fluent speakers L2:
PL1 = scaled native speaker population (L1)
PL2 = scaled fluent speaker population (L2)
α = weight for native speakers
Core Factors: (Variables Affecting Language Fitness)
- Population of Speakers (P) – number of fluent speakers. (since languages like English and Mandarin have hundreds of millions; scaled relative to 1.52 B = English top).
- Inter-generational Transmission Rate (T) – proportion of children learning the language as a first language. (high = 1, mid = 0.7, low = 0.3)
- Economic Value (E) – relative advantage of knowing the language for employment and commerce. (global: 1; regional: 0.6)
- Cultural Prestige (C) – association with literature, religion, science, or media. (global: 1; regional literary prestige: 0.7)
- Governmental/Legal Support (G) – education, official status, and policy protection. (official global = 1; regional official = 0.7; none = 0.3)
- Digital Presence (D) – availability of online content, software localization, and social media use. (derived from share of web content per language)
- Script Adaptability (S) – ease of writing, availability of keyboards, literacy infrastructure. (alphabetical standard with full Unicode support = 1; complex script with limited support = 0.5; endangered orthography = 0.3)
| Language | Native Speakers (L1) | Fluent Speakers (L2) | L2 Reach | P* (Weighted) | T | E | C | G | D | S | F (Fitness) |
|---|
| English | 370M | 1,150M | 3.1:1 | 0.82 | 1.00 | 1.00 | 1.00 | 1.00 | 0.521 | 1.00 | 0.91 |
| Spanish | 486M | 100M | 0.21:1 | 0.41 | 1.00 | 0.90 | 0.90 | 1.00 | 0.055 | 1.00 | 0.77 |
| French | 80M | 220M | 2.7:1 | 0.40 | 1.00 | 0.85 | 1.00 | 1.00 | 0.039 | 1.00 | 0.74 |
| Portuguese | 258M | 25M | 0.10:1 | 0.25 | 1.00 | 0.70 | 0.70 | 1.00 | 0.031 | 1.00 | 0.69 |
| German | 76M | 15M | 0.20:1 | 0.09 | 1.00 | 0.80 | 0.90 | 1.00 | 0.055 | 1.00 | 0.69 |
| Arabic | 310M | 110M | 0.35:1 | 0.39 | 1.00 | 0.60 | 0.80 | 1.00 | 0.006 | 0.70 | 0.67 |
| Mandarin | 1,120M | 30M | 0.03:1 | 0.76 | 1.00 | 0.90 | 0.90 | 1.00 | 0.013 | 0.80 | 0.65 |
| Russian | 154M | 110M | 0.71:1 | 0.081 | 1.00 | 0.60 | 0.80 | 1.00 | 0.045 | 1.00 | 0.65 |
| Hebrew | 9M | 0.2M | 0.02:1 | 0.02 | 1.00 | 0.50 | 0.90 | 1.00 | 0.043 | 1.00 | 0.63 |
| Japanese | 125M | 3M | 0.02:1 | 0.13 | 1.00 | 0.60 | 0.80 | 1.00 | 0.043 | 0.50 | 0.61 |
| Persian | 70M | 4M | 0.06:1 | 0.07 | 1.00 | 0.40 | 0.60 | 0.70 | 0.043 | 0.70 | 0.58 |
| Welsh | 0.56M | 0.35M | 0.63:1 | 0.01 | 0.70 | 0.40 | 0.70 | 1.00 | 0.020 | 0.90 | 0.57 |
| Yucatec Maya | 0.80M | 0.05M | 0.06:1 | 0.53 | 0.60 | 0.35 | 0.60 | 0.70 | 0.08 | 0.90 | 0.54 |
| Hindi | 602M | 260M | 0.43:1 | 0.46 | 1.00 | 0.70 | 0.80 | 0.70 | 0.005 | 0.70 | 0.49 |
| Maori | 0.05M | 0.30M | 6:1 | 0.06 | 0.70 | 0.30 | 0.60 | 0.70 | 0.010 | 0.80 | 0.48 |
| Basque | 0.75M | 0.10M | 0.13:1 | 0.01 | 0.50 | 0.20 | 0.50 | 0.70 | 0.015 | 0.60 | 0.36 |
| Cherokee | 0.002M | 0.002M | 1:1 | 0.002 | 0.20 | 0.10 | 0.30 | 0.40 | 0.005 | 0.50 | 0.24 |
| Cornish | <0.001M | 0.0006M | Revival | 0.0001 | 0.30 | 0.05 | 0.20 | 0.30 | 0.005 | 0.30 | 0.22 |
Color codes for L2 reach: Light Blue → Very high L2 reach (>5:1); Yellow → Moderate L2 reach (1:1 – 5:1); Grey → Low L2 reach (<1:1); Orange → Revival (L2 >> L1 but almost no native base)
Sources:
- Ethnologue 25th Edition (2022)
- Provides the most widely used estimates of L1 and L2 speakers for world languages.
- Particularly for English, Spanish, French, Portuguese, Russian, Arabic, Hindi, Japanese, Persian, German.
- Data compiled from government censuses, linguistic fieldwork, and academic studies.
- UNESCO Atlas of the World’s Languages in Danger (2022)
- Used for endangered and revival cases (Welsh, Cornish, Maori, Yucatec Maya, Cherokee, Basque).
- Includes vitality indicators and transmission data.
- Regional or Specialized Data (2021–2023)
Māori, Cornish, Cherokee, and Welsh revival figures cross-checked against government education statistics (e.g., New Zealand Ministry of Education, Welsh Government).
Why There Is a Range for Some Languages
- English L2 counts vary widely depending on how “fluent” is defined:
- British Council: ~1.5 billion with some English ability (2021).
- Cambridge English: ~750–1,100 million with functional fluency (2022).
- Similar uncertainty exists for French (200–300M L2) and Arabic (100–130M L2).
UNESCO Language Extinction Data
UNESCO (United Nations Educational, Scientific and Cultural Organization) estimates there are 6,700–7,000 languages spoken worldwide today, and roughly 40% of them are classified as threatened with extinction. Since 1950, approximately 244 languages have already gone completely extinct. UNESCO and linguistic researchers project that by the end of the 21st century, up to 3,000 languages (close to half of those currently spoken) could be lost if current trends continue.
What are the Current Trends?
Globalization and Urbanization: Migration to cities leads to adopting dominant languages for economic and social integration, reducing native language use. Witness the large migration into the United States during the ten years since 2015. The oldest of these people may not quickly change to English as their primary language but their children and grandchildren certainly will.
Cultural Assimilation: Policies and colonial legacies suppress minority languages, favoring national or global ones.
Economic Pressures: Languages tied to low socioeconomic status are abandoned for those offering better opportunities.
Education and Media: Schooling and media in dominant languages diminish fluency in native tongues.
Digital Exclusion: Lack of digital support (e.g., fonts, AI) for minority languages accelerates their obsolescence.