The tree of languages

small_tree.fwPhylogeny (or phylogenesis) is the origin and evolution of a set of organisms, usually of a species. A major task of phylogenists is to determine the ancestor-descendant relationships among known species (both living and extinct).

Philology is the study of ancient texts and languages. Historical linguistics (or comparative linguistics) is the branch of philology that studies languages and their interrelationships. A philological tree (or tree of languages) traces the evolutionary interrelationships among languages believed to have originated from a common ancestor.

This issue of our blog compares the tree of life to the tree of languages.

They are similar in that a mutation or change in one individual can spread through the entire population and eventually replace the ancestral type. The following image, a replica of a diagram published in the November 2003 edition of Nature magazine illustrated the resemblances between the reconstruction of the evolutionary history of life and that of languages.

box1 _new

The chart on the left is based on the notion that every living creature has parents, and each of the parents has parents, and so on. Therefore, if we go back far enough, we will find that their phylogenetic tree includes three domains:

Eukarya (animals, plants, fungi),
Bacteria, and
Archaea (living organisms living in extreme environment).

All these cell types are rooted at a hypothetical Cenacestor (the most recent common ancestor), but it is not certain how the Cenancestor branched into these different domains. One theory is that Bacteria and Archaea branched off the Cenancestor first, and then Eukarya branched off from Bacteria, but it is also possible that Eukarya derived some characteristics from Bacteria through a horizontal transfer of genes (indicated by the red arrow).

The tree chart on the right has as its root the Proto-Indo-European languages group, i.e., the Proto-Indo-European language is believed to have been the Cenancestor of this group that branches into:


English is considered a Proto-Indo-European-Germanic language. This means that its core vocabulary descended from German, but there has been extensive borrowing from other languages, like French (as indicated by the red arrow). In this chart we see that word classes serve different functions; i.e., the words descended from German refer to animals, while the words borrowed from French refer to their flesh. It also demonstrates how words might be expected to change over time.

There are also significant differences between the tree of life and the tree of languages. In the tree of life, genetic change can spread only from parent to child, so the rate of the mutation or change process is much slower and can take many generations. Genes can remain unchanged for millions, even billions of years. On the other hand, languages change more rapidly and linguistic changes can spread much faster among unrelated individuals. For example, let’s take a look at some historical examples of language mutations that occurred in the following countries.
Please refer to the following chart that illustrates how languages are rooted, how they evolved and how they are interrelated.
(For better visibility, please download PDF –  The Proto-Indo-European Language Tree)

Language Tree


Hungary has had a tumultuous geographical, political and linguistic history. For instance, the Hungarian language, which is associated with many Proto-Indo-European language branches, really belongs to the Fino-Ugric branch of the Ural-Altaic language tree.

The region known to us today as Hungary was ruled by the Romans from 15 BC to circa 378 CE. Next, it was dominated by the European Huns until 427 CE. In 434 CE, Attila the Hun, took over the region to become leader of the Hunnic Empire, which stretched from The Netherlands to the Ural River and from the Danube River to the Baltic Sea. The Romans managed to reclaim the region, if only briefly. They retained until 445 CE when Attila recovered it and ruled it until his death, in 453 CE. A fierce leader, Attila the Hun was feared by the Roman Empire and, to this date, he is considered the personification of cruelty and greed. The Hunnic Empire did not survive long past the death of its leader, and, in 460 CE, it was conquered by the Ostrogoths, whose domination was short- lived. From 488-558 CE the territory became tribal.

The Huns who survived remained in settlements nearby, giving their name to the region, which thus became known as Hungary. In 558 CE, Hungary was conquered by the Avars conquered Hungary in 558 CE, and remained in power untlil 803 (although there was a break in their rule in the 7th century – from 625 to 660 CE, when the local Slavs dominated). The Avars were a heterogeneous group. Avar is a collective term – Avar-Andi-Dido (Tsez) peoples – describing more than 15 different ethnic groups occupying the foothills of the Russian mountain slopes of the Dagestan Republic.

Toward the end of the 9th century, the Magyars, a nomadic tribe, descended on Hungary, perhaps from the West Siberian steppes and conquered it, thus establishing a Magyar monarchy in the Kingdom of Hungary.

The Magyars imposed their own language on the Romance-speaking population. This was a very significant linguistic change, because the Hungarian language is not related to any of the Indo-European languages. It actually belongs to the Ural-Altaic language tree, which includes Uralic languages, such as Hungarian, Finnish and Estonian, and Altaic languages, such as Turkish, Mongolian, Kazakh, Uzbek, Tatar, Manchu, plus perhaps Korean and Japanese. Genetically, the Magyar influence was not very significant. The Magyar conquerors amounted to a small percentage of the population (only thirty percent) and their influence was further diluted by interaction with neighboring countries. Today, only ten percent of the genes in Hungary can be traced to their Uralic conquerors.

British Isles

This region also has had a turbulent linguistic history, undergoing dramatic changes within a relative short time. The native population of the British Isles spoke pre-Indo-European languages unknown to us today. Circa 1500 BC, the Celts, who originated from Southwestern Germany, spread throughout France, to the North of Spain and to the British Isles. Celtic invasions also reached Northern Italy, Bohemia, Hungary, Illyria (a region of the western Balkan Peninsula) and Asia Minor (Anatolia). Eventually the Celts would be absorbed by the Romans and the barbarians and only Brittany and the West of the British Isles would remain Celtic.

When the Romans conquered the British Isles, most of the population spoke Celtic languages, but the Romans imposed Latin, their own language. In approximately 450 CE, when the Germanic peoples migrated to England, Latin was replaced by Anglo-Saxon (Old English), which assimilated the linguistic characteristics of the pre-Celtic and Celtic languages, and was used for about 700 years. Old English would not remain static. In 1066, William the Conqueror, Duke of Normandy defeated King Harold II of England, in the Battle of Hastings. The Normans introduced many French word into the language. In 793, the Norsemen invaded. Norsemen is the term used to designate the Vikings of Denmark, Norway and Sweden, and perhaps other Nordic tribes in the Scandinavian part of Europe. They also made linguistic contributions to our Anglo-Saxon language.

In the 17th century, Old English evolved into Modern English, approximately at the time of William Shakespeare. Some linguists subdivide Modern English into Early and Late Modern English, using the 1800’s, the time when the British Empire encompassed a large portion of the world and English was significantly influenced also by native languages.

In the 11th century, the Turks began attacking the Byzantine Empire, centered about Constantinople. The city of Byzantium in the Byzantine Empire, had been named by the Greeks from Megara (an ancient city in Attica) who settled there around 660 BC. In 330 AD Constantine the Great declared Byzantium the new eastern capital of the Roman Empire and renamed it Constantinople.

In 1453, under the command Ottoman Sultan Mehmed II, the Ottoman Army, conquered Constantinople. The Ottomans became one of the most powerful empires and the city became known as Istanbul (İstanbul in Turkish), a name that has remained until today.

Genetically, the impact of the Turkish invasion was not very significant, but the linguistic impact was huge, because the Greek and Turkish languages belong to entirely different family groups:

Greek belongs to the Hellenic branch of the Proto-Indo-European language tree, and
Turkish belongs to the Altaic family tree that includes Turkish, Mongolian, Kazakh, Uzbek, Tatar, Manchu, and other Asian languages, including perhaps Korean and Japanese, as stated previously.

There are many more examples of linguistic replacement and genetic change. (If you are interested in this topic, you will enjoy Luigi Luca Cavalli-Sforza’s book “Genes, Peoples, and Languages”. What is remarkable is that, notwithstanding all the changes that have taken place, it is still possible to reconstruct trees for the two evolutionary tracks.


Wikipedia –

Nature magazine, Vol. 426, 27 November 2003 –

Genes, Peoples and Languages by Luigi Luca Cavalli-Sforza

Classification – The Three Domain System

Click to access HW2-Kanchanawarin.pdf