Sunday, December 19, 2010

Fine-scale admixture in Europe (Dagestan/Basque/Sardinian components)

Wanting to see whether the Dagestan mystery would extend into Europe, I carried out an ADMIXTURE analysis including all my European populations. Once again, as this is done on only ~30k markers, a little noise on the low-level components is expected.

Notably there is now both a Sardinian and a Basque centered cluster; the latter was formerly (in the standard K=10 analysis) split between "Southern European" and "Northern European". The Urkarah, Lezgin, and Stalskoe samples show the highest presence of the "blue" component, which I label, once again, Dagestan. Note, however, that you should not compare admixture proportions across ADMIXTURE runs for components that happen to be labeled the same (homonymous). Certainly "this" Dagestan is related to the "previous" Dagestan component, but do not assume they are identical.

Here is the Fst distance matrix between the 7 components:


The most notable thing about this figure is the relative absence of the West Asian component in the periphery of Europe. The lowest values are seen in Basque, Sardinian, Orcadian, White Utahns, Lithuanians, Finns, and Scandinavians (in no particular order).

It is worthwhile to order the European populations in terms of their Dagestan component. Excluding the populations of the Caucasus, these are, in ascending order: Basque (0.7%), Sardinian, Cypriot, Belorussian, South Italian/Sicilian, Lithuanian, Tuscans, Portuguese, Greek (3.8%), Vologda Russian, Romanian, Finnish, Spaniards, North Italian, Dodecad Spaniards, Dodecad Russian, Chuvash, Hungarian, French (7.9%), German, Scandinavian, White Utahn, Orcadian (12.6%).

Interpreting this pattern is not easy, but it does seem that this component seems to have a V-like distribution, achieving its maximum in Caucasus and its environs, then undergoing a diminution, and achieving a secondary (lower) frequency mode in NW Europe.

The surprising appearance of the homonymous Dagestan component in India suggests a widespread presence of a common ancestry element. The West Asian element, by comparison seems to have a more normal /\-like distribution around its center in Anatolia-Caucasus-Iran region. It does reach the Atlantic coast, but is lacking in Scandinavia and Finland, and also in India itself.

This is just a piece of a broader puzzle, and the picture is not yet clear. However, we can tentatively say that whatever brought the "Dagestan" component to India was not a unidirectional process, but also brought a similar population element to western Europe.


  1. Very interesting indeed. Now, my question is if you can show also some intranational figures for the Dagestan component.

  2. So maybe the Dagestan component is Indo-European speakers - if their source is in the North Caucasus - Maikop or Kemi-Oba Cultural areas??

    This could make sense of the South Asian Dagestani results and most of the European ones, the slightly elevated NW European Dagestani results, would be the 5,500 Alano-Sarmatians, plus their horses and families. By one estimate 20,000 people settled in the borders of Northern England/Southern Scotland??

  3. two suggestions

    1) a 'leap frog' migration rather than demic diffusion

    2) the impact of the leap frog might vary as a function of local population density

  4. Wow. Someone call Mallory and John V. Day! Seriously.

  5. pconroy said... "So maybe the Dagestan component is Indo-European speakers - if their source is in the North Caucasus - Maikop or Kemi-Oba Cultural areas?"

    This is very plausible. The spread of the component also seems to correlate well with the spread of R1a/R1b, probably more with the latter in Europe (German, Scandinavian, White Utahn, Orcadian – Germanic & Celtic subclades of R1b1?). The Maikop population itself could have shared the component with, or received it from, the neighboring Kura-Araxes Culture, to which Dagestan belonged to. That would explain the higher % of the component in the East Caucasian Dagestani compared to the West Caucasian Adygei.

    Anyway, very fascinating! Thank you, Dienekes!

  6. It would be interesting to gather samples from Polish, Czech, Austrian, Slovakian and Ukrainian people to see how low the V distribution pattern falls between the Caucasus and Northwestern Europe... And where there could be a continuity (if there is one), whether it is along the Danube or the Northern European plain.

  7. Sorry for the double post, Dienekes. I'll be careful not to do that again.

    I am getting the impression, more and more, that the Proto-Indo-European language might have been a creole language incorporating elements of early Northeastern Caucasian and Proto-Uralic languages (the latter probably spoken by R1a1 barers) used as a lingua franca in the steppes north of Daghestan and probably even in Daghestan itself.
    There are already some proven similarities between Proto-Uralic and Proto-Indo-European. We could also hypothetise that the richess in laryngal phonemes (according to the laryngal theory) comes from a Northeast Caucasian substratum/superstratum.

  8. Dienekes said... "Maikop is in the NW Caucasus."

    Maikop was the center, but the culture stretched as far as the northern border of Dagestan. It was also apparently "influenced by the Kuro-Araxes culture (3500—2200 BC) which straddle[d] the Caucasus and extend[ed] into eastern Anatolia" (Wikipedia).

  9. BTW, wife's maternal grandmother DOD097 (Sicilian) has mtDNA H13, whose center of diversity is Dagestan?!

    With a minor element spreading along the cost of the Mediterranean Westward - another possible expansion route?!