Friday, April 8, 2011

Structure in West Asian Indo-European groups (part 2)

I will occasionally revisit old posts such as Structure in West Asian Indo-European groups to take advantage of new population samples from project submitters. This time around, I included our first Kurdish and Azeri participants, and limited the analysis to populations for which I had large numbers of markers (the final total is a ~132k pruned set of markers).

I also included Greeks, Caucasian populations (Georgians and Lezgins), and Levantine Arabs (Syrians-Lebanese) who frame this region from the West, North, and South respectively, as well as Assyrians who are interspersed in West Asia as an ethno-religious minority.

Here are dimensions 1 and 2 of the multidimensional scaling plot:

The two Caucasian groups (South and Northeast Caucasian Georgians and Lezgins respectively) form their own clusters. So do the Iranians and the Syro-Lebanese.

(As always population labels are placed on population averages, and _D denoted Dodecad Project populations)

Curiously, many linguists assert a close relationship of Greek and Armenian within the Indo-European language family. Turks speak an Altaic language due to migration of a numerically small population element, but their pre-Turkish genetic ancestors were Anatolian speakers, Greeks, Armenians, and Iranians, i.e., primarily Indo-Europeans.

Dimensions 1 and 3:

Dimension 3 contrasts Greeks from West Asian groups. Notice also the presence of 3 Armenians at the bottom of the plot, these are outliers of the Behar et al. Armenian sample.

Notice that the Azeri_D sample clustered with Iranians in dimension 2 and with Turks in dimension 3. This is not very surprising, as Azeris speak a Turkic language, but also have clear Iranian antescendants. The Kurd_D sample, on the other hand, clusters consistently with Iranians.

The variability of the Greek_D population sample along dimension 3 is also quite interesting. This could reflect variable levels of influence of extra-Greek European/Anatolian population elements on the basic Greek stock. Greeks who possess 23andMe or FTDNA Illumina population data are strongly encouraged to join the Project to help us better determine regional variation within the ethnic Greek population.

The Clusters Galore analysis results are as follows (9 clusters/3 MDS dimensions retained) :

In brief, the modal populations in each cluster are:
  1. Greeks
  2. Iranians
  3. Turks
  4. Syrians and Lebanese
  5. Armenians and Georgians
  6. Armenians and Assyrians
  7. Syrians and Iranians
  8. Lezgins
  9. Georgians
I will be happy to provide to all Dodecad Project members from the _D populations with their individual results. If you send me e-mail at, I will send you a line with your probabilities of assignment in each of the 9 clusters, as well as the 3 co-ordinates in the first 3 MDS dimensions plotted above.

I strongly encourage individuals from West Asia, the Balkans, and Italy to contact me for inclusion in the Project (send e-mail first, not data). While submission to the Project is currently closed, I usually accept data from these regions.

No comments:

Post a Comment