Saturday, May 21, 2011

The Turkic cline

(Last Update: May 22)

I have included all Turkic populations available to me, together with all Turks of the Project (Turkish_D), as well as some other individuals that indicated other Turkic ancestry (e.g., Azeri, Kazak, Tatar) in a new analysis. I have also included Greeks, Armenians, Georgians, Syrians, Poles, and Russians to represent non-Turkic West Eurasian populations, as well as Koryak and Han to represent non-Turkic East Eurasian populations, Mongol, Evenk as Altaic but non-Turkic populations (Mongolic, Tungusic), and Yoruba as a Sub-Saharan outgroup.

Here is the PCA plot:
The line on the plot is a regression of all individuals except Han, Syrians, and Yoruba, all of which appear substantially off the main cline. The line's equation is:

y = 0.4044x+0.0147

with R2=0.9738.

It appears that a simple East-West Eurasian cline captures the main features of variation in present-day Turkic people, as could be expected for a people of an East Eurasian origin that spread across Central Asian and West Eurasian lands that were previously occupied by people of West Eurasian origin.

I have also carried out an ADMIXTURE analysis (K=3) which is shown below:

All the above analyses were performed on a set 16,278 SNPs that were in common between all populations, and were sufficient to capture these continental-wide contrasts, although probably a bit noisier than usual.

The average positions of populations in the PCA plot and their average ADMIXTURE proportions can be seen in the table below:

Project participants who wish to know their own numbers will get a line of five numbers: the 2 PCA co-ordinates, followed by their 3 ADMIXTURE proportions, as in the table above. Write to me (dodecad@gmail.com) to get those results.

Related:

UPDATE (May 22): A reader asks whether my observations on sample sizes are relevant in this case.

First of all, in the post Beware of sample sizes: why Ancestral North Indians came from West Asia, not Eastern Europe I showed that the projection of populations on principal components created by populations of unequal size tended to be biased toward the larger-sample population. In this case this is not a problem, as all samples were used to calculate the first two principal components.

Second, as a precaution, I used 98 West Eurasian non-Turkic samples and 105 East Eurasian non-Turkic samples, hence the two poles of the experiment were of approximately equal sample size, and the effect described in the previous post would not be observed even if this was a projection along that axis.

Third, the Yoruba (N=21) were used in this experiment as an outgroup to tease out the African shift of Syrians who, as can be seen, do not fall simply on the East-West Eurasian cline. The smaller sample size is not a problem here, as the objective of this analysis is not to measure shift on the Eurasian-African axis, but rather to see whether Turkic groups fall on a West-East Eurasian cline, which they do.

13 comments:

  1. What are the ethnic groups of the "Other Turkic"s? I guess they are all Azeris except one or two of them, who may be Crimean Tatars.

    ReplyDelete
  2. Well, maybe some have posted in the ancestry thread, everyone is encouraged to do so.

    ReplyDelete
  3. Is there a particular reason why non-Greek Balkans was not included?

    ReplyDelete
  4. I used the same 4 populations to border Turks as in the previous experiment.

    http://dienekes.blogspot.com/2011/05/central-asian-element-in-turks-part-3.html

    I added Russians and Poles this time around because some non-Anatolian Turkic populations (Chuvash and others) seem to have a northern Caucasoid element

    http://dienekes.blogspot.com/2011/05/on-northernsouthern-caucasoid.html

    ReplyDelete
  5. BTW, I am almost sure that Kazakhs would show up as genetically the most East Asian of all Central Asian Turkic groups if more Kazakhs were included in this analysis, as uniparental genetic tests done so far strongly suggest that Kazakhs are genetically the most East Asian of all Central Asian Turkic groups.

    ReplyDelete
  6. What I don't understand is why the 1st dimension captures the Caucasoid-Mongoloid divergence while the 2nd dimension captures the Negroid-non-Negroid divergence. Shouldn't it be the opposite?

    Also, I think it would be more interesting for the PCA plot if you didn't include Yorubans, as they take up one PCA dimension. I wonder how the East-West Eurasian cline would be affected from the exclusion of Yorubans (you can also exclude Syrians for more restriction if you want).

    Another interesting issue is how Uralic speaking populations would show up on the PCA plot with or without Yorubans (+ Syrians if wanted), if you included Uralic speakers. If even without Yorubans (+ Syrians if wanted) Uralic speakers position on the same East-West Eurasian cline with all other Eurasian populations, I wonder what that would imply for the third race theory, which proposes that the original Uralics are not a simple admixture of Caucasoids and Mongoloids but a third Eurasian race (fourth if we include ASI) distinct from them.

    ReplyDelete
  7. What I don't understand is why the 1st dimension captures the Caucasoid-Mongoloid divergence while the 2nd dimension captures the Negroid-non-Negroid divergence. Shouldn't it be the opposite?

    It "shouldn't" be anything, it "is" what it is. As I never grow tired of repeating the ordering of dimensions in PCA or pattern of splits in ADMIXTURE are NOT a phylogeny.

    The cline can also be seen here without any Africans.

    http://dienekes.blogspot.com/2010/11/multidimensional-scaling-and-admixture.html

    ReplyDelete
  8. Sure there is a cline between Asia and Western Eurasia, and also Africa and Near East. But I cannot believe the admix analysis shown in your article. When you use 217 related Asian samples it is obvious that the results is a huge amount of Eastern components, some intra-Asian, some intra-Eurasian. So many samples simply are able to gather almost everything into that axis. 21 Yorubas cannot reach same level. Used Yorubas gain the Asian westwarad cline, because Yorubas are enough distant in right angle toward Asians.

    ReplyDelete
  9. It "shouldn't" be anything, it "is" what it is. As I never grow tired of repeating the ordering of dimensions in PCA or pattern of splits in ADMIXTURE are NOT a phylogeny.

    Pattern of splits in ADMIXTURE in general seem to be more random and less important than ordering of dimensions in PCA though. Anyway, what are the eigenvalues of the 1st and 2nd dimensions of the above PCA plot?

    The cline can also be seen here without any Africans.

    http://dienekes.blogspot.com/2010/11/multidimensional-scaling-and-admixture.html


    I am well aware of that MDS plot. The problem with it is that it includes Chukchi and Koryak, two populations that are relatively genetically highly divergent from the rest of Siberians and East Asians.

    BTW, I think Koryak are redundant also in this analysis.

    ReplyDelete
  10. There are two separate clusters of Uzbeks according to the above analysis. The cluster that includes bulk of the Uzbek samples forms a cluster with the other Central Asian Turkic groups (especially Uyghurs). The other Uzbek cluster, which includes only 4 samples, is much closer to Chuvash than to the rest of the Uzbek samples (as well as all of the Central Asian Turkic samples). I suspect that all of the 4 samples of that minor cluster are ethnic Tajiks from Uzbekistan. Because uniparental genetic tests done so far indicate that Tajiks of Uzbekistan are genetically generally less East Asian and more Caucasoid than ethnic Uzbeks (the dominant Turkic group in Uzbekistan) of Uzbekistan.

    ReplyDelete
  11. what is stalskoe? are they kumuk/kumyks?

    ReplyDelete
  12. what is the difference between Turkish_d and Turks?

    ReplyDelete
  13. what is stalskoe? are they kumuk/kumyks?

    Yes, they are all Kumuks from Stalskoe (a city in Dagestan).

    what is the difference between Turkish_d and Turks?

    "Turks" are the collection of 19 Turkish samples who are exclusively native Turks from the region of Turkey that is historically known as Cappadocia. They were originally used in the Behar et al. 2010 study and later used in the Yunusbayev et al. 2011 and Rasmussen et al. 2011 studies.

    "Turkish_D" are the collection of Turkish samples who were collected by Dienekes specifically for the Dodecad Project. They are Turks from all over Anatolia and the Balkans (mostly from Anatolia). As far as I know, as of now they are 26 samples in number. Their number will increase with the addition of new Turkish participants to the Dodecad Project.

    This will be my last reply to you if you don't reply again.

    ReplyDelete