Tuesday, May 31, 2011

ANI/ASI analysis of HGDP Pakistan groups

Until recently, it has been difficult to study the Ancestral North Indian/Ancestral South Indian (ANI/ASI) composition of Pakistan groups, as these fell outside the "Indian Cline" of Reich et al. (2009). My recent experimental reconstruction of ANI/ASI zombies, as well as West-Eurasian ones allows me to do a supervised run on them and see how they fare.

(One caveat is that this is based on ~30k SNPs, as the two different kinds of populations I am using include ~120k and ~150k SNPs, but not the same ones).

Overall, the results make sense (they can be seen on the left, as well as on this spreadsheet):
  • The components of the ANI and West Asian "zombies" dominate most populations; I suspect that as the two are related it may be difficult to distinguish between them
  • Intriguingly, Kalash continue to be dominated by West Asian, now that the composite "South Asian" has been resolved, and their ASI levels are similar to those in Iranians.
  • Conversely, the higher ANI are found in Pathans and Sindhi, i.e., precisely the populations used by Reich et al. (2009). Hence, I suspect that ANI in the sense of Reich et al. (2009), as reconstructed by myself, may be biased towards these two populations. Also note that my ANI reconstruction used the same Pathans (15) and Sindhi (10) used by Reich et al. (2009), whereas in this one all HGDP individuals are included.
  • The East Asian component turns up in the Hazara and the Burusho, in agreement with previous experiments
  • The Southwest Asian component turns up in Balochistan (Balochi, Brahui, Makrani), which also makes sense, linking that Iranic speaking region to nearby Iran where that component is also important
  • The North European component comes up in Hazara, Burusho, and Pathans, which again makes sense, as these populations may have been influenced by people from further north in historical times.
In conclusion, I would say that while the ANI/ASI "zombies" do capture real South Asian signals, as evidenced by my Gypsy experiment, but the reconstructed ANI does not capture the entirety of West Eurasian admixture in South Asia: a lot of it continues to be associated with West Asia, and a little with Northern Europe in some populations.


  1. It is very strange that Kalash lack the ANI component while having - however in small proportion - the ASI component in this analysis. The genetic affinity of the ANI and West Asian components and the difficulty of ADMIXTURE in distinguishing between closely related components, especially in supervised runs, might explain the absence of the ANI component in Kalash in this run to some extent. There is also the well known fact that Kalash are extremely inbred and isolated, which might further confuse ADMIXTURE.

  2. I find it rather interesting that, despite being equally ASI, the Pashtuns and Baloch differ strikingly in terms of West Eurasian ancestry. The Baloch seem to have a clinal relationship to Iranians, representing a highly South Asian-shifted Iranian plateau population. But the Pashtuns seem to have a clinal relationship to Indo-Aryans, much less ASI, but very similar in terms of West Eurasian ancestry. It would be interesting to see this experiment tried on some new populations which have appeared in the interim, especially the Di Cristofaro et al. Afghan samples.