This calculator was made with 196 different populations and 2,659 individuals, including 518 project participants. The following Dodecad populations do not have 5 individuals yet, so they are included in the OTHERS_D generic category:
As always, I encourage people with 4 grandparents from the same country or ethnic group of Eurasia, North or East Africa to contact me (do not send data!) for possible inclusion in the Project. If I have overlooked any such individuals, drop me a line (my e-mail address is at the bottom of the blog). I usually start a new _D population whenever individuals with 4 grandparents from the same group are submitted, but I may have missed some.Algerian_D, North_African_Jews_D, Slovenian_D, Mixed_Scandinavian_D, Danish_D, Moroccan_D, Tunisian_D, Serb_D, Austrian_D, Saudi_D, Pakistani_D, Tatar_Various_D, Palestinian_D, Greek_Italian_D, Romanian_D, Swiss_German_D, Szekler_D, Mandaean_D, Azeri_D, Czech_D, Georgian_D, Belgian_D, Latvian_D, Estonian_D, Bangladesh_D, Yemenese_D, Sri_Lanka_D, Hungarian_D, Basque_D, Udmurt_D, Egyptian_D
Note that all individuals from the reference populations have also been included, including outliers; you should be aware of this when reading the population averages, and consult the Outliers tab in the v3 spreadsheet for some instances of outliers.
Due to image size restrictions in Picasa, the labels are not visible well. A large version of the above plot can be found in the download bundle.
The seven ancestral populations inferred at this level of resolution are:
As usual, you should take these names as useful labels, and interpret them in conjunction with the components' distribution in different populations, and their Fst distances, both of which can be found in the spreadsheet.
The table of Fst distances:
Below you can see a neighbor-joining tree based on inter-population Fst distances:
The first six dimensions of a multi-dimensional scaling of the same:
- The spreadsheet contains population averages, the table of Fst distances, and individual results for included Project participants.
- The download RAR file (Google Docs or Sendspace) contains all the files needed to run the calculator. You must download and install DIYDodecad 2.1 first. In order to run the calculator, you follow the instructions of the README file, but type 'eurasia7' instead of 'dv3'.
The calculator is built using allele frequencies of K=7 ancestral components inferred by ADMIXTURE 1.21 analysis of 2,659 individuals. Markers included in the source datasets, as well as the Family Finder and 23andMe (as of Oct 21) platforms were included. The marker set was thinned of markers with less than 99.5% genotype rate and less than 0.5% minor allele frequency. Linkage-disequilibrium based pruning was carried out with a window size of 250 SNPs, advanced by 25 SNPs and R-squared greater than 0.4. A total of 164,990 SNPs remained after these filtering steps.
All relevant populations available to me, and genotyped at a sufficient number of markers were included. Inclusion of the Kalash population resulted in a population-specific component at K=7, and hence their admixture components were inferred a posteriori. Their proportions are consistent with previous results, showing them to be a "West Asian" population (62.4%) with substantial "South Asian" admixture (37.1%), and near-complete absence of any other genetic components.