I have consistently received requests for an assessment of Amerindian ancestry. While the focus of the Project is, and will remain, the region of Eurasia, I thought it was a good idea to release a tool that could be used by persons of partial Amerindian ancestry.
I have also included the two Australasian populations currently available, namely Bougainville Melanesians (NAN_Melanesian) and Papuans from the HGDP.
The inferred components at K=9 are quite similar to those of 'eurasia7', with the addition of the Australasian and Amerindian components. I have also included the Kalash in this experiment, which caused the 'West_Asian' component to be modal in them, although the Kalash's difference in terms of this component to other populations is not so great as to render it strongly population-specific; I have called this component 'Caucasus_Gedrosia' and it -like the 'eurasia7' West Asian component- ought to be quite similar to the k5 component inferred by Metspalu et al. (2011).
It is unfortunate that there are only two Australasian populations currently available as public data. There are many more Amerindian and Mestizo ones, but it should be noted that the Amazonian populations on which the 'Amerindian' component is modal are some of the most lacking in genetic diversity in my entire database. As a result, Eurasians who lack any Amerindian or Australasian ancestry can expect to see a little of it in their results as noise.
This is a very important caveat for Americans who suspect that they may have an Amerindian ancestor. Small levels of this component may be noise, and this component is also found in Siberia, and may represent either backflow from the Americas or the common ancestry of Siberian and Amerindian populations. If you are interested in the detection of Amerindian ancestry, I recommend that you use DIYDodecad's 'byseg', 'bychr', and 'target' modes to drill down deeper in your genomes.
Download Files
Terms of use:
'world9', including all files in the downloaded RAR file is free for non-commercial personal use. Commercial uses are forbidden. Contact me for non-personal uses of the calculator.
Information
Admixture proportions barplot:
The nine ancestral components are:
Technical Details
A dataset of 3,548 individuals/265,519 SNPs/284 populations was assembled. Pruning for distantly related individuals was performed by iterative pruning of a single individual from each pair showing IBD RATIO greater than the mean plus 2 standard deviations, or greater than 2.5. 3,026 individuals remained. An additional 14 individuals were removed because they had less than 97% genotype rate. The marker set was thinned to remove SNPs with less than 97% genotype rate or 1% minor allele frequency. Linkage-disequilibrium based pruning with a window of 200 SNPs, advanced by 25 SNPs, and an R-squared of 0.4 was performed. A total of 3,012 individuals and 170,822 SNPs survived these filtering steps. PLINK 1.07 and ADMIXTURE 1.21 were used in the analyses.
I have also included the two Australasian populations currently available, namely Bougainville Melanesians (NAN_Melanesian) and Papuans from the HGDP.
The inferred components at K=9 are quite similar to those of 'eurasia7', with the addition of the Australasian and Amerindian components. I have also included the Kalash in this experiment, which caused the 'West_Asian' component to be modal in them, although the Kalash's difference in terms of this component to other populations is not so great as to render it strongly population-specific; I have called this component 'Caucasus_Gedrosia' and it -like the 'eurasia7' West Asian component- ought to be quite similar to the k5 component inferred by Metspalu et al. (2011).
It is unfortunate that there are only two Australasian populations currently available as public data. There are many more Amerindian and Mestizo ones, but it should be noted that the Amazonian populations on which the 'Amerindian' component is modal are some of the most lacking in genetic diversity in my entire database. As a result, Eurasians who lack any Amerindian or Australasian ancestry can expect to see a little of it in their results as noise.
This is a very important caveat for Americans who suspect that they may have an Amerindian ancestor. Small levels of this component may be noise, and this component is also found in Siberia, and may represent either backflow from the Americas or the common ancestry of Siberian and Amerindian populations. If you are interested in the detection of Amerindian ancestry, I recommend that you use DIYDodecad's 'byseg', 'bychr', and 'target' modes to drill down deeper in your genomes.
Download Files
- The spreadsheet contains admixture proportions, the table of Fst distances, and individual results in the Individual Results tab.
- The RAR file contains files for use with DIYDodecad. Extract its contents to the working directory of DIYDodecad. In order to run the calculator, you follow the instructions of the README file, but type 'world9' instead of 'dv3'.
Terms of use:
'world9', including all files in the downloaded RAR file is free for non-commercial personal use. Commercial uses are forbidden. Contact me for non-personal uses of the calculator.
Information
Admixture proportions barplot:
The nine ancestral components are:
- Amerindian
- East_Asian
- African
- Atlantic_Baltic
- Australasian
- Siberian
- Caucasus_Gedrosia
- Southern
- South_Asian
Table of Fst divergences:
Neighbor-joining tree of Fst distances; the long branch lengths of the Australasian (and to a less degree the Amerindian) branch is due to the high level of inbreeding in the populations for which this component is modal.
First 8 dimensions of multi-dimensional scaling (MDS):
Neighbor-joining tree of Fst distances; the long branch lengths of the Australasian (and to a less degree the Amerindian) branch is due to the high level of inbreeding in the populations for which this component is modal.
Technical Details
A dataset of 3,548 individuals/265,519 SNPs/284 populations was assembled. Pruning for distantly related individuals was performed by iterative pruning of a single individual from each pair showing IBD RATIO greater than the mean plus 2 standard deviations, or greater than 2.5. 3,026 individuals remained. An additional 14 individuals were removed because they had less than 97% genotype rate. The marker set was thinned to remove SNPs with less than 97% genotype rate or 1% minor allele frequency. Linkage-disequilibrium based pruning with a window of 200 SNPs, advanced by 25 SNPs, and an R-squared of 0.4 was performed. A total of 3,012 individuals and 170,822 SNPs survived these filtering steps. PLINK 1.07 and ADMIXTURE 1.21 were used in the analyses.
Dienekes, I have a question.
ReplyDeleteAt this run, my Amerindian and Siberian components, combined, were significantly larger than the Asian component(s) I used to get on other calculators (v3, k12a, euro7, eurasia7, and some of Eurogene's calculators). Is it possible that the Amerindian component has now been overestimated (say, because some of the Amerindian control samples were admixed), or is it more likely that it has before been underestimated due to a lack of Amerindian control samples in previous works?
If you have actual Amerindian ancestry, then it is natural for the sum to increase, because the Amerindian component is a better fit for that aspect of your ancestry.
ReplyDeleteEast Asian, Siberian, and Amerindian all pick up a common ancestral component, and each two of them carry some (but not all) information in the third one.
is there any way to set apart father inherited from mother inherited admix so i can at least aproximate results for father and mother etc? or it is SF for now
ReplyDeletethank you
Curious when you say "if you have actual Amerindian ancestry, then it is natural for the sum to increase", what kind of an increase is worth noting?
ReplyDeleteI'm having a bit of trouble doing the segment by segment runs (I'm not the most technologically competent individual) so I've not looked farther than the basic test.
On most runs I score 0% Amerindian, this time I was 1.3%, this higher than other folks that I typically am genetically in common with.
I'm unsure how to interpret the increase in that I have 0% Siberian, I would think that if it was real admix the Siberian would increase as well, but the increase in general is a lingering thought.
Can someone tell me what this means: Is this a fatal error? Can anyone tell me what I might have done wrong?
ReplyDeleteWarning message:
running command 'DIYDodecadWin word9.par' had status 2
>
Can anyone help me understand how to load the results of my population finder from ftDNA onto this calculator? My results are from ftDNA, I will gladly share it, since my mother is Oceania and my Father Pakistan.
ReplyDeleteThanks,
See the README included in the download here on how to run DIYDodecad
ReplyDeletehttp://dodecad.blogspot.com/2011/09/do-it-yourself-dodecad-v-21.html
Can anyone tell me what the "Southern" distinction stands for?
ReplyDeleteUmm, I have read your article about the caution of admixture estimates. Are Pure amerindian reference samples are used on the world9 calculator? I think it is very important to choose Pure 100% Amerindian samples in order to get a accurate percentage.
ReplyDeleteThe information is right here, so read it.
ReplyDeleteI'm 1st generation mexican-american and I'm surprisingly pleased to find out I'm predominately indigenous (50+)! plus some asian that probably came from the out-of-asia migration.
ReplyDeleteAmerindian 55.53%
East_Asian 0.98%
African 6.46%
Atlantic_Baltic 20.27%
Australasian 0.51%
Siberian 2.01%
Caucasus_Gedrosia 3.07%
Southern 10.59%
South_Asian 0.57%
I say surprising cause people think only the poor/southern mex populace have higher amerindian %, but I'm neither poor nor from s.mexico.
Interesting, I am from South America and here are my results:
DeleteAmerindian 26.57
East_Asian 0.58
African 9.93
Atlantic_Baltic 35.02
Australasian -
Siberian 1.37
Caucasus_Gedrosia 6.61
Southern 19.92
South_Asian -
Do you know above how much an Amerindian component is not considered "noise" with world9 ? Thank you very much.
ReplyDeleteThe admixture proportions barplot isn't legible when you zoom in on it.
ReplyDeleteI came out as:
ReplyDeletePopulation
Amerindian -
East_Asian -
African -
Atlantic_Baltic 54.86%
Australasian -
Siberian 0.73%
Caucasus_Gedrosia 19.40%
Southern 24.96%
South_Asian -
I think this is the best calculator for Europeans out there.
It's the only calculator that could possibly identify me as half Dutch and not Scottish, irish,german or Kent.
Using 2 populations approximation:
1 50% S_Italiaan_Sicilian +50% Dutch @ 1.083
2 50% German +50% S_Italian @ 1.126
3 50% Dutch +50% S_Italian @ 1.148
4 50% German +50% S_Italian_Sicilian @ 1.223
What is CEU30? My results for the World9 Oracle-x Population Fitting are as follows...
ReplyDelete# Population Percent
1 Amerindian 0.58
2 East_Asian 0.00
3 African 0.00
4 Atlantic_Baltic 75.43
5 Australasian 0.00
6 Siberian 0.09
7 Caucasus_Gedrosia 10.98
8 Southern 11.50
9 South_Asian 1.41
Pct. Calc. Option 2
1 CEU30 99.55%
2 Sardinian 0.21%
3 Malayan 0.21%
4 CLM30 0.03%
5 Colombian 0.00%
6 MALAYAN 0.00%
7 MEX30 0.00%
8 Brazilian 0.00%
9 AthabaskHD4 0.00%
10 Castilla_La_Mancha 0.00%
Total RMSD: 0.249511
And my results for the World9 4-Ancestors Oracle are below, but I don't understand how to read this and what it means. Could someone please explain? Thanks.
# Population Percent
1 Atlantic_Baltic 75.94
2 Southern 11.58
3 Caucasus_Gedrosia 11.06
4 South_Asian 1.42
--------------------------------
Least-squares method.
Using 1 population approximation:
1 British @ 1.073
2 CEU30 @ 1.092
3 Cornwall @ 1.115
4 Kent @ 1.229
5 British_Isles @ 1.589
6 German @ 1.973
7 Dutch @ 2.204
8 Irish @ 2.620
9 Orcadian @ 2.719
10 Argyll @ 2.876
250 iterations.
Using 2 populations approximation:
1 50% Mixed_Germanic +50% Orkney @ 0.859
2 50% British_Isles +50% Dutch @ 0.876
3 50% CEU30 +50% Cornwall @ 0.924
4 50% Mixed_Germanic +50% Orcadian @ 0.954
5 50% British_Isles +50% CEU30 @ 0.955
6 50% British +50% CEU30 @ 0.968
7 50% CEU30 +50% Kent @ 0.986
8 50% British +50% Cornwall @ 1.026
9 50% Irish +50% Mixed_Germanic @ 1.071
10 50% British +50% British @ 1.073
31375 iterations.
Using 3 populations approximation:
1 50% British_Isles +25% Dutch +25% CEU30 @ 0.809
2 50% British_Isles +25% Dutch +25% Cornwall @ 0.833
3 50% Cornwall +25% Dutch +25% Orkney @ 0.843
4 50% Cornwall +25% Dutch +25% Orcadian @ 0.850
5 50% British_Isles +25% Mixed_Germanic +25% CEU30 @ 0.853
6 50% Mixed_Germanic +25% Orkney +25% Orkney @ 0.859
7 50% Orkney +25% Mixed_Germanic +25% Mixed_Germanic @ 0.859
8 50% CEU30 +25% British_Isles +25% Cornwall @ 0.860
9 50% Cornwall +25% Irish +25% Dutch @ 0.869
10 50% British_Isles +25% British +25% Dutch @ 0.870
526205 iterations.
Using 4 populations approximation:
1 British_Isles + British_Isles + Dutch + CEU30 @ 0.809
2 British_Isles + British_Isles + Dutch + Cornwall @ 0.833
3 Mixed_Germanic + CEU30 + Orkney + Cornwall @ 0.839
4 Dutch + Orkney + Cornwall + Cornwall @ 0.843
5 Mixed_Germanic + British_Isles + Dutch + Orkney @ 0.847
6 Dutch + Orcadian + Cornwall + Cornwall @ 0.850
7 British_Isles + Dutch + CEU30 + Cornwall @ 0.850
8 Irish + Mixed_Germanic + British_Isles + Cornwall @ 0.852
9 Mixed_Germanic + British_Isles + Cornwall + Argyll @ 0.852
10 Mixed_Germanic + British_Isles + British_Isles + CEU30 @ 0.853
11 British + British_Isles + Dutch + Cornwall @ 0.856
12 Mixed_Germanic + British_Isles + Orcadian + CEU30 @ 0.857
13 Mixed_Germanic + Mixed_Germanic + Orkney + Orkney @ 0.859
14 British_Isles + CEU30 + CEU30 + Cornwall @ 0.860
15 Irish + Mixed_Germanic + British_Isles + CEU30 @ 0.862
16 Mixed_Germanic + British_Isles + CEU30 + Orkney @ 0.862
17 Dutch + CEU30 + Orkney + Cornwall @ 0.864
18 Mixed_Germanic + British_Isles + Orcadian + Cornwall @ 0.868
19 Irish + Dutch + Cornwall + Cornwall @ 0.869
20 British + British_Isles + British_Isles + Dutch @ 0.870
2346773 iterations.
I am African American and Eastern Indonesian( Maluku) My Results are:
ReplyDeletePopulation
Amerindian -
East_Asian 30.02%
African 31.13%
Atlantic_Baltic 12.54%
Australasian 18.16%
Siberian 0.57%
Caucasus_Gedrosia 1.37%
Southern 3.86%
South_Asian 2.34%
Population
ReplyDeleteAmerindian 1.07%
East_Asian 0.93%
African 77.84%
Atlantic_Baltic 13.06%
Australasian 0.51%
Siberian -
Caucasus_Gedrosia 1.27%
Southern 5.22%
South_Asian 0.09%
This is my gentics I believe. I dont think there is any statistical noise as people call it, to say what want to be and dont want to be.
My dad (who's mostly British/Scottish/Irish, German, and Scandinavian) gets these Admix Results (sorted):
ReplyDelete# Population Percent
1 Atlantic_Baltic 72.95
2 Caucasus_Gedrosia 12.19
3 Southern 11.6
4 Siberian 1.35
5 Amerindian 1.33
6 South_Asian 0.57
7 African 0.01
Single Population Sharing:
# Population (source) Distance
1 Dutch (Dodecad) 2.06
2 German (Dodecad) 2.15
3 Mixed_Germanic (Dodecad) 2.32
4 CEU30 (1000Genomes) 2.74
5 Kent (1000 Genomes) 3.31
6 Cornwall (1000 Genomes) 3.35
7 British (Dodecad) 3.4
8 British_Isles (Dodecad) 4.19
9 Argyll (1000 Genomes) 4.38
10 Irish (Dodecad) 4.68
11 Orcadian (HGDP) 4.93
12 Ukranians (Yunusbayev) 5.28
13 Orkney (1000 Genomes) 5.33
14 Hungarians (Behar) 5.4
15 Polish (Dodecad) 6.22
16 French (HGDP) 7.58
17 Belorussian (Behar) 8.02
18 French (Dodecad) 8.57
19 Norwegian (Dodecad) 8.71
20 Swedish (Dodecad) 9
Mixed Mode Population Sharing:
# Primary Population (source) Secondary Population (source) Distance
1 97.6% Dutch (Dodecad) + 2.4% EastGreenland @ 0.38
2 96.9% Dutch (Dodecad) + 3.1% WestGreenland @ 0.4
3 93.6% Dutch (Dodecad) + 6.4% Aleut @ 0.67
4 97.9% Dutch (Dodecad) + 2.1% Athabask @ 0.7
5 97.6% German (Dodecad) + 2.4% EastGreenland @ 0.79
6 96.9% German (Dodecad) + 3.1% WestGreenland @ 0.83
7 97.9% Dutch (Dodecad) + 2.1% Chukchi @ 0.83
8 97.9% German (Dodecad) + 2.1% Chukchi @ 0.98
9 97.9% Dutch (Dodecad) + 2.1% Koryak @ 1.06
10 78.1% Swedish (Dodecad) + 21.9% O_Italian (Dodecad) @ 1.08
11 79.6% Swedish (Dodecad) + 20.4% C_Italian (Dodecad) @ 1.08
12 93.9% German (Dodecad) + 6.1% Aleut @ 1.08
13 97.2% Dutch (Dodecad) + 2.8% MEX30 @ 1.08
14 76.7% Swedish (Dodecad) + 23.3% Tuscan (HGDP) @ 1.09
15 98% German (Dodecad) + 2% Athabask @ 1.09
16 97.6% Dutch (Dodecad) + 2.4% Ecuadorian @ 1.09
17 93.3% Mixed_Germanic (Dodecad) + 6.7% Aleut @ 1.1
18 98.6% Dutch (Dodecad) + 1.4% Pima @ 1.1
19 98.5% Dutch (Dodecad) + 1.5% Maya @ 1.11
20 98.3% Dutch (Dodecad) + 1.7% PEL30 @ 1.11
I'm not american at all. All my ancestors have lived in east-Africa for centuries. Why do I get a Amerindian signal in every calculator?
ReplyDeleteAmerindian 1.02%
East_Asian 1.74%
African 34.35%
Atlantic_Baltic 1.65%
Australasian 0.83%
Siberian 0.77%
Caucasus_Gedrosia 16.61%
Southern 29.47%
South_Asian 13.56%
If you have Malagasy ancestry, that could be a reason since Native Americans are found to have Polynesian markers in their dna, and come from East Asian back ground like Polynesian/Austronesian people.
DeletePopulation
DeleteAmerindian 0.79
East_Asian 0.55
African 84.35
Atlantic_Baltic 9.77
Australasian -
Siberian -
Caucasus_Gedrosia 0.99
Southern 3.04
South_Asian 0.51
Oracle
Oracle-4
Spreadsheet
Using 3 populations approximation:
ReplyDelete1 50% MKK30 +25% Cochin_Jews +25% Samaritians @ 3.258934
What does MKK30 stand for?
Amerindian 0.76%
ReplyDeleteEast_Asian -
African 0.15%
Atlantic_Baltic 73.22%
Australasian -
Siberian 0.90%
Caucasus_Gedrosia 11.27%
Southern 13.07%
South_Asian 0.59%
What area is Southern specific to?
ReplyDeleteWhat does southern mean? Is it Southern Europe or Southwest Asia?
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteI'm just really confused and trying to find my ethnicity...it's all still confusing
ReplyDeleteAfrican 63.16
2 Atlantic_Baltic 27.04
3 Southern 6.80
4 Caucasus_Gedrosia 1.29
5 East_Asian 1.06
I have 1.19% Amerindian according the data by using this tool. Is that percentage considered noise?
ReplyDeleteIt may well be noise, but not necessarily.
DeleteMany People have asked what SOUTHERN represents, can this question NOT be answered?
ReplyDeleteI'm an Indonesian of mostly Javanese descent. I have this result on Dodecad World9:
ReplyDeleteAmerindian -
East_Asian 82.80
African 0.25
Atlantic_Baltic 1.52
Australasian 4.20
Siberian -
Caucasus_Gedrosia -
Southern -
South_Asian 11.23
Is it typical among South East Asians or does the African/Baltic/South Asian signify something?
Hello,
ReplyDeleteI am wondering if anyone can explain the bottom portion of the oracle to me. I am new to genealogy. What do the percentages mean? Thanks in advance.
World9 Oracle results:
Admix Results (sorted):
# Population Percent
1 African 74.23
2 Atlantic_Baltic 15.13
3 Southern 5.04
4 South_Asian 3.12
5 Caucasus_Gedrosia 1.38
6 Australasian 0.75
7 Amerindian 0.37
Single Population Sharing:
# Population (source) Distance
1 ASW30 (HapMap3) 8.71
2 San_He 17.05
3 ACB30 18.37
4 Hadza_He 19.39
5 Sandawe_He 20.01
6 MKK30 (Dodecad) 22.36
7 Bantu_N.E. (HGDP) 25.63
8 LWK30 (Behar) 26.39
9 Mandenka 29.05
10 Bantu_S.W._Herero (HGDP) 31.64
11 YRI30 (HGDP) 32.05
12 San 32.05
13 Yoruba (HGDP) 32.51
14 Bantu_S.E._Tswana (HGDP) 32.53
15 Biaka_Pygmies 33.21
16 Mbuti_Pygmies 33.21
17 Dominican 42.82
18 Somali (Dodecad) 48.15
19 Ethiopians (Behar) 52.18
20 Ethiopian_Jews (Behar) 54.96
Mixed Mode Population Sharing:
# Primary Population (source) Secondary Population (source) Distance
1 85.8% San_He + 14.2% French_Basque @ 2.24
2 85.8% San_He + 14.2% Pais_Vasco (1000 Genomes) @ 2.26
3 91.9% ASW30 (HapMap3) + 8.1% Romanians (Behar) @ 2.43
4 91.4% ASW30 (HapMap3) + 8.6% Brazilian (Dodecad) @ 2.59
5 92.1% ASW30 (HapMap3) + 7.9% N_Italian (Dodecad) @ 2.6
6 92.1% ASW30 (HapMap3) + 7.9% North_Italian (HGDP) @ 2.61
7 92.1% ASW30 (HapMap3) + 7.9% Baleares (1000 Genomes) @ 2.62
8 92% ASW30 (HapMap3) + 8% Extremadura (1000 Genomes) @ 2.63
9 92.1% ASW30 (HapMap3) + 7.9% Galicia (1000 Genomes) @ 2.63
10 92% ASW30 (HapMap3) + 8% Portuguese (Dodecad) @ 2.63
11 92.1% ASW30 (HapMap3) + 7.9% Castilla_La_Mancha (1000 Genomes) @ 2.64
12 92% ASW30 (HapMap3) + 8% Murcia (1000 Genomes) @ 2.64
13 84.6% ACB30 + 15.4% French (Dodecad) @ 2.65
14 92% ASW30 (HapMap3) + 8% Bulgarians (Yunusbayev) @ 2.65
15 92% ASW30 (HapMap3) + 8% Bulgarian (Dodecad) @ 2.66
16 92.1% ASW30 (HapMap3) + 7.9% Andalucia (1000 Genomes) @ 2.66
17 92.1% ASW30 (HapMap3) + 7.9% Castilla_Y_Leon (1000 Genomes) @ 2.66
18 92.2% ASW30 (HapMap3) + 7.8% Spaniards (Behar) @ 2.67
19 92.3% ASW30 (HapMap3) + 7.7% Cataluna (1000 Genomes) @ 2.68
20 91.8% ASW30 (HapMap3) + 8.2% Canarias (1000 Genomes) @ 2.68
Southern? "Southern" what? What does mean this cluster?
ReplyDelete