Tuesday, July 19, 2011

The Dodecad Oracle v1

Here is a little fun tool that tests the Dodecad v3 admixture proportions of an individual against all the reference populations, but also against the best pairwise combinations of these populations.

You need to install R to use it, and then download the program and double click on the file DodecadOracleV1.RData that can be found within the rar file. You will then be faced with a command prompt where you can enter the following commands:

Examining which populations are available

Just enter

X[,1]

You will see a list of 227 populations. You can use these population IDs in the next section.

Which populations are closest to a particular population?

Enter:

DodecadOracle("British_D")
[,1] [,2]
[1,] "British_D" "0"
[2,] "British_Isles_D" "0.9798"
[3,] "Cornwall_1KG" "1.1533"
[4,] "Kent_1KG" "2.265"
[5,] "Irish_D" "3.7643"
[6,] "Dutch_D" "4.5354"
[7,] "Mixed_Germanic_D" "6.8971"
[8,] "Norwegian_D" "11.3111"
[9,] "Orkney_1KG" "12.4652"
[10,] "Orcadian" "12.8195"

If you want to find e.g., the top-30 populations, rather than just the top-10, enter:

DodecadOracle("British_D", k=30)

Which populations are closer to a particular individual?

Enter the admixture proportions of the individual (from the "Individual results" tab of the spreadsheet) as follows:

DodecadOracle(c(4.6, 16.7, 33.6, 0, 23.2, 0.4, 0.6, 1.6, 0.7, 14.1, 4.5, 0.2))
[,1] [,2]
[1,] "Ashkenazi_D" "3.7908"
[2,] "Ashkenazy_Jews" "4.1473"
[3,] "Morocco_Jews" "6.338"
[4,] "S_Italian_Sicilian_D" "12.5443"
[5,] "Sephardic_Jews" "13.5067"
[6,] "C_Italian_D" "14.4554"
[7,] "Sicilian_D" "14.7469"
[8,] "S_Italian_D" "15.748"
[9,] "Tuscan_X" "15.9981"
[10,] "O_Italian_D" "16.1474"

Once again, you can specify k=30, if you desire the 30 top matching populations instead of the default 10.

Mixed Mode

You use mixed mode by adding mixedmode=T in any of the commands. The program then considers all pairs of populations, and for each one of them calculates the minimum distance to the sample in consideration, and the admixture proportions that produce it; population pairs where the distance to one of the two populations is smaller than to any admixture of the two are ignored.

Example:

DodecadOracle("Pathan",mixedmode=T)
[,1] [,2]
[1,] "Pathan" "0"
[2,] "84.8% Pakistani + 15.2% Urkarah" "1.075"
[3,] "84% Pakistani + 16% Stalskoe" "1.1555"
[4,] "63.9% TN_Brahmin + 36.1% Urkarah" "1.6669"
[5,] "32.4% Urkarah + 67.6% Meghawal" "2.3516"
[6,] "56.3% INS + 43.7% Urkarah" "2.4901"
[7,] "11.5% Adygei + 88.5% Pakistani" "2.6245"
[8,] "82.4% Sindhi + 17.6% Stalskoe" "2.6318"
[9,] "62.9% AP_Brahmin + 37.1% Urkarah" "2.7322"
[10,] "11.2% Lezgins + 88.8% Pakistani" "2.7749"

The mixed mode should be used with caution, and it shows, more than anything else, how similar apparent "mixes" can be achieved by different combinations of ancestry. Nonetheless, it may prove somewhat useful. For example, there is a suggestion in the above results, that Pathans can be viewed as a mix of other South Asian populations and populations from the eastern Caucasus, a suggestion that was arrived at independently by the Project using different methods.

Here is another example:

DodecadOracle("Assyrian_D",mixedmode=T)
[,1] [,2]
[1,] "Assyrian_D" "0"
[2,] "83.9% Armenians_16 + 16.1% Yemen_Jews" "1.7829"
[3,] "89.1% Armenian_D + 10.9% Saudis" "2.1624"
[4,] "84.3% Armenians_16 + 15.7% Saudis" "2.2884"
[5,] "88.9% Armenian_D + 11.1% Yemen_Jews" "2.2983"
[6,] "83.8% Armenian_D + 16.2% Bedouin" "4.1579"
[7,] "72.2% Armenian_D + 27.8% Syrians" "4.1841"
[8,] "23.4% Georgians + 76.6% Iraq_Jews" "4.2418"
[9,] "76.2% Armenians_16 + 23.8% Bedouin" "4.332"
[10,] "61.5% Armenians_16 + 38.5% Syrians" "4.4019"

This reaffirms the close relationship of Assyrians to Armenians that has been noticed in the project and by others, and it also shows that Assyrians differ from Armenians in a Southwestern Asian direction, consistent with their Semitic language.

Or, African Americans:

DodecadOracle("ASW",mixedmode=T)
[,1] [,2]
[1,] "ASW" "0"
[2,] "81.3% Hausa + 18.7% N._European" "2.3891"
[3,] "18.4% Orkney_1KG + 81.6% Hausa" "2.4031"
[4,] "18.5% Argyll_1KG + 81.5% Hausa" "2.4268"
[5,] "18.4% Orcadian + 81.6% Hausa" "2.4657"
[6,] "80.5% Igbo + 19.5% N._European" "2.5031"
[7,] "80.6% Brong + 19.4% N._European" "2.523"
[8,] "18.6% CEU + 81.4% Hausa" "2.5938"
[9,] "19.1% Argyll_1KG + 80.9% Brong" "2.6197"
[10,] "19% Orkney_1KG + 81% Brong" "2.6274"

I don't know that much about the slave trade, but I believe that Ghana was an important part of it?

Another thing to watch, is that some populations tend to have more than one sample available, so they appear to be mixtures of themselves, which is not really very informative, e.g., Spanish_D

DodecadOracle("Spanish_D",mixedmode=T)
[,1] [,2]
[1,] "Spanish_D" "0"
[2,] "7.9% French_Basque + 92.1% IBS" "0.8713"
[3,] "68.9% IBS + 31.1% Spaniards" "1.0377"
[4,] "98.8% IBS + 1.2% Irish_D" "1.2959"
[5,] "1.2% British_Isles_D + 98.8% IBS" "1.3018"
[6,] "1.2% British_D + 98.8% IBS" "1.3019"
[7,] "99% IBS + 1% Norwegian_D" "1.3046"
[8,] "1.2% Cornwall_1KG + 98.8% IBS" "1.3048"
[9,] "98.8% IBS + 1.2% Kent_1KG" "1.3142"
[10,] "2.2% French_D + 97.8% IBS" "1.3179"

To deal with these problems, you must "edit" the X matrix if you want to exclude some populations. For example, if you want to exclude "Spaniards" and "IBS", you must enter:

X <- X[setdiff(1:227,which(X[,1]=="IBS" | X[,1]=="Spaniards")),]

but notice, that you must relaunch the program, if you want to get the original matrix, or alternatively save it like this:

Z<-X

and then retrieve it like this:

X<-Z

47 comments:

  1. Very interesting. Greek_D mixed mode results strongly suggest Balto-Slavic admixture.

    ReplyDelete
  2. Once I have excluded a group how can I exclude more groups ? It gives an error the second time, and I have to re-open everytime to exlude groups. Thanks,

    ReplyDelete
  3. I think you need to change the 1:227 to the new number of populations. You can substitute dim(X)[1], which should give the number of populations every time.

    ReplyDelete
  4. Doug McDonald is now giving people three-way admixtures, though from perhaps a more compact set of population references. Might you alter the program to maximize the likelihood on 3-way or N-way admixtures by adjusting a parameter?

    ReplyDelete
  5. Brandon, this is not an autosomal set of fractions (like the ones of Doug McDonald) but rather the fractions that make your most similar numbers on the Dodecad K=12 run

    ReplyDelete
  6. Might you alter the program to maximize the likelihood on 3-way or N-way admixtures by adjusting a parameter?

    I'm not sure what Dr. McDonald does; this program uses a "geometric" approach to find the optimum admixture ratio for pairs of populations, and it ignores the population pairs where admixture isn't as good as one of the two populations.

    The approach used can be extended to 3 populations, although the geometry becomes more complex, and I will have to doublecheck it before I release a new version. Interestingly, in my experiments so far it seems that many individuals are better described as 2-way mixes than as 3-way mixes, although I've seen a few (such as a Mexican) that is best described as a 3-way mix.

    ReplyDelete
  7. DodecadOracle(c(8.9, 17.4, 32.2, 0, 21.8, 0.5, 0.4, 1.4, 1.1, 11.3, 5.1, 0), k=30)

    Gives me only 10 populations, not 30.

    Dienkes, can you give an example of the correct syntax, and for additional parameters like mixedmode=T?

    If you edit your original post with these instructions then go ahead and delete this one, thanks.

    ReplyDelete
  8. I get 30 here, not sure what's the problem.

    ReplyDelete
  9. I am having a problem typing commands and getting good results. If I copy the commands from the post above it works fine, but if I type the same commands in the R Console it doesn't work I just get a "+" and a blinking cursor. What am I doing wrong?

    > DodecadOracle("Assyrian_D",mixedmode=T)
    [,1] [,2]
    [1,] "Assyrian_D" "0"
    [2,] "83.9% Armenians_16 + 16.1% Yemen_Jews" "1.7829"
    [3,] "89.1% Armenian_D + 10.9% Saudis" "2.1624"
    [4,] "84.3% Armenians_16 + 15.7% Saudis" "2.2884"
    [5,] "88.9% Armenian_D + 11.1% Yemen_Jews" "2.2983"
    [6,] "83.8% Armenian_D + 16.2% Bedouin" "4.1579"
    [7,] "72.2% Armenian_D + 27.8% Syrians" "4.1841"
    [8,] "23.4% Georgians + 76.6% Iraq_Jews" "4.2418"
    [9,] "76.2% Armenians_16 + 23.8% Bedouin" "4.332"
    [10,] "61.5% Armenians_16 + 38.5% Syrians" "4.4019"
    > DodecadOracle("Assyrian_D,mixedmode=T)
    + |

    ReplyDelete
  10. Where do I find an explanation of the commands available for DodecadOracle?

    ReplyDelete
  11. In this one:

    > DodecadOracle("Assyrian_D,mixedmode=T)
    + |

    You forgot the ending quote after Assyrian_D

    The + sign means that R is expecting you to finish the command; in the above case, it saw a beginning quote ", but not an end quote. So, you need to be careful when you type in commands.

    ReplyDelete
  12. Thank you for this, Dienekes.
    This is what I was looking for, since a long time.
    Something that tells one, how much of what is needed to turn the results into the way they are.

    From Family lore I know that I have a Great Great Grandmother and indeed, one of the possibilities that Oracle comes up with is 95% German + 5% Lithuanian.

    ReplyDelete
  13. Ah, forgot to ask: What people does "N._European" contain btw?

    ReplyDelete
  14. @Fanty and Dienekes

    I was wondering what people were encompassed in the N. European category also...

    ReplyDelete
  15. Running DodecadOracle("ASW",mixedmode=T) gives me exactly what you have listed above; Is there a way to "tailor" it to an individual?

    ReplyDelete
  16. You tailor it to an individual by using the admixture proportions of the individual, rather than "ASW", as shown in the post.

    N. European is a population sample from Xing et al. (2010)

    http://jorde-lab.genetics.utah.edu/?page_id=8

    ReplyDelete
  17. This is indeed fun, but now the interpretation...
    All my ancestors back to 1800 are Dutch, while prior to 1800 some may have come from Germany or maybe further East.

    My data result, however, in:
    [1,] "Argyll_1KG" "1.353"
    [2,] "N._European" "2.2317"
    [3,] "Orcadian" "2.3712"
    [4,] "Orkney_1KG" "2.4925"
    [5,] "CEU" "2.4945"
    [6,] "German_D" "6.7632"
    [7,] "Mixed_Germanic_D" "8.0505"
    [8,] "Dutch_D" "9.4793"
    [9,] "Kent_1KG" "11.7887"
    [10,] "French" "12.8289"

    What does it mean? Can anybody help me out?

    ReplyDelete
  18. Well, this is ... interesting. From what research I've been able to do in the past year, all I can say for certain is I'm very American as I have both paternal and maternal ancestors that trace back to revolutionary times.

    My paternal grandfather is believed to have been mostly Irish.
    My paternal grandmother's father was Azorean and German.
    My maternal grandfather seems to be mostly a mix of English and either Scottish or Irish, and maybe Native American.
    My maternal grandmother is a complete mystery. My maternal haplotype is H4a1a which appears most commonly in Polish/Irish.

    23andMe says I'm genetically most similar to Northern Europeans.

    My DIYDodecad results are:
    East_European, 12.29%
    West_European, 50.36%
    Mediterranean, 26.37%
    Neo_African, 0.0%
    West_Asian, 7.81%
    South_Asian, 1.05%
    Northeast_Asian, 0.00%
    Southeast_Asian, 0.30%
    East_African, 0.0%
    Southwest_Asian, 1.67%
    Northwest_African, 0.01%
    Palaeo_African, 0.16%

    I’ve only been reading your blog for a short time and I’m still trying to make sense of a lot of it. If I’ve done everything correctly, it seems to me the Dodecad Oracle results lean, rather unexpectedly, in a French direction rather than Irish. I have yet to encounter any French ancestors in my searches.

    DodecadOracle(c(12.29,50.36,26.37,0.0,7.81,1.05,0.0,0.3,0.0,1.67,0.01,0.16))
    [,1] [,2]
    [1,] "CEU" "3.9671"
    [2,] "N._European" "5.5219"
    [3,] "Argyll_1KG" "6.2269"
    [4,] "Orcadian" "6.3383"
    [5,] "Orkney_1KG" "6.7261"
    [6,] "German_D" "6.8843"
    [7,] "French_D" "11.4239"
    [8,] "French" "11.4542"
    [9,] "Mixed_Germanic_D" "12.7139"
    [10,] "Dutch_D" "13.8534"

    Then there is the mixed mode:
    DodecadOracle(c(12.29,50.36,26.37,0.0,7.81,1.05,0.0,0.3,0.0,1.67,0.01,0.16),mixedmode=T)
    [,1] [,2]
    [1,] "89.4% CEU + 10.6% Romanians_14" "1.1168"
    [2,] "10.8% Balkans_D + 89.2% CEU" "1.3412"
    [3,] "37.2% Balkans_D + 62.8% British_D" "1.5368"
    [4,] "83% Orkney_1KG + 17% Romanians_14" "1.5842"
    [5,] "37.1% Balkans_D + 62.9% British_Isles_D" "1.6562"
    [6,] "83.9% Orcadian + 16.1% Romanians_14" "1.6801"
    [7,] "30.9% Balkans_D + 69.1% Dutch_D" "1.7519"
    [8,] "92.7% CEU + 7.3% Greek_D" "1.8079"
    [9,] "34.4% Balkans_D + 65.6% Kent_1KG" "1.8544"
    [10,] "81.2% CEU + 18.8% Slovenian" "1.9907"

    I’m not sure about that one, but your instructions did say that you could use the mixed mode with any command.

    I only have one question: What is CEU?

    I did find the instructions reasonably easy to follow.

    ReplyDelete
  19. CEU are White Americans from Utah

    ReplyDelete
  20. Apparently, my results from DodecadOracle, as posted earlier, are indeed baffling.

    Although I would love to have roots in one or other poetic place in Scotland (as long as there is a pub nearby), I suppose that the Dutch_D sample just isnot large enough as jet?

    More generally formulated: given present sample sizes, what type of interpretations are allowed for?

    Thank you for reacting.

    ReplyDelete
  21. When you run Dodecad Oracle on the admixture proportions of an individual, what do the numbers mean that follow each population in the resulting ranking of populations? In the results that others have posted here, the numbers usually start out quite low. For me, even the top ranked option seems to have a relatively high number following it (the top results is 12.8833). Do these numbers represent some sort of measure of how well each population fits the individual's admixture proportions?

    I am curious about this because my mother is from Brazil and my father is from the United States. I likely have very admixed ancestry, and was wondering if the higher relative numbers indicate that none of the populations are a very good match for me.

    Here are my results:
    [1,] "CEU" "12.8833"
    [2,] "Slovenian" "13.236"
    [3,] "French_D" "13.559"
    [4,] "N._European" "13.9869"
    [5,] "French" "13.9996"
    [6,] "German_D" "14.3945"
    [7,] "Argyll_1KG" "15.1435"
    [8,] "Orcadian" "15.24"
    [9,] "Orkney_1KG" "15.5916"
    [10,] "Hungarians" "16.8057"

    ReplyDelete
  22. Yes, the lower the number, the closer the fit. You can compare your results to the top few populations or to the weighted averages of pairs of populations to see in which components you are different from them.

    Individuals with more than 2 ancestries or people from very variable groups can usually expect not to see a tight fit.

    ReplyDelete
  23. Hello Dienekes,

    I appreciate your wide knowledge on DNA based anthropology and certainly also your Dodecad initiatives.

    Still, I am truly at a loss on how to interpret my results, as posted earlier.

    As you indicate, the smaller the number at the end of the line, the better the fit. I suppose therefore that "Argyll_1KG" "1.353" would mean a good fit.

    My question is: could such a good fit be caused by just one or two (thusfar unknown) ancestors from the West of Scotland of (well) before 1800?

    What would be your take on my results?

    Thanks.

    ReplyDelete
  24. Can you give some information about the "Slovenian" population? I thought my parents were the only 2 Slovenians in the Dodecad project, but the "Slovenian" population comes second for both of them in Oracle, while "Hungarians" is first. I thought my parents were typical Slovenes, and they even come from 2 regions that were politically separated for centuries and part of 2 different countries. Interestingly, they are extremely similar genetically, and much closer to Hungarians than to other Slovenians. But it may be possible that the "Slovenian" population in Oracle is small or not representative of the actual Slovenian population. I would be curious to know more.

    ReplyDelete
  25. The Slovenian population's percentages are based on a smaller number of markers, so potentially they are not as accurate as the Hungarians. The number of SNPs is listed in the Dodecad spreadsheet.

    Another possibility is that the distribution of Slovenians and Hungarians overlap so that particular Slovenians may be closer to the Hungarian average and vice versa.

    My question is: could such a good fit be caused by just one or two (thusfar unknown) ancestors from the West of Scotland of (well) before 1800?

    A couple of ancestors before 1800 would not determine your average. Also, the Argyll sample is small, and, moreover, we should all bear in mind that not all populations are distinguishable from each other on either the basis of this analysis, or in general. If you run Oracle with 'Argyll_1KG" you will see that there are several populations quite close to it.

    ReplyDelete
  26. @Teresa Your results look very similar to mine:
    DodecadOracle(c(10.5,49.6,26.3,0.1,10.3,0.4,0.0,0.0,0.0,1.7,1.0,0.0))
    [,1] [,2]
    [1,] "CEU" "5.4635"
    [2,] "N._European" "6.4288"
    [3,] "Argyll_1KG" "7.3457"
    [4,] "Orcadian" "7.6851"
    [5,] "Orkney_1KG" "8.0759"
    [6,] "German_D" "9.1499"
    [7,] "French" "10.8453"
    [8,] "French_D" "11.3824"
    [9,] "Mixed_Germanic_D" "12.3325"
    [10,] "Dutch_D" "13.8798"

    And on my mother's side, I'm the same pre-Revolutionary US mix of British Isles and some Swiss Germans. However, my Oracle results are far different:
    > DodecadOracle(c(10.5,49.6,26.3,0.1,10.3,0.4,0.0,0.0,0.0,1.7,1.0,0.0),mixedmode=T)
    [,1] [,2]
    [1,] "12.9% Sephardic_Jews + 87.1% Argyll_1KG" "0.8293"
    [2,] "14.5% S_Italian_Sicilian_D + 85.5% Argyll_1KG" "0.8452"
    [3,] "77.7% Argyll_1KG + 22.3% Tuscan_X" "0.888"
    [4,] "14% Sicilian_D + 86% Argyll_1KG" "0.9205"
    [5,] "13.5% S_Italian_D + 86.5% Argyll_1KG" "0.9258"
    [6,] "63.1% British_Isles_D + 36.9% Romanians_14" "1.0156"
    [7,] "77.5% Argyll_1KG + 22.5% Tuscan_H" "1.061"
    [8,] "17.6% Ashkenazy_Jews + 82.4% Orcadian" "1.0748"
    [9,] "91.8% CEU + 8.2% Druze" "1.0755"
    [10,] "90.9% CEU + 9.1% Turkish_D" "1.082"

    There must be something on my father's side that skews me towards the Middle East.

    ReplyDelete
  27. Hi Dienekes I wrote the following into the command line: DodecadOracle("Sicilian_D", mixedmode=T, k=30, X<-X[setdiff(1:dim(X)[1],which(X[,1]=="S_Italian_D"))])

    It gives me 30 results but S_Italian_D is still present.. What am I doing wrong?

    ReplyDelete
  28. You shouldn't have put the X<-... part inside the parentheses. It's a separate command that should be run before you invoke DodecadOracle.

    ReplyDelete
  29. thanks very much i got this and DIY to function well

    ReplyDelete
  30. Okay, so I got my DIYDodecad results, but what is the "spreadsheet" referred to above? How do I compare my results - all I got from DIYDodecad was a list...

    ReplyDelete
  31. I *think* I figured it out: I used the DIYDodecad results in DodecadOracle and got the following:

    DIYDodecad
    East European 14.79
    West European 51.91
    Mediterranean 24.09
    Neo-African 0.00
    West Asian 7.03
    South Asian 0.00
    Northeast Asian 0.00
    Southeast Asian 0.17
    East African 0.23
    Southwest Asian 1.74
    Northwest African 0.03
    Palaeo African 0.00

    Oracle
    DodecadOracle(c(14.79,51.91,24.09,0.00,7.03,0.00,0.00,0.17,0.23,1.74,0.03,0.00))
    [,1] [,2]
    [1,] "German_D" "3.0915"
    [2,] "CEU" "4.5501"
    [3,] "N._European" "4.9448"
    [4,] "Argyll_1KG" "5.2943"
    [5,] "Orcadian" "5.8573"
    [6,] "Orkney_1KG" "6.1"
    [7,] "Mixed_Germanic_D" "12.6902"
    [8,] "Dutch_D" "14.037"
    [9,] "Slovenian" "14.5208"
    [10,] "French_D" "14.5552"

    On my father's side I'm English/Irish and on my mother's Norwegian/English so these results appear to make sense although I'm not too sure about the Mediterranean or Asian parts.
    If anyone has additional insight as to what these mean (or if I've even done it right!), I'd appreciate it.

    ReplyDelete
  32. My results would appear to not mean much since my grandparents are from very different places. I am a little surprised that the "asian" component of my result didn't pull me further east. I estimate I am 10-15% Native American




    [,1] [,2]
    [1,] "Slovenian" "13.3824"
    [2,] "German_D" "16.4946"
    [3,] "Hungarians" "16.7991"
    [4,] "N._European" "17.5768"
    [5,] "CEU" "18.2647"
    [6,] "Argyll_1KG" "18.9445"
    [7,] "Orcadian" "19.824"
    [8,] "Orkney_1KG" "19.9309"
    [9,] "French_D" "23.247"
    [10,] "French" "23.4213"

    ReplyDelete
  33. Hi - Thanks so much for this. I was able to run DIYDodecad v. 2.0 from your ReadME. Very helpful.

    Is their a key to the nomenclature of the reference populations?

    What is the difference between Ashkenazy_Jews and Ashkenazi_D? S_Italian_Sicilian_D and C_Italian_D? Thanks

    ReplyDelete
  34. Are you planning to add new Bulgarian sample to the program?

    ReplyDelete
  35. Am I missing something here? Where is the DodecadOracle linux program, the link only gives you the R data stored on googledocs? I have the DIYDodecad admixtures ready to put into it.

    More generally, appreciate all the work you've put on the web!

    Thanks in advance. Jon.

    ReplyDelete
  36. There is no linux program, the Oracle runs on R.

    ReplyDelete
  37. Jon -> To run this under Linux simply launch R then use the load() command to load the Oracle.

    For example:
    load('DodecadOracleV1.RData')

    Use the full path for DodecadOracleV1.RData if R was not launched from the directory containing that file.

    Then execute whatever commands you wish from the instructions.

    Note also that the DodecadOracle will work on Mac OSX as well (by installing the Mac version of R).

    ReplyDelete
  38. Has the order of the admixtures for the program changed? Although my admixture is this:

    14.66% East_European
    50.17% West_European
    22.56% Mediterranean
    0.41% Neo_African
    7.75% West_Asian
    0.01% South_Asian
    0.03% Northeast_Asian
    0.00% Southeast_Asian
    2.26% East_African
    1.86% Southwest_Asian
    0.00% Northwest_African
    0.30% Palaeo_African

    ... when I enter them in that order into the oracle, I get:
    > DodecadOracle(c(14.66, 50.17, 22.56, 0.41, 7.75, 0.01, 0.03, 0.00, 2.26, 1.86, 0.00, 0.30))
    [,1] [,2]
    [1,] "Mongol" "21.4325"
    [2,] "Mongola" "22.1336"
    [3,] "Daur" "23.3353"
    [4,] "Hezhen" "24.1801"
    [5,] "Xibo" "26.5628"
    [6,] "Oroqen" "27.3458"
    [7,] "Buryat" "30.8749"
    [8,] "Tu" "32.4573"
    [9,] "Altai" "32.6115"
    [10,] "Uygur" "34.4787"

    I also see these Asian populations when I put in the example from the blog:

    > DodecadOracle(c(4.6, 16.7, 33.6, 0, 23.2, 0.4, 0.6, 1.6, 0.7, 14.1, 4.5, 0.2))
    [,1] [,2]
    [1,] "Altai" "35.4034"
    [2,] "Tuva" "37.4562"
    [3,] "Hazara" "38.201"
    [4,] "Buryat" "39.3191"
    [5,] "Mongol" "39.7467"
    [6,] "Uzbeks" "39.9885"
    [7,] "Uygur" "40.3521"
    [8,] "Yukagir" "42.0916"
    [9,] "Brahmins_from_Uttaranchal_M" "44.429"
    [10,] "Oroqen" "46.1119"

    I think this is a mistake somewhere... any suggestions?

    ReplyDelete
  39. @Kelly, you are probably not using the correct Oracle. The values you entered give:


    DodecadOracle(c(14.66, 50.17, 22.56, 0.41, 7.75, 0.01, 0.03, 0.00, 2.26, 1.86, 0.00, 0.30))
    [,1] [,2]
    [1,] "German_D" "4.3332"
    [2,] "N._European" "6.0212"
    [3,] "CEU" "6.5226"
    [4,] "Argyll_1KG" "6.9663"
    [5,] "Orcadian" "7.8569"
    [6,] "Orkney_1KG" "8.0432"
    [7,] "Slovenian" "13.2511"
    [8,] "Mixed_Germanic_D" "13.7513"
    [9,] "Dutch_D" "15.3921"
    [10,] "French" "15.9273"


    I suspect you downloaded the K12a or K12b Oracle and are entering the Dodecad v3 values into it.

    ReplyDelete
  40. Thank you, that was exactly the issue. I'm still learning here! Thank you so much for your quick response!

    ReplyDelete
  41. I wish the instructions were easier to understand. :/ I feel like a chimpanzee trying to interpret a legal document sometimes with this Dodecad stuff. I've got R running, managed to get Oracle and K12 in the folder (or working directory). What do I type in to see what percent of the samples I am? Hopefully my question made sense. Also when it says "DodecadOracle(c(4.6, 16.7, 33.6, 0, 23.2, 0.4, 0.6, 1.6, 0.7, 14.1, 4.5, 0.2))" Where are all those numbers coming from? I hate to seem daft, but I really don't understand this stuff. I try to read it, and the readme text files, but my brain seems to refuse to grasp it. (Oh, the joys of ADD -___-)
    Also am I correct in understanding that in the example below that I am more closely related to the Kent_1KG sample than the Orkney_1KG because the number is lower?
    [,1] [,2]
    [1,] "British_D" "0"
    [2,] "Kent_1KG" "0.4"
    [3,] "Cornwall_1KG" "1.631"
    [4,] "British_Isles_D" "1.7117"
    [5,] "CEU25" "2.6665"
    [6,] "Irish_D" "2.7331"
    [7,] "Dutch_D" "3.1016"
    [8,] "Argyll_1KG" "3.6579"
    [9,] "Orcadian" "3.755"
    [10,] "Orkney_1KG" "4.279"

    ReplyDelete
  42. What do I type in to see what percent of the samples I am? Hopefully my question made sense.

    First you need to get your admixture percentages. To do that you must first run standardize (see README) and then you must run DIYDodecadWin:

    http://dodecad.blogspot.com/2011/09/do-it-yourself-dodecad-v-21.html

    Also when it says "DodecadOracle(c(4.6, 16.7, 33.6, 0, 23.2, 0.4, 0.6, 1.6, 0.7, 14.1, 4.5, 0.2))" Where are all those numbers coming from?

    These numbers will be reported by DIYDodecadWin when you run it on your sample (see above).

    Once you get these numbers, you input them in the same order in DodecadOracle as above. Note, however, that this post is about Dodecad Oracle v1 which works with the "calculator" dv3 that is bundled with the DIYDodecad program. If you use the more recent K12b calculator (http://dodecad.blogspot.com/2012/01/k12b-and-k7b-calculators.html), you must use the Oracle designed for that, and which can be downloaded from that page (http://dodecad.blogspot.com/2012/01/k12b-and-k7b-calculators.html). All the most recent calculators/Oracles of the Project will always be available from the bottom right of the blog under "Project Links".

    Also am I correct in understanding that in the example below that I am more closely related to the Kent_1KG sample than the Orkney_1KG because the number is lower?

    It would appear so, yes; lower numbers = closer match.

    ReplyDelete
  43. I have been reading the posts for a few days now and trying to make sense of my results. I cannot find instructions on how to read my basic results. I apologize for my genealogy ignorance. Dienekes, you must be a genius!

    Anyhow, I will use Africa 9 Oracle. I understand Admix Results. I sort of understand Single Population sharing-that the smaller distance means more of a match? But....Mixed Mode Population Sharing- Which percentage do I use since they are all so close? Also, am I supposed to compare the distance with the Secondary Population and the closest is the probable match? Such as 64.2% Morocco_Jews to 35.8% French_Basque with distance of 1.67. Or do I consider the Tuscan with 93.2% to Mozabite 6.5%. Or since I'm showing large percentage matches with mostly North_Italian do I go with that?

    I'm sorry but I am truly completely green at this. Thanks for your patience.
    Maiysa
    PS I am a member of a Native American Tribe with a lot of Scandinavian and French, so the Italian is really confusing me, but very exciting, since I'm in love with Italy!! But not sure if it means anything.

    Admix Results (sorted):

    # Population Percent
    1 Europe 66.93
    2 SW_Asia 24.48
    3 NW_Africa 6.06
    4 Mbuti 1.06
    5 San 0.94
    6 Biaka 0.37
    7 W_Africa 0.15

    Single Population Sharing:

    # Population (source) Distance
    1 North_Italian 7.86
    2 Tuscan 8.31
    3 Morocco_Jews 24.16
    4 North_African_Jews (Dodecad) 31.99
    5 Druze 41.58
    6 French_Basque 43.22
    7 Jordanians 47.46
    8 Egypt 53.14
    9 Egyptans 55.82
    10 North_African (Dodecad) 59.57
    11 Yemenese 61.08
    12 Libya 62.61
    13 Algeria 68.11
    14 Bedouin 69.75
    15 Morocco_N 70.26
    16 Yemen_Jews 73.23
    17 Saudis 78.76
    18 Moroccans 79.3
    19 Ethiopian_Jews 82.65
    20 Ethiopians 82.65

    Mixed Mode Population Sharing:

    # Primary Population (source) Secondary Population (source) Distance
    1 64.2% Morocco_Jews + 35.8% French_Basque @ 1.67
    2 89.9% North_Italian + 10.1% Algeria @ 1.8
    3 88.6% North_Italian + 11.4% North_African (Dodecad) @ 1.8
    4 92.3% North_Italian + 7.7% Sahara_OCC @ 1.99
    5 91.3% North_Italian + 8.7% Moroccans @ 2.01
    6 90.3% North_Italian + 9.7% Morocco_N @ 2.3
    7 91.8% North_Italian + 8.2% Morocco_S @ 2.4
    8 89.4% North_Italian + 10.6% Libya @ 2.64
    9 57.5% North_African_Jews (Dodecad) + 42.5% French_Basque @ 2.67
    10 93.1% North_Italian + 6.9% Mozabite @ 2.89
    11 92.9% North_Italian + 7.1% TUNISIA @ 3.12
    12 55.2% French_Basque + 44.8% Egypt @ 4.39
    13 89.1% North_Italian + 10.9% Egypt @ 4.45
    14 78.6% North_Italian + 21.4% Morocco_Jews @ 4.47
    15 93.6% North_Italian + 6.4% Fulani @ 4.48
    16 89.7% North_Italian + 10.3% Egyptans @ 4.56
    17 93.2% Tuscan + 6.8% TUNISIA @ 4.66
    18 93.5% Tuscan + 6.5% Mozabite @ 4.7
    19 92.9% North_Italian + 7.1% Ethiopians @ 4.72
    20 92.9% North_Italian + 7.1% Ethiopian_Jews @ 4.72

    ReplyDelete
  44. Hi,

    I've also put in my results for the Oracle and I get the following results for single population (which is absolutely correct - I'm 100% AJ)
    [,1] [,2]
    [1,] "Ashkenazy_Jews" "3.4103"
    [2,] "Ashkenazi_D" "4.2778"
    [3,] "Morocco_Jews" "8.9073"
    [4,] "S_Italian_Sicilian_D" "13.0442"
    [5,] "C_Italian_D" "13.4503"
    [6,] "Tuscan_X" "13.4533"
    [7,] "Tuscan_H" "14.1806"
    [8,] "O_Italian_D" "14.1926"
    [9,] "TSI" "14.7916"
    [10,] "Sicilian_D" "15.0964"

    and for mixture:

    [1,] "20.3% Hungarians + 79.7% Morocco_Jews" "2.3257"
    [2,] "93.8% Ashkenazy_Jews + 6.2% North_African_D" "2.5027"
    [3,] "79.4% Morocco_Jews + 20.6% Slovenian" "2.5566"
    [4,] "93.8% Ashkenazy_Jews + 6.2% Morocco_N" "2.6222"
    [5,] "95.3% Ashkenazy_Jews + 4.7% Sahara_OCC" "2.6254"
    [6,] "95.1% Ashkenazy_Jews + 4.9% TUNISIA" "2.6363"
    [7,] "95.6% Ashkenazy_Jews + 4.4% Moroccans" "2.6739"
    [8,] "94% Ashkenazy_Jews + 6% Algeria" "2.6818"
    [9,] "93.5% Ashkenazy_Jews + 6.5% Libya" "2.7011"
    [10,] "96% Ashkenazy_Jews + 4% Morocco_S" "2.8802"

    Discounting # 1 and #3 (I have no Slovenian / Hungarian / Moroccan Jewish ancestry at such levels) If I understand what this means is that compared to the AJs who provided samples I have a higher amount of North African admixture indicating a small North African component together with my Ashkenazi Jewish ancestry?

    ReplyDelete
  45. What is German_V? It seems a there are a few groups with the _V tag at the end of their names.

    ReplyDelete
  46. Hello,
    my Dodecad V3 results don't show any South-West Asian or South Asian component, but using your spreadsheet for V3, I see that the most ethnicities have it, even Selkup has a value of 0,1.

    Dodecad World9 has least distances at mixed-mode and it shows 90% Russian, which I am for the most, but I've heard that this test is created only to discover Nartive-American ancestry. I got results like
    86.8% Russian (Dodecad) + 13.2% TSI30 (Metspalu) @ 0.96
    87.1% Russian (Dodecad) + 12.9% Tuscan (HGDP) @ 1

    The K12b test shows me
    87.1% Russian (Dodecad) + 12.9% Sardinian (HGDP) @ 1.61
    2 80.5% Russian_B (Behar) + 19.5% Castilla_La_Mancha (1000Genomes) @ 1.85
    3 79.8% Russian_B (Behar) + 20.2% Spaniards (Behar) @ 1.94
    4 79.1% Russian_B (Behar) + 20.9% Cataluna (1000Genomes) @ 1.95
    and oracle X
    Polish 68.49%
    2 Mordovians 9.50%
    3 Yakut 4.94%
    4 Sephardic_Jews 4.85%
    5 Swedish 3.78%
    6 C_Italian 3.47%
    7 O_Italian 3.32%
    8 Norwegian 1.09%
    9 Uzbeks 0.54%
    10 Hazara 0.01%
    which I find close to reality, if you replace Polish with Russian/Polish/Lithianian admixture.
    It is definitely some southern component, My high Caucasus component doesn't match with my low values of Gedrosia and West-Asian.
    My Gedmatch is F294441

    ReplyDelete
  47. I am so lost. I can't make sense of what these results are showing. I'm 67 and smart, but certainly challenged by the lack of plain English interpretation and explanations here. I feel as though the answers to my lifelong genealogy questions hang behind a curtain of confusion. I show a higher than what I would expect percentage of middle eastern, french and german in my results. I have no known people of those ethnic populations back to ggg grandparents, so ... was there infidelity, adoption, something else to explain this. I am overwhelmed by trying to understand the computer-language answers here. I have great appreciation for the person who has taken all the time and effort to develop this program, but please ... hire someone who can make reading the results easy.... I figure I only have 20 years left to find the answers to this lifelong passionate journey. And it won't get any easier from here.
    Known ethnicities: Father's father: Northern Italian. Father's mother: From Italy, but varied as she came from a family who probably married the women off to husbands from other countries for political and financial reasons. her father was reputed to have some Spanish (also Jewish?) as his name was Catalano. (Catalan origin?)
    Mother's father: Portuguese/old New England. Mother's mother: Scots/Irish/English and maybe German. My ID number is: M632230 if anyone would like to have a go at interpretation. A better email for me is: mamasi@comcast.net Thank you for your help and understanding. Melody

    ReplyDelete