Sunday, October 24, 2010

Introducing the Dodecad ancestry project

Welcome to the Dodecad ancestry project!

The Dodecad ancestry project, named for the Greek word for "group of twelve", aims to provide detailed ancestry analysis, primarily for Eurasian individuals. Please read on, if you want to participate; if not, please subscribe to the feed to keep up with the project's progress.

The project was started by Dienekes' Anthropology Blog.

1) Project goals

The Dodecad ancestry project has two goals:
  1. To provide detailed ancestry analysis to individuals who have tested with 23andMe; other testing companies may be included in the future.
  2. To build samples of individuals for regions of the world (e.g. Greeks, Finns, Albanians, Southern Italians, etc.) currently under-represented in publicly available datasets.
I neither endorse nor am I affiliated with any genetic testing company. I have chosen to base the project on 23andMe results, because (i) I perceive that quite a few people have used the service, (ii) the Illumina genotyping platform it uses has substantial overlap with the publicly available datasets on which my analysis depends.

2) Data privacy statement

Your raw genetic data will not be shared with anyone.

It will not be analyzed for anything other than ancestry or admixture. No analysis of physical or medical traits will be performed.

Individual-level results will be revealed with only a unique ID, without any further information about the identity or origin of each participant.

3) Who is eligible to participate

Due to my inability to process a large number of samples, at present, only the following groups are eligible to participate in the project's current pilot phase:
  1. Greeks (not necessarily from Greece: Cypriots, Pontic Greeks from the former USSR, North Epirotes, Griko speakers from Italy, Muslim rumca speakers from Turkey, etc. are all accepted)
  2. People from the Balkans
  3. People from Anatolia
  4. People from the Caucasus
  5. Italians
  6. Non-Indo-European speakers from Europe (e.g., Finns, Hungarians, Basques)
  7. Scandinavians and Icelanders
  8. Iranians
  9. Armenians
  10. Jews from Italy, the Balkans, or Anatolia
  11. Assyrians
  12. Arabs
Samples should be received by the end of October 2010. There may be a new opportunity to submit your data after that, which will be announced in this blog.

If you are uncertain whether your sample can be included in the project, please write to me at dodecad@gmail.com to inquire.

Close relatives should not submit all their samples. If you and your relatives have tested, please submit independent samples. For example, if you have data for you, your father, and your mother, it is ok to submit either (i) your own data, or (ii) your father and mother's data -- provided that it is not a consanguineous (e.g., cousin or uncle-niece) marriage.

4) What to send

Your compressed genotype file from 23andMe and as much information about your ancestry as you wish to reveal. It is necessary, however, that you tell me at least the country of origin of most of your ancestors and their ethnicity. Information about spoken language, religion, may also be useful.

5) Where to send it

Send it to dodecad@gmail.com. I will respond with a unique identifying code of the form DOD001, DOD002, and so on. The results will then be posted in the blog with that ID.

6) What you will receive

You will be included in an ADMIXTURE analysis together with other project participants and publically available populations, and the results will be published in this blog. Your sample will only be identified by your ID. If a group (e.g., Greeks) has at least 5 participants, I may also post the average admixture proportions for that group.

The following figure and table gives an idea of what to expect. Project members can expect to get a bar of their own, and a list of their ancestral proportions.



The number of components and population samples may vary over the course of the project.

7) Project updates

All project updates will be announced and presented in this blog. Additional commentary on the project may be posted in Dienekes' Anthropology Blog. I may also post occasionally on twitter about the project's progress.

8) Feedback

I encourage feedback about the project from participants or prospective participants. Please address it to dodecad@gmail.com

22 comments:

  1. What about European descendents from other regions of the globe? Are we able to participate?

    Antonio.

    :)

    ReplyDelete
  2. It seems that by Anatolian you exclusively mean Turks who speak a language descended from the Ottoman Turkish. But as you know, not all Ottoman Turkish-descended-Turkish-speakers from Turkey are from Anatolia, there are also a substantial number of Balkan Turks (full or partial like me) in both Turkey and various Balkan countries, who speak the same language - albeit with their specific Balkan dialect - with Anatolian Turks. Also Cypriot Turks speak basically the same language - again with their specific dialect - with Anatolian and Balkan Turks. The same is true for Ahiska Turks of Georgia and Turks of Syria (living around Aleppo). I think you should put all these Ottoman-descended-Turkish-speakers, all of whom speak basically the same language even if with specific dialects (Anatolian and Balkan Turks themselves speak various dialects all of which are specific to various locations of Anatolia and the Balkans), in the same category, as you put Greeks of the Balkans, Anatolia, Cyprus, Italy, Crimea, the Caucasus and elsewhere in the same category because that they basically speak or spoke (I say "spoke" as some of them switched to Turkish or Arabic under Muslim rules) the same Byzantine-descended-Greek language (with few and negligible exceptions like Tsakonian, which is Koine-influenced anyway). Other Turkic language speakers like Azeris, Iraqi Turkmens, Tatars, Central Asian Turkics, Nogais, Siberian Turkics, Bashkirs, Gagauzes, Chuvashs, Qashqais, Karachai/Balkars, Kumyks, Khalajs, etc., OTOH, should be considered as different groups from Balkan and Anatolian Turks, even individuals from them who were born and/or are resident in Turkey, and even with the citizenship of Turkey. So you should make a clarification here.

    Also what about Azeris? There are Azeris in Iran, the Republic of Azerbaijan and in various parts of Russia (mainly in the Russian Caucasus), and as a small but distinctive minority in easternmost Turkey and metopoles of Turkey and elsewhere throughout the world (including Georgia, BTW). In which group do you put them? In Iran, the Caucasus, Anatolia (because of their Turkic language) or somewhere else (I assume you don't put them in Anatolia as I sense that you want to examine them separately from Turks of Turkey)?

    Also what about Kurds? There are Kurds in Turkey, Iraq, Iran, Syria, Caucasus and elsewhere. What about Zazas, who are a population that primarily lives in Turkey?

    Laz people in NE Turkey speak a South Caucasian language that is a relative of Georgian and can be put in the Caucasus group. Of course, also Georgian, and also North Caucasian-speakers in Turkey and in Arab countries. Muslim Hamshenis can be put in the Armenian group irrespective of the fact that they are Muslims.

    ReplyDelete
  3. It seems that by Anatolian you exclusively mean Turks who speak a language descended from the Ottoman Turkish.

    I don't know how you've arrived at this conclusion. By Anatolian I mean all Anatolians, not only Turks of Anatolia. And, I am perfectly willing to include Turkic people from the Balkans, the Caucasus, Iran, as they all fall in the included categories of teh pilot phase


    What about European descendents from other regions of the globe? Are we able to participate?

    Of course, as long as your European ancestry is from one of the included groups. There may be other opportunities in the future for people who are in different groups.

    ReplyDelete
  4. I don't know how you've arrived at this conclusion. By Anatolian I mean all Anatolians, not only Turks of Anatolia. And, I am perfectly willing to include Turkic people from the Balkans, the Caucasus, Iran, as they all fall in the included categories of teh pilot phase

    It seems to me that the twelve groups you mention aren't distinct categories, but loose and sometimes intersecting categories that you've defined for the sole purpose of clarifying who is eligible to participate in the project's current pilot phase. So the results will not appear under these twelve categories, but as always, only under ethnic categories, whose number will depend on the ethnicity of the participants and the specific analysis you are making, am I right?

    ReplyDelete
  5. The results will appear as individual IDs + results. Groups that have at least 5 individuals will be posted as individuals. How these groups will be defined remains to be seen, as ethnic, linguistic, and national categories are not always congruent. I will try to use the most meaningful categories.

    ReplyDelete
  6. Dienekes,

    Obviously, you are mostly interested in the eastern Mediterranean region, and your selected groups will grant high resolution, there - which is all fine and understandable for personal interests. However, why don't you include neighboring groups that have had relatively much higher population densities and historically known high impact? If you are interested in resolving bidirectional flows, I would think that is a must.

    Excluding basically anything between Hungary, Czech Republic, Austria, Switzerland, and Germany (and Belgium, and the Netherlands)in my mind means that you exclude the majority of Europe - especially its center, which certainly heavily interacted with almost any part of Europe. But the the fringe that you do include did not.

    ReplyDelete
  7. I exlucde the regions that you mentioned for two reasons, a pragmatic and a scientific one.

    1. I don't want to be flooded with samples I can't process
    2. I already have scientific samples from populations of Europe outside the Balkans and Southern Italy, including all linguistic groups, but I only have Romanians for the Balkans and no southern Italian/Sicilian sample.

    Excluding basically anything between Hungary, Czech Republic, Austria, Switzerland, and Germany (and Belgium, and the Netherlands)in my mind means that you exclude the majority of Europe - especially its center, which certainly heavily interacted with almost any part of Europe. But the the fringe that you do include did not.

    That is a contestable claim. The European Neolithic, arguably the most important development in European prehistory originated in Anatolia and spread via the Balkans. The Roman Empire has had the single most important influence in facilitating population movement and it originated in Italy.

    I don't shun people from the regions you mention, and there will hopefully be opportunities for them to contribute, but right now my priorities are to sample the big holes of the map (e.g., Southern Slavs, Greeks, Albanians have no samples, while Germanics, East and West Slavs do).

    ReplyDelete
  8. Dienekes,

    I did not want to sound too critical - I certainly very much appreciate your effort here, and understand your (current) pragmatic/logistic limitations.

    On the other hand, it is always good if preconceptions do not enter too much into the selection criteria. It is also quite possible that after the 500year hiatus of agriculture south of the Danube, LBK emerged with people who were largely local (thus only 20% of western Asian in central Europe, today). Those people initially multiplied enormously, covering everything from France to the Ukraine (perhaps explaining why Ukraine and Germany end up rather similar), before (it seems, also based on mtDNA) the fringe got a stronger foothold, again. So, it could well be that within these ~250+million central European people, there is a second (~Danubian) component different from the "Nordic/Germanic" one - something that has already shown up when Scandinavia and surrounding counties are studied. You'll miss any such component if you largely only evaluate "the fringe."

    ReplyDelete
  9. eurologist, I don't share your optimism about the existence of a Danubian component primarily because of the very close relationship between all central European populations. Indeed, the only significant contrast in that region is between Germanic and Slavic speakers. ADMIXTURE works by exploiting contrasts, and it's doubtful that the Danubian component will appear.

    The more diversity you encounter in a region, the more likely it is that there is latent population structure driving this diversity. But, in Central Europe you see populations at very close geographical and genetic distances from each other, and no real hope of what you are suggesting.

    In the Balkans, on the other hand, there are strong historical, linguistic, and genetic reasons for suspecting that a latent component may lurk: two native extant IE languages, ample evidence for the existence of many other extinct ones, both IE and non-IE, region-specific genetic signals of expansion (e.g., E-V13, some subclades of J2a, I2a2, and so on).

    So, pragmatically it is a better hope for a new latent component.

    ReplyDelete
  10. Contrast between Germanic and Slavic speakers could just hide/paraphrase such an early Hungarian/Danubian/LBK component. Again, overall, Ukrainians tend to align unexpectedly closely with Germans.

    Again,

    it is always good if preconceptions do not enter too much into the selection criteria

    ReplyDelete
  11. Preconceptions always enter into the selection criteria, and they should, because preconceptions lead one to sample populations that may yield new knowledge. Sampling yet another Germanic or Slavic population of central Europe has much lower probability of yielding new knowledge than sampling under-represented regions such as the Balkans.

    ReplyDelete
  12. Will Americans of Northwest European descent, eventually be included in the project? Germany and the British Isles are the extent of my ancestry. The patrilineal line is from North Germany, but is in Y haplogroup E1b1b1, which is only 2.5% in that region. My highest match frequency is in Greece, followed by Bulgaria. I used DNA Tribes for a 15 STR autosomal marker test, which gave the strongest affiliation in Bulgaria, with Marmara, Turkey and Morocco each about half Bulgaria's strength. Later did their 27 marker upgrade, which gave completely different results - Iceland was strongest, with Croatia, and Bedouin (Negev, Israel), each about 20% of Iceland. The "World Region" strongest result was Arabia, followed by Eastern Europe 91% of Arabia, Mediterranean 78%, Northwest European 61%. Such disparate results greatly concerned me. Am currently waiting on FamilyTree autosomal results for their "Family Finder/Population Finder" test, using hundreds of thousands of SNP's. With so many data points I'm hoping it will more faithfully reflect my actual ancestry.

    Regards,
    Dave Schroeder, USA

    ReplyDelete
  13. Will Americans of Northwest European descent, eventually be included in the project?

    Thank you for your interest. I do not include Northwest Europeans (except Scandinavians) primarily because they form the bulk of genetic testees, and there are already population samples (HGDP-CEPH and HapMap) that include people from NW Europe.

    However, I am thinking that I may issue some "windows of opportunity" for such people to submit their data, say for 1-2 days, so please subscribe to the blog feed or visit sometimes to see if that happens.

    ReplyDelete
  14. Pontian Greeks from former USSR are the same race with the Pontians who live in the Greece. Why you put them in special category?

    And when you say Rumca speakers, you mean Pontic Muslims you speak Pontic Greek? Because Pontian musllims are not the only Greek speaking race in Turkey.

    ReplyDelete
  15. Ως Πόντιος ο ίδιος, γνωρίζω πως οι Πόντιοι της πρώην ΕΣΣΔ είναι το ίδιο με αυτούς που ζούνε στην Ελλάδα. Και δεν τους βάζω σε ιδιαίτερη κατηγορία αλλά τους συμπεριλαμβάνω στην κατηγορία των Ελλήνων. Το νόημα της πρότασής μου είναι πως όλοι όσοι είναι Έλληνες μπορούν να συμμετάσχουν χωρίς να είναι απαραίτητα από την Ελλάδα.

    And when you say Rumca speakers, you mean Pontic Muslims you speak Pontic Greek? Because Pontian musllims are not the only Greek speaking race in Turkey.


    Εφόσον δέχομαι δείγματα από τα Βαλκάνια και την Ανατολία, νομίζω πως έχω καλύψει όλους όσους είναι από αυτή την ευρύτερη περιοχή ανεξαρτήτως γλώσσας ή θρησκείας.

    ReplyDelete
  16. Θα ήταν ενδιαφέρον να κάνεις γενετικό τεστ στους Χριστιανούς Ποντίους και στους Ελληνόφωνους μουσουλμάνους Ποντίους της Τουρκιάς. Πιστεύω ότι θα βγουν ίδια σχεδόν τα αποτελέσματα.

    ReplyDelete
  17. Όλοι ευπρόσδεκτοι.

    ReplyDelete
  18. Δεν κατάλαβα ένα πράγμα. Θα βάλεις όλους τους Έλληνες σε μια κατηγόρια ή χωριστά ανά φυλή;

    ReplyDelete
  19. Δεν κατάλαβα ένα πράγμα. Θα βάλεις όλους τους Έλληνες σε μια κατηγόρια ή χωριστά ανά φυλή;

    Μέχρι στιγμής δεν έχω τόσα δείγματα για να με απασχολεί αυτό. Αν στο μέλλον έχω αρκετά δείγματα από διάφορες υποομάδες έτσι ώστε να μπορούν να εξαχθούν συμπεράσματα για τις διαφορές μεταξύ τους, τότε θα μπορέσει να γίνει και αυτό.

    ReplyDelete
  20. Pontiake Istoria, if you are looking for people in Turkey who are 100% descended from Byzantine Greeks, speaking Greek language is an indicator, but not only Greek-speakers (whether Christian or Muslim) are 100% descended from Byzantine Greeks. It is historically known that there were many mass switches to Turkish (a great majority of which are almost certainly undocumented) by Greek-speaking communities (whether Muslim or Christian) in both Asia Minor and the Balkans probably beginning from the 13th or 14th century, and even in late Turkish-occupied regions like eastern Pontus. Of course, only genetics can give us conclusive answers about these issues.

    ReplyDelete
  21. Where are the publicly available datasets you mention? Does this mean that there are certain SNPs more informative of ancestry than others, and where might one find such a thing? thanks....

    ReplyDelete
  22. Check "Data sources" at the bottom left of the blog.
    There are indeed ancestry informative SNPs, however I do not make any a priori selections of SNPs known to harbor differences between populations.

    ReplyDelete