Sunday, February 13, 2011

A note on the eligibility criteria of the Dodecad Project

1. Why I do not accept relatives

Relatives are much more similar to each other than they are to other people. When an ADMIXTURE run includes a pair of close relatives, then there is a high risk of the two relatives forming their own component to which they belong near 100%. Thus, (a) they learn nothing about themselves, other for the fact that they are related, and (b) other people have some membership in that particular component which does not have any real anthropological interest.

The same is true when one performs dimensionality reduction like multidimensional scaling (MDS) or Principal Components Analysis (PCA). It is highly likely that a pair of relatives will be distinguished from everyone else along a dimension. That dimension is "wasted", and it provides no real anthropological information other for the fact that a pair of people are related to each other. If this is combined with a clustering algorithm such as the Galore approach, then the pair of relatives will form their own anthropologically meaningless cluster.

The Dodecad Project is an anthropology project; it is not a genealogy project. There are other projects and experts out there who can help you interpret your data from a genealogy perspective.

2. Why I accept people with 4 grandparents from the same European, Asian, or North African ethnic group or country

I do so for two reasons, a scientific one, and a practical one:

First, such individuals allow me to gradually build a database of genetic variation for my region of interest, namely Eurasia. As of this writing, there are 387 participants in the project, out of which I have managed to form 19 populations, each of them consisting of at least 5 individuals from a particular ethnic/linguistic/national/geographical group such as Greek/Arab/British/Scandinavian, for a total of 209 individuals.

There are many participants who belong to a particular group for which there are not 5 participants yet; and, there are also many who have a mixed background and whom I have included in the Project either during some time-limited submission opportunities, or because they asked me to, making a good case for why they should be included.

The second reason is practical, as the customers of 23andMe are probably largely American, and many non-1st generation Americans are descended from multiple ethnic groups. Even though a large part of the analysis process is automated, I still need to spend some time on every submitted sample to download it, record it, convert it, to assign some processing time on my computing systems for the analysis, and then spend some more time to create the results barplots and materials.

I simply don't have the time to do so for the large pool of 23andMe customers, but I can, occasionally, accept individuals of mixed heritage during short submission opportunities.

What to do if you are not eligible

If you do not meet the eligibility criteria, and think that I should include you in the Project, the thing to do is to write to me, laying out the reasons why. Here are some scenarios that I might very well consider:
  1. Adoptees with no knowledge of their origins.
  2. People of mixed heritage who belong partly to an unsampled group. Let's say you are 50% English+50% Parsee, then I might consider this sample as I have no data on Parsees.
  3. People of mixed origins who have data for both their unadmixed parents (and thus provide two samples to the project), or a series of their unrelated relatives, I may very well analyze the admixed individuals as well.
  4. People of mixed heritage from Greece and its environs (i.e., Italy, the Balkans, and Anatolia), e.g., 50% Gypsy+50% Bulgarian, or 50% Croat+50% Serb or 75% Turk+25% Albanian would all probably be accepted if they contacted me.


  1. 75% Turk+25% Albanian

    Of course you wouldn't include such a person among the Dodecad Turks, would you?

  2. Why I accept people with 4 grandparents from the same European, Asian, or North African ethnic group or country

    I think you should only accept people with all known ancestors from the same ethnic group. Many countries have more than one ethnic group and many ethnic groups are dispersed in more than one country, so you should remove being from the same country from your eligibility criteria.

  3. You already have my data. Would you be willing to pull me out and add my parents if I sent their raw data? Both my parents are from the same region of Spain. I figure two Galician Spaniards are better than one.

  4. Hello, I have a question...
    I have my maternal grandmother's DNA, my mothers, my son's & mine from 23 and me.

    Bio father is unknown.

    My 23 and me results show me as 93% euro and 7% asian... my grandmother as 99% euro, my mother at 99% euro and my son is showing at 96% euro & 4% asian. His father is 100% euro. Would I qualify to be in the project?

    Thanks in advance...

  5. The eligibility criteria have nothing to do with the 23andMe results, but with the origin of your four grandparents. If they are all from the same European, North African, or Asian ethnic group or country, you qualify, otherwise, probably not.