1. Finding the origin of shared segments
Until now, when you had a segment match with another customer in your testing company, you had no idea what was the origin of the shared segment. Suppose, for example, that a Russian and a German share some sequence in a region X. This could be:
- Russian-like ancestry in the German individual
- German-like ancestry in the Russian individual
- Third party ancestry in both individuals
This is extremely important, as there is a noticeably confirmation bias in some individuals of interpreting the unusual as evidence of exotic ancestry. For example, an individual in search of Jewish ancestry may interpret segment matches with Jews as evidence for that ancestry: if he sees high Southwest_Asian ancestry in such segments, then that's a reasonable interpretation, but the shared segments could very well be interpreted as non-Jewish ancestry in the Jewish individual, if, they happen to be, e.g., East_European.
2. With parents' DNA
It is important to remember that each region includes both paternal and maternal DNA and you got a random draw of the segments inherited by their parents (your grandparents).
So, if you try to figure out where your region X came from, remember that it came from two places. So, if you see an unusual combination (e.g., Northeast_Asian + Northwest_African) that doesn't correspond well to any known population, this may mean that you got half of it from one parent, and the other half from the other.
Note also, that while on genomewide analysis a child's results will often be intermediate (but not necessarily so) in his ancestral components between his parents, this is not the case when looking at small segments. Suppose parent A is 50% West_Asian and 50% Mediterranean in a particular region, and parent B is 50% West_Asian and 50% West_European in the other region.
Then the child may end up with West_Asian near 100% in that region (if he happens to inherit the West_Asian segments from both parenets) or near 0% (if he happens to inherit the Mediterranean/West_European ones).
3. With Dodecad Oracle
In general, I discourage the use of Dodecad Oracle with chromosome or segment results. For two reasons:
- Small segments may appear more mixed than they are, because there may not be any informative SNPs in a particular region to distinguish between some of the ancestral components. So, the scale of the noise may be higher. As an experiment, you can average your segments, weighted by either the number of SNPs or their physical length, and you will come up with something close to your "genomewide" average, that will, however, be off, because of this factor.
- From a different perspective, segments may appear less mixed, because it is less likely that you got genetic material from all ancestral populations in a small section of your DNA. Your genomewide admixture may have several non-zero components, but you are unlikely to have many non-zero components in a small region (barring the aforementioned noise), and you could very well see >80% percentages in some of them that are very typical of a particular ancestral component.