To make a long story short, here are the IDs identified as outliers:
An outlier is someone who is not very close to any other individual and hence does not really "cluster" with anyone. Thus, it is recommended to remove outliers prior to clustering, as otherwise they will form makeshift clusters that don't really have a good meaning.
"DOD157" "DOD168" "DOD169" "DOD036" "DOD048" "DOD088" "DOD034" "DOD030" "DOD060" "DOD132" "DOD128" "DOD175"
Looking at the individual spreadsheet reveals that many of these outliers have very unusual ancestry. This falls under two categories:
- Recent admixture between geographically separated populations
- Being the only member from an unsampled population
In the second case, there are no members of the individual's group. Sometimes, if a group is close enough to another, this is not a problem, but there are many distinctive population groups for which that is not the case.
While outliers will be removed from some analyses, their outlier status will continue to be evaluated as new reference populations, or Dodecad Project members are added.
Samples DOD168 and DOD169 belong to me and my wife and are from Tunisia, they are among the samplers listed as outliers , they are the only,so far, samples from Tunisia and have the most admixture level !:ReplyDelete
South European 29.8% Northwest African 22.1% Southwest Asian 17.9% West Asian 11% North European 7.9% East African 7.3% West African 3.5% South Asian 0.4%
kamel, thanks for the info. Could you also add this in the ancestry thread, as I like to look at that to know which participants have identified their ancestryReplyDelete
I posted the info in that link
If you find any other data from Tunisia even in reserach papers, please add them