Monday, December 6, 2010

Clusters galore, K=50 for Dodecad Project members (up to FFD055)

I am now repeating the Clusters galore analysis for Family Finder data (for more info, see the previous post on 23andMe data).

With 14 MDS dimensions retained, there were 50 clusters inferred in the optimal solution by MCLUST.

The results spreadsheet has rows for the 54 project participants in the first rows: each row is the probability that you belong to a particular cluster. This is followed by the reference populations where each row has the number of individuals (for that populations) that is assigned to a particular cluster.

There are also some outliers in this analysis:
FFD002 FFD004 FFD007 FFD012 FFD015 FFD016 FFD021 FFD022 FFD023 FFD038 FFD046
Check what an outlier is in the context of this analysis, and what it means.

Interestingly, because of the smaller number of Family Finder participants some previously defined clusters (for 23andMe data) such as the "Finnish" cluster do not appear here. This is not surprising at all, because for a cluster to be defined several individuals from that population must be present in the data.

Many continental Europeans of this type ended up in cluster #2. Some others, like FFD048 who is Lithuanian were assigned to the proper cluster #9, centered on Lithuanians.

This underscores the importance of having more people join the Project at the next available opportunity. This will not only create new clusters for individuals who are currently the only representatives of their populations, but it may also split already existing clusters if regional sub-populations are detected.

It is also important for project participants to drop a note at the ancestry thread, to help others make better sense of their results.


  1. I see in cluster 12 with the Romanians also 3 Armenians and 1 Georgian. Are these the ones, who look admixed on the previous 23andMe analisis?
    In this case Anatolian/Caucasian + Slav = Balkanian South Slav?

  2. . Are these the ones, who look admixed on the previous 23andMe analisis?


    In this case Anatolian/Caucasian + Slav = Balkanian South Slav?

    That is their closest match but the 3 Armenians are also detected as outliers in the Analysis, so the relationship with Romanians is not especially close.

    ALso, Romanians are not Slavs.

  3. Romanians may have more actual Slav ancestry than the Bulgarians. It was only a matter of chance the Romanian(vulgar Latin) prevailed there and Slavic in Bulgaria, which was largely due to being intoduced as the official and church language.
    I am talking, of course, about the Vlachs, who used to live South and East of the Carpathian mountains, not areas acquired after WWI like Transilvania.

  4. Latin was spoken in what is Romania long before the Slavic speakers moved south into the Balkans. Slavic speech in the Balkans dates to the 6th century. That may seem a long time but in the viewpoint that that part of Europe has been inhabited from about 45,000 years before present, it is nothing. Slavic speech is less native than Latin, and more native than Altaic Turkic. The true languages of the native populace, Thracians, are extinct.

    The bottom line is that the Romanians, probably the Bulgarians, descend from people indigenous to the Balkans for many thousands of years before any I.E speaking person set foot in Europe from the Eurasian steppes. The Slav speakers are just another bunch of immigrants assimilated by the locals who went on in Bulgaria's case to assimilate the Turkic Bolgars but retained the language of the previous Slavic speaking migrants.

  5. Hm.
    It clustered me with Belorussians.
    Hm. I have some eastern connection, but, the difference to the "Admixture" result is quiet striking.

    The "admixture" profile looks (at overal impressison) more similiar to CEU than to Belarus. Well, unusual high Northeast Asian component and low African components (both typical for Northeastern Europe), but the dominating North and South European components matched CEU better than Belorussia.

    Still, "Mclust" thinks I am closer to the Belorussian cluster than to the CEU one.


    This is a MDS based on my 23andMe Data, wich just arrived (I am D8 in there):

    1&2: Quiet middle of Germans

    Side view:
    Shows that me atually is floating above the German cluster.


  6. As I said, the fact that #2 is represented among Belorussians in the reference population does not mean that it is a Slavic cluster, as all sorts of continental people are assigned to it. It's more of a "continental cluster" which awaits further refinement if more people's co-ethnics are added in order to define finer-scale clusters.