As a reminder to new readers, the Clusters Galore technique consists of applying multidimensional scaling on genomic data to convert ~152,000 SNPs into a number of continuous dimensions capturing most of the variation, followed by employment of MCLUST to cluster individuals along these dimensions.
Assyrian, Scandinavian, Greek, Finnish, S_Italian_Sicilian, Ashkenazi, German, Indian, Portuguese, Armenian, Russian, Spanish, British, Irish, Turkish, N_Italian, Balkans, Iranian, North_African, East_African, French, Chinese, Japanese, Polish
In total, 65 clusters were obtained when 10 MDS dimensions were retained.
- Most Greeks and all South Italians/Sicilians continue to fall in the same cluster #4. The fact that the latter population, despite being one of the largest (20 individuals) continues to remain unsplit and distinctive testifies to the fact that it is probably homogeneous and lacks substantial regional inbreeding within it.
- Cluster #2 includes most Germanic individuals and also the Irish
- Cluster #5 is made mostly of Central/North Italians
- Non-Greek Balkan participants fall mostly in cluster #6, which also includes the non-Gypsy admixed reference Romanians
- Project and reference Iberians (Spaniards and Portuguese) continue to be undifferentiated and distinctive, falling in cluster #14; my comments on South Italians/Sicilians also probably apply to them as well.
- There is a trace of structure in the Ashkenazi population, which is split into two clusters. This probably underscores the benefits of large samples in the inference of structure, as 25 Ashkenazi Jews have submitted their results to the Project.
- Project Russians have split affiliations between a circum-Baltic cluster #3 and the Finnish cluster #7.
- North Africans form two new clusters that do no overlap with either reference Mozabites or Egyptians. There is great variety in North Africa, and the 8 people who have submitted their samples are a good start to learning about this region of the world.
- The Chinese are split into two, one part aligning with the "southern" Miaozu and one part aligning with the "northern" Japanese.
Thanks, Diekenes. Is there anyway we can tell which clusters are most similar to each other? For example, if we look at a specific cluster (such as #4), which clusters did it share a parent cluster with before breaking up into this more specific clusters? Would a lower K value give us this answer?ReplyDelete
An euclidian distance matrix of the clusters could be very interesting too.ReplyDelete
I am quite surprised to see a definite French cluster to appear (which seems to be more common in the reference sample than in the dodecad one).
One can fairly conclude that despite the absolute lack of data on that French sample from Lyons, geneticists somehow selected people who are mostly of the same ethnic background (the Rhône valley, the Alps, Burgundy, ... which is about the autochtonous peopling of the town).ReplyDelete
Nevertheless, as I stated earlier on, Lyons being situated on a remarkable ethnic border (a very ancient one as Ligurian placenames end some miles South of Lyons around Annonay and still today that's the border between half Oïlic Arpitan dialects and Oc Vivaro-Alpine ones), it'd be much more interesting to know whether or not people in Drôme or Ardèche rather cluster with Provençal people than with Lyonnais people.
The North african group has 8 members which are splitted into two clustersReplyDelete
1)dod168 and dod169 from Msaken Tunisia
These samples are among the most admixed populations so far analysed