Dodecad Ancestry Project: fastIBD analysis of Balkans/West Asia

Saturday, January 14, 2012

fastIBD analysis of Balkans/West Asia

Now that I've discovered a way to boost Clusters Galore analysis even further by using fastIBD, I will start experimenting with different regional populations. This analysis took about 5 hours to complete, so it appears to be quite practical.

For my first experiment, I carry out an analysis of various populations from the Balkans and West Asia.

Clusters Galore

27 different clusters were inferred with 17 MDS dimensions. Some interesting findings:

For the first time there emerge a couple of clusters that appear to be quite specific to Armenians (#2 and #3).
Similarly, Assyrians are broken to a few clusters that appear fairly specific to them (#9-11)
Georgians are split into three clusters, one of which (#14) is linked with the neighboring Abkhasians, who in turn have their own exclusive cluster (#25)
The cluster modal in Greeks (#6) includes 14 of 19 Greek participants, and a few Greeks are also in the Balkan cluster (#8) and an Iranian-Turkish cluster (#4)
The Behar Cypriot sample also splits into two, and the few Turkish Cypriot participants link to one of them (#13)
The Ossetian project participant links to one of the three North_Ossetian clusters
The major Balkan cluster (#8) still defies resolution. I am certain, however, that structure in this cluster will be uncovered with more participation. MCLUST adapts the cluster size and shape, and a "big", inclusive cluster spanning the Balkans appears more parsimonious than smaller clusters centered on the different groups. With larger participation, I anticipate that regional structure will be uncovered in the Balkans as well.

I cannot stress the importance of participation strongly enough. When groups have more participants, it is possible to both:

Discover group-specific clusters, by identifying what is common between members of groups
Discover within-group clusters, by identifying what is different between members of groups

For example, the great participation of Armenians in the Project has now allowed me to discover structure within the Armenian population. It appears, that cluster #2 corresponds to a more "western" Armenian group, and #3 to a more "eastern" one, with some overlap between the two.

Inter-population IBD

You can also see a visual representation of inter-population IBD:

I have only included populations with 5+ participants in this representation. Reddish shades express high IBD sharing; bluish ones low one. The heatmap has been scaled by row.

As you might expect, values across the diagonal are "reddish", since individuals within populations tend to have high IBD sharing with each other.

A few features "pop out" of the screen. Going from top to bottom:

Intra-Iranic sharing
Intra-Armenian sharing
Intra-Balkan sharing
Georgian-Abkhaz sharing

You can probably get more out of the figure, but these appear to be the most salient features.

Results for Project Participants

The results can be found in the spreadsheet, and include:

Probabilities of assignment in each of the 27 clusters of the Clusters Galore analysis
Z-scores of IBD between each individual and each of the 20 populations with 5+ participants. Higher values mean more IBD sharing. Note that Z-scores have been calculated for each row, hence each participant must scan his own row to find populations with an excess (+) or deficiency (-) of IBD sharing, and people should not compare across different rows.

Last but not least, I want to remind new project participants to leave a message in the Information about Project samples thread. Your comment will not appear immediately, since comment moderation is on, and also note that there are multiple pages of comments.

If you haven't joined the Project yet, I encourage you to do so if you are eligible.

11 comments:

Alareiks GadrauhtsJanuary 14, 2012 at 2:27 PM
very very nice, thank you for this. however, i would have expected to be included, i am DOD772, romanian. did you include me into Romanians_D?
ReplyDelete
Replies
Dodecad ProjectJanuary 14, 2012 at 4:25 PM
Hi, I've responded via e-mail, since I don't want to discuss participants' ancestry except via the e-mail address used to submit the sample.
ReplyDelete
Replies
KartveliJanuary 14, 2012 at 6:24 PM
Thank you for the wonderful work. I was curious about the 18th cluster, consisting of two Georgians. Could you indicate if those are admixed individuals? (There's at least one obvious case, №19 https://fbcdn-sphotos-a.akamaihd.net/hphotos-ak-snc6/271075_2017531491856_1650935365_31935511_755865_n.jpg)
ReplyDelete
Replies
Onur DincerJanuary 15, 2012 at 12:01 AM
Gosh! When did the Dodecad Armenians reach 44 participants?
ReplyDelete
Replies
alextttJanuary 15, 2012 at 12:55 AM
You should have also included Ashkenazi Jews and Sephardi Jews.
ReplyDelete
Replies
Onur DincerJanuary 15, 2012 at 11:17 PM
Why are the Yunusbayev Kurds so remote from everyone else in the heat map?
ReplyDelete
Replies
KurtiJanuary 18, 2012 at 4:09 PM
@Onur might have something to do with the fact that the samples are taken from Kurds in Kazakhstan otherwise I don t know.
ReplyDelete
Replies
Onur DincerJanuary 18, 2012 at 8:49 PM
@Onur might have something to do with the fact that the samples are taken from Kurds in Kazakhstan otherwise I don t know.

Yeah, that was the explanation I had in mind too (Mr. Metspalu had already informed me that the Yunusbayev Kurdish samples are all from the Kurdish minority in Kazakhstan). Thanks for the explanation anyway.

For those who don't know, Kurds in Kazakhstan are a small and isolated minority and arrived there during the Soviet times from Transcaucasia mostly as a result of Stalin's mass deportations, so they are genetically a subset of the Transcaucasian Kurds with a probable recent population bottleneck added due to their small number and isolation from outside.
ReplyDelete
Replies
AnonymousJanuary 19, 2012 at 11:12 PM
Dienekes, in view of all this, what is your opinion on Armenian origins in the Balkans?
ReplyDelete
Replies
Dodecad ProjectJanuary 19, 2012 at 11:26 PM
I am sure that Armenians originated in the Balkans because:
1. of Herodotus' account
2. of the close linguistic relationship with Greek
3. of the lack of a close linguistic relationship with the Anatolian languages

To what extent that involved a substantial movement of people is a different issue that is difficult to resolve in the absence of ancient DNA.
ReplyDelete
Replies
AnonymousJanuary 19, 2012 at 11:50 PM
Of course, but more specifically, are you seeing any IBD sharing that could point to this movement of people?
ReplyDelete
Replies

Add comment

Saturday, January 14, 2012

fastIBD analysis of Balkans/West Asia

11 comments:

Data Sources

Useful software

Genome Bloggers

Project Links

Technical stuff