Monday, March 5, 2012

fastIBD analysis of Italy/Balkans/Anatolia

I have included the new Turkish data from Hodoğlugil & Mahley (2012) in this analysis. Additionally, there are now 5 participants in the Serb_D and Turkish_Cypriot_D sub-populations, as well as a Bosnian Muslim. There are now project participants from many Balkan countries, although Albania, the fYROM, and Croatia remain as "black holes" in the map.

Still, I am hopeful that there will be more project participants from currently under-represented populations. I have already started processing the same dataset with ChromoPainter (which takes much longer), and hopefully that analysis will be posted at the end of this week or the beginning of the next one.

First, the heatmap of inter-population IBD:

Remember that the tree groups similar populations together, and for each row in the matrix, the red end of the spectrum indicates lots of IBD sharing, and the blue end low IBD sharing. Additionally, I have now calculated the median IBD sharing, which is more resistant in the presence of potential relatives in the data.

The results appear fairly reasonable, with the Balkan, Anatolian, and Italian populations of the title forming separate branches, and the mainland Greek sample joining with Central/South Italians and Sicilians.

The Clusters Galore can be seen below; 28 clusters were inferred with 21 dimensions:



Results for Project participants can be found in the spreadsheet, and include the probabilities that each ID is assigned to each of the 28 clusters, as well as the Z-scores comparing each individual against all populations with 5+ individuals. The Z-score should be read as follows: for each row, high values indicate a high degree of IBD sharing, while low values indicate a low degree of IBD sharing.

Of course, I encourage Project participants to leave a message in the Information about Project samples thread.

27 comments:

  1. DOD232 here. I'm 100% assigned to the Greek cluster, as expected. However, like a lot of Greeks, I have higher Z-Scores with Romanians, Serbs and Bulgarians than I do with the Greek-D population. Is that itself an expected result?

    ReplyDelete
  2. Dear Dienekes, thank you for the interesting analysis, looking forward to the ChromoPainter results too. I don't know enough about some of the populations here, so I wonder if there is a particular reason why the Aydin group has a higher degree of sharing with the Balkans than with other Turkish groups. Are they recent Balkan immigrants?

    ReplyDelete
  3. aspromavro, some of the Aydin samples are obviously partially or fully descended from Balkan immigrants. They are the ones with significantly elevated European and significantly reduced West Asian components.

    ReplyDelete
  4. Replies
    1. That place:
      http://www.buildturkey.com/maps/de/MAZEDONIEN.png

      Its a region of Yoguslavia wich was called "Macedonia", even in yoguslavian times. When that region declared an independant country, it remained to name itself by the name "Macedonia" and the people calling themselfs "Macedonians".

      This caused the Greeks to go craisy and claiming, that those "Macedonians" are not real Macedonians but Slavs from Russia or where ever Slavs came from originally (hehe) and that real Macedonians are Greeks by blood and DNA and nobody else has any right to steal the name "Macedonians".

      Thats why the Greeks refuse to call that country "Macedonia" but instead call it "fYROM" wich means: "Former Yoguslavian Republic of Macedonia".


      ;-)

      Delete
  5. What's fYROM?

    The Former Yugoslav Republic of Macedonia

    aspromavro, some of the Aydin samples are obviously partially or fully descended from Balkan immigrants. They are the ones with significantly elevated European and significantly reduced West Asian components.

    BTW, the Aydin Turkish samples with probable recent Balkan origins are among the Aydin Turkish samples with the lowest Mongoloid component percentages. OTOH, the probably fully indigenous Aydin Turkish samples are the most Mongoloid admixed Turkish samples of all Turkish samples I have seen so far (probably due to the presence of Yoruks in that region).

    ReplyDelete
  6. Aydin samples are the most Turkic of all Turkish samples (Eurasian percentage in Aydin is above 10% about half of Turkmenistan levels).

    Turks settled in Balkans were mostly moved from the Aydin area. It is not clear at all what would make Aydin more Balkan like when almost all Aydin samples wall into POP 10 along with the Behar Central Anatolian samples and the majority of all other Turks.

    ReplyDelete
  7. What would be interesting is the following exercise:

    Exclude Aydin Turks with less than 5% East Eurasian and keep the rest (this would be more than 90% of the Aydin samples)

    Re-run the Heat Map.

    If it turns out that Aydin samples correlate closely with Romanians/Bulgarians this means something beyond mere back migration of Turks from the Balkans. My mom is a half Aydin Turk with 6-7% East Eurasian and Romanian and Bulgarian IBD sharing is very significant for her.

    We know Turkic people settled the Balkans before the Ottomans. I suspect this sharing with Romanians and Bulgarians is due to a much more ancient connection.

    ReplyDelete
  8. According by the Greek geographer Strabo, Aydin has been founded by Thracian tribes.

    Maybe this is that ancient connection ?

    ReplyDelete
  9. mr Anatolian Turkmen turkmenia and turkmens are culturally and phenotypcially and haplotypcially iranians who shifted to turkish the official language of the invaders and adopted it slowly as their mothertongue because there was many different iranic dialects used in Turkmenia (i.e dahistan) and the recent Turkmen language was uniform

    Gallians shifted from celtic to latin without any latin colonisation and iranic dahaes from actual Turkmenia shifted to Turkmen without Turk input and the east asian in them is geographical and is comparable to the east asian in Iranians and even lower than the east asians in iranic tadjiks

    Anatolia was even turkified (i.e simply language shift of the local Indo-Europeans Armenians and Greeks to Turkish) before Turkmenia/Dahistan

    As for the oghuz/kipchak matter it's an artefact as by 1300 there was still no oghuz/kipchak ramification indeed oghuz is simply kipchak Turkic with iranic superstratum and until 1300 there was a single Turkic from wich Oghuz dialect arose by 1400's

    To look for your probable Turkic input (or more accurately its introgression by raping) you should look for the yakuts or if you are so opinionated about the oghuz thing you should look for the oghuz speaking muslim salars of China wich contrary to Turkmen speaking Iranians of Turkmenia (who are Iranians or Turks heavily mixed with Iranians at a ratio of at least 80% Iranian 20% Turk) did not mix with the local indigenous Budhist chinese people

    ReplyDelete
  10. Thank you mr Dienekes for your highbrow blog as well as all folk here that seem very conoisseur, I think that this whole story of Ötzi confirms again that:
    1/It's the westasian-northeuropean bicomponent that traces the legacy of the folks that brought Indo-European languages to Europe

    2/The Sardinians and Ötzi's drag toward neareasterners is due to the lack in both of the westasian derived (most likely that differentiation occured northward beyond the caucasus) northeuropean component as well as the presence in both of the mediterranean component
    So we could argue that mediterranean component in Libyans is a legacy of "iberianoic" speaking mesolitihic Ibero-Maurussians whilst the westasian+southwestasian component in Libyans are legacy of the Afrasanic speakers (a branch of the kartvelian-indoeuropean-afrasan macrophylum) be them imazighen or arabs

    Do you agree on this matter dear friends; I am awaiting a backing on these 2 matters before communicating them to my people

    Thanks

    ReplyDelete
  11. It is not clear at all what would make Aydin more Balkan like when almost all Aydin samples wall into POP 10 along with the Behar Central Anatolian samples and the majority of all other Turks.

    13 out of the 20 Aydin Turkish samples fall into Pop 10. In contrast, the majority of the samples of all the non-Aydin Turkish populations (including the Behar Central Anatolian Turkish population) fall into the clusters other than Pop 10; in fact, no cluster is in the majority among them. A minority of the Aydin Turkish samples obviously have recent Balkan origins, however partial, this is clear from their European and West Asian component percentages in the K12b calculator.

    Exclude Aydin Turks with less than 5% East Eurasian and keep the rest (this would be more than 90% of the Aydin samples)

    4 out of the 20 Aydin Turkish samples have less than 5% East Eurasian admixture, excluding them would leave us with 80% of the Aydin Turkish samples.

    If it turns out that Aydin samples correlate closely with Romanians/Bulgarians this means something beyond mere back migration of Turks from the Balkans.

    Even a few of the Aydin samples with more than 5% East Eurasian admixture might have some partial origin from recent Balkan immigrants. I am saying this based on the percentages of the European and West Asian components.

    My mom is a half Aydin Turk with 6-7% East Eurasian and Romanian and Bulgarian IBD sharing is very significant for her.

    May I ask you to which populations your mother's other half of ancestors belong?

    We know Turkic people settled the Balkans before the Ottomans. I suspect this sharing with Romanians and Bulgarians is due to a much more ancient connection.

    Unlikely because of the very low East Eurasian admixture percentages of the Balkan populations (including Romanians and Bulgarians). The relative (=compared to the other Turkish populations) abundance of samples with significant levels of recent Balkan immigrant origins explains the elevated levels of genetic sharing with Balkan populations in the Aydin Turkish population much better.

    ReplyDelete
  12. Hi Dienekes,

    I am DOD846, in previous fastIBD (I think it was in Feb or maybe in Jan this year, I am not too sure with the date)my highest score was with Bulgarians I think it was somewhat 2.5?

    Anyway since I am half Kayserili half Mersinli, I didnt take the previous project that serious since my family doesnt have any sort of immigration background.

    Now according to the newest fastIBD, my top matches are:

    Serb_D 3.08
    Turkish_Kayseri_Ho 1.14
    S_Italian_D 1.05
    Turks 0.91
    North Italian 0.63
    Bulgarians_Y 0.63

    And I am clustering with the population 10.

    A. It makes sense that I am clustering in population 10, since everybody in that population is Turkish like me.

    B. It doesnt make any sense that my first best match is again non-Turkish (it is Serb_D 3.08) and extremly high ! Since I dont have any Balkan background and since I cluster in population 10 with other Turks and since generally speaking usually my admixture results by other projects are somewhat similar to the Turkish average.

    I think there has been a wrong calculation regarding my results. Do agree on that? If not what's the reason that I got again a strange result?

    ReplyDelete
  13. The results have not been miscalculated. Of course, the Serb_D group is small, but you seem to have high IBD sharing with a few Serbs:

    IBD sharing with Serb_D (in cM):

    7.385 9.204 14.673 18.491 18.547
    Median: 14.673
    Mean: 13.66

    IBD sharing with Turkish_D:

    2.001 2.899 3.595 4.05 4.256 5.616 5.782 6.17 6.863 7.933 8.022 8.441 8.539 8.617 8.699 10.164 10.205 10.387 10.578 10.905 11.898 12.052 12.253 12.797 13.853

    Median: 8.539
    Mean: 8.263

    As you can see, you appear to share particularly long IBD tracts with 3 of the 5 Serbs.

    IBD sharing corresponds to recent ancestors; admixture to the overall genomic ancestry. So, for example if someone shares a few recent ancestors with a particular group A, and more, but more distant ancestors with a different group B, he may have higher IBD sharing with A, and higher overall genetic similarity with B.

    ReplyDelete
  14. Okay now in plain English, does it mean I have got some recent mixed Balkan ancestory that I dont know?

    ReplyDelete
  15. The simplest explanation is that you have some recent Balkan ancestors. It's also possible that some Serbs and Bulgarians have recent Anatolian ancestors similar to yourself.

    You should run DIYDodecad in byseg mode to find segments with high "North European" or "Atlantic Med" levels, as these are more likely to come from the Balkans. You can also look at your segment matches in 23andMe to see if you share with people from the Balkans, and then look at those regions with DIYDodecad.

    ReplyDelete
  16. I tried to run run DIYDodecad in byseg, but I was not successful, I just didnt manage it.
    Anyway maybe you are right, maybe these tested Serbs and Bulgarians are not ethnic Balkan people but more mixed with Turks, so that's why I got each time so high score :p

    ReplyDelete
  17. Are you saying that those Serbs are really ethnic Turks and not Balkan people?

    ReplyDelete
  18. The Serbs cluster with other Balkan people, and there is no reason to think that they are ethnic Turks. For whatever reason, 5aday shares relatively high IBD segments with 3 of them, and it is impossible to determine what these segments correspond to (e.g., whether they are of Balkan or Anatolian origin).

    On the whole, I'd say that Balkan origin is more likely, since the Turkish_D sample appears to share substantially with Balkan populations (see heatmap, Turkish_D), whereas the Balkan populations do not appear to share substantially with Turkish_D, but, as I've said, one would have to look at each segment individually.

    ReplyDelete
  19. Also, the IBD segments make a small part of the genome, hence whether they are Balkan or Anatolian in origin has little bearing on the overall genetic ancestry of these individuals. When looking at the overall pattern of sharing, as summarized by the Clusters Galore analysis, 5aday falls in a cluster consisting of mostly Turks, and the Serbs fall in a cluster consisting mostly of Balkan populations.

    ReplyDelete
  20. On the whole, I'd say that Balkan origin is more likely, since the Turkish_D sample appears to share substantially with Balkan populations (see heatmap, Turkish_D), whereas the Balkan populations do not appear to share substantially with Turkish_D

    As high IBD sharing is strongly correlated with recent ancestry and relatedness, this is something entirely expected, because in the Eastern Mediterranean it is the Muslims, as the ruling power, who have been assimilating, converting and enslaving non-Muslims and admixing with them since the spread of Islam, not non-Muslims, who were under the yoke of Muslims, so clearly the gene flow between Muslims and non-Muslims has historically always (there might have been some exceptions we don't know) been one way in the Eastern Mediterranean: from non-Muslims to Muslims and not the other way around (this is also consistent with the ADMIXTURE results).

    ReplyDelete
  21. A user from ABF run DIYDodecad in byseg mode for me, and this is my result, but I dont know how to interpret it, do you mind helping me?


    Chromosome #SNPS Gedrosia Siberian Northwest_African Southeast_Asian Atlantic_Med North_European South_Asian East_African Southwest_Asian East_Asian Caucasus Sub_Saharan

    1 13311 8.03 4.86 3.89 5.35 11.52 6.99 0.00 0.16 11.34 2.13 44.37 1.37

    2 13107 17.85 3.26 9.63 2.74 4.20 10.28 0.04 0.00 8.00 0.95 43.05 0.00

    3 11325 9.55 11.50 0.62 0.00 7.60 10.05 4.82 3.91 15.43 0.00 36.52 0.00

    4 9969 16.46 7.05 4.48 0.00 4.07 9.39 2.05 0.01 6.08 6.13 44.27 0.00

    5 10276 11.12 6.04 0.00 0.00 5.47 14.45 0.59 0.00 3.69 7.21 51.43 0.00

    6 10659 15.06 3.91 2.70 0.00 16.68 4.97 0.64 0.02 9.89 6.28 39.86 0.00

    7 9070 10.42 7.22 5.80 0.53 9.81 7.23 0.32 0.02 7.17 2.02 49.44 0.00

    8 9168 16.24 8.71 4.72 5.79 17.52 7.16 0.00 1.23 14.79 0.85 22.90 0.08

    9 8079 21.19 6.22 8.77 1.36 1.15 5.99 0.00 0.00 4.94 3.61 46.77 0.00

    10 8980 8.94 7.71 2.96 2.54 10.54 5.74 0.00 0.00 15.07 0.01 46.07 0.43

    11 8075 7.66 2.59 0.00 5.78 11.41 15.86 0.36 0.00 13.42 0.00 42.92 0.00

    12 8343 12.31 0.00 3.81 0.56 15.06 8.93 1.10 1.00 15.32 0.00 41.90 0.00

    13 6308 7.32 0.00 3.60 0.30 7.73 10.34 0.00 0.00 10.46 3.67 56.58 0.00

    14 5733 3.81 6.17 2.69 4.14 13.20 0.00 0.00 0.34 10.73 0.22 58.70 0.00

    15 5299 14.19 5.17 16.00 1.91 3.31 7.51 5.46 1.03 11.08 0.01 34.32 0.00

    16 5432 13.58 0.47 8.49 0.00 24.86 14.60 0.00
    2.73 8.66 0.06 26.52 0.04

    17 4897 13.76 0.00 5.01 0.09 6.11 16.44 0.05 0.00 12.42 9.03 37.02 0.07

    18 5402 10.39 0.10 0.00 0.00 17.29 10.34 0.00 0.78 15.38 0.00 45.72 0.00

    19 3469 13.67 0.85 2.82 0.00 8.02 2.10 0.95 2.74 20.16 1.36 44.50 2.83

    20 4634 2.69 0.00 5.91 0.00 6.62 6.09 10.33 0.00 18.84 0.01 49.51 0.00

    21 2621 10.77 0.00 0.10 7.25 13.19 7.01 11.27 0.00 10.76 0.69 38.96 0.00

    22 2613 16.14 1.56 0.01 11.81 20.67 8.54 0.00 0.00 8.87 0.00 32.40 0.00

    ReplyDelete
  22. Could Cluster 5 be a Caucasian tribe? Many Circassians were settled in Central Anatolia (incl. Kayseri) after the Russo-Turkish War.

    ReplyDelete
    Replies
    1. No need to invoke any Circassian or any other Caucasian connection, as the cluster 5 is pan-Anatolian-Caucasian.

      Delete
  23. To expand on my previous comment, I'm Turkish and in Cluster 5 (99%), with highest Z-values Romanian and Georgian. I'm a bit confused how this latest analysis compares with prior ones as I seem to recall I was in the "Anatolian" cluster of Turks in a prior analysis. Some of my great grandparents are immigrants from the Balkans and Caucasus and most my family's from the Black Sea so it's interesting to find out I'm in the same cluster as Turkish_Kayseri. Finally, forgive the newbie question, but when looking at Z values, do you look at the absolute value or do the negatives actually imply greater dissimilarity? Thanks!

    ReplyDelete
  24. To be more precise, the cluster 5 seems to be centered in Anatolian and South Caucasian populations. Circassians, OTOH, are North Caucasian. In fact, I don't think any Circassian or any other North Caucasian would fall in the cluster 5 if they were included in the analysis.

    ReplyDelete
  25. Hi Dienekes,

    Great analysis and also great to see the Cypriot sample expanding. Basically based on this sample, we can conclude than an average 3 out of 5 Turkish Cypriots are genetically indistinguishable from Greek Cypriots! I can share the raw autosomal data (FTDNA) of 4 additional (unrelated) Greek Cypriots if you are interested to use them in future calculations. By the way, I have noticed a big discrepancy between my 4 Greek Cypriot samples and the 12 'Behar et al' Cypriot samples used in the Dodecad project. The main difference is in the dv3 k=12 calculator, where the Dodecad raw averages data spreadsheet shows a mean Western European admixture of 1.3% among Cypriots. My 4 Greek Cypriot samples have the following Western Eur admixtures in the DIY Dodecad dv3 calculator: 7.75%, 8.07%, 8.18% and 10,64% respectively. This difference is really substantial and I am really puzzled about it. One explanation is that I am referring to an outdated spreadsheet with outdated data (results look more comparable in the k12b calculator). The other explanation is that I am doing something wrong in my calculations. Alternatively, if we assume that the data are correct and up to date and that I am doing my calculations correct, there appear to be 2 distinct Greek Cypriot clusters, one with a much larger Western European admixture than the other. To be honest I doubt that the explanation is the latter.

    ReplyDelete