Tuesday, May 31, 2011

ANI/ASI analysis of HGDP Pakistan groups

Until recently, it has been difficult to study the Ancestral North Indian/Ancestral South Indian (ANI/ASI) composition of Pakistan groups, as these fell outside the "Indian Cline" of Reich et al. (2009). My recent experimental reconstruction of ANI/ASI zombies, as well as West-Eurasian ones allows me to do a supervised run on them and see how they fare.

(One caveat is that this is based on ~30k SNPs, as the two different kinds of populations I am using include ~120k and ~150k SNPs, but not the same ones).

Overall, the results make sense (they can be seen on the left, as well as on this spreadsheet):
  • The components of the ANI and West Asian "zombies" dominate most populations; I suspect that as the two are related it may be difficult to distinguish between them
  • Intriguingly, Kalash continue to be dominated by West Asian, now that the composite "South Asian" has been resolved, and their ASI levels are similar to those in Iranians.
  • Conversely, the higher ANI are found in Pathans and Sindhi, i.e., precisely the populations used by Reich et al. (2009). Hence, I suspect that ANI in the sense of Reich et al. (2009), as reconstructed by myself, may be biased towards these two populations. Also note that my ANI reconstruction used the same Pathans (15) and Sindhi (10) used by Reich et al. (2009), whereas in this one all HGDP individuals are included.
  • The East Asian component turns up in the Hazara and the Burusho, in agreement with previous experiments
  • The Southwest Asian component turns up in Balochistan (Balochi, Brahui, Makrani), which also makes sense, linking that Iranic speaking region to nearby Iran where that component is also important
  • The North European component comes up in Hazara, Burusho, and Pathans, which again makes sense, as these populations may have been influenced by people from further north in historical times.
In conclusion, I would say that while the ANI/ASI "zombies" do capture real South Asian signals, as evidenced by my Gypsy experiment, but the reconstructed ANI does not capture the entirety of West Eurasian admixture in South Asia: a lot of it continues to be associated with West Asia, and a little with Northern Europe in some populations.

Monday, May 30, 2011

More Zombies: Ancestral North Indians and Ancestral South Indians reborn

In my previous post I showed how synthetic individuals corresponding to ADMIXTURE ancestral components can be created and used. This was made possible by the fact that ADMIXTURE outputs allele frequencies for its components, which can be utilized to create a population of random genotypes with the same allele frequencies.

A more difficult task is to create such "zombie" individuals when there are no allele frequencies at hand. A prime example of this is the paper by Reich et al. (2009) on the two ancestral components in Indians: Ancestral South Indians (ASI) and Ancestral North Indians (ANI). The paper provides admixture estimates for these two components in present-day "Indian Cline" groups, but no allele frequencies for these components: we only knew that ANI was closely related to West Eurasians, and ASI formed a clade with the Onge from the Indian Ocean.

Both ANI and ASI are extinct (in pure form) populations, and they are blended (in varying proportions) in modern day Indians, with highest ANI occurring in the Northwest and among upper caste groups, and highest ASI among South Indian tribal and low caste populations.

As I was thinking of ways to extend the "zombie" approach, it occurred to me that there is a fairly involved way to extract the ANI/ASI allele frequencies from the available evidence:

If f(ANI) and f(ASI) are the allele frequencies at a locus for ANI and ASI, and an admixed population P has x fraction of ancestry from ANI and 1-x from ASI, then its allele frequency is expected to be:

x*f(ANI)+(1-x)*f(ASI) = f(P)

I have marked (in bold) the known variables. Obviously, this equation does not hold in practice, because of sampling error, uncertainty in the estimation of x, as well as genetic drift that may affect the allele frequencies of the admixed population.

Nonetheless, we do not only have one equation of this sort, but 18, since Reich et al. (2009) provides ANI/ASI estimates for 18 different Indian Cline populations. We can thus fit a linear regression to recover f(ANI) and f(ASI).

This is exactly what I did; there are two important caveats:
  • because most of the Reich et al. (2009) populations are very small, f(P) is expected to be very noisy. I thus grouped the Indian Cline populations into five groups (based on increasing ANI, and making sure that each one had >15 individuals), and calculated admixture proportions (x's) and allele frequencies (f(P)'s) on these groups.
  • linear regression coefficients (the f(ANI) and f(ASI) estimates) may be less than 0 or more than 1, which makes no biological sense, so these were fixed to 0 and 1 in a few cases whenever that was the case (~5% of markers)
All of this required a bit of thinking and work, so I was very skeptical that it would work; given sampling/admixture estimation errors/limitations of regression/random creation of individuals, the whole process from input data to output "zombies" passed through so many layers, that it could very well lead to nonsense.

Nonetheless, there is power in numbers, and I was hopeful that this might work. If it did, I could have synthesized ANI and ASI populations to play with and use pretty much like regular populations in a variety of experiments.

Validation of synthetic ANI/ASI populations

I generated 25 ANI and 25 ASI individuals using the above-described method. There are 119,588 SNPs in these populations.

To validate them, I ran supervised ADMIXTURE using these ANI/ASI individuals as ancestral populations, and all the Indian Cline populations as test data. The results can be seen below:
Although the estimates for some populations (e.g., Chenchu: 31 vs. 40.7%) are substantially off, the median error is 1%, and the average error is 2.4%. Overall, it does appear that the synthetic ANI/ASI individuals are fairly good standins for their (extinct) populations.

Ancestral North Indians

I included ANI together with the 4 West Eurasian components of the Dodecad Project in an MDS plot:
Also, a neighbor-joining tree:
Putting ANI/ASI to work: Romanian Gypsies

I have previously detected 2 individuals in the Behar et al. (2010) Romanian sample that are likely to be of Roma (Gypsy) heritage. Here is a supervised admixture of the Romanian sample using the ANI/ASI components:
The previously detected individuals do possess both ANI and ASI components, indeed these are:

18.1, 15.3
16.9, 16.4

in the two individuals, which might be useful in constraining geographically the origin of European Gypsies along the Indian Cline.

Putting ANI/ASI to work: Iranians

Iranians generally show affinity to South Asians. Is this affinity related to the common Indo-Iranian background of Iranians and Indo-Aryans, or, is it, perhaps, due to the absorption of South Asian population elements during Iran's long imperial past?

The ANI/ASI components in the Iranians and Iranian_D samples are:

11.7, 7.5
12.0, 6.9

Compared to the previously described Romanian Gypsies, the South Asian component in Iranians tends to be clearly tilted towards ANI.

How to create Zombies from ADMIXTURE etc.

ADMIXTURE infers K ancestral populations, and estimates the admixture proportions of individuals from these K populations, as well as the allele frequencies for all SNPs for each ancestral population.

An interesting use of the allele frequencies is to generate synthetic "zombies" from the ancestral populations. These are artificial individuals whose genotypes are drawn randomly based on the allele frequencies. For example, there is a "West Asian" component in the Dodecad Project, but no individuals who have 100% membership in the "West Asian" component. A "West Asian" zombie is a synthetic individual who appears to be drawn from that "West Asian" component only, without any other (e.g., "South European", or "Southwest Asian") admixture at all.

"Zombies" may be viewed as either useful theoretical abstractions, or as reconstructed hypothetical ancient-like individuals, purged of centuries or millennia of admixture. Irrespective of how one views them, they are very useful as a tool.

Zombies of K=10 components

I generate 25 zombies for each of the 10 ancestral components of the Dodecad Project. Below, you can see an MDS plot of these 250 individuals, which is quite similar to the MDS plot generated using only the Fst divergences between the ancestral components.
Including real and "zombie" populations

I include the "West African", "North European", and "South European" zombie populations, together with 25 African Americans (ASW) from HapMap-3:
Notice the direction of the African American cline: slightly tilted towards North Europeans. This makes sense as the European ancestry of African Americans is derived mainly from Northwestern Europe and neither exclusively from the Mediterranean or Northern Europe where the "South European" and "North European" components peak.

Convert unsupervised ADMIXTURE runs to supervised ADMIXTURE

The most exciting use of "zombies" is to convert unsupervised ADMIXTURE runs into supervised ones. In unsupervised mode, ADMIXTURE treats all individuals alike, and tries to infer their ancestral proportions. In supervised mode, some individuals are treated as "fixed" (belonging 100% in one of K ancestral components), and the ancestry of the rest is inferred.

The idea is fairly simple: run an unsupervised ADMIXTURE analysis once to generate allele frequencies for your K ancestral components; then generate zombie populations using these allele frequencies; whenever you want to estimate admixture proportions in new samples run supervised ADMIXTURE analysis using the zombie populations.

You can thus use the zombie populations to mimic a regular (unsupervised) ADMIXTURE run. This is useful for two reasons:
  1. It can be much faster: the initial set (of the unsupervised run) can be huge, but the zombie populations need only be large enough to capture the allele frequencies of the inferred components.
  2. It avoids the generation of spurious clusters, especially if you include individuals from highly-inbred populations, or a large number of test individuals
I re-estimated admixture proportions for the 9 individuals of the last run, using the "zombie" populations in a supervised ADMIXTURE run. This took less than 1/10 of the time, and achieved results that were highly concordant with the ones previously reported: correlation was +0.999729; the average difference in ancestral proportions was 0.3%, the maximum difference 2.1%.

The speedup is due to two reasons: first, I'm running ADMIXTURE on 250 "zombie"+9 real individuals, as opposed to 692+9 real individuals using the unsupervised method. Moreover, admixture proportions are only estimated for the 9 real individuals and are fixed for the 250 "zombie" ones. This idea seems to work like a charm.

More average K=10 results

I was also able to calculate admixture proportions for the 10 Dodecad components in Druze, Kalash, and Palestinians. These populations have a tendency of forming their own population-specific clusters, so they are very difficult to compare against other populations: you just can't get their breakup into ancestral components easily, because they become their own ancestral components at fairly low K.

Using the trick of "zombie" populations, we can determine their ancestral components and compare them with other Dodecad populations.

I have labored long to be able to compare these to the ones in the standard Dodecad set, and I am very pleased that I was finally able to achieve it:
  • Both Druze and Palestinians have substantial "Southwest Asian" component as do most Semitic (Arab, Jewish, Ethiopio-Semitic) populations in my database
  • Druze have more "West Asian" than "Southwest Asian", and the reverse is true for Palestinians
  • Palestinians have more African admixture than Druze

By far, the most exciting thing about this analysis are the results for the Kalash, a population that speaks a language of the Dardic group of Indo-Iranian. Some linguists place Dardic languages in the Indo-Aryan subgroup (of which Sanskrit and Hindi are the most famous representatives), whereas others view Dardic as a third branch of Indo-Iranian together with Iranian (like Kurdish, Persian, or Pashto) and Indo-Aryan. In any case, the study of these mountaineers is extremely crucial to the study of Indo-Iranians in general.

The Kalash have been much mythologized as either long-lost Aryans or the descendants of Alexander the Great's soldiers.

The absence of the South European component among them agrees with Y-chromosome research about the absence of a Mediterranean or Greek influence in that population. The Kalash are completely split between the West Asian component (56%) and the South Asian one (43.5%). Indeed, their West Asian admixture is very high compared to my south Asian populations, exceeding even that of the Pathans (~40%) and reaching levels found only in West Asia proper. It is also perfectly consistent with my theory of Indo-Aryan origins in West Asia.

The way forward

I initially considered the idea of zombies as a way to include more Project participants in my detailed ADMIXTURE runs, such as the recent K=12 and K=11 ones. There are two problems with these runs:
  • Each one takes 24+ hours to complete, so it is not exactly possible to replace the standard K=10 analysis with them just yet
  • Including all project participants, especially those of mixed background, makes them completely impractical, in addition to making them very capricious: at high K different components begin to appear depending on sample composition, and the solution is not as robust as in the standard K=10 analysis.
With the use of zombie populations, these problems can be largely solved. I can spend many hours or even days in a very detailed ADMIXTURE run with a large sample, create "zombie" populations from the inferred results, and then run project participants fairly fast using these "zombie" populations and supervised ADMIXTURE mode. In fact, I am working on exactly this type of test at the moment, so project members of all backgrounds should expect good things to come in the next days or weeks.

Friday, May 27, 2011

List of populations on the brink of achieving 5+ participants

I have just posted updated K=10 ADMIXTURE results for 27 different Dodecad populations.

However, there are several other populations that have 3-4 participants so far, so they are on the brink of reaching the 5-person mark that will allow me to use them in more experiments, and to post average proportions for.

So, if you belong to one of these populations, feel free to send me your data during the current submission opportunity:
  • Algerian_D
  • N_Italian_D
  • Dutch_D
  • Korean_D
  • North_African_Jews_D
  • Danish_D
  • Moroccan_D
  • Slovenian_D
  • Tunisian_D
  • Bulgarian_D
  • Mixed_Scandinavian_D
There are also 15 more populations with 1-2 individuals, as well as many individuals that are on my to-do list for population assignment.

So, if you meet the eligibility criteria, you are encouraged to join.

In particular, I would love more samples from the Balkans, North Africa, the Near East, the Caucasus, Central Asia, or Siberia, as these are some of the most underrepresented regions, but all eligible submitters are welcome.

Thursday, May 26, 2011

K=10 admixture proportions of Dodecad populations

You can find new K=10 admixture proportions for 27 different Dodecad Project populations (with sample size of at least 5) in this spreadsheet

Results for DOD695 to DOD703 posted

Admixture proportions can be found in the spreadsheet.

Don't forget to leave a message in the ancestry thread to help yourselves and others make better sense of these results.

Read about the eligibility criteria of the current submission opportunity. Please subscribe to the feed to be alerted of new ones.

All populations:

Individual bars:

Tuesday, May 24, 2011

Italy, the Balkans, and Anatolia

Here is a PCA plot of Italian, Balkan, and Anatolian samples, together with some reference populations. I've removed 2 Roma-admixed Romanians and 3 Northern European-admixed Armenians from the Behar et al. (2010) set.
Below are MCLUST results:
Below you can see the shape of the 7 clusters:
With increasing sample sizes, I expect the validity and distinctiveness of the various clusters to improve; as you can see, there are several samples bordering different clusters or far away from most of them.

If you write to me with your ID, I will send you your results: cluster assignment followed by the two co-ordinates, so that you can locate yourself on the plot.

Monday, May 23, 2011

Clusters Galore: African edition

I gathered about 900 individuals from the Dodecad Project and the literature and run Clusters Galore on them. Dodecad Project members were included if any of the following conditions held:
  1. They were in the North_African_D, North_African_Jews_D, or East_African_D population
  2. They had at least 10% "Northwest African" in the standard K=10 analysis
  3. They had at least 10% "West African" in the standard K=10 analysis
  4. They had at least 10% "East African" in the stanard K=10 analysis
Integrating among multiple datasets meant that the analysis was done on only 1,871 SNPs, which were however sufficient to infer 28 clusters with 9 PCA dimensions retained.

PCA plots

Below are PCA plots of the first nine dimensions; each dimension is paired with the 1st one to create 8 different scatter plots:








Galore/PCA results

All the results can be found in the spreadsheet, which contains:
  • Population Galore results: how many individuals from each population are assigned to each cluster
  • Individual Galore results: what is the probability that each ID belongs in each cluster (in %)
  • Population PCA results: mean positions of populations in the first 20 principal components
  • Individual PCA results: position of individuals in the first 20 principal components.
Some observations

It does appear that within Africa two processes explain most of genetic variation:
  1. Variable affiliation with West Eurasians, with North Africans being most West Eurasian-like, followed by some East Africans such as Ethiopians, and then Maasai
  2. Contrast between farmers and hunter groups such as the San and Pygmies
With respect to Project participants:
  1. All members of East_African_D align with Ethiopians and Ethiopian Jews (cluster #2)
  2. All members of North_African_Jews_D align with a major cluster of the Morocco_Jews of Behar et al. (2010) (cluster #3)
  3. Members of North_African_D are split over three clusters: most are in the Algeria/Libya/North Morocco cluster #4; one is in the Egyptian cluster #5; three are in the Mozabite/Saharan/"Berber" cluster #7
  4. The Other_D participants include many African Americans, as well as some Arabs with some African admixture, etc. Clusters #8 and #10 seems to include many African Americans.
Hopefully this will be useful to project partipants with an African background.

Results for DOD684 to DOD694 posted

Admixture proportions can be found in the spreadsheet.

Don't forget to leave a message in the ancestry thread to help yourselves and others make better sense of these results.

Read about the eligibility criteria of the current submission opportunity. Please subscribe to thefeed to be alerted of new ones.

All populations:

Individual bars:

Sunday, May 22, 2011

Results for DOD674 to DOD683 posted

Admixture proportions can be found in the spreadsheet.

Don't forget to leave a message in the ancestry thread to help yourselves and others make better sense of these results.

Read about the eligibility criteria of the current submission opportunity. Please subscribe to thefeed to be alerted of new ones.

All populations:

Individual bars:

Saturday, May 21, 2011

The Turkic cline

(Last Update: May 22)

I have included all Turkic populations available to me, together with all Turks of the Project (Turkish_D), as well as some other individuals that indicated other Turkic ancestry (e.g., Azeri, Kazak, Tatar) in a new analysis. I have also included Greeks, Armenians, Georgians, Syrians, Poles, and Russians to represent non-Turkic West Eurasian populations, as well as Koryak and Han to represent non-Turkic East Eurasian populations, Mongol, Evenk as Altaic but non-Turkic populations (Mongolic, Tungusic), and Yoruba as a Sub-Saharan outgroup.

Here is the PCA plot:
The line on the plot is a regression of all individuals except Han, Syrians, and Yoruba, all of which appear substantially off the main cline. The line's equation is:

y = 0.4044x+0.0147

with R2=0.9738.

It appears that a simple East-West Eurasian cline captures the main features of variation in present-day Turkic people, as could be expected for a people of an East Eurasian origin that spread across Central Asian and West Eurasian lands that were previously occupied by people of West Eurasian origin.

I have also carried out an ADMIXTURE analysis (K=3) which is shown below:

All the above analyses were performed on a set 16,278 SNPs that were in common between all populations, and were sufficient to capture these continental-wide contrasts, although probably a bit noisier than usual.

The average positions of populations in the PCA plot and their average ADMIXTURE proportions can be seen in the table below:

Project participants who wish to know their own numbers will get a line of five numbers: the 2 PCA co-ordinates, followed by their 3 ADMIXTURE proportions, as in the table above. Write to me (dodecad@gmail.com) to get those results.

Related:

UPDATE (May 22): A reader asks whether my observations on sample sizes are relevant in this case.

First of all, in the post Beware of sample sizes: why Ancestral North Indians came from West Asia, not Eastern Europe I showed that the projection of populations on principal components created by populations of unequal size tended to be biased toward the larger-sample population. In this case this is not a problem, as all samples were used to calculate the first two principal components.

Second, as a precaution, I used 98 West Eurasian non-Turkic samples and 105 East Eurasian non-Turkic samples, hence the two poles of the experiment were of approximately equal sample size, and the effect described in the previous post would not be observed even if this was a projection along that axis.

Third, the Yoruba (N=21) were used in this experiment as an outgroup to tease out the African shift of Syrians who, as can be seen, do not fall simply on the East-West Eurasian cline. The smaller sample size is not a problem here, as the objective of this analysis is not to measure shift on the Eurasian-African axis, but rather to see whether Turkic groups fall on a West-East Eurasian cline, which they do.

Results for DOD665 to DOD673 posted

Admixture proportions can be found in the spreadsheet.

Don't forget to leave a message in the ancestry thread to help yourselves and others make better sense of these results.

Read about the eligibility criteria of the current submission opportunity. Please subscribe to the feed to be alerted of new ones.

All populations:

Individual bars:

Friday, May 20, 2011

Results for DOD656 to DOD664 posted

Admixture proportions can be found in the spreadsheet.

Don't forget to leave a message in the ancestry thread to help yourselves and others make better sense of these results.

Read about the eligibility criteria of the current submission opportunity. Please subscribe to the feed to be alerted of new ones.

All populations:

Individual bars:

Results for DOD643 to DOD655 posted

Admixture proportions can be found in the spreadsheet.

Don't forget to leave a message in the ancestry thread to help yourselves and others make better sense of these results.

Read about the eligibility criteria of the current submission opportunity. Please subscribe to the feed to be alerted of new ones.

All populations:

Individual bars:

Thursday, May 19, 2011

Results for DOD634 to DOD642 posted

Admixture proportions can be found in the spreadsheet.

Don't forget to leave a message in the ancestry thread to help yourselves and others make better sense of these results.

Read about the eligibility criteria of the current submission opportunity. Please subscribe to the feed to be alerted of new ones.

All populations:

Individual bars:


Wednesday, May 18, 2011

Open-ended submission opportunity for 23andMe/Family Finder data (#3)

Who is eligible

Everyone who is of European, Asian, or North African ancestry and all four of his/her grandparents are from the same European, Asian, or North African ethnic group or the same European, Asian, or North African country.

Please do not submit samples of relatives, as these make analysis difficult. I consider 2nd cousins and closer relatives to be related.

See note on eligibility criteria on possible exceptions, but send e-mail to me first (not data) to confirm that I can accept your sample.

I am sorry that I can't process everyone's data, so if you don't fit the above criteria and you feel you should be included, feel free to write to me and I will keep it in mind. Also, you can subscribe to the feed for future opportunities to submit your data.

Note that I take data quality very seriously, and participants who misrepresent their ancestry or submit samples of relatives without consultation will be excluded from the Project.

What you will receive

You will receive the standard K=10 analysis results such as this, and you will also be eligible for other types of tests such as Galore analysis, IBS comparisons, experimental tests or regional studies.

Your raw data or genealogical information will not be shared or distributed in any manner, and it will not be analyzed for any other purpose than assessment of ancestry (i.e., not for any physical or health-related traits). It will be identified by a unique ID, known to you and me, and results will be posted in the blog using that ID. I will continue to analyze your data for ancestry, and new results will be posted using that same ID. Also, I will report aggregate results for populations with at least 5 participants.

What to send

Send your zipped autosomal data to dodecad@gmail.com. I can accept data in either 23andMe or FTDNA Family Finder format. Also let me know something about your ancestry (e.g., ethnic group, country of origin of grandparents, or anything else that might be useful).

Sunday, May 15, 2011

600 members

After DOD633, there are now 600 active and unrelated members in the Project. I have included all these members in a few experiments that were aimed to:
  1. Test the bootstrap-based standard error reporting in ADMIXTURE 1.12
  2. Create a simple global test that would allow me to get a quick feel of where a sample comes from
  3. Test a few of my observations about minute shifts towards distant populations that I have been writing about in my other blog of late.
  4. Compare supervised vs. unsupervised ADMIXTURE modes
So, please take the data for what they are: an experiment to mark a milestone in the project, and a comparative data dump, rather than a definitive breakup of your ancestry.

I have included five ancestral groups in addition to the Project participants: Papuans, Karitiana Amerindians, Lithuanian/Tuscan Europeans, Mbuti/San Palaeoafricans, and She/Tujia East Asians. The analysis is based on 138,839 SNPs after quality-control and LD-based pruning.

There are four different experiments:
  1. Supervised ADMIXTURE analysis, with five ancestral groups (K=5)
  2. Unsupervised ADMIXTURE groups (K=6). Asian and European Caucasoids split at K=5 so I upped K to 6 in order for all the five ancestral groups to be recreated.
  3. Principal Components analysis (all samples)
  4. Principal Components analysis (Dodecad samples projected on 5 ancestral groups)
The two PCA plots can be seen here:

All samples (PC1: 4.05%, PC2: 2.70% of variance):
Projected (PC1: 12.73%, PC2: 7.85% of the variance):
The raw numbers for all participants can be found in the spreadsheet.

Friday, May 13, 2011

Results for DOD623 to DOD633 posted

I am reposting this, as I don't know whether the original post survived the recent major #BloggerFail.

Admixture proportions can be found in the spreadsheet.

Don't forget to leave a message in the ancestry thread to help yourselves and others make better sense of these results.

Current submission opportunity is over. Please subscribe to the feed to be alerted of new ones.

All populations:
Individual bars:

Friday, May 6, 2011

Ancestral North Indian for South Asian members (with PCA)

I have previously estimated the Ancestral North Indian component in South Asian project members by exploiting the correlation between ADMIXTURE results and the published figures of Reich et al. (2009).

A different method of achieving the same is to project individuals onto the CEU-Onge first principal component, and exploit the correlation between PC1 scores and the published ANI figures. This correlation is +0.99, so it is possible to regress ANI on PC1 and come up with ANI estimates from PCA scores.

Results for all Indian, Bangladeshi, and Pakistani project members can be found in this spreadsheet, ordered by ANI, and interspersed with population averages.

Tuesday, May 3, 2011

Results for DOD613 to DOD622 posted

Admixture proportions can be found in the spreadsheet.

Don't forget to leave a message in the ancestry thread to help yourselves and others make better sense of these results.

Current submission opportunity is over. Please subscribe to the feed to be alerted of new ones.

All populations:
Individual bars: