Pages

Tuesday, May 6, 2014

Neanderthal in America, data from an interesting genetics paper


A recent paper (G. Povysil and S. Hochreiter, 2014) [1], analyses "very short identity by descent (IBD)" genetic segments of chromosome 1, comparing those in humans, Denisovans and Neanderthals.


Their findings are contradictory and disclose some oddities among "Admixed Americans". By the way, it is a pity that they did not include real "South American Indian", South Asian and Oceanian genomes (the Puerto Rico, Colombian and Mexican populations that were included in their study are not pure Amerindian, they are a "Admixed Americans", a mixture of European, African and Native American peoples - more on this below). Nevertheless the paper is interesting and shows very close links between Africans and Americans that cannot be attributed to post-discovery mixture (slave trade) and some incongruent findings.


IBD Segments


When individuals have the same -identical- nucleotide sequences in a given segment, this segment is said to be identical by state (IBS). When these individuals inherited this segment from a common ancestor, it is said to be identical by descent (IBD). The ancestor of those individuals carried this IBD segment and passed it on to them.


How can you tell apart an IBS from an IBD? That is, two or more persons could be IBS, sharing an identical sequence yet this may have arisen purely by chance and not by inheritance, so they would not be IBD.


The authors looked for "rare variants" which as their name indicates are not very common and therefore if a rare variant is an IBS it is highly probable that it is an IBD too. Which makes sense.


They also found some IBD shared with Denisovans and Neanderthals, so these are also very ancient.


Their findings...


So what did they discover? some interesting facts, below are the highlights:

  • Denisovans and Asians. The IBD segments of Asians were the ones with the closest match to the Denisovan ones. Some segments were exclusive to Asians. They were longer than other segments shared by other populations confirming an "In Asia admixture".
  • Neanderthals and Asians. The same as with Denisovans: Asians have the highest sharing with Neanderthals and the segments are longest and many are exclusively Asian. The Europeans also share a high frequency of Neanderthal segments but... theirs are also found in other populations too.
  • Africans. Unexpectedly, they have many Neanderthal and Denisovan IBD segments, and some are exclusive to Africans. These must be ancient, and originating in the common African ancestors of H. sapiens, Neanderthals and Denisovans.

Let's dig deeper into these generalisations:


On Pearson Correlations


First, some theory. To measure the strength of a linear association between two variables, a statistical correlation is used: the "Pearson product-moment correlation coefficient" (symbolized by "r"). What it does is draw a line of best fit through the data of two variables and "r" indicates how far away from this line are these data points.


"r" can range from -1 to +1. The closer to 0, the worse the correlation, closer to 1 or -1 the better the correlation.


For negative values, the association is negative (one variable grows, the other drops), and for positive values both grow.


Below are some correlations as an example. The case where r=0, it is clear that there is no correlation between variables. (i.e. "eye color" and "criminality" would have a similar correlation: none). However "poverty" or "unemployment" and "criminality" will have 0 < r<1 (there IS a strong correlation between those variables).


Below are some examples of Pearson correlations (I prepared them with my Excel sheet):


Pearson correlation
Some examples of Pearson correlation. Copyright © 2014 by Austin Whittall

Having said this, let's see the Pearson correlation coefficient found for IBD and populations:


Denisovans


correlaton of Denisovan alleles in humans
Figure 1, Adapted from [1]. Pearson correlation between
populations and the Denisova genome

ASW: AFR Americans with African ancestry from SW US, YRI: AFR Yoruba in Ibadan, Nigeria, LWK: AFR Luhya from Webuye, Kenya, CLM AMR Colombians in Medellin, Colombia, MXL: AMR Mexicans from Los Angeles, California, PUR: AMR Puerto Ricans, CEU: EUR Utah residents with ancestry from northern and western Europe, FIN: EUR Finnish, GBR: EUR British from England and Scotland, IBS: EUR Iberians from Spain, TSI: EUR Toscani in Italy, CHB: ASN Han Chinese from Beijing, CHS: ASN Han Chinese from South, JPT: ASN Japanese from Tokyo, Japan


The figure above shows four regions with different population groups (Africa, America, Europe and Asia). The Africans show a negative correlation, Asians, Europeans and Americans a Positive one.


The correlations are not all that "strong": Asians average: 0.25, Europeans: 0.1, "Admixed Americans": 0.025 and Africans: -0.075. Which in my opinion are quite weak (very close to zero).


The Asians have the "highest correlation" to the Denisovan genome. Among Admixed Americans, the Mexicans (MXL) have a high correlation too; the authors point out that: "Mexicans (MXL) have also a surprisingly high correlation to the Denisova genome while Iberians (IBS) have a low correlation compared to other Europeans" [1]


This MXL - IBS anomaly is explained as follows: "of all European populations Iberians show the highest rates of African gene flow whereas Mexicans show a high proportion of Native American ancestry which in turn might also reflect gene flow from Asia..." [1].


Allow me to disagree (see my argument further down).


They believe their findings corroborate other studies where Denisovans share more genes with "modern East Asians and South Americans (called Admixed Americans here) than with Europeans". [1]


Now, let's see what they have to say about Neanderthals....


Neanderthals


correlation of Neanderthal genes in humans
Figure 2, Adapted from [1]. Pearson correlation between
populations and the Neanderthal genome

Again we see the same trends as with Denisovans but notice that the correlations are even stronger (the values are higher in absolute value but still ‹ 0,5). Again too, the Mexicans have a better correlation than Iberians, which is (again) lower than the rest of the Europeans.


Asians average: 0.4, Europeans: 0.2, "Admixed Americans": 0.1 and Africans: -0.15. Less weaker than the Denisovan correlations.


According to the authors, "As expected, Asians again show the highest odds for IBD segments matching the Neandertal genome... while Africans have the lowest odds .... Europeans show clearly more matching with the Neandertal genome than Admixed Americans." [1].


Why should Admixed Americans have less than Europeans? They are a mix of roughly 50/50 Europeans and Asian (via Beringia) - the touch of African (via slave-trade) is negligible as we will see further down- so Admixed Americans should have IBDs somewere in between those of Europeans and those of Asians... but they don't.


The authors [1] attribute the North-South decreasing cline in Europe (with lowest values in Italy and IBS -in Spain) to "reduced rates of shared ancestry compared to the rest of Europe... [and]... higher IBD sharing between North Africans and individuals from Southern Europe which would decrease the amount of DNA sharing with Neandertals." [1]


In other words, Africans (with less Neanderthal genes) mixed more intensely with southern Europeans thus watering down any Neanderthal IBDs in their genomes (even though Spain and Italy were peopled by Neanderthals and a place where admixture with modern humans should have taken place....


The Odd "figure 10"


hominin alleles in human populations
Figure 10, from [1]

Figure 10, shown above, perplexes me. The caption in the paper corresponding to this figure is the following (and I highlight the "baffling" text):


"Figure 10: For each genome and each IBD segment, the color indicates whether a population contains this segment (“With”) or not (“Without”). For the human genome, 4,000 random IBD segments were chosen. IBD segments that match the Neandertal or the Archaic genome are found more often in Asians and Europeans than all IBD segments (human). This effect is not as prominent for IBD segments that match the Denisova genome." [1]


In other words, they checked the "content" of specific genomes (the human one, the Neanderthal one, the Denisovan one and the Archaic one) in different populations and showed if the populations carried them or did not carry them in their current genomes.


In the text they state: "IBD segments that match the Neandertal or the Archaic genome are found more often in Asians and Europeans than all IBD segments (human genome). This effect is not as prominent for IBD segments that match the Denisova genome, but still significant." [1]


What I think they are trying to say is the following (take a look at fig. 10 above please):


Asians and Europeans have roughly 200 - 300 IBD segments that are "human", 1,500 that are Neanderthal, some 400 that are Denisovan and about 800 that are Archaic. The "old" genome (2,700 segments) is larger than the "human" one (200- 300 segments).


What they don't point out is that Americans (orange bar in fig. 10) have much higher numbers of ALL segments than Europeans or Asians, and in the case of Neanderthal IBDs, they have the highest values of all populations, including Africans. Africans also surpass both Europeans and Asians... so what is so striking about the Eurasians anyway? What is striking is the high content of "ancient" IBDs in Admixed Americans and Africans.


Could this mean for instance a very ancient peopling of America an Out of Africa directly to America (ie. H. erectus), or later, with Neanderthals, long before modern Humans reached the New World?


Americans also have a much higer "human" IBD segment content than Europeans and Asians, second only to Africans


Note that the “Archaic” element is much more significant than the Denisovan one in both Europeans and Asians... is this due to archaic genome carried by Neanderthals or the Sima de los Huesos ancestor of Denisovans? or, what about the Dmanisi in Georgia or even Homo erectus in Southern Eurasia?


Noticing the closeness between African and Admixed Americans, the authors quickly point out the "African" admixture in Americans (slave trading to the Americas after its discovery). This is, in my opinion, an incorrect assessment of the situation.


Mexican and Iberian anomalies, the correct interpretation


First of all, a "MXL" or "Mexican from Los Angeles, California" is a rather untidy way of studying traces of ancestral hominids in America... it is like looking at say, "Italians in Buenos Aires" or "Dutch in Cape Town" or even "Ashkenazi in New York", none of these groups, real and relevant as they may be, reflect the original people of those areas (Querandí, Khoi or the Lenape, respectively).


MXL, and the other "Americans" (CLM, Colombians in Medellin, Colombia and PUR Puerto Ricans) are an admixture of European, African and Amerindian. So it is difficult to come to any conclusions. Europeans, Africans and Asians on the other hand, are the "original" peoples of their respective continents and can tell us about the changes in the genomes of their respective geographic areas.


Nevertheless, we will try to work with these artificial constructs, the MXL, PUR and CLMs:


I came across a paper that takes the 1000 Genomes project data (Gravel et al., 2013) [2] and gives us a breakdown of the makeup of those "Admixed Americans" which is shown below: [2]


  • MXL. 47.6% Native American,  4.2% African, 48,2% European.
  • PUR. 12.8% Native American, 11.7% African, 75.5% European.
  • COL. 25.6% Native American,  7.5% African, 66.9% European.

Being people originating in former Spanish Colonies (Colombia, Puerto Rico and California were part of the Spanish Empire until the XIXth century), the European content is predominantly Iberian, and mostly Spanish (IBS), which may also have some ancient admixture with African individuals in their genome [3].


We must ask ourselves why the correlation with Neanderthals and Denisovans in Admixed Americans is similar to the correlation of IBS when only half their genome is Iberian? It should differ because the other half is predominantly Amerindian, allegedly closer to Asians than to Africans. Since Asians are even more correlated to Neanderthals and Denisovans we must ask ourselves how come the correlation of Admixed Americans isn't higher than the European one?


I used a rough approximation to calculate the value for the admixed Americans (very rough and I am inclined to believe that it is flawed): I calculated the weighed average by multiplying the "Pearson correlation" of each of the populations that originated the different admixed Americans by its weight in each Admixed American group, and adding them up to obtain a "calculated Pearson" for both Denisovans and Neanderthals for that group:


Example Neanderthal Pearson among MXL. See the image below (in brackets I indicate each of the populations that make up the MXL):


0,48 x 0.2 [EUR] + 0.04 x -0.15 [AFR] + 0.48 * 0.4 [NAm]= 0.28


For Nam (Native Americans or Amerindians) I took the same value as their alleged ancestors: Asians.


admixture in Americans
Calculation of Pearson correlation for Admixed Americans
Copyright © 2014 by Austin Whittall

In red, under "Average" is the value for both Neanderthal and Denisovans (0.23 and 0.129 respectively) the average is simple since the size of each American group is nearly the same (see Appendix B in [1]).


When compared with the "Actual" values given by G. Povysil and S. Hochreiter [1], we find that the Denisovan figure is very similar but the Neanderthal value is definitively off mark. Yes, I know that the calculation is rough and indeed surely wrong (see Appendix I at the bottom of this post to see why).


But I disagreed, even without any calculations, because a population that is basically a mixture of Asians (via Beringia) and Europeans must have a Pearson correlation somewhere between the values of both of those two populations. But, the value given by Povysil and Hochreiter is ten times lower! and almost zero (no-correlation)


Furthermore, the data in figure 10 clearly shows that Americans have the highest Neanderthal IBS of all groups.


There is something that is not quite right in this paper or I have not understood the data.


Additionally, Table 1 in [1] shows how IBDs are shared between populations. And it has some striking values:


Key: AFR = African, AMR = Admixed Americans, EUR = Europeans, ASN = Asian.


  • AFR / AMR = 28,710
  • AFR / ASN = 387
  • AFR / EUR = 986
  • AMR / ASN= 207
  • AMR / EUR =1,008
  • ASN / EUR = 351

Once again, as with Fig. 10, Africans and Americans are close together: Africans and Americans share nearly 29,000 IBS despite AFR being only 4 -12% of AMR ancestry, while Europeans (48 - 76% of AMR ancestry) only share 1,008 IBD with Americans, and Asians, share even less IBD (207) than EUR.


Africans share 74 times more IBS with Americans than with Asians and 29 times more than with Europeans. Why? Slave trade cannot explain this. Maybe an ancestral admixing (H. erectus or even earlier... Dmanisi?)


In Appendix A [1] , they also indicate the "rare and low-frequency variants .... (0.5 - 5% frequency)", found in different quantities in each population. As expected (by me) Africans and Americans have the highest values:


  • AFR: 558,996 - 683,289
  • EUR: 143,987 - 155,270
  • ASN: 118,137 - 120,115
  • AMR: 163,737 - 214,430

The authors note that Africans have 4 times more rare alleles than all other populations and add, trying to attribute America's high values to admixed African genes, they state (bold mine): "if we ignore the Admixed Americans that have African admixture".


But this statment is off mark: we have seen that Admixed Americans only have 4 - 12% African genetic content!, how can they have such a high frequency of rare alleles? They have +28% more than Europeans and +60% more than Asians! These are supposedly "rare" and "ancient" markers, shouldn't they appear more frequently in the Old World?


I even suspect that the massive loss of lineages in America post-discovery due to disease and war during the XVI century, wiped out many rare lineages and what we see is the tip of a once vast iceberg. The ghost of the ancient peopling of America.


It is a shame that the paper did not include "real" Amerindian and Papuan / Oceanian samples. Maybe future studies will do so. It is also a pity that when faced with a brink, they back off and stick to orthodoxy.


Appendix I


Why my "rough calculation" was flawed:


Actually Admixed Americans are a mixture of "certain" members of three different populations, whose admixture can range from 0 to 100% and include any of, or all of, these populations. That is, for example: "pure" american natives with no European or African ancestry on one hand and, mixed descendants of Europeans and Africans and American Natives on the other.


So the calculation is not a simple mathematical average. The resulting distribution depends on which members of the orignal populations take part in the final admixture. Because the individuals of each population carry different frequency of Neanderthal or Denisovan alleles in their genomes which may be different to that of the whole group.


Just as an example I made a simulation with three populations (American, African and European), each with a distribution that gave a Pearson correlation (r) identical to those reported by Povysil & Hochreiter [1] for Neanderthals, the outcome is shown below:

Pearson correlation simulation
A simulation of Correlations by admixing populations, for Neanderthals and MXL
Copyright © 2014 by Austin Whittall

On the left side I placed three clusters: the Europeans are shown with blue rhombus, their trend is the blue line with r=0.2; the Africans are the red squares and the red line marks their r=-0.15, finally American Natives are the green triangles with a green line marking r=0,4 (For them I adopted the Asian values).


Note the dispersal of the dots (a cloud of dots with no real pattern), this shows how "slack" correlations are when they are far from 1 and close to 0.


On the right side is a mixture, which takes the Mexican "combination" of 48% European, 4% African and 48% Amerindian. So if there were 16 values in each population, the mix was made up of 8 chosen from Europeans, 8 from Amerindians and 1 from Africans. These points were taken from the original distributions shown on the Left, and the outcome is shown with the red circles and the red line. I chose any points, at random.


The Pearson correlation value in the chart, for the mixed population, was r=0,11. But in other simulations, depending on the points chosen from each population, I obtained different correlations which varied widely, from r= -0,08 to r= +0,83.


This proves that my assumption was flawed. Nevertheless, the populations used by [1] consisted of more than 16 values or points, they had over 350 individuals in each population. So the dispersion "cloud" would not be similar to mine and it is likely that the final values of "r" would not be so different.


Sources


[1] Gundula Povysil, Sepp Hochreiter, (2014). Sharing of Very Short IBD Segments between Humans, Neandertals, and Denisovans. bioRxiv April 7, 2014. doi: 10.1101/003988
[2] Simon Gravel et al., (2013) Reconstructing Native American Migrations from Whole-Genome and Whole-Exome Data. PLoS Genet. Dec 2013; 9(12): e1004023. doi: 10.1371/journal.pgen.1004023
[3] Johnson NA, et al., (2011). Ancestral Components of Admixed Genomes in a Mexican Cohort. PLoS Genet 7(12): e1002410. doi:10.1371/journal.pgen.1002410



Patagonian Monsters - Cryptozoology, Myths & legends in Patagonia Copyright 2009-2014 by Austin Whittall © 

1 comment:

  1. Learn more about the Gran Turismo 5 Prologue, a game which has players on the edge of their seats around the world. Get more information about recent releases and the things you can expect from this game.
    see more details:tour a iquitos desde lima

    ReplyDelete