Friday, December 1, 2017

Descendants of ancient European (fair?) maidens in Central Asia's highlands

Several South Central Asian populations have a reputation for producing individuals who look surprisingly European, even the lighter shade sort of European from Eastern and Northern Europe. This is especially true of the Pamiri Tajiks, and that's unlikely to be a coincidence, because these people probably do harbor a lot of ancient Eastern European ancestry.

My own estimates, using various ancestry modeling methods, suggest that Pamiri Tajiks derive ~50% of their genome-wide genetic ancestry from populations closely related to, and probably derived from, Eneolithic/Early Bronze Age pastoralists from the Pontic-Caspian steppe of Eastern Europe, such as the Sredny Stog and Yamnaya peoples. Below is a simple Admixture graph using the mostly Yamnaya-derived Iron Age Sarmatians from Pokrovka, Russia, in far Eastern Europe, to illustrate the point. Note that Sarmatians were East Iranic-speakers, which is what Pamiri Tajiks are. The relevant graph file is available here.

But, some of you might retort, this is all just statistical smoke and mirrors, and what it really shows is that these so called Europeans came from Central Asia or even India.

Not so, because my models can't be twisted any which way, and they have strong support from uniparental marker data.

Many South Central Asian groups, and especially Indo-European-speakers, like the Tajiks, show moderate to high frequencies of two Y-chromosome haplogroups typical of Bronze Age Eastern Europeans: R1a-M417 and R1b-M269. This is old news to the regular visitors here and its implications are obvious, so if you still think that these haplogroups expanded from South Central Asia to Eastern Europe, rather than the other way around, then please update yourself (for some pointers, see here and here).

And now, courtesy of Peng et al. 2017, we also have a much better understanding of ancient European influence on the maternal gene pool of Pamiri groups (see here). The paper doesn't specifically cover the topic of European admixture in South Central Asia, but it nevertheless demonstrates it unequivocally.

Below are a couple of phylogenetic trees from the paper featuring a wide range of mitochondrial DNA (mtDNA) sequences shared between Europeans and Central and South Asians; quite a few of these lineages are rooted in Eastern Europe, as shown by both modern-day and ancient DNA, so they strongly imply gene flow, and indeed considerable maternal gene flow, from Eastern Europe deep into Asia.

Worthy of note are the lineages belonging to such relatively young (likely post-Neolithic) haplogroups as U5a1a1, U5a1d2b, U5a2a1, and U5b2a1, all of which have already been found in ancient remains from the Pontic-Caspian steppe.

I'm no longer wondering whether there were massive population movements from Eastern Europe to South Central Asia during the metal ages. It's a given that they happened, and I'm now looking forward to learning about the details from ancient DNA. For instance, what was the ratio of men to women amongst these migrants? And how fair were they exactly?

Tuesday, November 28, 2017

The ancient genomics revolution (Skoglund & Mathieson 2017 preprint)

Two former Harvard scientists Pontus Skoglund and Iain Mathieson are working on a new review paper on the wide range of scientific breakthroughs provided by ancient genomics over the past decade. The preprint is available at Dropbox here. There's also a thread about the preprint at Mathieson's Twitter account here.

I've read through it a couple of times, especially the parts about Europe, and haven't been able to spot any major problems; the authors obviously chose their words very carefully, and their geography is beyond reproach. [Edit: first problem spotted, see here]

Now, you might think that geography is easy, but apparently not when it comes to the location of the Pontic-Caspian steppe. Recent media articles have claimed that it's located in West Asia, and, I kid you not, even that it's hilly (for instance, see here), while scientists from Max Planck and other supposedly high brow places seem to think that it's in Central Eurasia (see here). Nope, as Skoglund and Mathieson correctly point out, it's actually located in (far) Eastern Europe, while Central Eurasia is generally posited to be further to the east. From the preprint (emphasis is mine):

Anatomically modern humans were widely distributed in Europe by at least 42,000-45,000 BP (3; 41). The oldest genomic data from a modern human in Europe is the Oase 1 individual from present-day Romania dated to 37,600-41,600 BP. This individual, which had a direct Neanderthal ancestor in the past four to six generations, did not contribute detectable ancestry to later Upper Paleolithic populations (24). During the Upper Palaeolithic, a major transformation ~30,000-35,000 years ago was likely associated with the replacement of the Aurignacian with the Gravettian culture in western Europe(28). As the Last Glacial Maximum (LGM) came to an end and the ice sheets receded, Europe was repopulated, possibly from southern European and central Eurasian refugia (28). Another transformation may have taken place during an interstadial warm period ~14.5 kya, replacing the original recolonizers with a population that would come to form the Mesolithic populations of Europe (28; 93). These Mesolithic populations were outside the genetic diversity of present-day Europe (114; 131) and themselves display a clinal structure, with an east-to-west cline (32; 37; 38; 47; 57; 62; 72; 78; 112; 130). The origin of this cline is not clear, although it plausibly reflects two or more major sources of ancestry in the post-LGM or post-14.5kya expansions.

Starting from the southwest around 8,500 BP, the Mesolithic ancestry of Europe was largely replaced (29; 38; 42; 130; 131) as a new type of ancestry related to that found in Neolithic northwest Anatolia (73; 87) and, ultimately, to early farming populations of the Levant and Northern Iran (11; 56) expanded throughout Europe. This ancestry rapidly reached the extreme edges of Europe, with direct evidence of its presence in Iberia at 7300 BP (86), in Ireland at 5100 BP (14) and in Scandinavia at 4900 BP (131). This “Anatolian Neolithic” ancestry was highly diverged relative to the “hunter-gatherer” ancestry of the populations that previously inhabited Europe (F ST ~ 0.1, similar to the divergence between present-day European and East Asian populations) (73; 132). Across Europe, its appearance was closely linked in time and space to the adoption of an agricultural lifestyle, and it now seems established that this change in lifestyle was driven, at least in part, by the migration. However, the Anatolian Neolithic migrants did not replace the hunter-gatherer populations. Over the next 4000 years, the two populations merged, and by 4500 BP, almost all European populations were admixed between these two ancestries, typically with 10-25% hunter-gatherer ancestry (29; 38; 42; 50; 62; 71; 73; 130; 131). Across Europe, this “resurgence” of hunter-gatherer ancestry (10) was independent–driven by local hunter-gatherer populations who lived in close proximity to farming groups (7; 62; 72; 130).

The next substantial change is closely related to ancestry that by around 5000 BP extended over a region of more than 2000 miles of the Eurasian steppe, including in individuals associated with the Yamnaya Cultural Complex in far-eastern Europe (1; 38) and with the Afanasievo culture in the central Asian Altai mountains (1). This “steppe” ancestry is itself a mixture between ancestry that is related to Mesolithic hunter-gatherers of eastern Europe and ancestry that is related to both present-day populations (38) and Mesolithic hunter-gatherers (46) from the Caucasus mountains, and also to the populations of Neolithic (11), and Copper Age (56) Iran. Steppe ancestry appeared in southeastern Europe by 6000 BP (72), northeastern Europe around 5000 BP (47) and central Europe at the time of the Corded Ware Complex around 4600 BP (1; 38). These dates are reasonably tight constraints, because in each case there is no evidence of steppe ancestry in individuals immediately preceding these dates (47; 72). Gene flow on the steppe was extensive and bidirectional, as shown by the eastward flow of Anatolian Neolithic ancestry–reaching well into central Eurasia by the time of the Andronovo culture ~3500 BP (1)–and the westward flow of East Asian ancestry–found in individuals associated with the Iron Age Scythian culture close to the Black Sea ~2500 BP (143).

Copper and Bronze Age population movements (14; 78 Martiniano, 2017 #8761; 85; 112), as well as later movements in the Iron Age and Historical period (70; 119) further distributed steppe ancestry around Europe. Present-day western European populations can be modeled as mixtures of these three ancestry components (Mesolithic hunter-gatherer, Anatolian Neolithic and Steppe) (38; 57). In eastern Europe, further shifts in ancestry are the result of additional or distinct gene flow from Anatolia throughout the Neolithic and Bronze Age in the Aegean (42; 51; 55; 72; 87), and gene flow from Siberian-related populations in Finland and the Baltic region (38).

And I really like this part; sounds ominous for the Out-of-India (OIT) crowd, doesn't it? Hopefully we won't have to wait too long for the relevant paper from Harvard, which, I can assure you, is coming sooner or later.

There are no published ancient DNA studies from South- or Southeast Asia. However, data from neighboring regions provides clues to the population history of this region. In particular, present-day South Asian populations share ancestry with Neolithic Iranian (11) and Steppe (56) populations. This strongly suggests Neolithic or Bronze Age contact between South Asia and west/central Eurasia, although only direct ancient DNA evidence from the region will resolve the timing and structure of this contact.


Pontus Skoglund and Iain Mathieson, Ancient genomics: a new view into human prehistory and evolution, preprint 2017

Wednesday, November 22, 2017

Ancient genomes from NE Europe suggest in tandem spread of Siberian admixture and Uralic languages into the region >3,500 ya

Max Planck's Thiseas Christos Lamnidis recently tweeted this image of a part of a poster that he's presenting on the population history of Northeastern (NE) Europe at the Human Evolution 2017 conference in Cambridge, UK (for the tweet see here):

If you can't make out the text in the image, this is what the introduction says:

European history has been shaped by migrations, and subsequent admixture. Evidence points to migrations linked to the advent of agriculture, and the spread of Indo-European languages [a b]. Little is known about the ancient population history of NE Europeans, specifically Uralic speakers. Here we analyse eleven ancient genomes from Finland and NW Russia and a high-coverage modern Saami genome, and show that northern Europe was shaped by gene flow from Siberia that began at least 3,500 ya. Today, this ancestry is found in modern populations of the region, especially Uralic speakers. Additionally, we show that ancestors of the Saami inhabited a larger territory in Finland during the Iron Age than today.

It's intriguing to me that Max Planck is looking so closely at these issues now, because back in 2015 I ripped into Max Planck's Paul Heggarty for some comments that he made about the potential link between Yamnaya-related admixture and Uralic languages (see here). This is what I said back then:

These are exceedingly naive and stupid comments from someone representing the Max Planck Institute. Perhaps as an ardent supporter of the Anatolian hypothesis he's feeling more than a little desperate at this point and clutching at straws? That's because anyone with even a basic grasp of European linguistics and genetics should know that:

- present-day Hungarians and Estonians speak Uralic languages, but they are of course overwhelmingly of Indo-European origin, which is easily seen in their genome-wide and uniparental DNA

- other Uralic speakers, further to the north and east, in the forest zone away from Indo-European influence, are clearly distinct from the vast majority of Indo-European speaking Europeans, because they show significant levels of recent Siberian ancestry, which was missing among the Yamnaya and Corded Ware people, and appears to be an Uralic-specific genetic signature

- therefore, it's highly unlikely that Uralic-speakers were also part of the Yamnaya > Corded Ware movement; rather, early Uralics in all likelihood began to move west across the forest zone well after the Yamnaya and related expansions from the steppe.

All of this is probably just a remarkable coincidence, but in any case, it's nice to see that the good people at Max Planck are now beginning to understand the processes that have shaped the genetics and linguistics of NE Europe.

Monday, November 13, 2017

Who's your (proto) daddy Western Europeans?

Considering the increasingly large numbers of paleogenomic samples being released online nowadays, it's no longer practical for me to try to highlight most archaeological cultures and even genetic clusters in my Principal Component Analyses (PCA) of the ancient world. Thus, from now on, I'll be focusing attention in such PCA on the main population shifts that have led to the formation of the modern-day West Eurasian gene pool and genetic substructures, like on the PCA plot below, which includes the new Lipson et al. 2017 data (available at the Reich Lab here).

The relevant PCA datasheet can be gotten here. By grouping several hundred ancient samples into just nine clusters, I'm attempting to highlight four key processes and resulting genetic shifts in Europe, the Near East and Central Asia:

- European forger populations mixing with genetically much more southern early farmers of Near Eastern origin, mostly during the Neolithic, bringing about the total disintegration of the Europe to Siberia Hunter-Gatherer cline

- "Old Europeans" getting overrun and largely absorbed by Y-haplogroup R1-rich Kurgan pastoralists from the Pontic-Caspian steppe during the Eneolithic and Bronze Age, leading to the formation of at least one major new cline from the Bronze Age steppe into post-Kurgan expansion Europe

- the ancient Near East "imploding" or becoming significantly more compact in terms of genetic structure, likely due to a variety of major population expansions from the chalcolithic onwards from the eastern and western parts of the Fertile Crescent, as well as probably the Caucasus and Europe (note how the post-Neolithic western Asian cluster stretches out towards Europe)

- fully nomadic and very wide ranging pastoral and warrior cultures dominating the entire Eurasian steppe during the Iron Age, leading to the emergence of progressively more East Asian-admixed populations from west to east across the Eurasian steppe

An interesting outcome of the denser sampling from space and time in West Eurasia is that Y-haplogroup R1b, once so elusive in the ancient DNA record, is now popping up all over the place. The new Lipson et al. dataset, for instance, includes two R1b "Old Europeans" from Blatterhole in Germany dated to the Middle Neolithic. Below is the same PCA as above except with all of the ancients belonging to R1b marked with an X. The two Blatterhole samples are sitting in the largely empty space between the European/Siberian Hunter-Gatherer cline and most of the "Old Europe" cluster. The relevant PCA datasheet is available here.

So it may seem that we're back to square one in the long running effort to pinpoint the origin of Y-haplogroup R1b-L51, which encompasses almost 100% of modern-day Western European R1b lineages, and thus probably ranks as Europe's most common Y-haplogroup. But at this stage I'd say no, because R1b-L51 is a subclade of R1b-M269, of which the oldest sample comes from the Bronze Age steppe. In fact, as can be seen in the above PCA, this sample is sitting in exactly the right spot to be one of those pastoralists who overran "Old Europe", or at least a very close relative thereof.

Or am I wrong? Feel free to let me know in the comments.

I didn't bother creating a similar plot of ancient samples belonging to Y-haplogroup R1a, because, unlike R1b, this marker is still non-existent in samples from outside of Eastern Europe and Siberia dating to before the late Neolithic. And I doubt that this is simply due to a lack of the right ancient material. Moreover, the recent discovery of Y-haplogroup R1a-M417, which encompasses almost 100% of all modern-day R1a lineages on the planet, in a North Pontic steppe sample belonging to the Eneolithic Sredny Stog culture means that it's game over for the naysayers as far as the steppe origin of most modern-day R1a lineages is concerned (see here and here).

In other words, if you're still hoping to see R1a, and especially R1a-M417, pop up in non-steppe derived ancient individuals in, say, such far away places as South Asia, then you'll probably be waiting forever.

Update 15/11/2017: After a couple of days of messing around with the Lipson et al. dataset, I'm certain that Late Copper Age sample Protoboleraz_LCA I2788 shows significant steppe-related admixture. This is the only sample from Lipson et al. with such an obvious signal of steppe-related input that had enough data to be analyzed individually by me with PCA and D-stats.

For the time being, amongst the best proxies for this signal appear to be Yamnaya_Samara and Samara_Eneolithic. But it's likely that the real source of the admixture is yet to enter the ancient DNA record, or at least my dataset. When it does, it'll probably be an Eneolithic pastoralist population from the North Pontic steppe.

Yamnaya_Samara also gives the best statistical fit as the single source population in qpAdm (see here). It's an important result, because it suggests that steppe peoples very similar to Yamnaya were already expanding on and out of the steppe as far back as ~3500 BCE, and perhaps a few hundred years earlier.

Thursday, November 9, 2017

Descendants of Greeks in the medieval Himalayas?

Below is an abstract from the upcoming Human Evolution 2017 conference (Cambridge, UK, November 20-22). It'll be interesting to see when the paper comes out how Harney, Patterson et al. uncovered the Greek affinities of some of these individuals; uniparental markers, rare alleles? The accompanying pic is from Wikipedia.

The skeletons of Roopkund Lake: Genomic insights into the mysterious identity of ancient Himalayan travelers

Eadaoin Harney, Niraj Rai, Nick Patterson, Kumarasamy Thangaraj, David Reich

The high-altitude lake of Roopkund, situated over 5000 meters above sea level in the Himalayas, remains frozen for almost 11 months out of the year. When it melts, it reveals the skeletons of several hundred ancient individuals, thought to have died during a massive hail storm during the 8th century, A.D. There has been a great deal of speculation about the possible identity of these individuals, but their origins remain enigmatic. We present genome-wide ancient DNA from 17 individuals from the site of Roopkund. We report that these individuals cluster genetically into two distinct groups-consistent with observed morphological variation. Using population genetic analyses, we determine that one group appears to be composed of individuals with broadly South Asian ancestry, characterized by diffuse clustering along the Indian Cline. The second group appears to be of West Eurasian related ancestry, showing affinities with both Greek and Levantine populations.

Tuesday, October 31, 2017

Monday, October 30, 2017

On the wrong end of a steppe herder's cudgel (?)

From a new paper at the International Journal of Osteoarchaeology:

In this study, we examine trauma on human remains from the Tripolye site of Verteba Cave in western Ukraine. The remains of 36 individuals, including 25 crania, were buried in the gypsum cave as secondary interments. The frequency of cranial trauma is 30-44% among the 25 crania, six males, four females and one adult of indeterminate sex displayed cranial trauma. Of the 18 total fractures, 10 were significantly large and penetrating suggesting lethal force. Over half of the trauma is located on the posterior aspect of the crania, suggesting the victims were attacked from behind. Sixteen of the fractures observed were perimortem and two were antemortem. The distribution and characteristics of the fractures suggest that some of the Tripolye individuals buried at Verteba Cave were victims of a lethal surprise attack.


Recent paleogenomic studies have indicated that the nomadic pastoralists of the Pontic-Caspian steppe were involved in large-scale population movements at precisely this time, expanding westward farther into continental Europe (Haak et al., 2015). Such a massive population movement likely resulted in lethally violent interactions between indigenous populations and the newly arriving migrants.

Madden et al., Violence at Verteba Cave, Ukraine: New Insights into the Late Neolithic Intergroup Conflict, International Journal of Osteoarchaeology, online: 27 October 2017, DOI: 10.1002/oa.2633

Genetic and linguistic structure across space and time in Northern Europe

I feel that I need to do a double take, and demonstrate more obviously why my new PCA, the one that I introduced in the recent Tollense Valley warrior blog post (see here), should prove very useful for analyzing both genetic and ethnolinguistic links in Northern Europe between modern-day populations and ancient samples, particularly those from late prehistory to early history, which is when the main ethnolinguistic groups that today dominate Northern Europe formed. Judging by some of the reactions in the comments, not everyone was convinced, so let's try this again.

Below is a new version of the said PCA that focuses on several ancient individuals who, based on their archaeological contexts, should show strong genetic affinities to modern-day speakers of Celtic, Germanic and Slavic languages in Northern Europe. These are three Iron Age samples from what is now England, one Iron Age sample from what is now Sweden, and two Medieval samples from what is now Bohemia, Czech Republic, respectively. The relevant datasheet is available here.

And clearly these ancients do show the expected genetic affinities considering where they cluster relative to modern-day Northern Europeans in the two most significant dimensions of genetic variation. Moreover, despite the fact that the Anglo-Saxon and English Iron Age samples were all excavated from sites in eastern England, the Anglo-Saxons cluster between the English Iron Age individuals and the singleton Scandinavian Iron Age sample. This of course makes perfect sense, considering that the Anglo-Saxons were Germanic speakers with recent ancestry from very near to Scandinavia.

So everything seems in good order, and for now it's very difficult for me to consider that those Tollense Valley warriors who cluster alongside modern-day Slavic speakers on my PCA are not ethnolinguistically closer to them than to Celtic and Germanic speakers.

On the other hand, my standard PCA of West Eurasian genetic variation does a comparatively lousy job at matching ethnolinguistic origins with genetic structure, at least in Northern Europe. Note below, for instance, that the same Celtic and Germanic samples from England and Scandinavia form a tight cluster between the two Slavs from Bohemia. Hence, based on this PCA it would be very difficult, perhaps impossible, to correctly predict the ethnolinguistic ties of these ancients just by looking where they cluster relative to modern-day Germanics, Slavs and so on. Right click and open in a new tab to enlarge to the max.

But this is not surprising, because this PCA is based on a wider, more diverse range of populations, and so rather than being dominated by relatively recent, ethnolinguistic-specific genetic drift within Northern Europe, it's much more reflective of deeper, more basic genetic relationships across West Eurasia.

Saturday, October 28, 2017

Global distributions of lactase persistence alleles (Liebert et al. 2017)

The series of maps below is from a new paper by Liebert et al. at Human Genetics. Almost certainly, any population with a sizable level of the 13910*T allele has relatively recent (post-Mesolithic) ancestry from Europe. In that context, note the presence of 13910*T in South Asia and North Central Africa. Populations in these regions also show high frequencies of two Y-chromosome haplogroups that are present in samples from Mesolithic Eastern Europe: R1a and R1b-V88, respectively. It's hard to imagine that this is a coincidence.

Liebert, A., López, S., Jones, B.L. et al., World-wide distributions of lactase persistence alleles and the complex effects of recombination and selection, Hum Genet (2017).

Thursday, October 26, 2017

Ancient Guanches genetically most similar to modern-day Berbers (Rodríguez-Varela et al. 2017)

Over at Current Biology at this LINK. Emphasis is mine:

Summary: The origins and genetic affinity of the aboriginal inhabitants of the Canary Islands, commonly known as Guanches, are poorly understood. Though radiocarbon dates on archaeological remains such as charcoal, seeds, and domestic animal bones suggest that people have inhabited the islands since the 5th century BCE [1, 2, 3], it remains unclear how many times, and by whom, the islands were first settled [4, 5]. Previously published ancient DNA analyses of uniparental genetic markers have shown that the Guanches carried common North African Y chromosome markers (E-M81, E-M78, and J-M267) and mitochondrial lineages such as U6b, in addition to common Eurasian haplogroups [6, 7, 8]. These results are in agreement with some linguistic, archaeological, and anthropological data indicating an origin from a North African Berber-like population [1, 4, 9]. However, to date there are no published Guanche autosomal genomes to help elucidate and directly test this hypothesis. To resolve this, we generated the first genome-wide sequence data and mitochondrial genomes from eleven archaeological Guanche individuals originating from Gran Canaria and Tenerife. Five of the individuals (directly radiocarbon dated to a time transect spanning the 7th–11th centuries CE) yielded sufficient autosomal genome coverage (0.21× to 3.93×) for population genomic analysis. Our results show that the Guanches were genetically similar over time and that they display the greatest genetic affinity to extant Northwest Africans, strongly supporting the hypothesis of a Berber-like origin. We also estimate that the Guanches have contributed 16%–31% autosomal ancestry to modern Canary Islanders, here represented by two individuals from Gran Canaria.

Rodríguez-Varela et al., Genomic Analyses of Pre-European Conquest Human Remains from the Canary Islands Reveal Close Affinity to Modern North Africans, Current Biology (2017),