A natural barrier to lateral gene transfer from prokaryotes to eukaryotes revealed from genomes: the 70 % rule

Few topics in evolutionary biology have received as much attention in the last 20 years as lateral gene transfer (LGT, or horizontal gene transfer [HGT]) [13], with more than 11,000 papers that have appeared on the topic since 1985 and more than 30,000 citations to those papers in 2015 alone (Thomson Reuters Web of ScienceTM as of 21 April 2016). Cognizant biologists have learned one thing for certain about LGT: Not all papers bearing claims for LGT are evidence for the workings of LGT, especially when it comes to LGT from prokaryotes to eukaryotes, which is the focus of our paper. For example, the original report of the human genome in 2001 [4] carried claims for hundreds of cases of prokaryote-to-eukaryote LGT in our own DNA. Those claims were, however, quickly unveiled as interpretation and annotation artifacts [5, 6]. More recently two papers on tardigrade genomes have provided a clear case in point: One report said that 16.1 % of the genes in the tardigrade genome were recently acquired via LGT from various prokaryotes [7], while an independent sequencing project stated that there was virtually no LGT in the tardigrade genome [8]. The main difference between the two studies was that in one study [7] genes probably belonging to associated bacteria were annotated as tardigrade genes. Those genes were not present in the other genome study [8], the scaffolds of which are longer, helping to filter out the contaminations that were interpreted as prokaryote-to-eukaryote LGT. Curiously, the claims for LGTs in the human genome, which were long ago refuted [5, 6], are now making their way back into the literature [9], based on analyses employing the same LGT identification software [10] used for the tardigrade genome that was reported to be LGT-rich [7]. Apart from the natural and well-documented process of gene acquisition from the ancestors of organelles in the wake of mitochondrial and plastid origin — endosymbiotic gene transfer [11, 12] — how much prokaryote-to-eukaryote LGT, if any, is really going on in nature?

Within the prokaryotes, LGT is best seen as a way of life. Several naturally occurring mechanisms of LGT among prokaryotes have been known for many decades: transfer by naked DNA uptake from the environment (transformation), transfer by plasmid transfer (conjugation), transfer via phage particles (transduction), and gene transfer agents [1318]. A great deal is known about the genes and proteins that moderate these LGT mechanisms in prokaryotes [1921]. These LGT mechanisms merely introduce DNA into the prokaryotic cell; whether or not it recombines into the genome is governed by the genes and proteins that mediate DNA insertion and/or recombination [22, 23].

Importantly, the mechanisms that introduce DNA into the cell for LGT are the same that introduce DNA into the cell for normal recombination within prokaryotic species [24]. In prokaryotes, recombination is never reciprocal. It is always unidirectional from donor to recipient, and with transformation, transduction, or gene transfer agents, the donor and recipient do not even need to ever physically meet. Prokaryotic genomes are highly dynamic in terms of gene content. They are typically replete with LGT, undergoing continuous gains (often from outside the species, genus, or family) and losses through deletion [2, 2527]. Over time, these gains and losses lead to pangenome structures [12, 2830], not only at the species level but at all taxonomic levels [12]. In prokaryotes, acquisition through LGT dwarfs the role of gene duplication in generating gene families within genomes [31]. Prokaryotic LGT is pivotal in the spread of antibiotic resistance [32] and in ecological adaptation [33]. The existence and extent of LGT in prokaryotes has challenged the traditional view of prokaryotic evolution as a fundamentally tree-like process and has prompted the use of more network-like representations to describe the evolutionary relationships among genomes [3, 3436].

In contrast to prokaryotes, eukaryotes undergo recombination during meiosis and sex, and recombination is always reciprocal [37]. Although eukaryotes are descended from prokaryotes [38, 39], at eukaryote origin they apparently lost the LGT mechanisms typical of prokaryotes, because eukaryotes have so far not been observed to undergo inter-specific (or inter-phylum) conjugation, transformation, or transduction, nor have any genes or proteins been described in eukaryotes that would mediate prokaryotic-type LGT. As a consequence, prokaryotes clearly have pangenomes [12, 2830], but eukaryotes apparently do not. Neither 1000 human genomes [40] nor 1135 Arabidopsis genomes [41] harbored any hint of evidence for the existence of a pangenome or pangenome-like structure. By contrast, the existence of pangenomes in prokaryotes became evident based upon only a handful of sequences per species [12, 2830]. The only mechanism characterized as a source of new genes entering nuclear genomes in a natural manner is gene transfers from organelles [42]. Barring targeted gene transfer experiments [43] and endosymbiont genome insertions into insect chromosomes with contiguous sequences [44], reports of prokaryote-to-eukaryote LGT are based on sequence comparisons and annotations of individual genes. Thus, in contrast to LGT among prokaryotes, which is their natural mechanism to generate new gene combinations, the role of LGT in eukaryote evolution is controversial.

Some reports suggest that prokaryote-to-eukaryote LGT frequently occurs in phagotrophic, unicellular eukaryotes [45], that there is continuous LGT from prokaryotes to vertebrates and other animals [9] as well as to plants [46] and to algae [47]. In only a few rare and well-documented cases can the sources of LGT to eukaryotes be pinpointed [44, 48], in other cases, the prokaryotic donors are known for their ability to transfer DNA to eukaryotes [49], and of course eukaryotes acquired many genes from the endosymbiotic ancestors of mitochondria and chloroplasts [50]. Yet for the vast majority of cases reported for prokaryote-to-eukaryote LGT, the mechanisms and specifics (how, when, and between which groups) remain obscure.

If the numerous claims for eukaryotes constantly acquiring prokaryotic genes through LGT [5158] are true, then there would indeed seem to be no natural barrier for prokaryote-to-eukaryote LGT. That leads to two important questions: (1) If such claims are true, what are the implications for our understanding of evolution? But that is not our question here, rather we ask the second question: (2) Are such claims true? Importantly, asking whether eukaryotes are constantly acquiring genes from prokaryotes is not the same as asking if prokaryote-to-eukaryote LGT never ever occurs. After all, examples like the genome fragments that are present in insect genomes and that were acquired from bacterial endosymbionts of the insect lineage [44, 48] or Agrobacterium colonization in plants [49] show that sometimes genes do make their way from prokaryotes to eukaryotes. We are thus not going to ask whether the barriers to gene flux from prokaryotes and eukaryotes are absolute and have never been crossed during evolution, because we already know that they have, in particular at the origin of chloroplasts and mitochondria [50]. Rather we are going to ask whether prokaryotic genes enter the eukaryotic lineage at a frequency that has detectable evolutionary impact and leaves clear evidence in the form of genes in eukaryotic genomes that were recently acquired from prokaryotes.

In previous work, we showed that acquisitions of prokaryotic genes by the eukaryotic lineage correspond to endosymbiotic events (the origins of mitochondria and chloroplasts) [50] and that many of the patterns of “patchy” gene distributions that some reports interpret as evidence for LGT [51, 52] are in fact more likely the result of differential loss [50] superimposed upon vertical inheritance. Those findings are not compatible with claims that eukaryotes are constantly and frequently acquiring genes from prokaryotes. Something has to give.

How to test claims for abundant LGT from prokaryotes to eukaryotes? If LGTs from prokaryotes to eukaryotes are as commonplace and as frequent as many papers assert [4, 9, 10, 45, 5158], then eukaryote genomes should contain both anciently acquired prokaryotic genes and recently acquired prokaryotic genes. Furthermore, it should be possible, using robust measures, to uncover evidence for the presence of recently acquired genes. Here we look for recent LGTs from prokaryotic donors in eukaryotic genomes and — for direct comparison to a positive control where recent LGTs should be detectable — in prokaryotic genomes as well.