Phylogenomic analysis reveals genome-wide purifying selection on TBE transposons in the ciliate Oxytricha

Transposable elements (TEs) are genomic parasites present in all eukaryotic genomes. There exist multiple different classes of TEs, which occupy distinct fractions of the genome and show a wide variety of genomic activity. Despite the drastic differences, TEs play important roles in shaping the genome and facilitating genome evolution by processes that can promote genome rearrangements, contribute to the origin of new genes and alter gene expression [1–4].

Ciliates are unicellular eukaryotes that possess two types of nuclei, a transcriptionally active somatic nucleus and an archival germline nucleus [5]. The somatic nucleus develops from a copy of the germline through extensive genome rearrangements. In Oxytricha, the somatic macronucleus (MAC) is extremely gene dense, with ~16,000 short “nanochromosomes” that average 3.2 kb, and most encode a single gene [6]. The germline micronucleus (MIC), on the other hand, exhibits a highly fragmented and complex genome architecture, with short gene segments (Macronuclear Destined Sequences, MDSs) interrupted by brief noncoding sequences (Internal Eliminated Sequences, IESs). These DNA segments are the information that is retained in the soma after development; intriguingly, the DNA segments are often present in a permuted order or inverse orientation in the germline. Therefore, correct assembly of functional genes in the soma requires precise deletion of noncoding sequences and extensive reordering and inversion of gene segments that are “scrambled” in the germline. The somatic genome is free of transposons, although it contains some transposase-like genes [6]. Nearly 20 % of the germline genome is occupied by TEs [7], which are all eliminated during somatic development.

Ciliates provide novel model systems to study transposable elements because multiple TEs, especially the transposases they encode, have been recruited to provide important cellular functions for somatic development [8, 9]. The macronuclear genomes of Tetrahymena and Paramecium encode a homolog of the PiggyBac transposase that is expressed during development. Knockdown of the PiggyBac transposase results in a developmental defect, implicating its role in nuclear development [10, 11]. Tc1/mariner transposons are the most prevalent transposons in ciliate germline genomes, including the Tec elements in Euplotes [12] and Tennessee, Sardine and Thon elements in Paramecium [13, 14]. The terminal sequences of Paramecium IESs resemble the terminal inverted repeats of Tec elements in Euplotes [12, 15] and the ends of Tc1/mariner transposons [16], leading to the hypothesis that many IESs are remnants of TE insertions [17].

In Oxytricha, the telomere-bearing elements (TBEs) are another group of Tc1/mariner DNA transposons that have long been studied in ciliate germline genomes [18]. There is also phylogenetic evidence for recent insertion of TBEs [19]. TBEs encode three open reading frames (ORFs), a 42kD transposase, a 22kD ORF with unknown function and a 57kD ORF with zinc finger and kinase domains but unknown function (Fig. 1a). The 42kD transposase, together with the transposase encoded by Euplotes Tec elements and other Tc1/mariner transposases, belong to a superfamily of transposase genes with a common DDE catalytic motif [20]. Similar to the PiggyBac transposase, knockdown of the TBE transposase also leads to developmental defects, such as accumulation of unprocessed DNA and incorrectly rearranged nanochromosomes [21], suggesting that the TBE transposase has acquired an essential function in genome rearrangement. Because the transposase gene is present in many thousands of copies in the germline, this experiment was unique in knocking down such a high copy target. Nowacki et al. concluded that the 42kD transposase has likely been recruited for its DNA cleaving activity or another role in eliminating noncoding sequences, including their own elimination [21, 22].

Fig. 1

Phylogeny of sampled Oxytricha TBE genes and orthologs identified in three other stichotrich ciliates. a Schematic map of TBE transposons. Gray arrows represent terminal inverted repeats (TIR). Orange arrows represent ORFs encoded by TBEs. b Phylogeny constructed with TBE 42kD transposases (29 TBE1, 27 TBE2.1, 26 TBE2.2 and 25 TBE3 42kD protein sequences). Clades formed by TBE1, TBE2 and TBE3 are labeled accordingly. TBE2.1 representatives are indicated in red and TBE2.2 in blue. Internal branches supported by posterior probability higher than 0.9 are colored in green. c Phylogeny constructed with TBE 22kD ORFs (32 TBE1, 39 TBE2.1, 30 TBE2.2 and 28 TBE3 22kD protein sequences). Colors are as above. d Phylogeny constructed with TBE 57kD ORFs (27 TBE1, 26 TBE2.1, 23 TBE2.2 and 21 TBE3 57kD protein sequences). Clades formed by TBE1, TBE2.1, TBE2.2 and TBE3 are labeled accordingly; colors as above. The multiple sequence alignment was produced with MAFFT v6.956b and trimmed with trimAl v1.2 to remove excess gaps and poorly aligned regions. The unrooted Bayesian trees were produced with MrBayes v3.2.2 [35]. The three TBE orthologs are 1: Sterkiella histriomuscorum; 2: Tetmemena sp.; 3: Laurentiella sp.. All posterior probability values are above 0.5. The scale below the phylogeny illustrates branch substitutions per site

A few studies have suggested that purifying selection is acting on the 42kD transposase encoded by TBEs [21, 23, 24]. However, these studies were limited by the small number (up to 100) of TBE sequences that were previously available. The levels of selection acting on the 22kD and 57kD ORFs have not been reported before and here we investigate their properties genome-wide. With the recent sequencing and assembly of the Oxytricha micronuclear genome [7], we are able to provide a thorough characterization of TBE sequences in the germline, including their genomic distribution and sequence features. We also infer the levels of selective constraints acting on the three transposon-encoded ORFs, and we discovered homologs of TBE transposons in other ciliate genomes. Together, these results provide insights into the origin and evolution of TBE transposons in Oxytricha.