Extensive horizontal gene transfers between plant pathogenic fungi

Overview of Magnaporthales genomes

Magnaporthales comprises a group of fungal lineages with an evolutionary depth comparable
to tetrapods (i.e., human-frog divergence; Fig. 1a). The Magnaporthales lineages possess comparable genome sizes (39–42 Mbp) and total
gene numbers (12–13 K), which are typical of Sordariomycetes (Fig. 1b). To reconstruct a robust Sordariomycetes phylogeny, we identified 1453 highly conserved
single-copy genes across 22 taxa (see Methods). A maximum likelihood (ML) tree built
using multi-protein data comprising 20 % of the genes (291 genes and 226,915 amino
acids positions) with the strongest phylogenetic signal (see Methods) resulted in
a topology with 100 % bootstrap support for all interior nodes (Fig. 1b). This result is generally consistent with previous phylogenies that showed a sister
group relationship between Magnaporthales and Ophiostomatales (e.g., 16], 22]).

thumbnailFig. 1. Comparative analysis of Magnaporthales genomes. a Evolutionary rate comparison between Sordariomycetes and vertebrates. All interior
nodes have 100 % bootstrap support using a multi-protein concatenated dataset. Magnaporthales
and vertebrates are highlighted using thick branches in pink and black, respectively.
b Phylogenetic relationships among 19 lineages of Sordariomycetes, showing their genome
sizes (Mbp) and predicted gene numbers. The outgroup species are not shown in this
phylogeny. All interior nodes have 100 % bootstrap support using a multi-protein concatenated
dataset (shown in Additional file 1). The numbers shown at the selected nodes are gene-support frequencies/internode
certainty values. The black dots mark the five branches at which independent gene
losses are required to explain Magnaporthales-Colletotrichum gene sharing under the assumption of vertical gene transmission

Extended majority rule consensus and majority rule consensus (MRC) trees built using
the corresponding 291 single-gene ML trees resulted in the same topology (Fig. 1b). Of the 11 internodes that define or link orders (Fig. 1b), 10 internodes have more than 50 % gene-support frequencies (GSF) or are supported
by more than 50 % (146) of the single-gene ML trees (Fig. 1b). All of these internodes have more than 0.3 internode certainties (IC, see 23] for details), suggesting the defined bipartitions are more than four times more likely
to exist than the most likely alternative bipartitions. The same topology and ML bootstrap
support values were obtained when using the 583 (40 %) genes with the strongest phylogenetic
signal and when using the full set of 1453 genes, although with decreasing GSF and
IC values (Additional file 1). These results show that Magnaporthales and Colletotrichum are distinct lineages separated in the tree by multiple, well-defined Sordariomycetes
lineages.

HGT marker genes derived from non-Pezizomycotina sources

To search for HGT candidates, we employed a phylogenomic approach to build single-gene
phylogenies for protein sequences from the specified query species. This approach
is conservative because many genes do not lead to highly supported phylogenies (or
no phylogenies at all) for different reasons such as lack of phylogenetic signal,
short sequence length, and few detectable homologs in the database (see Methods for
details). From the available Magnaporthales genomes, we used Magnaporthiopsis incrustans (a grass pathogen in Magnaporthales) as a representative species. We used the M. incrustans proteins as query against a local database that included NCBI RefSeq (version 55)
and genome and transcriptome data from 110 Pezizomycotina species (Additional file
2). We identified three instances in which M. incrustans genes and their Magnaporthales orthologs were derived from non-Pezizomycotina (NP)
sources via HGT (Additional file 3) with 85 % or more SH-like branch support 24] and 85 % or more UFboot support 25]. Limited numbers of foreign gene candidates were previously reported in its sister
lineage Pyricularia oryzae10], 12], 15], 26].

When allowing the NP-derived foreign genes to be shared with one other Pezizomycotina
genus, we identified two NP-derived genes that are exclusively shared between M. incrustans (and Magnaporthales orthologs) and Colletotrichum (Fig. 2). An example is the monophyly of the Magnaporthales and Colletotrichum major facilitator superfamily transporter proteins that are nested within bacterial
homologs (Fig. 2a and Additional file 4). The other case represents the exclusive sharing of a putative alpha-1,2-mannosidase
that is derived from distantly related fungal lineages (Fig. 2b and Additional file 4). These two instances of exclusive gene sharing were confirmed using a two-way phylogenomic
approach. The principle behind this method is analogous to the reciprocal-best hit
approach widely used with BLAST searches. More specifically, in this case, we subjected
the Colletotrichum sequences in Fig. 2a, b to our phylogenomic pipeline to search its sister lineages and recovered exclusive
gene sharing with Magnaporthales (see Methods for details).

thumbnailFig. 2. Exclusive sharing of non-Pezizomycotina-derived horizontal gene transfer gene markers
in Magnaporthales and Colletotrichum. a Maximum likelihood (ML) tree of a major facilitator superfamily transporter. b ML tree of a putative alpha-1,2-mannosidase that participates in carbohydrate transport
and metabolism

Extensive gene transfer between Magnaporthales and Colletotrichum

Given the overall paucity of NP-derived genes in M. incrustans and two instances of exclusive sharing of such foreign gene markers with Colletotrichum, we tested the magnitude of gene transfers between M. incrustans and Colletotrichum using the two-way phylogenomic approach. Out of 9154 single gene phylogenies generated
using M. incrustans proteins as queries, we identified 93 (1.0 %) M. incrustans genes with a Colletotrichum provenance with 85 % or above SH-like branch support 24] and 85 % or above UFboot support 25] (Additional file 5). These 93 candidates represent 89 distinct transfer events followed by independent
duplications of four different genes (Additional file 5). These HGTs are located in relatively long M. incrustans contigs (coding???5 genes) and have orthologs in other Magnaporthales species. In
91 % (86/93) of the cases, at least one of the associated Colletotrichum genes is located in contigs or scaffolds encoding five or more genes. In 80 % (75/93)
of the instances, shared genes are present in two or more Colletotrichum species. Transfers of five genomic segments comprising 2–3 HGTs were identified between
the two lineages (Additional file 5). In all but one case, only limited regions of the entire length of contigs were
impacted by HGT in both lineages. One example is the transfer of a two-gene Magnaporthales
segment to the common ancestor of Colletotrichum. The phylogenies of the two genes with Magnaporthales-Colletotrichum groupings are shown in Additional file 6. These results, corroborated by the overall high quality of the fungal genome data,
suggest that most of the identified HGT instances between Magnaporthales and Colletotrichum are not explained by sequence contamination.

The nature and significance of HGT between Magnaporthales and Colletotrichum

Of the 93 putative instances of HGT, 45 likely resulted from gene transfers from Magnaporthales
to Colletotrichum (Additional file 5). One example is the phylogeny of a putative dimethylaniline monooxygenase in which
Colletotrichum sequences are nested within homologs from Magnaporthales (Fig. 3a and Additional file 4). Another 19 HGT instances were in the opposite direction (Additional file 5) including a NACHT and TPR domain-containing protein, whose phylogeny shows Magnaporthales
to be nested within Colletotrichum and its sister-group lineage Verticillium (Fig. 3b and Additional file 4). The directions of gene transfers for the remaining instances are unclear.

thumbnailFig. 3. The nature of horizontal gene transfer (HGT) between Magnaporthales and Colletotrichum. a Maximum likelihood (ML) tree of a putative dimethylaniline monooxygenase. This phylogeny
provides an example of a gene transfer from Magnaporthales to Colletotrichum. b ML tree of a NACHT and TPR domain-containing protein. This phylogeny provides an
example of a gene transfer from Colletotrichum to Magnaporthales. c Random sampling analysis of HGT gene clustering in the M. incrustans genome. We randomly sampled 93 genes from the M. incrustans data 5000 times (see Methods) and the number of genomic segments derived from these
replicates (represented by the histogram) ranged from 0 to 7. In over 99.9 % (4955)
of the replicates, six or less genomic segments resulted. Therefore, the chance is
less than 0.1 % to generate the eight genomic segments that were observed in the empirical
data (the thick black arrow). Similarly, the range of the genes that were included
in the genomic segments was 0–14 with over 99.9 % of the gene numbers being 12 or
less. Therefore, the chance is less than 0.1 % to generate a total of 18 genes that
are contained in genomic segments. These results suggest that the enrichment of physical
linkage in our HGT data cannot be explained solely by chance. d The proportion of carbohydrate-activating enzymes, transporters, and peptidases among
the HGT set (gray color) in comparison to those in complete-genome data (white color).
The results of significance test are indicated for each comparison

About one-quarter of the gene transfers occurred in the stem lineage of Magnaporthales
(e.g., Figs. 2a and 3b, and Additional file 4). Considering the relatively recent emergence of Colletotrichum, these HGTs likely occurred between the Magnaporthales common ancestor and an ancient
lineage leading to extant Colletotrichum. Other HGT instances occurred more recently and are restricted to particular Magnaporthales
lineages (e.g., Fig. 3a and Additional file 4). Given the uncertainties that result from the varying sequencing depth and differential
gene loss among Magnaporthales clades, predictions about the timing of gene transfer
should be treated with caution. Nevertheless, these results strongly suggest that
Magnaporthales exchanged genes with the lineage leading to modern-day Colletotrichum.

We identified eight M. incrustans genomic segments (containing 18 genes) that contain two or more physically linked
genes of HGT origin (allowing one intervening non-HGT gene) (Additional file 5). We manually examined the genomic locations of the relevant Colletotrichum genes associated with the five genomic segments without non-HGT interruption (discussed
earlier). In almost all cases, the corresponding genomic segments were also found
in Colletotrichum genomes. Random sampling 18 genes (5000 times) from the 9154?M. incrustans genes with single-gene phylogenies showed that the physical linkage of HGT genes
is significantly more than expected by chance alone (Fig. 3c). A similar result was obtained when using the Ophioceras dolichostomum (instead of M. incrustans) proteome as the input for the two-way phylogenomic analysis (Additional file 7). A total of 51 HGTs (51 distinct transfer events) were inferred between O. dolichostomum and Colletotrichum (Additional file 8). These results suggest that HGT between Magnaporthales and Colletotrichum often occurred as segmental transfers involving more than one gene.

We then asked, what is the functional significance of HGT between Magnaporthales and
Colletotrichum? From the perspective of taxonomy, out of the 1453 highly conserved single-copy orthologous
genes that were identified across 22 Pezizomycotina lineages (see Methods), none were
implicated in HGT. This suggests that Magnaporthales-Colletotrichum HGTs have a limited impact on highly conserved genes and likely does not pose significant
challenges for the reconstruction of a fungal tree of life. From the perspective of
functional impacts, we examined several functional categories associated with the
plant pathogenic lifestyle, including carbohydrate-activating enzymes (CAZymes) 27] involved in cell wall degradation, membrane transporters, and peptidases involved
in pathogenesis 28]. We found a 2.6-fold enrichment of CAZymes in the M. incrustans gene set derived from HGT (31.2 %; 29/93; regardless of direction and timing of HGT,
Fig. 3d) when compared to the 9154-gene background data (11.7 %; 1075/9154). This enrichment
was statistically significant (P?=?1?×?10
–8
; ?
2
test) and was not explained by post-HGT duplication of CAZyme encoding genes in Magnaporthales.
The 29 transferred CAZymes represent 27 independent HGT events with only two genes
having resulted from post-HGT gene duplication. Enrichment of CAZymes among genes
that were transferred between Magnaporthales and Colletotrichum (P?=?0.052; 19.6 % (10/51) in HGTs versus 11.0 % (999/9047) in genome background; ?
2
test) were also observed when analyzing the O. dolichostomum genome data (Additional file 7). Weak or non-significant differences were however found in the distribution of transporter
and peptidase genes (Fig. 3d and Additional file 7).

Given that DNA transfer and integration are largely independent of gene functions,
these results suggest that HGTs with cell wall degradation functions were selectively
retained (twice as likely than average) after insertion into host genomes. This function-driven
selection is likely linked to the plant pathogenic lifestyles found in both lineages.
The Magnaporthales-Colletotrichum HGT connection may therefore have been facilitated by a shared ecological niche and
host. HGT occurs commonly between species that are in close proximity or have physical
contact (e.g., 29]–31]).

Alternative explanations for Magnaporthales-Colletotrichum gene sharing

We examined three potential issues that might weaken our case for the 93 HGTs between
M. incrustans and Colletotrichum (i.e., poor sampling and extensive gene loss among taxa, phylogenetic artifacts,
and random chance). Regarding the first issue, when the corresponding genes were absent
in all other Sordariomycetes lineages (e.g., Fig. 2a), the explanation for HGT due to poor sampling and extensive gene losses in closely
related lineages would require the complete absence or loss of the impacted genes
in all five Sordariomycetes lineages (Fig. 1b and Additional file 9: Figure S1) that were well-sampled in this study (Additional files 2 and 10). When assuming the existence of the node uniting Magnaporthales and Colletotrichum to be the Sordariomycetes common ancestor, a total of five gene losses are required
to explain all Magnaporthales-Colletotrichum HGTs (HGT type I, see Additional file 9: Figure S1 for details). However, careful examination of the HGT gene trees derived
from the M. incrustans genome data revealed a total of 33 independent HGT events [type II (4 genes), type
III (12 genes), and type IV (17 genes)] that require more than five gene losses when
vertical inheritance with gene loss is assumed (Additional file 9: Figures S2, S3 and S4). For HGT types II and III, the corresponding genes are present
in additional Sordariomycetes lineages and form a sister group relationship (?85 %
UFboot support) to the Magnaporthales-Colletotrichum monophyletic clade (e.g., Verticillium in Fig. 3b). This leads to phylogenetic conflicts because Magnaporthales and Colletotrichum are separated by additional Sordariomycetes lineages in the species tree shown in
Fig. 1b (see Additional file 9: Figures S2 and S3 for details). To explain these phylogenetic conflicts, one ancient
gene duplication and 11 independent gene losses are required when assuming vertical
inheritance and gene loss, whereas only one gene transfer (type II) and an additional
gene loss (type III) are required when HGT is allowed (Additional file 9: Figures S2 and S3). We also identified HGT cases (type IV), in which Colletotrichum species are nested among Magnaporthales or vice versa (with???85 % UFboot support
at the relevant nodes, Fig. 3a and Additional file 9: Figure S4). The phylogenetic conflicts raised in these HGTs require a total of one
ancient gene duplication and 11 independent gene losses when assuming vertical inheritance
and gene loss, whereas only one gene transfer (Type IV, scenario b) and an additional
gene duplication (Type IV, scenario a) are required when HGT is allowed (see Additional
file 9: Figure S4 for details). Whereas we cannot definitively exclude the possibility of
vertical inheritance and gene loss as an explanation for each HGT candidate identified
in this study, a total of 33 HGT cases (corresponding to HGT types II–IV, explained
in Additional file 9) are highly unlikely to be explained by the vertical inheritance and gene loss scenario.
The topologies and supporting values of these high confidence HGTs (available in Additional
file 11) were confirmed via examination of gene trees generated from two-way phylogenomics
and from the HGT validation procedure (see Methods). A total of 15 independent HGTs
(types II–IV) were found in O. dolichostomum genome data (Additional file 11).

For the second issue, we applied a novel implementation of two-way phylogenomics and
an additional round of phylogenomic analysis to search for and validate HGTs. These
analyses involve different sequence sampling strategies (taxonomically dependent and
independent sampling, and BLASTp hits sorted by bit-score and by sequence identity)
and different tree building methods (FastTree and IQtree) (see Methods for details).
The Magnaporthales-Colletotrichum HGTs are therefore unlikely to be primarily explained by phylogenetic artifacts.
Regarding the third issue, it is possible that analysis of large genomic datasets
might lead to observations of HGT that are explained solely by chance. However, random
sampling of the Magnaporthales gene set (see Methods) is unlikely to generate as many
physical linkages as we report in the empirical data (Fig. 3c and Additional file 7). The enrichment of physical linkages among HGT candidates (0.1 % chance by random
sampling, Fig. 3c and Additional file 7) is therefore unlikely to be accounted for solely by chance due to the large amount
of genome data being analyzed. Likewise, the observed enrichment of CAZyme genes (P?=?1?×?10
–8
in M. incrustans data, Fig. 3d; and P?=?5?×?10
–2
in O. dolichostomum data, Additional file 7) in our HGT data is unlikely to be explained by random chance.