Metagenomic analysis of the microbiota in the highly compartmented hindguts of six wood- or soil-feeding higher termites

We analyzed amplicon libraries and metagenomic libraries obtained for six species
of higher termites to compare the structure and functional potential of the intestinal
microbiota in the major gut compartments. iTag sequencing analysis revealed strong
differences in bacterial community structure already at the phylum level, both between
the individual gut compartments of each termite and among the homologous gut compartments
of termites with different feeding strategies (Fig. 1). Spirochaetes represented the majority of bacteria in the P3 compartment of wood
and litter feeders but comprised only a minor proportion in the humus and soil feeders,
which is in agreement with previous reports based on bacterial clone libraries obtained
from total guts of congeneric species 8], 16], 17]. The presence of Fibrobacteres and the TG3 phylum exclusively in the gut microbiota
of wood and litter feeders matches previous observations 17], 18] and the characteristic association of these lineages with wood fibers 19].

Fig. 1. Bacterial community structure in the major hindgut compartments (P1, P3, and P4) of
six termite species. The relative abundance of major bacterial phyla in the iTag analysis
is shown; detailed classification down to the genus level is shown in Additional file
3: Table S1. Termite host abbreviations: Nc Nasutitermes corniger, Mp Microcerotermes parvus, Co Cornitermes sp., Th Termes hospes, Nt Neocapritermes taracua, Cu Cubitermes ugandensis

The bacterial community of the P1 compartment of most termite species was dominated
by Firmicutes, which is in agreement with previous reports on the microbiota of this
sometimes highly alkaline hindgut compartment 6]–8]; the high proportions of Spirochaetes and Actinobacteria in certain termite species
are exceptional but not unprecedented 6], 8]. The bacterial communities in the P4 were generally more diverse than in the other
compartments and displayed an increasing abundance of Bacteroidetes, which matches
previous observations with Nasutitermes and Cubitermes species 6], 7]. The detailed classification results for all taxonomic ranks down to the genus level
are shown in Additional file 3: Table S1.

Metagenomic sequencing of the major hindgut compartments (P1, P3, and P4) of the six
termite species yielded an average library size of 42 Gbp (range, 30–70 Gbp), with
90 % of the bases (range, 68–99 %) in the assembled fraction (Table 1). The large number of bacterial contigs longer than 100 kbp and the strong size reduction
of the assemblies to 1.4 Gbp (range, 0.6–2.1 Gbp) after dereplication indicate a relatively
low diversity of the respective communities. In a pilot experiment with N. corniger and Cubitermes ugandensis, we also obtained smaller libraries (3–5 Gbp) for the crop (foregut), midgut, and
P5 compartments, with only 50 % of the bases in the assembled fraction. Because assembly
sizes after dereplication were about tenfold smaller (0.1–0.4 Gbp) (Table 1), these datasets were not included in the following analyses.

A BLASTp analysis of the metagenomes allowed assignment of the majority of the protein-coding
genes to the three top-level domains; only 10–38 % of the gene copies remained unclassified
(Fig. 2). In most libraries, the majority of genes were of bacterial origin. Archaeal genes
represented only a small fraction of the gene copies in all libraries, with highest
proportions (up to 4 %) in the P4 compartment, which is in agreement with the low
abundance of archaeal rRNA in termite hindguts 20]. However, in most P1 compartments, bacterial genes were outnumbered by genes assigned
to eukaryotes. Notable exceptions are Cornitermes sp. and Cubitermes ugandensis, where the P1 is almost as large as the P3 7], 21]. This agrees with our expectation that the proportion of host DNA will be larger
in smaller compartments (resulting in a higher surface-to-volume ratio and hence relatively
more host tissue) and the observation that the density of the gut microbiota is generally
lower in the P1 than in other compartments 6], 7]. Nevertheless, the remaining information on the bacterial and archaeal microbiota
is sufficient to draw conclusions about symbiont-mediated functions in each gut compartment.

Fig. 2. Assignment of protein-coding genes in the metagenomic libraries to the three top-level
domains. Taxonomic assignment is based on BLASTp analysis (top hit 30 % identity).
The abundance of a gene in a library was estimated using the length and read depth
of the gene in the respective assembly (read depth of 1 for unassembled reads). Termite
host abbreviations: Nc Nasutitermes corniger, Mp Microcerotermes parvus, Co Cornitermes sp., Th Termes hospes, Nt Neocapritermes taracua, Cu Cubitermes ugandensis

The differences in bacterial community structure among the gut compartments were reflected
in the relative abundances of COG functional categories in the respective libraries,
which indicated that the functional potential of the bacterial gut microbiota differs
between feeding groups (Fig. 3). The tight clustering of the P3 compartments of wood- and litter-feeding termites,
with the inclusion of the P1 compartment from the litter-feeding termite, indicates
that patterns in functional potential of the gut microbiota are correlated with the
feeding strategy of the host. In addition, P1 from wood-feeding termites, as well
as P1 from the humus-feeding T. hospes clustered separately from other gut compartments, which indicates similarities between
homologous gut compartments regardless of feeding strategy. Detailed results of the
COG analysis are shown in Additional file 3: Table S2.

Fig. 3. Similarity of the functional potential of the microbiota in different gut compartments.
The analysis is based on non-metric multidimensional scaling (NMDS) of Bray-Curtis
similarities using the relative abundances of genes in different functional categories
(COGs), weighted by gene length and read depth in the respective assembly (see Additional
file 3: Table S2). The shape of the data points differentiates wood and litter feeders (circle) from humus and soil feeders (square); numbers indicate gut compartments P1, P3, and P4

A comparison of the bacterial community structure determined by iTag analysis with
the phylogenetic classification of protein-coding genes in the metagenomes revealed
large discrepancies already at the phylum level (Additional file 3: Table S1). While Fibrobacteres and the TG3 phylum were highly abundant in bacterial
communities of wood- and litter-feeding termites, they were strongly underrepresented
(Fibrobacteres) or undetected (TG3 phylum) in the taxonomic assignments of the protein-coding
genes (exemplified in Additional file 2: Figure S1). This discrepancy is explained by the lack of appropriate reference genomes
in public databases. The only sequenced genome from Fibrobacteres, the rumen isolate
Fibrobacteres succinogenes, is only distantly related to Fibrobacteres detected in this study 22], and the draft genome of Chitinivibrio alkaliphilus, the first isolate of the TG3 phylum 23], was not included in public databases at the time of analysis. The high abundance
of genes assigned to Proteobacteria, which contrasts strongly with their low proportion
in the iTag datasets, is also likely caused by the bias introduced by incorrect assignment
due to the lack of reference genomes.