Ecology of bacteria in the human gastrointestinal tract—identification of keystone and foundation taxa

Mapping biotic interactions

Summary statistics of the data used for time series modelling are presented in Table 1. The analysis identified many genus level interactions (Fig. 1, Additional file 1: Figures S1–S7), with considerable variation among subjects (Table 2). Figures presented in the main text are of I1 while I2–4 are discussed in the text,
and the corresponding figures are presented in supplementary information. The distributions
of the number of data points used to compute the models for each subject are presented
in Additional file 1: Figure S8. No relationship was observed between the number of data points used to
compute a model and the chance of observing a significant relationship at the 99 %
confidence level (Additional file 1: Table S1). When reducing the number of data points used to compute the models, the
results were qualitatively very similar to the full models (Table 2), with correlations between regression coefficients at 0.9 or more when reducing
the amount of data by 50 % (Additional file 1: Figure S9). However, for the same models, the number of observed significant interactions
relative to the total number of potential interactions declined in a near linear fashion
as the amount of data was reduced (Additional file 1: Figure S10). In this case, slopes among individuals were relatively consistent with
the chance of discovering a significant interaction reduced by 0.6–0.9 % when reducing
the data amount by 10 %.

Table 1. Summary statistics of data analysed in this study

Fig. 1. Genus level bacterial interactions for I1. The heat map describes the strength and
direction (? i,j
in Eq. 1) of highly significant interactions. Dependent variables are along the y axis and independent variables along the x axis, i.e. if you follow the column of given genus upward from the x axis until you reach a coloured cell, that cell indicates the effect of the given
genus (dependent) on the genus indicated on the y axis (independent). The colour key on the right-hand side indicates the sign and magnitude of interactions that were
significant at the 99 % confidence level. Cells representing non-significant relationships
are black. Genus names are coloured according to phylum provenance: black Actinobacteria, red Bacteroidetes, green Firmicutes, blue Proteobacteria, light blue Tenericutes, and pink Verrucomicrobia. Note that according to the NCBI taxonomy database, the family Erysipelotrichaceae and the genus Holdemania are classified as Firmicutes. Here, and throughout, we have remained consistent with the Greengenes classification
of these taxa as Tenericutes

Table 2. Summary of time series modelling results

In line with general expectations 15], there was a predominance of negative interactions (Figs. 1 and 2; Table 2; Additional file 1: Figures S1–S7 and S13–S15). I1 and I3 had more positive interactions relative to
I2 and I4 (Table 2). We determined the prevalence of pairwise interactions of varying signs in order
to assess the frequencies of apparent cooperation (+/+), competition (?/?), exploitation
(+/?), commensalism (+/0), with zero being a non-significant interaction and amensalism
(?/0). Although there was considerable variability between individuals, pairwise interactions
were dominated by competitive and amensal relationships (Fig. 3, Additional file 1: Figure S11). We did not observe a single instance of exploitation in any of the
individuals. Overall, biotic interactions were more negative among members of a phylum
than between members of different phyla when intra-genus interactions were included
in the analysis (p??0.001 for I1–I4). When intra-genus interactions were excluded, the test remained
significant for I1, I3 and I4 (p??0.001) but was only marginally significant for I2 (p?=?0.06).

Fig. 2. Ranked genus level degree of connectedness in I1. The bar charts show the total number of significant negative and positive genus level interactions.
Dependent (acted upon) interactions are shown in a while independent (acting on) interactions are in b. Genus names are coloured according to phylum provenance: black Actinobacteria, red Bacteroidetes, green Firmicutes, blue Proteobacteria, light blue Tenericutes and pink Verrucomicrobia

Fig. 3. Prevalence of observed pairwise interaction categories. The categories are indicated
on the x axis: cooperation (+/+), competition (?/?), commensalism (+/0, with zero being a
non-significant interaction) and amensalism (?/0). The y axis indicates the number of significant interactions in the specified categories
(Additional file 1: Figure S11) relative to the total number of pairwise interactions identified in
an individual (i.e. the number of pairwise interactions in which at least one taxon
in a given pair interacts with the other)

Variability in ecological network connectedness

The biotic interactions described above were categorized as positive or negative (i.e.
the ?s in 1 were larger or smaller than zero) and whether they represent a taxa acting upon other
taxa (independent variable, right hand term in Eq. 1) or taxa being acted upon (dependent
variable, left hand term in Eq. 1). These categories are referred to as positive independent
(PI), positive dependent (PD), negative independent (NI) and negative dependent (ND).
The sum of all four categories thus represents the degree of connectedness of a taxon.

We observe a large variation in the degree of connectedness across the different genera
(Fig. 2, Additional file 1: Figures S13–S15). We examined whether the ratio between positive and negative interactions
in which a genus is involved is independent of the genus’ total degree of connectedness
and found that more connected genera had disproportionately large numbers of positive
interactions relative to less-connected genera in I1 and I3 (p??0.001 for both, linear model), but not I2 and I4 (p?=?0.845 and 0.974, respectively).

Bacteriodetes as a candidate foundation taxon

The degree of connectedness in an ecological network can be seen as a proxy for a
taxon’s importance in structuring the dynamics of the ecosystem, i.e. its influentialness.
Bacteroidetes is an abundant taxon (48.8 % mean relative abundance) that is highly connected to
the other ecosystem members, in particular through positive interactions (Fig. 2, Additional file 1: Figures S13–S15). When we normalize the degree of connectedness to the number of
genera in each phylum, Bacteroidetes emerge as the main contributors of positive interactions (Fig. 4, Additional file 1: Figure S16). This observation is supported by linear models testing for differences
in the mean number of positive interactions between the different phyla in three of
the four individuals (I1, p?=?0.025; I3, p??0.001; I4, p??0.001). For I2, even though Bacteroidetes had the highest mean number of positive interactions, the low total number of positive
interactions identified in this individual limits the statistical power to detect
significant differences. Thus, the Bacteroidetes fit the description of a foundation taxon sensu 6].

Fig. 4. Total and mean connectedness of six main phyla in I1-4. a–d show counts of positive and negative interactions significant at the 99 % confidence
level for I1-4, respectively. e–h show the mean connectedness of the phyla, i.e. the counts in a–d divided by the number of observed genera within the respective phyla in each individual.
Each bar combines interactions viewed both as dependent and independent. Dependent
and independent interactions can be seen separately in Additional file 1: Figure S16

Actinobacteria as a candidate keystone taxon

Actinobacteria constitute on average 1.8 % of the GI microbiotas of the four individuals, comprising
comparatively few genera. When the degree of connectedness is normalized to relative
abundances, it becomes apparent that this group is very influential in the community
even though it is relatively scarce (Fig. 5). This pattern is particularly striking in I1, I2 and I4. I3 has a much higher mean
abundance of Actinobacteria (6.1 %), making it less apparent. The central role of the Actinobacteria in the biotic interaction networks, due to large number of both negative and positive
interactions, suggests that it may constitute a keystone taxon. This contrasts with
the Bacteroidetes that are characterized both by high relative abundances and many positive interactions.

Fig. 5. Phylum level degree of connectedness divided by relative abundances. The bars show the number of positive dependent (PD), positive independent (PI), negative dependent
(ND) and negative independent (NI) interactions observed within each of the six main
phyla divided by the mean relative abundance of each phylum (horizontal bars below
the panels). The ratios have been converted to percentages. a–d show data for I1-4, respectively

Relative abundances of Actinobacteria are strongly linked to community beta diversity when viewed across individuals (R2
?=?0.99, p??0.001; Fig. 6). Proteobacteria and Tenericutes occur at abundances comparable to those of Actinobacteria, but no such relationship was observed for these groups. Within individuals, we also
observe that variation in diversity is linked with Actinobacterial abundances (p??0.001 for all four individuals; Additional file 1: Figure S17). These results further strengthen the position of Actinobacteria as a keystone taxon.

Fig. 6. Between-individual relationship between Actinobacteria relative abundance and total community diversity (Shannon index). On the log scale,
the relationship is linear (R2
?=?0.99, p??0.001). This relationship was not observed for Tenericutes or Proteobacteria

Concordance with interaction maps from a RE approach

The RE approach 12], 16], although fundamentally different from the time series analysis approach taken here,
produces results that are qualitatively similar, i.e., they estimate biotic interactions
that can be categorized in the same way as our results into PI, PD, NI and ND. Comparison
finds concordance between our results, and RE results in that more negative than positive
interactions were observed (Additional file 1: Figure S16 E, J) and that more competition was observed within than between groups
(p??0.001; Additional file 1: Figure S18). This follows the general expectation that stronger competition would
stem from a higher degree of metabolic overlap, which is linked with phylogenetic
relatedness.

We also used the complementary and competitive indices as proxies for ecological connectivity
(higher value?=?more connected), summing the indices of species within each of the
six main phyla for comparison with time series analysis results (Fig. 4, Additional file 1: Figure S16). The results from RE are similar to the results from time series analysis
in that Firmicutes and Bacteroidetes are the most highly connected groups (Additional file 1: Figure S16A–E). However, in contrast to time series analysis results, the six phyla
are roughly equal in terms of mean connectedness estimated from the RE approach (Additional
file 1: Figure S16 J).

Comparison of co-occurrence modelling and time series analysis

Co-occurrence modelling, sometimes used in cross-sectional community studies for inference
of biotic interactions, is based on the rationale that negatively and positively correlated
occurrence patterns arise from negative and positive interactions, respectively 17]. Here, we analysed co-occurrence of taxa within each of the individuals and compared
coefficients of contemporaneous correlation to regression coefficients from time series
analysis. In each individual, we found a negative correlation (?0.68, ?0.48, ?0.70
and ?025 for I1–I4, respectively; Spearman correlations) between co-occurrence and
time series modelling results. A more detailed analysis (generalized additive models)
showed these relationships to be highly significant and mostly non-linear (Additional
file 1: Figure S12).

Using the lens of enterotyping to view longitudinal data

In order to see if enterotypes are stable over time, we looked at temporal patterns
of relative abundances of Bacteriodes, Ruminococcus, and Prevotella as well as a species rich fourth group of Clostridiales belonging to the Lachnospiraceae family and the genus Blautia (Additional file 1: Figure S19). These are the main bacterial groups associated with distinct enterotypes
in previous publications 14], 18]. For I2, I3 and I4, there was no strong evidence of enterotype clustering (average
silhouette widths of 0.23, 0.18 and 0.29, respectively). In addition, for these individuals,
there was generally poor agreement between silhouette widths and Calinski-Harabasz
indices for determining optimal cluster numbers, further indicating a poor fit to
the data. For I1, however, there was support for two distinct clusters (Fig. 7a, b). This is in agreement with previously reported results 19].

Fig. 7. Enterotype switching in I1. a Average silhouette widths of different cluster numbers indicate that two clusters
provide a reasonable fit to the data. b PCoA plot of samples from I1 showing two distinct clusters. c Relative abundance of Prevotella in samples corresponding to enterotype 1 (red) and 2 (green). d Relative abundance of Bacteroides in samples corresponding to enterotype 1 (red) and 2 (green)

Enterotype switching was related to the Bacteroides/Prevotella balance (Fig. 7c, d). Enterotype 1, characterized by high Bacteroides abundances, was predominant (278 days; 84 %) with type 2 observed for 54 days (16 %).
There was some evidence of temporal clustering of enterotype 2 observations (p?=?0.015, Wald-Wolfowitz test; Additional file 1: Figure S20), but sporadic switching at single time points occurred throughout the
time series. Interestingly, in the case of I1, there appears to be a strong dynamic
coupling between Bacteroides and Prevotella (Fig. 1, Additional file 1: Figure S1) with Bacteroides exerting a strong positive influence on Prevotella and a less pronounced yet still significant positive reciprocal interaction. This
relationship was not observed in the three other individuals and may be related to
the switching phenotype observed for I1.