Genetic import and phenotype specific alleles associated with hyper-invasion in Campylobacter jejuni

Hyper-invasive strains are distributed across the C. jejuni phylogeny

All six isolates identified as hyper-invasive in our previous studies 15] (Table 1), were genome sequenced using the Illumina sequencing platform. The phylogenetic
position of the six isolates was then determined from a core genome alignment incorporating
all available C. jejuni reference genomes as well as the 128 genomes previously sequenced by Sheppard et al. 28]. The resulting phylogeny clearly showed that the hyper-invasive phenotype is not
a lineage specific trait (Fig. 1). Three of the hyper-invasive strains (01/51, 01/35 and 01/41) are part of the host
generalist and widely distributed clonal complex 21 29], many of which have been assayed for invasion phenotype and shown not to be hyper-invasive
15]. These three strains are not grouped together and are randomly distributed across
the phylogeny. Additionally, two further hyper-invasive strains, 01/04 and EX114,
which belong to the clonal complex 677 and 682 respectively, are found at the opposite
end of the species phylogeny, providing support for the hypothesis that hyper-invasion
is an acquired trait that can occur in any C. jejuni lineage.

Table 1. List of strains used for pan-genome analysis in our study

Fig. 1. Maximum likelihood phylogeny derived from the core-genome alignment of 131 C. jejuni isolates. Isolates with a known hyper-invasive phenotype have their taxa identifier
names highlighted in red. The three clades identified as containing hyper-invasive
strains have branches indicated in red

Evidence of genetic import from C. lari and C. doylei in the capsular polysaccharide region of hyper-invasive strains

Since our earlier study identified a role for LPS modification in invasion 17], and a previous DNA microarray study (data not shown) indicated that the capsular
polysaccharide region of C. jejuni strain 01/51 was divergent to that in other campylobacters, this region was studied
in more detail in all six of the hyper-invasive genomes to investigate whether other
surface structures might play a role in this phenotype. The overall structure of this
region in the hyper-invasive strains is similar to other reported CPS loci with a
central variable region flanked by conserved genes associated with capsule transport
and assembly. The size of the locus varies amongst the hyper-invasive strains with
strains 01/04 and 01/35 having the smallest locus of just 14, 746 bp and 01/10 having
the largest locus of 35,655 bp, including the flanking transport and assembly regions.
The number of genes in the central variable region ranges from just eight in strain
01/04 and 01/35 to 22 in strain 01/10. Genes associated with MeOPN modification (cj1415c-1418c) are present in all of the hyper-invasive strains, however the conserved heptose
biosynthesis genes (hddC, gmhA, hddA and dmhA) are only present in one of these strains, 01/10. All of the hyper-invasive strains
have a common set of six genes (cj1413c–cj1420c) adjacent to the kpsC gene orthologue (Fig. 2), which are absent from the previously characterised low invasive strain 81116. This
set of genes is also present in NCTC11168, which displayed high invasion levels in
our previous study 15].

Fig. 2. Comparison of the capsular polysaccharides (CPS) locus of the six hyper-invasive strains
(C. jejuni 01/51, 01/10, 01/35, 01/41, EX114 and 01/41) with that in the low invasive strain
C. jejuni 81116 and the hyper-virulent C. jejuni strain IA3902, isolated from aborted sheep. CPS loci from all strains were compared
using BLASTn and visualised using EasyFig. CDSs are colour-coded to indicate putative
gene function, with the conserved kps genes coloured grey. Grey scale indicates BLASTn similarity between CDSs

Comparison of the CPS locus across a wider set of Campylobacter genomes identified that the capsule locus for 01/51 was identical to that reported
in the reference genome of C. jejuni strain IA3902, known to be a hyper-virulent strain, isolated from an aborted sheep
30] (see Fig. 2). BLASTn analysis of capsule genes with no identity to other known C. jejuni genes in both 01/51 and 01/10 returned high identity hits to CPS genes in C. jejuni subsp. doylei and C . lari (Fig. 3). This cross species similarity was also observed within the capsular variable region
of IA3902, which as mentioned above was determined to be highly invasive (Fig. 3). This similarity was not observed for the capsule locus in C. jejuni 81116, which is the only available low invasive reference strain (Fig. 3). Our data suggests that the phylogenetically disparate hyper-invasive strains share
similarities in import of non-C. jejuni genes into the capsule locus. We tested the ability of the hyper-invasive strains
to survive exposure to human serum in comparison to low invasive reference strains
and found no differences (data not shown).

Fig. 3. Comparison of the capsular polysaccharide (CPS) locus of two of the C. jejuni hyper-invasive strains (C. jejuni 01/51 and 01/10), C. jejuni 81116 and the hyper-virulent C. jejuni strain IA3902, isolated from aborted sheep, with the CPS locus from C. jejuni subsp. doylei 26697 and C. lari RM2100. CPS loci from all strains were compared using BLASTn and visualised in EasyFig.
CDSs are colour-coded to indicate putative gene function. Grey scale indicates BLASTn
similarity between CDSs

Core genome analysis revealed a number of hyper-invasive unique genes with similarity
to genes from other Campylobacter species

Given our observation of genetic import at the capsule locus, and the lack of phylogenetic
clustering of hyper-invasive strains, we decided to further investigate any other
instances of shared gene content that may account for the observed hyper-invasive
phenotype. A pan genome was constructed using LS-BSR, from the six hyper-invasive
strains, C. jejuni 81116 and an additional six strains previously sequenced by our research group 31] that are known to be low-invasive. Group specific genetic loci were then identified
using the accompanying compare-BSR.py script 32]. A number of loci were identified as either unique to the hyper-invasive strains
or these strains had allelic variants of CDSs present in C. jejuni (Table 2). This finding was further confirmed by BLASTx comparison of each individual CDS
to the entire non-redundant nucleotide database. This analysis verified that many
of the hyper-invasion associated loci were either unique to the hyper-invasive strains
or only had orthologues in other species such as C. jejuni subsp. doylei, C. coli and C. lari (Table 2), indicating possible import of these loci as observed in the capsule locus (Fig. 3). Many of these CDSs were involved in sugar and acyl transfer reactions or hypothetical
proteins (Table 2).

Table 2. List of loci identified as associated with Hyper-invasive C. jejuni isolates

Interestingly, there was variability in sequence of some well-defined C.jejuni virulence genes with alleles conserved in all hyper-invasive strains. Most striking
was a set of genes identified as a second cdt operon by BLASTx comparison. The cdtA, B and C genes showed 81 % similarity to gene BAJ52735 from C. lari, 81 % similarity to BAJ52756 from C. lari, and 83 % similarity to BAJ52757 from C. lari respectively. The secondary cdt operon was present in addition to the classic cdt operon found in abundance across the species. A visual comparison of the cdt loci identical to that performed for the capsule locus was undertaken. The nucleotide
sequence similarity levels of 60–80 % suggest the secondary cdt operon is a paralog of the classical operon. However the secondary cdt operon showed more similarity at a phylogenetic level to the classical C. jejuni cdt operon than those of C. jejuni subsp. doylei and C. lari, and the orientation of the genes was in reverse to the classical cdt operon. (Additional file 1: Figure S1).

We focussed on the flagella genes and the secondary cdt operon and performed phylogenetic analysis of these regions. For the flagella genes,
we extracted this locus from the core genome alignment and then determined a maximum
likelihood phylogeny (Fig. 4). The phylogenetic tree showed comprehensively that flgD, E and flgL sequences are highly conserved in all six of the hyper-invasive strains and are divergent
from the majority of versions found across the species. The hyper-invasive flagella
genes form a distinct secondary clade that contains nine other C. jejuni strains that have not been characterised for invasion capacity (Fig. 4). For the second cdt operon, we aligned the sequences in hyper-invasive strains against all available
cdtA alleles available on the Campylobacter BigsDB website (http://pubmlst.org/campylobacter/) and created a maximum likelihood phylogeny (Fig. 5). This approach was taken as the cdt operon was not present in every genome included in Fig. 1, and to ensure as comprehensive an analysis as possible against the full diversity
of the cdt locus within C. jejuni. The phylogenetic separation of the cdt is even more apparent with the six hyper-invasive secondary cdt operons forming a highly divergent clade from all other C. jejuni alleles of cdt that have been sequenced (Fig. 5). BLAST analysis suggested the hyper-invasive cdt genes were similar to those found in C. jejuni subsp. doylei and C. lari.

Fig. 4. Maximum likelihood phylogeny of the concatenated sequences of flgD, E and L extracted from all 131 genomes used to create the core genome phylogeny. The hyper-invasive
isolates are highlighted in red

Fig. 5. Maximum likelihood phylogeny of all available alleles of cdtA from the BigsDB Campylobacter database aligned with the cdtA gene identified in the secondary CDT operon of the hyper-invasive isolates. The hyper-invasive
strains are highlighted in red

To confirm that this allele sharing between the hyper-invasive strains was not a random
event, we determined the likelihood of any given gene allele being shared between
strains belonging to the three phylogenetic clades containing the hyper-invasive strains
(clades marked in red in Fig. 1). A pan-genome matrix of all strains belonging to the three clades was created in
LS-BSR and the number of alleles shared between these three clades measured. A total
of 1551 genetic loci were compared and of those 53 shared an identical allele across
the three clades containing hyper-invasive isolates, giving a probability of random
allele sharing between these clades of 0.038. This provides strong statistical support
that the cdt and flagella gene alleles shared between the hyper-invasive strains are not random
and are associated with the shared phenotype of hyper-invasion.