High quality draft genome of Lactobacillus kunkeei EFB6, isolated from a German European foulbrood outbreak of honeybees


The genome statistics are provided in Table 3. The high quality draft genome sequence consists of 55 contigs that account for a
total of 1,566,851 bp and a G?+?C content of 37 mol%. Of the 1,455 predicted genes,
1,417 were putatively protein-encoding, 35 represented putative tRNA genes and three
putative rRNA genes. For the majority of the protein-encoding genes (75%) a function
could be assigned. The distribution of these genes into COG functional categories
36] is shown in Table 4.

Table 3. Genome statistics

Table 4. Number of genes associated with the general COG functional categories

Table 5. Primer used in this study

Insights into the genome

Five different Lactobacillus species were used for genome comparisons with L. kunkeei EFB6 based on blastp 37]. Results are shown in Figure 3. All five species are of interest as probiotics, part of the gastrointestinal tract
of animals or humans, or used in the production of fermented food.

Figure 3. L. kunkeei EFB6 artificial circular chromosome map. Comparisons (blastp) of L. kunkeei EFB6 chromosome to Lactobacillus acidophilus 30SC (NC_015213), Lactobacillus plantarum 16 (NC_021514), Lactobacillus brevis ATCC 367 (NC_008497), Lactobacillus johnsonii NCC 533 (NC_005362), and Lactobacillus rhamnosus ATCC 8530 (NC_017491), using the BRIG software 38] are shown in black, purple, red brown, cyan, blue and green, respectively. Gene regions
used for detailed analyses are depicted in an outer circle and marked in red.

The identification of orthologous proteins was performed with the program Proteinortho
5.04 39] by using the protein content deduced from 232 lactobacilli genomes as references
(GenBank database as of 28.02.2014). For this purpose ncbi_ftp_download v0.2, cat_seq
v0.1 and cds_extractor v0.6 were used 40]. With an identity cutoff of 50%, we identified 425 proteins in L. kunkeei EFB6 without orthologs in any other Lactobacillus species. Among these unique L. kunkeei EFB6 proteins, we selected 7 proteins for detailed analyses.

Analysis of the 89-kb region shown in Figure 3 revealed five ORFs (LAKU_4c00030-LAKU_4c00070) without orthologs in any genomes derived
from lactobacilli deposited in GenBank (as of 28.02.2014). Furthermore, no homologs
could be identified in any other sequenced microbial genome (NCBI nr-database as of
05.03.2014) by using blastp (e-value cutoff of 1e-20). Except for LAKU_4c00060 (7,521
amino acids), we could identify an N-terminal signal peptide and a non-cytoplasmic
domain (Figure 4A) using Phobius’ domain prediction software 41]: LAKU_4c00040 (4,579 amino acids) and LAKU_4c00070 (3,129 amino acids) contain coiled
coil structures. Except of LAKU_4c00050 (8,342 amino acids), all ORFs show weak similarity
to large surface proteins or extracellular matrix-binding proteins found in bacteria
such as Staphylococcus, Streptococcus, Burkholderia, Weissella, Mannheimia, and Marinomonas, but also in Lactobacillus and Pediococcus. Since, L. kunkeei EFB6 is the first sequenced genome harboring these cluster, we designed specific
primer pairs for detection of each ORF in other Lactobacillus strains by PCR (Table 6). As shown in Figure 4B, all five ORFs were present in other L. kunkeei strains isolated from honey and wine. On the basis of domain prediction and IMG’s
bidirectional best hits 32], we assume that this gene cluster encodes cell surface or secreted proteins involved
in cell adhesion or biofilm formation.

Figure 4. Domain prediction (A) of the 89-kb region found in L. kunkeei EFB6 and its presence in other lactobacilli (B). A combined transmembrane topology and signal peptide predictor 41] was used to determine putative domains. The yellow blocks represent signal peptides,
the white color of the arrows show the non-cytoplasmic part. Red blocks represent
transmembrane regions and blue blocks predicted coiled-coil structures. To test whether
this region exists in other L. kunkeei strains, we designed specific primer-pairs for each ORF (Table 5, Figure 4A). Predicted PCR product sizes are depicted in white boxes. The presence of the genes
were tested for L. kunkeei EFB6, L. kunkeei HI3 (isolated from honey), L. kunkeei DSM 12361 (isolated from wine), and L. johnsonii DSM 10533 (isolated from human blood) (Figure 4B). The obtained PCR product sizes correlated with the predicted sizes (Table 5, Figure 4A). For L. johnsonii DSM 10533, no PCR product could be obtained.

During genome comparison, we identified two additional proteins (LAKU_24c00010 and
LAKU_24c00050) without a homolog in any of the publicly available genome sequences.
These proteins show only weak sequence similarity to known proteins and might be involved
in cellular adhesion. LAKU_24c00010 contains a signal peptide, transmembrane helices
and 29 DUF1542 domains, which are typically found in cell surface proteins. In Staphylococcus aureus, it has been shown that some DUF1542-containing proteins are involved in cellular
adhesion and antibiotic resistance 42]. LAKU_24c00010 showed the highest sequence identities to the matrix-binding protein
(WP_010490864) of “Lactobacillus zeae” KCTC 3804 (40%) 43] and the extracellular matrix binding protein (YP_005866289) of Lactobacillus rhamnosus ATCC 53103 (36%) (Figure 5).

Figure 5. Tblastx comparison of L. kunkeei ORF LAKU_24c00010 to matrix binding proteins of L. rhamnosus ATCC 53103 and “ L. zeae ” KCTC 3804. The graphical presentation was done with Easyfig software (minimum blast hit length
of 200 bp and a maximum e-value of 1e-100) 44]. LAKU_24c00010 shows similarities to WP_010490869, WP_010490864 and WP_010490862
of “L. zeae” KCTC 3804, but also to YP_005866289 (L. rhamnosus ATCC 53103). The ORFs used for comparison are labeled with NCBI accession numbers.
The blast identity is shown in a colored scale ranging from 31 % (yellow) to 100 %
(red).

Additionally, LAKU_24c00050 contains N terminal transmembrane helices, two mucin-binding
protein domains as well as a C terminal Gram positive-anchoring domain. Proteins with
this domain combination are usually associated with bacterial surface proteins. LAKU_24c00050
showed similarity to the Mlp protein (WP_004239242) of Streptococcus mitis and other mucus-binding proteins (Figure 6). Due to the mucosal surface-colonizing properties of lactobacilli, they have been
investigated as potential recombinant mucosal vaccines 45].

Figure 6. Tblastx comparison of MucBP domain-containing proteins. Comparison of MucBP domain-containing proteins were performed using the program Easyfig
(mininum blast hit length of 50 bp and maximum e-value of 1e-10) 44]. LAKU_24c00050 shows similarity to ORFs of Streptococcus mitis NCTC 12261 (NCBI accession numbers inside arrows, which represent ORFs used for comparison).
Additionally, LAKU_24c00050 shows similarity to WP_003144513 of Gemella haemolysans ATCC 10379 and CCC15643 of Lactobacillus pentosus IG1 46]. The blast identity is shown in a colored scale ranging from 20% (yellow) to 100%
(red).

In the genome of L. kunkeei EFB6, we identified genes encoding all proteins of the general secretory (Sec) pathway
and putative polysaccharide biosynthesis proteins, which may participate in capsule
or S layer formation. Recently, Butler et al. (2013) 47] detected a lysozyme produced by L. kunkeei Fhon2N and suggested a bacteriolysin or class III bacteriocin function. In L. kunkeei EFB6, we identified four genes belonging to the glycoside hydrolase family 25. Enzymes
of this family are known to possess lysozyme activity. Two of the deduced proteins
(LAKU_13c00160 and LAKU_32c00010) contain a signal peptide, indicating secretion of
the proteins. LAKU_19c00290 harbors transmembrane helices and is probably anchored
in the cell wall. LAKU_6c00080 did not contain a putative signal peptide or transmembrane
helices.

Rapid test PCR

Specific primer pairs have been designed to test other strains by PCR for the presence
of an 89 kb region, which harbors five open reading frames (ORFs). Genomic DNA of
the L. kunkeei strains EFB6, HI3 and DSM 12361, and Lactobacillus johnsonii DSM 10533 was used as template for PCR amplifications employing the thermal cycler
peqSTAR 2X (PEQLAB Biotechnologie GmbH, Erlangen, Germany). PCR amplification was
performed with the BIO-X-ACTâ„¢ Short DNA Polymerase (Bioline, Luckenwalde, Germany)
and an initial denaturation step at 98°C for 2 min, followed by 30 cycles of denaturation
for 20 s at 96°C, annealing for 20 s at 60°C and elongation for 30 s at 68°C. Subsequently,
a final elongation step of 10 min at 68°C was performed. PCR products were purified
employing the QIAquick PCR Purification Kit (Qiagen, Hilden, Germany).