High quality draft genome sequence and analysis of Pontibacter roseus type strain SRC-1T (DSM 17521T) isolated from muddy waters of a drainage system in Chandigarh, India

Genome project history

This organism was selected for sequencing on the basis of its phylogenetic position
25,26]. It is a part of the Genomic Encyclopedia of Type Strains, KMG-I project 27], a follow-up of the GEBA project 28], which aims to increase the sequencing coverage of key reference microbial genomes
and to generate a large genomic basis for the discovery of genes encoding novel enzymes
29]. KMG-I is a Genomic Standards Consortium project 30]. The genome project is deposited in the Genomes OnLine Database 21], the annotated genome is publicly available from the IMG Database 31] under the accession 2515154084, and the permanent draft genome sequence has been
deposited at GenBank under accession number ARDO00000000. Sequencing, finishing and
annotation were performed by the DOE Joint Genome Institute (JGI) using state of the
art technology 32]. The project information is briefly summarized in Table 2.

Table 2. Project information

Growth conditions and DNA isolation

Pontibacter roseus DSM 17521T, was grown aerobically in DSMZ medium 948 (Oxoid nutrient broth) 33] at 30°C. Genomic DNA was isolated using a Jetflex Genomic DNA Purification Kit (GENOMED
600100) following the standard protocol provided by the manufacturer with the following
modifications: an additional incubation (60 min, 37°C) with 50 ?l proteinase K and
finally adding 200 ?l protein precipitation buffer (PPT). DNA is available through
the DNA Bank Network 34].

Genome sequencing and assembly

The draft genome of Pontibacter roseus DSM 17521T was generated at the DOE-JGI using the Illumina technology 35]. An Illumina Std shotgun library was constructed and sequenced using the Illumina
HiSeq 2000 platform which generated 12,071,874 reads totaling 1,810.8 Mbp. All general
aspects of library construction and sequencing performed at the JGI is publicly available
36]. All raw Illumina sequence data was passed through DUK, a filtering program developed
at JGI, which removes known Illumina sequencing and library preparation artifacts.
Following steps were then performed for assembly: (1) filtered Illumina reads were
assembled using Velvet (version 1.1.04) 37], (2) 1–3 Kbp simulated paired end reads were created from Velvet contigs using wgsim
38], (3) Illumina reads were assembled with simulated read pairs using Allpaths–LG (version
r41043) 39]. Parameters for assembly steps were: 1) Velvet (velveth: 63 –shortPaired and velvetg:
-very clean yes –export- Filtered yes –min_contig_lgth 500 –scaffolding no –cov_cutoff
10) 2) wgsim (-e 0 –1 100 –2 100 –r 0 –R 0 –X 0) 3) Allpaths–LG (PrepareAllpathsInputs:
PHRED_64 =?1 PLOIDY =?1 FRAG_COVERAGE = 125 JUMP_COVERAGE =?25 LONG_JUMP_COV =?50,
RunAllpathsLG: THREADS =?8 RUN =?std_shredpairs TARGETS =?standard VAPI_WARN_ONLY
=?True.

OVERWRITE =?True). The final draft assembly contained 12 scaffolds. The total size
of the genome is 4.6 Mbp and the final assembly is based on 562.0 Mbp of Illumina
data, which provides an average 122.8?×?coverage of the genome. Additional information
about the organism and its genome sequence and their associated MIGS record is provided
in Additional file 1.

Additional file 1. Associated MIGS record.

Format: DOCX
Size: 93KB Download file

Genome annotation

Genes were identified using Prodigal 40] as part of the JGI genome annotation pipeline 41], followed by a round of manual curation using the JGI GenePRIMP pipeline 42]. The predicted CDSs were translated and used to search the NCBI nonredundant database,
UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. These data sources
were combined to assert a product description for each predicted protein. Non-coding
genes and miscellaneous features were predicted using tRNAscan-SE 43], RNAMMer 44], Rfam 45], TMHMM 46], SignalP 47] and CRT 48]. Additional gene functional annotation and comparative analysis were performed within
the IMG platform 49].