Comparative transcriptome sequencing and de novo analysis of Vaccinium corymbosum during fruit and color development

Blueberry (Vaccinium corymbosum) is an economically important small fruit crop and a member of Ericaceae family which includes many species, such as blueberry, cranberry (V. macrocarpon), lingonberry (V. vitis-idaea), rhododendron, and more than 400 other species [1, 2]. Three major types of blueberry are harvested commercially including lowbush (Vaccinium. angustifolium), highbush (V. corymbosum), and rabbiteye bluberry (V. ashei or V. virgatum). Although mostly originated in North America, many blueberry species are widely grown in Asia, Europe, South America, Africa, Australia and New Zealand owing in part to their high level of vitro antioxidant capacities [3, 4]. Blueberry is becoming a major crop in China, cultivated widely from temperate area to subtropical region. There are currently three major areas for blueberry cultivation in China, the Jilin and Liaoning provinces, the Shandong provinces and the areas of the Yangtze River [5].

Demand and consumption worldwide of blueberry has greatly increased in recent years for its beneficial influence on human health. These positive effects are generally due to the high levels of flavonoid [6], which have been linked to improve night vision, prevent macular degeneration, and decrease the heart disease [7, 8]. Therefore, it is crucial to elucidate the molecular mechanisms that trigger biosynthesis and accumulation of anthocyanin metabolites during fruit and color development. The blueberry genome is large (600 Mb/haploid genome) and genomic information is limited compared to some plants like grape, for example [9], which restrains the dissection of blueberry. Over the past decades, more attention has been focused on the analysis of plant cold resistance, cultivation, and effects on human health [10, 11]. Since massive amounts of information can be obtained from genome-scale expression data, RNA sequencing has become a powerful technology to profile the transcriptome [12]. To date, RNA sequencing has been reported in bilberry (Vaccinium myrtillus) and cranberry [13, 14]. Recently, transcriptome sequences of blueberry were analyzed during cold acclimation and at different development stages of fruit by ESTs sequencing or RNA sequencing [2, 15]. So far, transcriptome sequences have been generated using next generation sequencing so far from northern highbush [2], half-high [16], and southern highbush blueberry [17]. However, the information is still limited regarding the control of horticultural traits such as the molecular regulation mechanisms of blueberry maturation and flavonoid metabolism.

In order to gain new insights into molecular mechanism at transcriptome level, we performed transcriptome sequencing and gene expression profiling for the northern highbush blueberry variety ‘Sierra’ over berry development with Illumina sequencing technique. A total of more than 13.67 Gbp of data were generated and assembled into 186,962 transcripts and 80,836 unigenes. Large numbers of simple sequence repeats (SSRs) and candidate genes, which are potentially involved in plant growth, metabolic and hormone pathways, were identified. In addition, RNA-Seq expression profiles and functional annotations have been made publicly available (accession number: SRR2910056). We believe that this study provides a new and more powerful resource for interpretation of high-throughput gene expression data for blueberry species.