Identification, characterization, and gene expression analysis of nucleotide binding site (NB)-type resistance gene homologues in switchgrass

The identification and validation of plant disease resistance genes is a major focus in the molecular investigations of plant-pathogen interactions. While other studies have aimed to understand the molecular mechanisms controlling switchgrass resistance to switchgrass rust [13, 30], none of these studies have mined the currently available draft switchgrass genome (v 1.1) for potential NB-LRR resistance gene homologs. In this research, a homology-based computational approach was used to identify 1011 unique RGHs in the switchgrass genome. Approximately 266 % more RGHs were identified than in a similar study that detected switchgrass RGHs from EST sequences and PCR-based cloning [30]. However, the total number of RGHs in switchgrass may change with further refinement of the switchgrass genome; although, the percentage of RGHs identified in this study is similar to rice [58].

Structural analysis of the 1011 switchgrass RGHs provided useful insights into their putative molecular functions. We identified several of the major features expected in plant NB-LRRs [20, 54] in switchgrass. Additionally, the large number of putative RGHs indicates that there is a substantial repertoire of disease resistance genes in the switchgrass genome and that a robust immunity potential is present in the event of pathogen invasion. However, further studies are needed to validate the sub-cellular localization and functions of these proteins.

The NB domain of plant R protein has been shown to act like a molecular switch and function in signal transduction pathways. In analyzing the 1011 switchgrass RGHs, 600 were found to contain a full-length NB domain while the remaining sequences were missing one or more highly conserved motifs (P-loop/WalkerA, Kinase 2, RNBS, and GLPL; [15]. This could be explained by the assembly of the switchgrass genome (v 1.1), which consists of 18 pseudomolecules and several thousand additional contigs ranging in size from 1000 to 88,021 bp [31]. Incomplete NB-LRR gene sequences in switchgrass could be a result of incomplete duplications or transversions, incomplete assembly or annotations, or may actually be pseudogenes. Some NB-LRR pseudogenes have been shown to code for non-functional or truncated protein products [59]. Interestingly, the evolution of a pseudogene at the Pid3 gene locus has been found to promote disease in susceptible rice cultivars [60].

Sequence duplication and divergence is also prominent in NB-LRR genes. Phylogenetic analysis of the switchgrass and Brachypodium NB-LRRs found that the majority of the Brachypodium NB-LRRs have been conserved in switchgrass. Several switchgrass RGHs, including ten that were identified to contain a protein kinase domain, however, appear to have emerged after the two species diverged.

Unique domains other than the NB-LRRs identified in the switchgrass RGHs have also been identified in other species, and could play roles similar to ones reported recently in Arabidopsis. In Arabidopsis the RRS1-R gene encodes NB-LRR protein that contains a WRKY domain that acts as a decoy to intercept effector molecules secreted by Pseudomonas syringae pv. pisi and Ralstonia solanacearum [61].

Two switchgrass RGHs, Pavir.Fa01782 and Pavir.Hb00484, were found to have a C-terminal domain homologous to thioredoxin proteins. One switchgrass RGH, Pavir.Bb01048, was predicted to have a C-terminal glutaredoxin domain. Thioredoxin and glutaredoxin proteins participate in oxidation/reduction reactions and have been associated with increased disease resistance in tobacco [62, 63] and increased disease susceptibility in Arabidopsis [64]. Therefore, the NB-LRR containing a thioredoxin domain may function in disease resistance by reducing pathogen-induced oxidative stresses. Since glutaredoxin has been shown to promote disease resistance [64], the presence of a glutaredoxin domain in a NB-LRR disease resistance gene may function as a decoy similar to the data described by Sarris et al. [61].

A total of 5 switchgrass RGHs were predicted to contain unique domains that are involved in DNA binding. One switchgrass gene, Pavir.J03445, was found to contain a WD40 domain in the C-terminal. Plant genes that contain WD40 domains have been shown to be differentially regulated during pathogen infection, suggesting that these genes may be important regulators of defense-related responses [65, 66]. Another switchgrass gene, Pavir.J20878, which contains an N-terminal B3 DNA-binding domain, was also found to contain a C-terminal WRKY DNA-binding domain. This further supports a role for Pavir.J20878 in DNA binding and transcription regulation.

Aside from DNA-binding, several other smaller categories were identified. One switchgrass RGH, Pavir.Hb01174, is predicted to function in sugar binding. This particular RGH contains a C-terminal jacalin-like lectin binding domain. Jacalin-like lectin domains bind carbohydrates, mainly mannose and galactose, and have been shown to play an important role in disease resistance [67]. For example, the RTM1 gene of Arabidopsis encodes a protein that contains a jacalin-like lectin domain and this protein is critical for inhibiting long-distance movement of the tobacco etch virus [68]. Additionally, three switchgrass RGHs, Pavir.Ib02384, Pavir.Ib02433, and Pavir.J18369, are predicted to function in protein trafficking and vesicle movement, a function that has not yet been demonstrated by plant disease resistance genes. These proteins contain a domain in their N-terminals that shows strong homology to a major sperm protein of nematodes. A previous report has identified a similar domain in the N-terminal region of the VAP27 protein of tomato, which has been shown to interact with the Cf9 resistance protein; however, no direct role for VAP27 in disease response has been established [69]. To our knowledge, this is the first report of a major sperm protein domain attached to the N-terminal of a NB-containing resistance gene.

Several other domains found in the switchgrass RGHs, have been linked to disease response in other plants. These include calmodulin (Pavir.Hb00190) [70, 71], WD domain (Pavir.J03445) [72, 73], and transposable elements (Pavir.Fb0150) [74, 75]. This highlights the diversity and potential for gene diversification of RGHs encoded by the switchgrass genome.

The switchgrass cultivars ‘Alamo’ and ‘Dacotah’ exhibit significantly different disease responses after exposure to switchgrass rust (Puccinia emaculata). ‘Alamo’ is relatively resistant to the rust pathogen whereas ‘Dacotah’ is highly susceptible [9]. Polymorphisms within RGHs may contribute to the disease resistance phenotype that is observed between ‘Alamo’ and ‘Dacotah’. In our study, we identified 2634 variants between ‘Alamo’ and ‘Dacotah’, including SNPs, MNPs, indels, and replacements. Approximately 89 % of the variants detected were SNPs. The predominance of SNPs could be attributed to the fact that single nucleotide changes in the coding regions of genes are less likely to disrupt the reading frame, which often times results in nonsense mutations. SNPs could also explain the disease response phenotypes of ‘Alamo’ and ‘Dacotah’, as they could alter the amino acid sequence of resistance genes and potentially disrupt gene function. Since these SNPs are associated with defense-related genes, they could be further developed into molecular markers for use in breeding of disease resistance.

In addition to polymorphisms within the RGHs, differential expression of NB-LRR disease resistance genes may contribute to the different phenotypes observed between the two cultivars. It is believed that R genes are expressed at relatively low levels in unchallenged plant cells in anticipation of pathogen attack [14]. Indeed, we found that 338 RGHs displayed expression evidence in both cultivars in an unchallenged state, supporting the idea that R genes are basally expressed in healthy plant cells [14]. There could be several reasons for the genes that we could not find expression evidence for. First, these genes could be expressed at extremely low levels in healthy plant cells and thus, they escaped detection at the sequencing coverage used in this study. Second, the expression of these genes could be induced upon pathogen detection and they are not basally expressed in healthy plant cells. Finally, some of these genes may be pseudogenes and may not be expressed under any conditions. Further studies are needed to evaluate the expression patterns of the switchgrass RGHs and to determine the exact role, if any, that these genes play in switchgrass disease response.

Developmental regulation of specific RGHs could also contribute to disease resistance phenotypes at different stages of plant growth. RNA-sequencing from the flag leaves of field grown cultivar ‘Summer’ provided a unique opportunity to examine switchgrass RGH expression over the course of a growing season [44]. The first group of developmentally regulated RGHs (module 1, Fig. 6a) is of particular interest since these genes are up-regulated at the end of July. The end of July and the beginning of August are optimal times for switchgrass rust infection and thus, these genes may play an important role in immediate defense responses against foliar pathogens like switchgrass rust. Correspondingly, the transcripts for these RGHs decreased over the remaining harvests, supporting the idea that these genes are involved in the earlier stages of disease response. Field-grown switchgrass plants appear to be more susceptible to switchgrass rust as they begin to flower and set seed (data not published). As displayed in modules 5 and 8, 119 of the 755 RGHs (16 % of the genes) exhibited peak gene expression in at least one biological replicate during the last sampling point (9/19). The remaining 84 % of the genes displayed peak gene expression over the first four sampled time points. Since fewer RGHs showed preferential expression during the later stages of the growing season, these results support the likelihood that switchgrass plants may utilize resources towards other processes, such as flowering and nutrient remobilization, rather than disease resistance in the later stages of development.