H3K4 tri-methylation breadth at transcription start sites impacts the transcriptome of systemic lupus erythematosus

H3K4me3 breadth has recently been identified as a key regulator of cell type identity [5]. Very broad domains of H3K4me3 were identified as lineage specific markers in both human and mouse. These broad domains were typically extended over 5 kb and highly remodeled during cell differentiation. Enrichment of lineage specific transcription factors within these broad domains was observed, thereby supporting the concept that the domains were critically associated with lineage commitment. The H3K4me3 mark is deposited by members of the COMPASS/Trithorax family of methyltransferases and is removed by the JARID family of demethylases [22]. The H3K4me3 marked nucleosome can be dynamically acetylated by p300 and CBP [23]. This subsequent step may represent a mechanism by which H3K4me3 regulates transcription. We wished to understand the effects of H3K4me3 breadth at TSSs, a subject that had not been previously addressed. We had already identified significant changes in H3K4me3 peak height in the setting of SLE. This study was undertaken specifically to evaluate effects of H3K4me3 peak breadth in a human disease state.

We found that TSS H3K4me3 patterns were not markedly changed in SLE although monocyte behavior is markedly changed in SLE [12, 24–26], and transcription changes underlying the altered behavior are also substantial [11, 14–16]. It was surprising, therefore, to find that the H3K4me3 patterns themselves were largely stable in SLE monocytes.

We noted that the TSS and the downstream H3K4me3 changes were most closely aligned with differential transcription in SLE. Upstream changes in H3K4me3 were not directly associated with differential transcription once the data were corrected for dependence of adjacent regions on each other. The downstream extended category of H3K4me3 was the pattern most strongly associated with inflammation and immune responses. It is then expected that this would be the set of genes most altered in the setting of SLE. The nucleosome downstream of the TSS is important functionally. H3K4me1 tends to locate next to the outer edge of H3K4me3 (Fig. 1c), so its peak breadth is correlated with H3K4me3 peak breadth. H3K4me1 is required for the recruitment of factors that interact with H3K4me3 [27]. This association may therefore relate to restriction of the activities nucleated on the downstream nucleosome. Many TSSs in this study had downstream H3K4me3 extended to ~650 bp, where H3K36me3, a transcriptional elongation mark, starts to increase (Fig. 1c). Release of RNA polymerase from pausing occurs at this location, and histone acetylation of this downstream nucleosome appears to be central to the process and is at least partly dependent on H3K4me3 [23, 28–30]. Therefore, modifications at this downstream nucleosome may control pivotal events in transcriptional elongation. These findings also have important implications for the analysis of ChIP-seq data that typically focuses on the nucleosome upstream of the TSS. These data highlight the importance of a comprehensive assessment of changes in analytic approaches.

This conclusion is seemingly contradictory to what was described in the seminal paper on H3K4me3 breadth [5]. That paper reported the effect of H3K4me3 breadth on cell identity by comparing different cell types while the current study was focused on one cell type under different pathological states. Another source of the different conclusions between the two studies is the difference in the definition of broad H3K4me3 peaks. In that paper, the broad H3K4me3 domains spanned up to 60 kb and had minimal length of over 4 kb in hESC. The domains were largely intergenic. This study, on the other hand, only looked at a much narrower region of 1 kb around TSSs, which might have more direct association with transcription level and variability. The breadth of narrow and broad peaks usually differed only by hundreds of base pairs.

This study makes unique contributions by defining H3K4me3 patterns at TSSs and by identifying the nucleosome downstream of the TSS as directly associated with transcription. Given that many genes have the transcriptional initiation-elongation transition in this region [31, 32], it is plausible to hypothesize that increase of downstream H3K4me3 will facilitate the transition by making the nucleosome more accessible to elongation machinery. Nevertheless, the study has some important limitations. The sample size of the SLE patients was relatively small and included patients with mild or moderate disease activity. As higher levels of disease activity could recruit additional gene expression changes as well as H3K4me3 changes, our study may underestimate the effects. An additional limitation was the focus on the annotated TSS region. We chose to focus on the TSS in order to link our findings with our RNA-seq data.

In summary, our study highlights the importance of examining H3K4me3 breadth patterns as well as peak height in evaluating ChIP-seq data. This is also one of the first studies to examine the changes in H3K4me3 patterns related to a disease state. Furthermore, data mining analyses of extra data sets further suggested that the association between transcription and downstream H3K4me3 is common to inflammatory responses. Our results emphasize the stability of the patterns and the importance of the downstream nucleosome in regulating gene transcription.