Sex-specific chromatin landscapes in an ultra-compact chordate genome

Compared to vertebrates, the protochordate Oikopleura dioica has undergone strong secondary genome compaction and morphological simplification, characteristic of the rapidly evolving larvacean lineage. O. dioica is the only known larvacean with separate sexes, and this is rare in tunicates in general [77] indicating an ancestral hermaphroditic state from which separate sexes and heteromorphic sex chromosomes have evolved more recently in only a few species. In this study, we compared the chromatin landscapes of the O. dioica ovary and testis and assessed differences between the autosomes and sex chromosomes. A greater diversity of chromatin states was deployed in the testes, compared to the ovary, paralleling the previously observed amplification of testes-specific histone variants [33].

Chromatin states in O. dioica included the typical epigenetic signature of active promoters conserved among other metazoans (H3K4me3, H3K27ac, H3K18ac, weak H3K4me1 and H3K79me2 [12]). Tissue-specific promoter states also often marked gene bodies. This has also been observed in other species on frequently transcribed genes [78] and may be a consequence of very short coding regions and introns and/or high RNAPII processivity in O. dioica.

Histone acetylations are generally associated with accessible chromatin and are related to transcriptional activation. We found, however, silent developmental genes marked by H4ac, particularly in the testis (state 15). In endocycling nurse nuclei, silent developmental genes were enriched in H3K79me3 (state 1) (e.g., sox2 locus; Additional file 2: Fig. S12) though these two marks also co-occurred in both sexes to some extent, as resolved by the 50-state model (states 43–45, Additional file 2: Fig. S5). All three methylation states of H3K79 are mediated by Dot1 [79]. This mark acts to regulate endocycle progression, and its peak at G1/S prohibits re-replication in mammalian cells [3]. H4 acetylation and H4K20me1 peak during M and early G1 and decrease during S-phase [80] to ensure replication origin licensing and a chromatin state accessible to replication factors [81]. Thus, one potential reason for differential H4ac (testes) and H3K79me3 (ovaries) marking could be related to the different mitotic versus endocycling cell cycle modes. Alternatively, these nucleosomes may be subject to histone variant exchange in the testis. O. dioica sperm chromatin does not undergo histone-to-protamine replacement that is typically preceded by massive H4 hyperacetylation in both human and Drosophila [82]. Instead, histone H3.3 is replaced by three isoforms of the O. dioica testis-specific variant H3t [33], and nucleosomes are retained. Amino acid substitutions surrounding K79 within the H3ts most likely preclude Dot1 binding and its ability to methylate these histones. H4 is also replaced by the male-specific H4t that has two residue changes adjacent to K20 that may hinder the methylation of this residue. Together with the retention of histones in O. dioica sperm, the association of generally activating H4ac with silent developmental genes in the testes could be viewed as a potentiating transgenerational mark. The lack of a transition to protamines, and extremely rapid embryonic and larval development, may have increased weighting on the intergenerational transmission of potentiated epigenetic states. Resolving individual lysine acetylations on H4 would be a step toward better understanding this unusual marking.

Along with promoter structure, enhancers determine cell-type specificity of gene expression and are important in developmental switching of metazoan genes. O. dioica intergenic and intronic spaces are very limited compared to vertebrate genomes, and the presence of enhancers is as yet poorly defined. We identified a number of candidate enhancer regions in this compact genome, in both the ovary and the testis, using intersections of typical enhancer PTMs and p300-bound regions. Histone PTMs marking enhancers in human, fly, cnidarians and worm (H3K4me1 and 2, H3K27ac, H3K79me2 and 3 [12, 16, 83]) were found in actively differentiating embryonic cells and tissue cell cultures. The O. dioica ovary and testis comprise terminally differentiated, highly specialized cell types, and the lower abundance of typical enhancer marks is in accordance with recent models of gene regulation during differentiation [84]. Moreover, genes organized in operons, particularly those devoted to maternal transcripts in the ovary, might have high transcriptional rates by default and be more subject to translational regulation via mTOR signaling [57]. The longer introns of O. dioica developmental genes exhibit more conserved intron positioning and also contain HCNEs [25]. We observed enrichment of chromatin states 1 and 15 on HCNEs. State 15 was under-represented in regions bound by p300 in the testis, but in the ovary, p300 co-localized with actively transcribed gene promoter state 5 or with 5mC in silent intergenic regions. The detection of typical active developmental enhancer activity would be further addressed through analysis of embryonic stages.

We found evidence of Polycomb interactions on autosomes and X-chromosomes in both sexes and propose roles of PRC2 in dosage compensation. We also observed in the O. dioica male germ line that the somatic H3K36 methyltransferase SETD2 is repressed, leaving NSD-(mes-4)-like and PRDM9-like to deposit H3K36-methylation. Complementary patterns of H3K36me and H3K27me3 in testis indicate regulation of transcript levels in the germline analogous to mes-4 (NSD-like) catalyzed H3K36me in C. elegans, which antagonizes PRC on germline genes such that PRC is excluded from autosomes but remains on (and represses) the X-chromosome. The mutual antagonism between PRC2 and mes-4 (NSD-like) is thought to be important in the transgenerational inheritance of germline-specific transcriptional programs in C. elegans [61, 8587]. Thus, similar mechanisms may be operating in O. dioica, and it would be of future interest to determine the terminal chromatin states of mature sperm in this regard. In C. elegans, this proposed mechanism operates in the absence of DNA methylation or histone replacement by protamines during spermatogenesis. Both of these latter mechanisms are features of mammalian spermatogenic programs, though protamine replacement is not complete and some histones remain. Interestingly, O. dioica is intermediate in that DNA methylation is present whereas histone replacement by protamines is absent [33], offering perhaps a useful comparative reference perspective in the evolution of mechanisms assuring transgenerational inheritance of germline transcriptional programs.

Given the rarity of separate sexes among tunicates, O. dioica offers an interesting model of more recent independent evolution of heteromorphic sex chromosomes. The O. dioica Y-chromosome contains a pseudoautosomal region, accumulated mobile elements, and a few male-specific genes, indicating that Y-chromosomal degeneration progressed rapidly [25]. Evolutionary forces driving Y-chromosomal sequence decay are well studied [88], but little is known about autosomal epigenome transitions toward the largely heterochromatic nature of ancient Y-chromosomes. We observed strong co-localization of heterochromatic marks and 5mC on this chromosome. This cooperation is an ancient feature based on HP1-mediated DNA methyltransferase recruitment [89, 90]. Patterns of histone PTMs on the Y-chromosome reflect the functional state and evolutionary history of the sequences [88]. The combination of histone and DNA modifications on the O. dioica Y-chromosome appears to have adapted to repress the activity of accumulated mobile DNA elements [25].

Our finding of unusual chromatin states containing typically heterochromatic marks (H3K9me2, H3K9me3, H4K20me3) combined with active transcription-related ones (H3K4me3, H3K36me3, H3K27ac, H4K20me1, H3K79me1) might represent a transition state in the time course of heteromorphic chromosome evolution. To our knowledge, such patterns of H3K4me3 and H3K36me3 overlapping heterochromatin signatures have been observed only in mammalian imprinted loci [91] and C. elegans mobile elements [14]. In the analysis of the epigenomes in nine human cell types, a group of endogenous retroviruses was enriched in a complex chromatin state that consisted of a number of histone modifications including H3K36me3, H3K4me3 and H3K9me3 [92]. However, these states have not been reported specifically on the Y-chromosome. O. dioica has little heterochromatin, consistent with a scarcity of TE elements [30] and generally reduced noncoding regions. Most TEs are present on the largely non-recombining Y-chromosome. The genome contains active Tor-family retrotransposons [30] that are transcribed in primordial germ-cell-adjacent somatic cells during embryonic development and in the adult testis [31]. These elements are specifically activated, often from their own non-LTR internal promoters and not genome-wide de-repressed. The presence of H3K4me3 and H3K36me3 on TEs could be a consequence of TE transcription with the co-occurrence of H3K9me3 due to the location of some copies of these elements in repressive chromatin environments, whereas others are active, either within one animal, or in different individuals. Notably, these unusual states were restricted to marking certain classes of TEs in O. dioica.

The short life cycle and ability to rapidly regulate gamete output over 3 orders of magnitude [45, 56, 93] may also be relevant to the extent of paused RNAPII signatures we observed in the gonads. Such a strategy would be compatible with more rapid transcriptional response to nutrient availability in adjusting gamete number. Finally, it is suspected that the width of broad chromatin domains contributes to the heritability of epigenetic states because of random segregation of nucleosomes to daughter cells during genome replication [35]. Thus, the probability of loss of an epigenetic state will increase as a function of decreasing breadth of chromatin domains. This poses a challenge to epigenetic regulation and inheritance on the compact O. dioica genome, where intergenic regulatory regions are frequently on the order of one nucleosome. Indeed, we found that chromatin state domain widths in O. dioica were generally smaller than those in their chordate relatives, the vertebrates, and rarely exceeded 7 nucleosomes in size. Nonetheless, this is evidently compatible with fidelity of heritable transmission of epigenetic states in this species.