Functional role of dimerization and CP190 interacting domains of CTCF protein in Drosophila melanogaster


dCTCF contains an N-terminal dimerization domain

The 11 zinc fingers of the CTCF proteins are highly conserved in bilaterian phyla
24], 25] (see schematic in Fig. 1a and Additional file 1: Figure S1A). In contrast, the NTDs and CTDs were poorly conserved and there was
little sequence similarity even between proteins from different dipteran families
(Additional file 1: Figure S1A). Sequence alignment of CTCF proteins from species within the Drosophila genus revealed much more extensive homology in the NTDs and CTDs, including several
very well conserved sequence blocks (Additional file 1: Figure S1B). A plausible hypothesis is that these conserved sequences may serve
as protein interaction modules that are important for dCTCF activities.

Fig. 1. a Domain structure of the dCTCF protein. b Sephacryl S200 size-exclusion chromatography of dCTCF terminal domains. (N-terminal
domain is thioredoxin-tagged.) Positions of molecular weight markers are shown. c Cross-linking of dCTCF N-terminal thioredoxin-tagged deletion derivatives using increasing
concentrations of glutaraldehyde (GA). Proteins were separated in a 5–12 % gradient SDS-PAGE gels and visualized with
silver-staining. d Summary of the results from chemical cross-linking mapping experiments and limited
proteolysis of the dCTCF–NTD multimerization domain. For further experiments see Additional
file 2: Figure S2. e Superdex 200 size-exclusion chromatography of dCTCF 1–163 amino acids without thioredoxin.
f Analysis of dCTCF protein N-terminal dimerization using yeast two-hybrid assay. Relative
N- or C- terminal position of AD/BD is shown. AD GAL4 activation domain, BD GAL4 DNA binding domain

One of the interactions that could be mediated by these modules was the dimerization
or multimerization of the dCTCF protein. This possibility was suggested by insulator
bypass experiments in which pairs of multimerized and appropriately oriented CTCF
binding sites could mediate long-distance regulatory interactions 15], 16]. Additionally, studies on vertebrate CTCF have suggested that it can form dimers
58], 59]. To test for the presence of homo-dimerization/multimerization modules in the N-
or C-terminus of the dCTCF protein, we fractionated bacterially expressed thioredoxin-fused
NTD or CTD proteins by size-exclusion chromatography. As shown in Fig. 1b, the thioredoxin NTD fusion 1–288 had a hydrodynamic molecular mass significantly
larger (~250 kD) than that predicted for the monomer (45 kD). Similar results were
obtained for the CTD 612–818 protein (Fig. 1b).

The presence of these larger complexes could be explained by either a module-dependent
multimer formation or by the presence of intrinsically disordered regions that lead
to non-specific protein aggregation. To distinguish between these possibilities we
used glutaraldehyde cross-linking to probe for complex formation. In the case of the
CTD 612–818 protein, glutaraldehyde cross-linking was quite inefficient, suggesting
that it likely forms non-specific aggregates (Additional file 2: Figure S2A). On the other hand, consistent with the results of the size-exclusion
chromatography, cross-linking of the NTD 1–288 protein gave a high yield of a multimeric
band of ~200 kD (Fig. 1c).

To further pinpoint the interaction module, we generated three C-terminal deletions
(see Fig. 1d). The smallest deletion, NTD 1–205, gave a cross-linked band of ~100 kD. The largest
C-terminal deletion, NTD 1–125, also gave a cross-linked product; however, the yield
was quite low compared to the NTD 1–205 protein (compare Additional file 2: Figure S2A to Fig. 1c). This suggests that key interaction sequences were located between amino acids 125
and 205. This suggestion was supported by NTD 1–163, which was much more efficiently
cross-linked than NTD 1–125 (Fig. 1c). Though NTD 1–163 gave a prominent cross-linked band at the approximate size expected
for the tetramer (~120 kD), there was also a ladder of larger bands. This ladder was
likely due, at least in part, to the presence of the thioredoxin moiety in the fusion
protein, as only two cross-linked bands were observed when thioredoxin was removed
(Additional file 2: Figure S2A). Taken together, these findings map the N-terminal dCTCF:dCTCF multimerization
module to sequences spanning the region between amino acids 125 and 163 and suggest
that this module likely mediates the formation of dimers or possibly tetrameric complexes. Further support for the formation of multimeric complexes (tetrameric or an even
larger) came from size-exclusion chromatography of the NTD 1–163 protein (lacking
the thioredoxin moiety), which gave a predicted mass of 120 kD (Fig. 1e). However, it is also possible that disordered regions of the protein retard complex
mobility during size-exclusion chromatography.

Several additional lines of evidence localized the dCTCF multimerization module to
this region of the NTD. First, two internal deletions (Fig. 1d) that lacked sequences from this interval failed to cross-link efficiently (Additional
file 2: Figure S2A). Second, two terminally truncated proteins, NTD 125–180 and NTD 70–163
(Fig. 1d), that contained this part of the NTD were cross-linked efficiently (Additional file
2: Figure S2A). Third, protease digestion indicated that the region containing the
interaction module had an ordered structure. We subjected the thioredoxin NTD 1–205
fusion protein to limited proteinase K or trypsin digestion and then analyzed the
resulting protease-resistant products by matrix-assisted laser desorption/ionization
time-of-flight (MALDI-TOF) mass spectrometry (Additional file 2: Figure S2B; Additional file 3: Table S1). Both proteases generated two resistant-to-digestion products. One corresponded
to thioredoxin, while the other to a dCTCF NTD peptide extending from 84 to 188.

To independently demonstrate that the NTD contains a dCTCF multimerization module
we used two different in vivo assays. The first was a yeast two-hybrid assay (Fig. 1f). Sequences encoding the NTD 1–288 amino acids were fused in-frame to the yeast GAL4
DNA binding domain (BD) and activation domain (AD). Because steric hindrance can interfere
with transcriptional activation in the two-hybrid system, the NTD 1–288 sequence was
placed at both the N-terminus (NTD-AD and NTD-BD) and the C-terminus (AD-NTD and BD-NTD)
of the fusion protein. Fig. 1f shows that activation was observed in only one configuration, NTD-BD and AD-NTD.
Similar results were obtained when the NTD was tested with a full-length dCTCF protein
(not shown).

In the second assay, we ectopically expressed a 3xFLAG-tagged fusion protein consisting
of the N-terminal 302 amino acids of dCTCF, a nuclear localization signal and the
bacterial LexA DNA BD in Drosophila S2 tissue culture cells. The S2 cells were co-transfected with a plasmid encoding
the firefly luciferase protein, whose expression is dependent upon a minimal TATA-box
promoter and upstream 4xLexA binding sites (Fig. 2a). Measurements of luciferase activity relative to a Renilla luciferase co-transfection
control indicated that the 3xFLAG-N-terminal dCTCF-LexA fusion protein weakly activated
firefly luciferase expression from the 4xLexA-TATA reporter (Fig. 2b). By contrast, no activation was observed for a luciferase reporter that lacked the
4xLexA binding sites or when the 3xFLAG-tagged fusion protein had the LexA DNA BD
but not the dCTCF N-terminal domain. The chromatin immunoprecipitation (ChIP) experiments
in Fig. 2c show that when the N-terminal dCTCF fusion protein was tethered to the 4xLexA-TATA
reporter via the LexA BD, it could interact with and recruit endogenous full-length
dCTCF. Because CP190 antibodies were also able to immunoprecipitate the 4xLexA-TATA
reporter, it would appear that the full-length dCTCF protein could in turn recruit
CP190 to the 4xLexA-TATA reporter (via the CTD of the full-length dCTCF protein: see
below).

Fig. 2. a Schematic drawing of luciferase reporter constructs. b Firefly luciferase expression from the five reporters shown in a when co-transfected with empty vector, with a vector encoding a 3xFLAG-tagged-(nuclear
localization signal)-LexA fusion protein, or with a vector encoding the 3xFLAG-tagged
N-terminal dCTCF-(nuclear localization signal)-LexA fusion protein. A plasmid encoding
the Renilla luciferase under the control of the actin promoter was used to correct
for variations in transfection efficiency, and expression of the firefly luciferase
was normalized in each case to Renilla luciferase. Each transfection experiment was
performed in three independent biological replicates and each lysate was measured
in four technical replicates. Error bars show standard deviations of measurements
of all summarized replicates. c Chromatin immunoprecipitation of S2 cells co-transfected with the 4xLesA TATA-box
reporter or the basic promoterless reporter and either of two fusion protein expression
constructs, the 3xFLAG-tagged (nuclear localization signal) LexA construct or the
3xFLAG-tagged-N-terminal dCTCF-(nuclear localization signal)-LexA construct. Fixed
and processed S2 chromatin samples were immunoprecipitated with antibodies directed
against (as indicated) the dCTCF N-terminus, the dCTCF C-terminus, CP190, or FLAG,
and then assayed for the presence of sequences corresponding to the 4xLexA TATA reporter
or the basic reporter constructs as indicated. Each chromatin immunoprecipitation
experiment was performed in three independent biological replicates. Error bars show
standard deviations of summarized biological replicates after quadruplicate PCR measurements
in each experiment. The results are presented as a percentage of input DNA. Basic no promoter, bla basic promoterless reporter, Hsp70 firefly luciferase with an hsp70 promoter, TATA firefly luciferase with a minimal TATA-box promoter, 4xlex bs, firefly luciferase with four copies of the LexA recognition sequence, 4xlex bs?+?TATA, firefly luciferase with four copies of the LexA recognition sequence linked to a
minimal TATA-box promoter

The BTB
CP190
dimer interacts with the C-terminal domain of dCTCF

CP190 has an N-terminal BTB-POZ (BTB for BR-C, ttk and bab and POZ for Pox virus and
Zinc finger) protein–protein interaction domain, which is followed by an aspartic
acid-rich D domain, a microtubule targeting domain, four C2H2 zinc fingers (which
bind non-specifically to DNA), and finally a glutamic acid rich C terminal domain
(Fig. 3a). Previously it was shown that the CP190 protein interacts with dCTCF 54], 55]. While interacting modules in the two proteins were not identified, it was found
that the BTB domain is required for the binding of CP190 to chromatin 60].

Fig. 3. a Domain structure of the Drosophila CP190 protein. b Mapping dCTCF and CP190 interaction modules using the yeast two-hybrid assay. c Analysis of interactions between purified recombinant GST-dCTCF-CTD and 6xHis-CP190
by GST-pull-down assay. GST-dCTCF-CTD bound to glutathione agarose beads was incubated
with bacterially expressed 6xHis-CP190. After successive washes, the GST-dCTCF-CTD
protein was eluted from the beads with excess glutathione. d Analysis of interactions between recombinant GST-dCTCF-CTD and CP190 from Drosophila S2 cells nuclear lysate by GST-pull-down assay. An S2 nuclear extract was incubated
with recombinant GST-dCTCF-CTD bound to glutathione agarose beads. After washing and
elution with excess glutathione, CP190 and GAF association was assayed by western
blotting. e Immunoprecipitation of FLAG-tagged dCTCF full-length and deletion mutants with CP190
antibodies. f Mapping of CTCF-interaction region within CP190 protein using GST-pull-down assay.
AD activating domain, BD binding domain, S2 Schneider 2 cells

To identify modules in dCTCF that mediate CP190 interaction, we first used the yeast
two-hybrid assay. dCTCF was subdivided into the NTD, the zinc fingers plus the CTD,
and the CTD alone and each was fused to the GAL4 activation domain. The NTD failed
to interact, while the full-length protein, the zinc fingers plus the CTD, and the
CTD alone gave transcriptional activation when combined with full-length CP190 (Fig. 3b). The localization of the CP190 interaction module in dCTCF to the CTD was confirmed
by GST pull-down experiments. A GST-fusion protein containing the dCTCF-CTD 612–818
domain was found to pull down both bacterially expressed CP190 and CP190 in Drosophila S2 cell nuclear extracts (Fig. 3c,d).

To further pinpoint the sequences within the 612–818 CTD that are important for contacting
CP190, we expressed two different FLAG-tagged C-terminal deletions, dCTCF?610-723
and dCTCF?774-818, in Drosophila S2 cells. Figure 3e shows that FLAG-tagged wild-type dCTCF and a control N-terminal deletion could be
precipitated by CP190 antibodies from the S2 extracts. In contrast, neither of the
smaller dCTCF-CTD deletions was precipitated by CP190 antibodies from S2 cells. Taken
together, these findings indicate that an apparently rather large sequence is required
to mediate a dCTCF–CP190 association that is stable in S2 nuclear extracts.

We used a similar strategy to localize the region in the CP190 protein that mediates
interactions with dCTCF. For the yeast two-hybrid experiments, full-length dCTCF was
fused to the GAL4 activation domain, while different sub-fragments from CP190 were
fused to the GAL4 DNA BD. These experiments map a dCTCF interaction module to the
CP190 BTB domain (Fig. 3b). This was confirmed by GST pull-down experiments (Fig. 3f) which showed strong protein–protein interactions between the CP190 BTB domain (1–126)
and dCTCF. In addition, weak interactions were detected between the dCTCF-CTD and
GST–CP190 fusions spanning the microtubule interaction domain (see CP190 308–517 and
308–468 in Fig. 3f).

We have previously shown that the CP190 BTB domain exists as a stable homodimer 61]. This observation raised the possibility that a CP190 dimer could simultaneously
bind two dCTCF proteins, linking them together in the same manner that the Bcl6 BTB
dimer is thought to bring together two SMRT co-repressors 62]. However, glutaraldehyde cross-linking experiments argue that the predominant complex
consists of a BTB
CP190
dimer linked to a single CTD protein. Figure 4a shows that the CP190 BTB domain alone formed a stable dimer that could be readily
captured by glutaraldehyde cross-linking. When the BTB domain was present in a twofold
excess over the dCTCF–CTD 612–818 protein, the cross-linked BTB
CP190
dimer disappeared and was replaced by a band migrating with an apparent molecular
weight of ~130 kD. While this cross-linked complex migrated more slowly than we would
have predicted, we interpret it to be a 2xBTB
CP190
:CTD 612–818 trimer based on the stoichiometry of the two proteins. This conclusion
was supported by cross-linking experiments in the presence of increasing amounts of
the CTD protein (Fig. 4b). At a CTD to BTB ratio of 1:4 and 1:2, the predominant cross-linked species was
the ~130 kD 2xBTB
CP190
:CTD trimer, whereas there was little if any of the BTB
CP190
dimer. Only at ratios of 1:1 or 2:1 did we observe a larger species that could correspond
to the BTB
CP190
dimer linked to two CTD proteins or to some other more complex structure(s). However,
under these conditions a significant fraction of the CTD protein appeared to be free
monomer, and this would also argue that the preferred configuration is a 2xBTB
CP190
:CTD heterotrimer.

Fig. 4. a Analysis of complexes between dCTCF-CTD and thioredoxin-tagged CP190-BTB mixed at
a molar ratio of 1:2 using the chemical cross-linking reagent glutaraldehyde. Proteins
were visualized by Coomassie staining. a indicates position of CP190 BTB-domain monomer, b position of dCTCF-CTD, c dimer of CP190 BTB, d complex between CP190 BTB dimer and dCTCF-CTD. b Analysis of stoichiometry of interaction between dCTCF-CTD and CP190-BTB mixed in
different molar ratios, and cross-linked with 0.2 % glutaraldehyde after 1 h incubation,
visualized by silver-staining. a indicates position of CP190 BTB-domain monomer, b position of CTCF-CTD, c dimer of CP190 BTB, d complex between CTCF-CTD and CP190 BTB in molar ratio 1:2, and e higher order complex between CTCF-CTD and CP190 BTB with unknown stoichiometry. GA glutaraldehyde

Rescuing a dCTCF null allele

If the NTD and CTD interaction modules are critical for dCTCF function, then dCTCF
proteins that lack these modules should be defective. As a prelude to assaying the
activity of NTD and CTD mutant proteins in vivo, we determined whether dCTCF mutant flies could be rescued by a transgene expressing the wild-type dCTCF protein.
Several putative dCTCF null alleles have been reported 54], 55], 63]. Flies homozygous for these mutations (dCTCF 30.6
, dCTCF Y+1
, dCTCF Y+2
, dCTCF 30
) mainly died during larvae–pupae stages. For the wild-type rescue construct we generated
P-element transformants of a hybrid fusion gene that expresses the dCTCF cDNA under the control of the ubiquitously expressed hsp83 promoter 64], 65]. To identify the transgene proteins, a sequence encoding a 3xFLAG epitope was introduced
at the beginning of the dCTCF open reading frame. Five independent transgene inserts
were recovered on the first and second chromosomes; however, none of these transgenes
rescued the lethal effects of the four dCTCF alleles. One reason why the transgenes were unable to complement the four dCTCF alleles is that they did not express as much protein as the endogenous dCTCF gene. An alternative possibility is that there were additional lethal lesions on
the chromosomes carrying these particular dCTCF mutations.

A fifth predicted dCTCF null allele, GE24185, has been described 55]. The viability of adults homozygous for the GE24185 mutation is reduced by a third or more, while F2 flies do not survive. The GE24185 mutation was generated by insertion of an EP
S
transposon in reverse orientation into the third exon of the dCTCF gene (Fig. 5a). The EP
S
transposon contains an hsp70 minimal promoter that drives transcription in the opposite orientation to the dCTCF gene 66]. The promoter is under control of a GAL4-responsive enhancer. As would be expected
from its insertion site, the GE24185 disrupts expression of the dCTCF protein. Extracts prepared from F1 adults homozygous
for the GE24185 mutation showed no dCTCF-specific bands when probed with antibodies directed against
N-terminal or C-terminal regions of the dCTCF protein (Fig. 5b). Unlike the other dCTCF alleles, we found that two copies of the hsp83-dCTCF +
transgene rescued the F1 and F2 lethal phenotypes of the GE24185 mutation.

Fig. 5. a Schematic diagram showing the GE24185 transposon insertion into the dCTCF gene. b Western blots of protein extracts prepared from wild-type and homozygous GE24185 mutant flies. c Schematic representation of dCTCF constructs used to rescue the GE24185 mutation. d Abdomen and cuticle preparations (bottom row) of wild-type and homozygous GE24185 mutant flies in the absence or presence of the hsp83:dCTCF transgenes as indicated. Arrows in GE24185 and dCTCF ?N
;GE21485 indicate the presence of a rudimentary A7 tergite and hairs on the A6 sternite. Arrows
in dCTCF ?C
;GE24185 indicate an A5 to A4 transformation of the tergite. wt wild type, A4-A7 abdominal segments 4-7

To confirm these findings, we generated two imprecise excisions, GEx52 and GEx56, by introducing the P transposase. As indicated in Additional file 4: Figure S3, both imprecise excisions disrupted the coding sequence and were expected
to encode only a truncated protein containing the first ~158 N-terminal amino acids.
Both excision derivatives had the same phenotypic effects as GE24185 and were complemented by the hsp83-dCTCF +
transgene. These results support the conclusion that GE24185 is a null allele of the dCTCF gene 55].

As was shown previously 55], adult flies homozygous for the GE24185 mutation as well as the two excision derivatives had a mild but highly penetrant
held out wing phenotype and thin bristles throughout the animal, and exhibited a series
of homeotic phenotypes in posterior parasegments indicative of a loss of Abd-B activity. These homeotic phenotypes were temperature dependent. They were typically
observed in flies raised at 25 °C, while they were much less frequent when the flies
were raised at 18 °C. One of these phenotypes was the presence of a rudimentary A7
segment in males as would be expected for a loss-of-function transformation of PS12
into PS11 (Fig. 5d). Another was a protruding and rotated male genitalia. Also unlike wild-type males,
GE24185 males had bristles on the A6 sternite and sometimes also patchy pigmentation of the
A5 tergite. The former phenotype is characteristic of a PS11 to PS10 transformation,
while the latter is expected for a PS10 to PS9 transformation. While A7 and A8 do
not form cuticular structures in adult males, they contribute to the cuticle in females.
Homeotic transformations of the A7 sternite into A6 were evident in surviving GE24185 females (Additional file 5: Figure S4). Adult mutant females had significantly reduced egg production, and produced
no viable offspring when mated to homozygous mutant males. These lethal effects could,
however, be rescued by mating the homozygous mutant females to heterozygous balancer
males. This finding indicates that zygotic dCTCF expression can compensate for the absence of maternally derived dCTCF.

Selective depletion of dCTCF from the bithorax complex in GE241845 pupae

Mohan et al. (2007) found that though the levels of dCTCF were substantially reduced
in GE24185 larvae, maternally derived protein could still be detected at approximately 25 %
of the sites in salivary gland polytene chromosomes that are normally observed in
wild-type polytenes 55]. One idea suggested by this observation is that the homeotic transformations evident
in GE24185 adults arise because dCTCF is selectively lost from the bithorax complex (BX-C).
When dCTCF depletion compromises BX-C insulator function, this might enable Polycomb
response elements (PREs) in silenced cis-regulatory domains to repress neighboring active cis-regulatory domains and thus downregulate Abd-B expression in a manner that changes segmental identity. Alternatively, or in fact
in addition, the proper functioning of the Abd-B promoter could require dCTCF.

Mohan et al. 55] addressed this question by examining dCTCF association with BX-C in the brain of
wild-type and GE24185/Df(3L)0463 larvae and found that dCTCF was absent from most insulators in the complex, but was
detected at Abd-B promoter. We have repeated these experiments using chromatin prepared from pupae
because it is during this stage that the adult cuticle is elaborated. We selected
six dCTCF binding sites from BX-C: the Mcp, Fab-6, Fab-8 insulators 67]–77], Fab-3 region, Fab-4 region, and the Abd-B promoter region (Fig. 6) 78]. We also selected the CG1354 promoter region (9A1) 55] and four regions that were identified by Schwartz et al. 56] as requiring dCTCF to block the spread of H3K27me3 in the BGL3 cell line 79]. In addition to testing dCTCF association with these sequences, we also assayed CP190.

Fig. 6. Histograms show dCTCF or CP190 occupancy in chromatin isolated from mid-late pupa
at sequences containing the BX-C insulators Fab-3, Fab-4, Mcp, Fab-6, Fab-8, the Abd-D promoter, and several previously defined dCTCF insulators (9A1, 21E2, 24C4, 27B2
and 57B4R). Cross-linked chromatin prepared from wild-type (WT) (y 1 w 1118
) pupae and homozygous GE24185 (GE/GE) mutant pupae was immunoprecipitated with antibodies directed against the N-terminal
domain of dCTCF and CP190. Sequences from tub, rpl32, and 62D regions were used as
negative controls for dCTCF and CP190 association. 62D is an example of a sequence
in which CP190 occupancy is independent of dCTCF. The left axis shows the scale for
dCTCF enrichment, while the right axis shows the scale for CP190 enrichment. Each
chromatin immunoprecipitation experiment was performed in at least two independent
biological replicates. Error bars show standard deviations of quadruplicate PCR measurements.
The results are presented as a percentage of input DNA

ChIP experiments with chromatin isolated from wild-type pupae using antibodies directed
against the N-terminal region of dCTCF confirmed that it was bound to the insulators
and the Abd-B promoter in BX-C and to the CG1354 promoter (9A1) and four BGL3 insulators (Fig. 6). However, the extent of enrichment of Fab-3, Fab-4, and the Abd-B promoter was about half that of the other BX-C insulators (Mcp, Fab-6, and Fab-8) and also of the CG1354 promoter (9A1) and four BGL3 insulators. As expected, the enrichment of the BX-C
dCTCF sequences was substantially reduced in ChIPs of homozygous GE24185 pupae. The extent of reduction was not, however, uniform. Near-background levels
of dCTCF were observed for Fab-3, Fab-4, Fab-8, and the Abd-B promoter in GE24185 mutant pupae, while residual dCTCF could still be detected at Mcp and, to a lesser extent, Fab-6. By contrast, only one BGL3 insulator, 24C4, showed a loss of dCTCF equivalent to
that seen for most of the BX-C sequences. Three of the other insulators, 9A1, 21E2,
and 57B4R, showed only very modest reductions. Though the loss of dCTCF at the fourth
insulator, 27B2, was more substantial, the occupancy level in mutant pupae was still
about the same as that seen for several of the BX-C elements in wild type.

The rather modest reductions in dCTCF evident at several non-BX-C insulators as well
as the residual dCTCF that was retained at two of the BX-C insulators in the absence
of a zygotic source of dCTCF protein would suggest that a significant amount of maternally
derived dCTCF remains up to at least the pupal stage in GE24185 mutant animals. Moreover, it would appear that the protein is preferentially retained
at a subset of the dCTCF insulators. However, an alternative explanation for the apparent persistence of maternal
dCTCF is that our antibody recognized some other protein species that happened to
bind to the insulators that were pulled down in ChIPs of the mutant pupae. To exclude
this possibility, we used an antibody directed against the C-terminal region of the
dCTCF protein for ChIP experiments (Additional file 6: Figure S5). ChIPs with this antibody paralleled those obtained with the N-terminal
antibody. Substantial amounts of dCTCF persisted at several non-BX-C insulators in
GE24185 mutant pupae, while there was still some residual dCTCF remaining at the BX-C insulators
Mcp and Fab-6.

With the exception of Mcp, all of the BX-C insulators and the Abd-B promoter had less CP190 than the BGL3 insulators. The effects of GE24185 mutation on CP190 association with the BX-C and BGL3 sequences also followed a pattern
similar to that observed for dCTCF. For all of the BX-C insulators, loss of dCTCF
was accompanied by a loss of CP190. For the other insulators, the reduction in CP190
occupancy was, with one exception, roughly comparable to that seen at the insulator
for dCTCF. For example, dCTCF levels were reduced about 40 % for 9A1, while CP190
was reduced about 50 %. The one exception was 27B2, which lacked CP190 in GE24185 mutant pupae, yet retained significant dCTCF occupancy. To confirm that the loss
of CP190 occupancy at dCTCF insulators was not due to a reduction in CP190 protein
levels, we probed western blots of extracts prepared from wild-type and GE24185 mutant pupae (Additional file 7: Figure S6).

Role of CTDs and NTDs in functional activity of the dCTCF protein

To examine the in vivo functions of the N-terminal multimerization domain and the C- terminal CP190 interacting
domain, we generated hsp83 transgenic lines expressing FLAGx3-tagged dCTCF proteins lacking these domains. For
the multimerization domain, we deleted sequences between amino acid 90 and amino acid
170 (dCTCF
?N
). This deletion spans the region required for dCTCF–dCTCF interactions in vitro.
For the CP190 interaction module, we used the C-terminal 774–818 deletion (dCTCF
?C
) that eliminates interactions between CTCF and CP190 in S2 cells (Fig. 5c). The activities of two independent transgenic lines expressing the deleted proteins
were tested in the GE24185 mutant background.

As described above, the control transgene, dCTCF +
, encoding the wild-type protein fully complemented the zygotic and maternal effect
lethality of the GE24185 mutation (Fig. 5d). It also rescued the thin bristles phenotype and the loss-of-function homeotic transformations
evident in PS11-14 (Fig. 5 and Additional file 5: Figure S4). However, in approximately 10 % of the dCTCF +
males we observed a partial loss of pigmentation in the tergite of abdominal segment
A5, which is characteristic of a loss-of-function transformation of A5 (PS10) to A4
(PS9) transformation 80]. The held out wing phenotype was also not rescued. The dCTCF ?C
transgenes resembled dCTCF +
. They fully rescued the zygotic and maternal effect lethality of the GE24185 mutations, the thin bristles, and the PS11-14 homeotic transformations in males and
females. Like dCTCF +
, we also observed a partial loss of pigmentation on the A5 tergite; however, the
frequency was somewhat higher (50 % as compared to 10 %) and the size of the depigmented
patches was typically larger. In contrast to dCTCF ?C
, the dCTCF ?N
transgene only partially ameliorated the zygotic and maternal effect lethality of
GE24185 and dCTCF ?N
transgenic flies had reduced viability and were only semi-fertile. In addition, dCTCF ?N
did not rescue the thin bristles phenotype or the homeotic transformation seen in
the abdominal segments of GE23185 adult males and females (Fig. 5d and Additional file 5: Figure S4).

Chromatin association of dCTCF
+
, dCTCF
?C
, and dCTCF
?N

As a prelude to analyzing the chromosome association of the mutant dCTCF proteins,
we first examined the expression of transgenic wild-type and mutant dCTCF proteins.
For this purpose we probed fly extracts with antibodies directed against the FLAG
tag. As shown in the western blot in Additional file 8: Figure S7, the mutant proteins were expressed at nearly equivalent levels. When
we probed western blots with antibodies directed against dCTCF, we found that the
levels of proteins produced by the transgenes were about twofold less than that of
the endogenous gene (not shown). This would suggest that the incomplete rescue of
two of the GE24185 phenotypes (loss of A5 pigmentation and held out wings) by the
hsp83:dCTCF +
transgene is likely due, at least in part, to the insufficient expression of dCTCF.

Next, we examined the association of transgenic wild-type and deletion mutant dCTCF
with the insulators and Abd-B promoter in BX-C, CG1354 promoter (9A1), and the BGL3 insulators. In ChIPs of chromatin isolated from GE24185 hsp83:dCTCF +
pupae we found that the occupancy levels of the transgenic dCTCF
+
at most of these sites were reduced about twofold compared to the endogenous protein
in wild-type flies (Fig. 7). This reduction would be consistent with the lower levels of dCTCF in the GE24185 hsp83:dCTCF +
flies. However, there were three exceptions. For two of these, the Fab-6 and Fab-8 insulators, the reductions in dCTCF occupancy were greater than twofold. dCTCF occupancy
at Fab-6 was reduced by nearly tenfold while it was reduced by almost fourfold at Fab-8. While Fab-8 still retained levels of dCTCF comparable to several other BX-C sites, only a small
amount of dCTCF was detected at Fab-6. Because one function of the Fab-6 insulator in PS10 cells is to prevent inactivation of the iab-5 cis-regulatory domain by blocking the spread of Polycomb-dependent silencing from the
PRE in the adjacent iab-6 cis-regulatory domain, the substantial reduction in dCTCF association with the Fab-6 insulator could potentially account for the persistence of the A5–A4 transformation
in a subset of the GE24185 flies rescued by hsp83:dCTCF +
. The other exception, the BGL 57B4R insulator, had near wild-type levels of dCTCF.

Fig. 7. Histograms show dCTCF or CP190 occupancy in chromatin from mid-late pupa at sequences
containing the BX-C insulators Mcp, Fab-6, Fab-8, the Abd-B promoter, and several other previously defined dCTCF insulators (9A1, 21E2, 24C4,
27B2, 57B4R). Chromatin was isolated from homozygous GE24185 mutant pupae that also carry the hsp83:dCTCF +
, hsp83:dCTCF ?N , or hsp83:dCTCF ?C
transgenes. The tub sequence was used as the negative control. The left axis shows
the scale for dCTCF enrichment, while the right axis shows the scale for CP190 enrichment.
Each ChIP experiment with 2- to 3-day pupae was performed in at least two independent
biological replicas. Error bars show standard deviations of quadruplicate PCR measurements.
The results are presented as a percentage of input DNA. WT wild type

Like dCTCF +
, dCTCF ?N
and dCTCF ?C
occupancy at dCTCF sites in BX-C and the BGL3 insulators was reduced compared to wild
type (Fig. 7). At most sites, the levels of dCTCF ?N
occupancy were very close to those for dCTCF +
. In contrast, dCTCF ?C
occupancy levels were in most instances slightly lower (less than twofold) than either
dCTCF +
or dCTCF ?N
. Though the effects were small, they suggest that chromatin association of the dCTCF
C-terminal deletion was partially compromised. Because this domain mediates interactions
with CP190, this finding would support the idea that dCTCF binding to chromatin can
be stabilized by interactions with CP190.

CP190 occupancy requires dCTCF but not necessarily the dCTCF-CTD

We also tested whether the reductions in CP190 occupancy evident in GE24185 mutants could be rescued by the hsp83:dCTCF transgenes encoding the wild-type and mutant proteins. Supporting the idea that dCTCF
functions in CP190 recruitment, we found that dCTCF +
and dCTCF ?N
promote CP190 occupancy at sites bound by dCTCF in vivo (Fig. 7). For BX-C insulators the effects on CP190 occupancy seemed to correlate with the
levels of the transgene dCTCF associated with the insulator. For example, at Mcp where the dCTCF +
and dCTCF ?N
transgene proteins were present at only about half the level of the endogenous dCTCF,
CP190 occupancy was about 60 % of wild type (see Fig. 7). Similarly at Fab-8, the transgene dCTCF proteins and CP190 were present at levels about 30 % that of
wild type. CP190 occupancy for two BGL3 insulators, 24C4 and 27B2, also depended upon
dCTCF. In GE24185 mutants, CP190 was not detected at either of these insulators, while association
was restored by the dCTCF +
and, to a somewhat lesser extent, the dCTCF ?N
transgenes. Because the three other insulators (9A1, 21E2, and 57B1R) retained significant
levels of both dCTCF and CP190 in GE24185 mutants, it was not clear whether dCTCF is essential for CP190 occupancy or if it
is one of several factors that contribute to CP190-insulator association. Thus, though
CP190 occupancy in dCTCF +
and dCTCF ?N
flies at these three insulators was near wild type, the extent to which transgene
dCTCF protein contributed to the rescue was not entirely clear.

Further insight into the role of dCTCF in CP190 occupancy came from ChIPs of dCTCF ?C
transgene embryos (Fig. 7). There seemed to be three classes with respect to the requirement for the dCTCF-CTD.
In the first class were the BGL3 insulators 24C4 and 54B4R and also Fab-8. In this class, CP190 occupancy required the dCTCF-CTD and was substantially reduced
in dCTCF ?C
transgene flies compared to wild type or the two other dCTCF rescue transgenes (Fig. 7). The second class was represented by Mcp and 27B2. Like 24C4, 54B4R and Fab-8, CP190 occupancy at these two insulators depended upon dCTCF and was reduced to near-background
levels in GE24185 flies. However, unlike the insulators in the first class, the dCTCF ?C
transgene could partially rescue CP190 association. In the third class was the Abd-B promoter. Although CP190 occupancy at the Abd-B promoter required dCTCF (see Fig. 6), the requirement seemed to be independent of the dCTCF-CTD and was fully rescued
by the dCTCF ?C
transgene.

dCTCF is required to properly initiate Abd-B expression in the embryo

The visible phenotypic defects in GE24185 adult flies arise from alterations in the patterns of gene expression induced by
the gradual depletion of maternal dCTCF as the animals develop. It seemed possible
that the effects on gene regulation might differ if dCTCF were completely absent at
the onset of embryonic development instead of being present at near-normal levels
and then slowly lost. To explore this possibility, we examined the expression of three
genes, the homeotic gene Abd-B, the segment polarity gene, engrailed (en) 81], and the Notch pathway gene, insensitive (insv) 82] in the progeny of GE24185 mothers and fathers. Unlike the progeny of heterozygous parents, these embryos lack
both maternal and zygotic dCTCF. Because the greatly reduced fecundity of GE24185 mothers made embryo collections problematic, we restricted our analysis to mid-embryogenesis.

The pattern of Abd-B expression during mid-embryogenesis in wild-type embryos is dynamic
83], 84]. In stage 10 germ band extended embryos, Abd-B protein is expressed in parasegments
PS13 and PS14, while little or no protein is evident in more anterior parasegments.
Abd-B protein first begins to accumulate at detectable levels in more anterior parasegments
towards the end of stage 11 at the onset of germ band retraction. Only a low level
of protein is initially observed in PS12. As the germ band retracts, Abd-B levels
increase in PS12, and protein begins to accumulate at detectable levels in PS11. Finally
at the end of germ band retraction in stage 13, low levels of Abd-B are found in PS10.
Panels E-G in Fig. 8 show the pattern of Abd-B expression in a stage 10 dCTCF m-z-
embryo. For the purposes of comparison, a slightly older stage 11 wild-type embryo
is shown in panels A-C. Abd-B expression in the dCTCF m-z-
embryo differed in two respects from wild type. First, the levels of Abd-B in both
PS13 and PS14 of the dCTCF m-z-
embryo were noticeably higher than that found in the corresponding parasegments of
the wild-type embryo (compare panels C and G). Second, while Abd-B could not be detected
in PS12 in the stage 11 wild-type embryo, it was prematurely expressed in PS12 in
the stage 10 dCTCF m-z-
mutant embryo. The differences in both timing and level of expression seen in stage
10/11 wild-type and dCTCF m-z-
embryos were also evident in older embryos. In the stage 12 wild-type embryo shown
in Fig. 9a,b, there was at most only a very low level of Abd-B protein in PS12, while Abd-B did
not appear to be expressed in PS11. In contrast, Abd-B was readily detected in both
PS12 and PS11 of the dCTCF m-z-
embryo. Moreover, protein could even be seen in a cluster of cells in PS10. In addition
to being prematurely expressed in more anterior parasegments, the level of Abd-B protein
in PS13 and PS14 was higher than that in the wild-type control 83], 84].

Fig. 8. Expression of Abd-B and Insv in stage 10/11 wild-type and dCTCF m-z-
embryos. Stage 11 wild-type and stage 10 dCTCF m-z-
(from cross of homozygous GE24185 parents) embryos were probed with antibodies directed against Abd-B (mouse monoclonal
1A2E9 from Developmental Studies Hybridoma Bank) and Insv (a rabbit polyclonal: gift
of Tsutomu Aoki) and visualized by confocal microscopy. Parasegments are indicated
in the Fig. Arrows in panels F and G indicate Abd-B expression in PS12. a Wild type: merged image. b Wild type: Abd-B. c Wild type: Abd-B. d WT: Insv. edCTCF m-z-
: merged image. fdCTCF m-z-
:Abd-B. gdCTCF m-z-
:Abd-B. hdCTCF m-z-
: Insv. Red/Gray Abd-B, Blue Insv

Fig. 9. Expression of Abd-B, Insv, and En in stage 12 wild-type and dCTCF m-z-
embryos. Stage 12 wild type (a-d) and dCTCF m-z-
(e-h) were probed with antibodies directed against Abd-B (panels a,b e, and f), Insv (panels
c and g), and En (panels d and h). Arrows in panel f point to Abd-B protein expression in PS12, PS11, and PS10 in the stage
12 dCTCF m-z-
embryo. By contrast, arrows in panel b indicate that little or no Abd-B was detected in PS12 or PS11 of the wild-type
embryo. See text for details

The segment polarity gene en and the Notch pathway gene insv were expressed in a stripe-like pattern in each parasegment in the ectoderm of wild-type
germ band extended embryos; however, while all of the cells in the en stripes appeared to express essentially the same levels of En protein (Fig. 9d,h) only a subset of the cells in the insv stripes expressed Insv (Figs. 8d and 9c). In the case of En, there were no obvious changes in the stripe pattern or in the
level of protein in the cells expressing En in the dCTCF m-z-
mutant. In contrast, there was a substantial increase in the number of cells that
expressed Insv in dCTCF m-z-
embryos. The level of Insv protein in these cells also appeared to be elevated. These
changes were evident in both the stage 10 embryo in Fig. 8h and the stage 12 embryo in Fig. 9h.

We also examined the expression of Abd-B in GE24185 embryos rescued by the hsp83:dCTCF +
, hsp83:dCTCF ?N
, and hsp83:dCTCF ?C
transgenes. The pattern of Abd-B expression in GE24185 hsp83:dCTCF +
transgenic embryos resembled wild type. This was also true for GE24185 dCTCF ?C
transgenic embryos. In the case of hsp83:dCTCF ?N
, we occasionally observed stage 11–15 embryos that appeared to have slightly elevated
levels of Abd-B.