To locate elements regulating the human CD8 gene complex, we mapped nuclear matrix attachment regions (MARs) and DNase I hypersensitive (HS) sites over a 100-kb region that included the CD8B gene, the intergenic region, and the CD8A gene. MARs facilitate long-range chromatin remodeling required for enhancer activity and have been found closely linked to several lymphoid enhancers. Within the human CD8 gene complex, we identified six DNase HS clusters, four strong MARs, and several weaker MARs. Three of the strong MARs were closely linked to two tissue-specific DNase HS clusters (III and IV) at the 3′ end of the CD8B gene. To further establish the importance of this region, we obtained 19 kb of sequence and screened for potential binding sites for the MAR-binding protein, SATB1, and for GATA-3, both of which are critical for T cell development. By gel shift analysis we identified two strong SATB1 binding sites, located 4.5 kb apart, in strong MARs. We also detected strong GATA-3 binding to an oligonucleotide containing two GATA-3 motifs located at an HS site in cluster IV. This clustering of DNase HS sites and MARs capable of binding SATB1 and GATA-3 at the 3′ end of the CD8B gene suggests that this region is an epigenetic regulator of CD8 expression.
As thymocytes progress through development, they undergo induction and repression of a number of cell surface molecules. Changes in the expression of the CD4 and CD8 T cell surface glycoproteins best characterize the stages of ontogeny and ultimately define the two major lineages of mature T cells (CD4+CD8− or CD4−CD8+). Gene knockout studies with CD4 and CD8α have shown that the entire respective lineage does not develop (1, 2), making it likely that factors that regulate CD4 and CD8 expression also regulate the functional commitment of the cell. Therefore investigating the mechanisms controlling CD4 and CD8 expression should increase our understanding of thymocyte development.
CD8 can be expressed in mice and humans as an αα homodimer or an αβ heterodimer. Human CD8β can also be expressed as a ββ homodimer in transfected COS cells and transgenic mice (3). The CD8A and B genes, closely linked at a distance of ∼36 kb in mice and 56 kb in humans (see Fig. 1⇓), are coexpressed on most CD8+ T cells. The close linkage and coexpression of the CD8 genes suggest coordinate regulation. However, regulation is not always coordinated, particularly in cells which seem to be extrathymically derived, as subsets of human NK cells (4) and gut intraepithethelial lymphocytes (IELs)3 (5, 6) express only the CD8αα homodimer.
Toward identifying cis-acting transcriptional regulatory elements in the CD8 loci, large fragments of genomic DNA have been used to make transgenic mice. A 95-kb human genomic fragment beginning ∼25 kb upstream of the CD8B gene and containing the entire CD8B gene afforded developmentally correct expression on thymus-derived T cells in transgenic mice, indicating that CD8 lineage-specific regulatory sequences must be located within that fragment (7). Likewise, an 80-kb murine genomic fragment from 2 kb upstream of the CD8B gene to 25 kb downstream of the CD8A gene allowed appropriate expression in transgenic mice (8). The 80-kb murine genomic fragment contained four clusters of DNase hypersensitive (HS) sites (9) which were analyzed for enhancer activity in transgenic mice. One cluster at the 3′ end of the CD8B gene and two in the intergenic region had enhancer activity. The results indicate that there are separate elements for CD8 expression in the thymus vs the periphery, and possibly also for CD8α vs CD8β (9, 10, 11, 12, 13).
To locate regulatory regions within the human CD8 gene complex, we mapped the DNase HS sites which often colocalize with cis-acting transcriptional regulators. In addition, since many lymphoid gene enhancers are closely linked to matrix association regions (MARs) (14), we also undertook the mapping of MARs in the human CD8A and B loci. MARs, interspersed in genomes on the average of 50–100 kb, have been identified as specialized genomic sequences that tightly associate with the nuclear matrix, a RNA and protein containing fraction that remains after high salt extraction and DNase I treatment. MARs serve to anchor chromatin loops to the nuclear matrix and have been shown to facilitate long-range chromatin remodeling and accessibility (15, 16). We identified several MARs spanning the human CD8 loci. Interestingly, three strong MARs are closely linked to DNase HS clusters III and IV, located at the 3′ end of the CD8B gene, making this region a prime candidate for an element regulating CD8 gene expression.
To further establish the significance of this region, we analyzed the candidate regulatory region for binding of SATB1, a tissue-specific MAR-binding protein (17). Studies of SATB1 knockout animals showed that SATB1, expressed primarily in thymocytes, regulates temporal and spatial expression of multiple genes during T cell development, and is required for maturation of CD4+ and CD8+ T cells (18). Within the MARs associated with the human CD8 DNase HS clusters III and IV, we found two fragments that bound SATB1 with high affinity in an EMSA.
Further analysis of the sequence in the region of HS clusters III and IV revealed several potential binding sites for GATA-3, a transcription factor widely expressed in embryonic tissues, but whose expression is mainly limited in adult animals to T cells and NK cells (19, 20, 21). Targeted deletion of GATA-3 was embryonically lethal (22), but using Rag complementation, it was determined that GATA-3 is required for development of the earliest T cell progenitors (23). GATA-3 binding sites have been reported in several T cell-specific genes (24, 25, 26, 27, 28, 29, 30, 31), including the murine CD8 promoter/enhancer (32). In the present study, EMSA analysis revealed that an oligonucleotide with a double GATA-3 binding site, corresponding to the cluster IV HS site located in a strong MAR, binds GATA-3 strongly, while others in the HS cluster III and IV region also bind. Colocalization of GATA-3 binding sites, SATB1 binding sites, MARs, and DNase HS clusters suggests this region is a candidate regulator of human CD8 expression.
Materials and Methods
All cell lines were grown in RPMI 1640 medium supplemented with 2 mM l-glutamine, 100 U of penicillin/ml, 100 μg of streptomycin/ml, and 10% FCS. Cell lines used included the EBV-transformed B cell line UC and the human T cell lines HPB.ALL (CD4+CD8α+β+), JM (CD4+CD8α-β−), and Jurkat (CD4+CD8−). Cells were cultured at 37°C in a water-saturated atmosphere of 5% CO2.
DNase I hypersensitivity mapping
DNase I hypersensitivity mapping was performed as described previously (7). Briefly, DNA purified from DNase-treated nuclei was restriction digested and used to prepare Southern blots which were probed with the following fragments (see Fig. 1⇑): 1, 0.9-kb NotI-EcoRI fragment from the 3′ end of clone 647; 2, 0.5-kb NcoI-BamHI fragment from the 3′ end of clone 646; 3, a 0.8-kb BamHI-KpnI fragment from the 5′ end of clone 1229; 4, 0.9-kb KpnI-Stu fragment from 1229; 5, 1-kb EcoRV-BamHI fragment from the 3′ end of clone 1231; 6, 0.6-kb Xho-BamHI fragment from the 3′ end of 1230; and 7, 0.6-kb BssHII fragment including CD8α exon II. Fragments 1, 2, 3, 5, and 6 were used to probe DNA restriction digested with BamHI, fragment 4 with Kpn-digested DNA, and fragment 7 with BglII-digested DNA. Fragment 6 was also used to probe DNA double digested with Xho and SphI.
Nuclei from the human T cell line JM (CD4+CD8α+β−) were obtained by hypotonic lysis and purified by centrifugation through a cushion of 2 M sucrose. Nuclear matrices were isolated in the continuous presence of 250 μM PMSF and 10 μg/ml leupeptin, as described previously (33, 34). Briefly, isolated nuclei were digested with 100 μg/ml DNase I in 10 mM NaCl, 3 mM MgCl2, 10 mM Tris (pH 7.4), 0.25 mM sucrose, and 1 mM CaCl2 for 1 h at room temperature, and nonmatrix proteins were extracted with 2 M NaCl. Nuclear matrix binding was determined in an in vitro DNA-binding assay (34). Briefly, plasmids were restriction digested and the fragments were end-labeled with [γ-32P]ATP. After preincubation of nuclear matrices at room temperature with 100 or 200 μg of unlabeled Escherichia coli DNA as a nonspecific competitor, labeled plasmid fragments were added and the incubation was continued for 2 h. Insoluble matrix proteins were pelleted and washed to remove unbound DNA, treated with proteinase K, and the remaining DNA was electrophoresed on agarose gels. Gels were dried on Whatman 3 MM paper (Whatman, Clifton, NJ) and exposed to X-OMAT AR film. A 6-kb HindIII fragment from the p34 region upstream of ZNF 127, which contains strong MARs on 2.7- and 3.3-kb fragments generated by BamHI digestion (35, 36), was used as a positive control in these assays. The vector DNA from which the test fragments were excised served as an internal negative control. To show the relative intensities and sizes of the input radiolabeled probe fragments, 5% of the total radioactivity added to each MAR-binding assay (the probe) was electrophoresed beside 50% of the bound fragments. The intensities of the insert (Ii) and plasmid bands (Ip) within each assay were quantified by densitometry using the NIH Image program (rsb.info.nih.gov/nih-image). Adjusted relative binding ratios were calculated according to the following equation (37): (Iibound/Ipbound) × (Ipprobe/Iiprobe).
Base composition (Percent A + T) plots were produced using the MacVector software program (Accelrys, Princeton, NJ) with a window size of 50 nt. Maps of MAR-binding motifs were created by entering the specific motifs (38) into an enzyme filter in the MapDraw program (version 4.0) of Lasergene software (DNAstar, Madison, WI). Repetitive content in each sequence was identified using the RepeatMasker database (http://ftp.genome.washington.edu/cgi-bin/RepeatMasker/). Free energy values were calculated with the nearest-neighbor algorithm program Web-Thermodyn (http://wings.buffalo.edu/gsa/dna/dk/WEBTHERMODYN/index.html) using the program’s default values. We have previously found that the presence of free energy values of <90 in four or more successive windows was a specific indicator of in vitro MARs at the α and β globin and imprinted loci (J. M. Greally, unpublished data). The several sequences fitting this criterion are indicated as low free energy (LFE) regions in Fig. 4⇓.
The SATB1 gel shift assay was performed as described previously (39), using recombinant SATB1, a GST fusion protein consisting of aa 346–763 of human SATB1 (GST-SATB1). This protein contains both the MAR-binding domain and homeodomain. To estimate the Kd values for SATB1-binding sites, gel shift assays were run using 0.5 ng of DNA probe and increasing concentrations of SATB1. The dried gels were exposed to a phosphor imager screen, and the amounts of remaining free probe in each lane were quantitated. The Kd values were determined by calculating the concentration of SATB1 required to shift 50% of the probe DNA.
The GATA-3 gel shift assays were performed essentially as described previously (27). The binding reactions contained 10 mM HEPES (pH 7.9), 50 mM NaCl, 1 mM DTT, 1 mM EDTA, 1 mM MgCl2, 10% glycerol, 50 μg/ml poly(dI)-poly(dC), 200 μg/ml BSA, 1× protease inhibitor mixture (catalogue no. 1836170; Roche, Mannheim, Germany), and nuclear extract prepared from Jurkat cells stimulated for 8 h with 1 mM Bt2 cAMP and 25 ng/ml PMA. Abs used for the supershifts were anti-GATA-3 (Santa Cruz Biotechnology, Santa Cruz, CA) and an isotype control, anti-human CD8β (Immunotech, Westbrook, ME). The top strand sequences of the oligonucleotides used in the GATA-3 EMSAs are shown below. The number in parentheses indicates the location in plasmid 1231 for oligonucleotides 1–3 and in plasmid 1230 for oligonucleotides 4–7. Bold sequences highlight the putative GATA-3 motif, and underlined sequences indicate that the motif is on the complimentary strand: 1, atctccatcagatctcttgggtg (3858); 2a, ctctctccatatcagcaataag (541); 2b, tcgttttcttatcattcatgtg (4569); 3, tgcataccctatcttgaaatttgtgggt(5844); 4, ttgaaaagagatctaaaattga (2909;, 5, caaaaattagatcaagtagataaaattta (5013); 6, aaaatttgttatcaccttttaa (5262); and 7,tattatgcagatatcagataagattcatgaag (5400) contains three GATA-3 motifs. The sequence of the IL-5 promoter GATA-3-positive control oligonucleotide is cctctatctgattgttagca (27).
The GenBank accession number for the 19-kb nucleotide sequence from the 3′ end of the human CD8B gene is AY032722.
DNase hypersensitivity mapping
We had previously identified a tissue-specific DNase HS cluster upstream of the human CD8B gene (HS I) (7) and another in the last intron of the CD8A gene (HS VI) (40). The entire human CD8 gene complex was subcloned to facilitate a comprehensive mapping of DNase HS sites (Fig. 1⇑). We determined that these clones contain the whole CD8 gene complex, except for a small region between clones 1231 and 1230 containing CD8β exon IX, which we obtained by PCR. Cell lines used in this work were HPB.ALL, which are a CD4+CD8α+CD8β+ thymoma cell line, and UC cells, a B cell line, as a control for tissue specificity. In the present work, we identified four additional DNase HS clusters in the human CD8 loci, making a total of six HS clusters (Fig. 1⇑).
The data for DNase HS cluster III, located between CD8β exons VIII and IX, are shown in Fig. 2⇓A. Of the seven bands marked, those at 3.7 and 3.4 kb appeared to be specific for HPB.ALL cells. The 5.6- and 4.5-kb bands can be seen in the UC cells in Fig. 2⇓A, and bands appearing to correspond to the 3-, 2.4-, and 1.3-kb bands were seen in other blots made from UC cells treated with 25 U/ml DNase (data not shown).
The data for DNase HS cluster IV and part of cluster V are shown in Fig. 2⇑B. Of the bands seen, those at 4.7, 3.8, and 2.9 kb appeared to be specific for HPB.ALL cells, since those at 0.7, 0.5, and 0.4 were also seen in UC cells. These latter three small bands at the 3′ end of 1230 have been placed in cluster V in Fig. 1⇑ because they are closer to the other sites in cluster V than they are to the three tissue-specific DNase HS sites in cluster IV. There are several more weak DNase HS sites within cluster V, seen at 1.3, 1.0, 0.8, 0.5, 0.3, 0.25, and 0.1 kb from the 5′ end of 1264 (data not shown). These bands, although weak, appear to be tissue specific.
There are several DNase HS sites within cluster II (Fig. 1⇑ and data not shown). Sites which map to 1.5, 1.1, and 1.0 kb from the 3′ end of clone 646 appear to be tissue specific, in that they were not seen in UC cells, whereas sites at 3.2 and 3.5 kb from the 3′ end of 646 were also seen in UC cells. Cluster II contains additional sites, which map to 3.5, 3.0, 1.7, 1.6, 1.3, and 1.1 kb from the 5′ end of 1229. Of these, only the 3.5-kb band appeared to be specific for HPB.ALL cells.
In summary, we identified four new DNase HS clusters in the human CD8 gene complex. These are DNase HS cluster II (HS II), located just downstream of CD8β exon VII, HS III, between CD8β exons VIII and IX, and HS IV and V, in the intergenic region, 3–5 and 8–11 kb downstream of CD8β exon IX, respectively. We also observed four weak but tissue-specific DNase HS sites 0–0.8 kb upstream of the first exons of both CD8A and CD8B, in the promoter regions (Fig. 1⇑ and data not shown).
An in vitro MAR-binding assay was used to map the MARs in the CD8 loci. Over the whole region, there were four strong MARs and several weaker MARs. In plasmid 1235-4, located 13–17 kb upstream of the CD8B gene, there is a strong MAR in the 5′ 2-kb fragment and, in addition, at least one weaker MAR (Fig. 3⇓A). Since the 0.9- and 1.6-kb fragments are juxtaposed, it is possible that there is only one MAR split between them. Another strong MAR, located between CD8B exons VIII and IX in fragment 1231, is shown in Fig. 3⇓B. This strong MAR is closely linked to the tissue-specific DNase HS sites in cluster III (Fig. 4⇓A). There is also a weaker MAR at the 5′ end of clone 1231. The data for clone 1230, shown in Fig. 3⇓C, indicate a cluster of MARs in the center of plasmid 1230, with two strong MARs flanking a weaker one. Two of the three tissue-specific DNase HS sites of cluster IV are located within the weak MAR and the third is within the strong MAR on the 3′ end of 1230 (Fig. 4⇓B). We found six additional weaker MARs in the CD8 complex (data not shown). Thus, distributed over the CD8 gene complex were 4 strong MARs and 10 weaker MARs, as indicated in Fig. 1⇑.
Given that MARs are generally AT-rich sequences, we considered whether this frequency of MARs could be attributed to the human CD8 genes possibly being located in an AT-rich isochore, as was the case for the 15q11-13 imprinting center (35). We determined the isochore type to which the CD8 genes belonged by manually scanning the third base of each codon in the coding regions for guanine cytosine (GC) content (percent GC3). CD8A and B genes were grouped into their corresponding isochores based on the following criteria: L1 + L2, GC poor (GC3 < 57%), H1 + H2, GC high (57% < GC3 < 75%), and H3, GC rich (GC3 > 75%) (41, 42). The GC3 content of the CD8A and B genes were 71 and 84%, placing them well into the GC high and GC-rich isochores, respectively (data not shown). This indicates that the frequency of MARs we observe is not simply due to location of the CD8 genes in an AT-rich isochore.
Since clones 1231 and 1230 contain strong MARs closely linked to tissue-specific DNase HS sites, we considered them likely candidates for containing regulatory elements. Therefore, we sequenced the 19 kb spanned by the plasmid 1231, plasmid 1230, the PCR product between 1231 and 1230, and 4 kb from the 5′ end of 1264. We located the CD8β exons VIII and IX within this region as shown in Fig. 1⇑.
The sequence of a typical MAR is ∼65% AT rich (43) and contains a region of 150–200 bp that has a high potential for base unpairing (44, 45, 46). A Thermodyn analysis of the 19-kb sequence predicted seven potential LFE regions (Fig. 4⇑). Only one of the seven LFE regions did not localize to the biochemically determined MARs.
A mapping of repetitive elements in clones 1231 and 1230 is shown in Fig. 4⇑. There were eight whole or partial Alu elements. Interestingly, five of the six long interspersed nuclear elements were within fragments that bound to the nuclear matrix. The strong MAR in 1231 overlapped on its 5′ end with a Tigger 1 element, an interspersed repeat that resembles DNA transposons, which move by excision and reintegration into the genome without a RNA intermediate.
Several sequence motifs have been associated with MARs (17, 37, 44, 47, 48, 49). We mapped ATATTT and vertebrate topoisomerase sites and found that there was some clustering in the biochemically defined MARs, but this clustering was not absolute, particularly for the topoisomerase (Fig. 4⇑, A and B). In addition, we analyzed the sequence for potential SATB1-binding sequences. SATB1 is an interesting MAR-binding protein in that it does not recognize a DNA sequence but rather it is believed to recognize ATC sequences indirectly by the altered sugar-phosphate backbone determined by the ATC sequence context (50). The ATC sequence defined by one strand consists of exclusively As, Ts and Cs, excluding Gs, and at least 65% AT content. When the ATC stretches are clustered, it potentially confers high base unpairing propensity (44). With only one exception, the long ATC sequence stretches within our 19-kb sequence were confined to regions containing MARs (Fig. 4⇑).
To test for SATB1 binding to the potential ATC sequence stretches, we performed EMSAs using recombinant GST-SATB1. Ten fragments, ranging in size from 190 to 580 bp, and a control fragment were tested. Two fragments from the regions with strong matrix binding activity were found to bind purified SATB1. One fragment, from plasmid 1231, showed very strong binding (Fig. 5⇓B) with a Kd of 0.04–0.15 nM. A fragment from plasmid 1230, bound SATB1 more weakly with a Kd of 1.6–2 nM (data not shown). Locations of these SATB1 binding sites are mapped in Fig. 4⇑. For comparison, the Kd values were 0.3–1 nM for in vitro SATB1 binding to six other MAR probes, including fragments from the IgH and β globin MARs, (51) and ranged from 1 to 29 nM for 16 SATB1-binding sequences identified using chromatin immunoprecipitation studies with anti-SATB1 Ab and T cell nuclear extracts (39).
Both SATB1 binding sites are in Thermodyn predicted LFE regions, which is consistent with previous findings that SATB1 binds to DNA regions with a high propensity to base unpair (52). In the very high-affinity binding fragment from 1231, there are 3 stretches of ATCs of 28, 31, and 27 bases, separated by 8 and 18 bases, respectively (Fig. 5⇑C). Our other SATB1-binding site had two stretches of ATCs of 37 and 33 bases, separated by 33 bp (Fig. 5⇑D). The presence of multiple potential SATB1 sites separated by 25 bp or less has been noted in other fragments that bound SATB1 (39).
Another transcription factor that is potentially important for CD8A gene expression is GATA-3. We therefore analyzed the 19-kb region for putative GATA-3 sites using either the motif GATA or GATC with appropriate 5′ and 3′ bases according to Ko and Engel (53). We tested eight oligonucleotides for GATA-3 binding by EMSA analysis. We focused on oligonucleotide 5 with two tandem GATA-3 binding sites separated by 3 bases, that was located at a tissue-specific HS site. This oligonucleotide gave a band that was shifted with an anti-GATA-3 Ab. A mutated oligonucleotide with the AGATCA site mutation, M1, was able to compete fairly well as compared with the wild-type oligonucleotide (Fig. 6⇓A), indicating that the M1 site was not as critical for GATA-3 binding as the AGATAA site which when mutated (M2 mutant) could not compete. To test the other oligonucleotides, we performed cold competition studies (Fig. 6⇓B). Oligonucleotide 7, with three potential GATA-3 sites, competed strongly and the others to a lesser extent. The IL-5 promoter oligonucleotide, which contains a known double binding site for GATA-3, competed for binding less well than did oligonucleotide 5. This may have to do with the arrangement of the two GATA-3 sites in both oligonucleotides. In contrast to the tandem GATA-3 sites in oligonucleotide 5, the two GATA-3 sites in the IL-5 oligonucleotide overlap and are on opposite strands. Interestingly, oligonucleotide 7 is located 400 kb 3′ of oligonucleotide 5 within the same MAR. This oligonucleotide, as well as oligonucleotides 2a, 2b, and 3 showed supershifts with the anti-GATA-3 Ab (data not shown).
Gene transcription is dictated by regulatory elements to which transcription factors bind and by appropriate chromatin modification (epigenetic regulation). Because chromatin remodeling appears to be a critical component of gene transcription, DNase I HS sites, indicators of relatively open chromatin, are often used as signposts for regulatory elements. However, not all HS sites contain regulatory elements. Another indicator is colocalization of matrix attachment regions with HS sites. For example, there is a single MAR adjacent to both the Ig L chain enhancer, (33) and the TCRβ enhancer (54). Also, MARs flank both the IgH (55) and the TCRδ (56) enhancers. Therefore, identifying regions of the human CD8 loci that contain HS sites near MARs seemed a logical approach to more rapidly identify strong candidates for regulatory elements.
By mapping DNase I HS sites and MARs in the human CD8 gene complex, we found two regions at the 3′ end of the CD8B gene in which both HS sites and strong MARs were colocalized. The tissue-specific DNase I HS cluster III, located between the last two CD8β exons, was adjacent to a strong MAR. Another tissue-specific HS cluster, IV, located 3′ of the last exon, was flanked by strong MARs. Further support for the potential importance of these regions in CD8 gene expression is the presence of a very strong SATB1 binding site in the MAR linked to HS cluster III and another site in the 5′ MAR linked to HS cluster IV. Although we have discussed these regions separately, they encompass an 11-kb region that may actually function as a locus control region-like regulatory unit.
Positive functional effects by MARs on enhancer activity have been observed (56, 57, 58, 59, 60). Studies have shown that MARs are required for demethylation of the Igκ locus (61, 62) and generation of long-range accessibility of chromatin in the Igμ locus (15, 63). An elegant study by Forrester et al. (16) demonstrated that MARs could facilitate long-range chromatin remodeling. They studied the Igμ enhancer, which is flanked by MARs, for its ability to activate the VH promoter over a distance of 150 bp or 1.2 kb upstream of a methylated or demethylated gene in stable transfectants of B cell lines. They found that the enhancer alone induced local chromatin remodeling, giving rise to a DNase I HS site and local demethylation, which was sufficient to activate transcription when theenhancer was 150 bp from the promoter. However, for enhancer-mediated promoter activation over a distance, both MARs were required for methylated μ gene expression. The MARs in combination with the μ enhancer could induce acetylation of histones at a distal position. This may explain why the μ MARs were found to predominantly function in germline transmission but not in transient transfection assays where chromatin remodeling does not need to occur (60). Because the HS sites linked to the strong MARs that we have identified are located at least 20 kb from either CD8 promoter, it is possible that the MARs associated with these putative enhancers could promote a similar long-range interaction.
The higher order chromatin structure that may be required for tissue-specific expression of the CD8 genes may in part be regulated by the presence of SATB1. SATB1-binding sequences isolated from a T cell line were localized to the nuclear matrix at the base of chromatin loops in vivo (39). However, in a breast cancer cell line, in which SATB1 is absent, this was not the case for at least one of these SATB1-binding sequences, indicating that in vivo, anchoring of certain MARs onto the nuclear matrix is cell type specific. The hypothesis was put forth that SATB1 binding to the base of chromatin loops in vivo would create a specific chromatin loop domain structure that was involved in T cell-specific gene regulation. The SATB1-binding MARs in the CD8 region may likewise form a chromatin loop structure that is tissue specific. The very high-affinity SATB1 binding site in the strong MAR located in the last intron of the CD8B gene and the other site in the strong MAR located 4.5 kb downstream could lead to the formation of a loop domain that might facilitate CD8 gene expression, possibly through long-range histone acetylation of the CD8 gene. Because of the differences in affinity for SATB1 between the two sites, the formation of specific loop structures may vary depending on the concentration of SATB1.
The other protein that we found to bind to the candidate regulatory region, GATA-3, is also likely to affect CD8A gene expression. In the mouse a region in the murine HS cluster II located 4–5 kb upstream of the CD8A gene (32) contains two GATA-3 binding sites which function in in vitro assays. These GATA-3 binding sites are within the murine CD8 gene thymocyte-specific enhancer which also contains a SATB1-binding MAR (64). Interestingly, GATA-3 levels are high in CD4/CD8 thymocytes and then decline as they mature (65). Therefore, GATA-3 may be most important for CD8 expression in the double-positive T cell stage and would bind to the thymocyte-specific enhancer. The strong GATA-3 binding site that we found in the human CD8 tissue-specific HS cluster IV is potentially a functional site in that it has two GATA motifs 3 bp apart; double GATA motifs are often found in functional GATA sites.
Although it would be very informative to be able to compare the location of HS sites between the two species, exact comparisons are not possible because the human gene complex has ∼20 kb more DNA in the intergenic region (56 kb) compared with the murine region (36 kb), and the murine CD8B gene lacks exons VIII and IX. Despite this caveat, some sites may be comparable (Fig. 7⇓). For instance, the murine HS cluster IV at the end of the murine CD8B gene corresponds to our cluster II. In transgenic animal studies, this region contained regulatory element(s) which directed expression to double-positive thymocytes and mature CD8+ T cells, but not to CD8αα IEL (9, 12, 13).
Another murine regulatory region, associated with DNase HS cluster III, located in the intergenic region ∼16 kb upstream of the murine CD8A gene, contained an enhancer that was specific for mature CD8+ T cells and CD8αα IEL (10, 11). A fragment from murine DNase HS cluster II, just upstream of the murine CD8A gene, when analyzed alone did not display enhancer activity, but did direct expression to double-positive thymocytes when combined with cluster III (9). The location of human DNase HS clusters III, IV, and V in the intergenic region may be similar to these murine HS clusters in the intergenic region (8). The finding that one of the murine clusters did not function unless linked to another cluster indicates that a large regulatory unit composed of multiple HS clusters is likely to be regulating murine CD8A gene expression.
To address the functional significance of potential regulatory elements in the human CD8 gene complex, we have continued to test in transgenic animals portions from the 95-kb genomic CD8 fragment that gave tissue-specific expression in transgenic animals. Focusing on the region described in this article, we have linked cluster III plus MAR or clusters IV and V plus MARs to a genomic human CD8A gene marker gene and analyzed for expression in transgenic animals. High level expression in a small percentage of the murine CD8 T cells was noted with both constructs in all transgenic lines (two to five lines per transgene, our unpublished data). However, our failure to obtain expression in most of the murine CD8 T cells may be related to the fact that both regions together may be required for large numbers of cells expressing the transgene. Such studies are currently in progress.
We thank Dr. Carol Webb for the MAR assay protocol. We are very grateful to Dr. Anuradha Ray for the GATA-3 EMSA protocol, nuclear extracts, and helpful discussions. We also thank Lei Yan and Sarah Cho for technical assistance and Roberta O’Brien for secretarial assistance.
↵1 This work was supported by National Institutes of Health Grant CA48115 (to P.B.K.), National Research Service Award 5 F32 AI09700 (to L.J.K.), National Institutes of Health/National Institute of Diabetes and Digestive and Kidney Diseases Grants DK02467 and DK45676 (to J.M.G.), and National Institutes of Health Grant R01 CA39681 (to T.K.-S.).
↵2 Address correspondence and reprint requests to Dr. Paula B. Kavathas, 333 Cedar Street, P.O. Box 208035, New Haven, CT 06520-8035. E-mail address:
↵3 Abbreviations used in this paper: IEL, intraepithelial lymphocyte; MAR, matrix attachment region; HS, hypersensitive; LFE, low energy free; GC, guanine cytosine.
- Received December 10, 2001.
- Accepted February 20, 2002.
- Copyright © 2002 by The American Association of Immunologists