|
|
||||||||
Division of Biology, California Institute of Technology, Pasadena, CA 91125
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Two transitions in early T cell development are of particular interest for lineage choice mechanisms: the onset of T lineage gene expression ("specification"), and the final exclusion of any fate except a T cell fate ("commitment"). Both transitions occur among intrathymic, early T lineage cell populations (Pro-T cells), which are still negative for the mature T cell markers CD4 and CD8 and do not yet express TCRs. These double-negative (DN) cells within the mouse thymus are divided into four stages on the basis of their expression of the surface markers Kit, CD25, and CD44 reviewed in Refs. 6, 7, 8 . The first thymocyte population, DN1, maintains much lineage plasticity and, under special conditions, is capable of producing macrophages, NK cells, or dendritic cells, with a minute subset apparently capable of generating B cells as well (9, 10, 11, 12). It is not yet clear whether DN1 cells are distinct from prethymic precursors in gene expression pattern. However, as cells enter the next stage, DN2, they express sharply increased levels of Pro-T cell genes such as those encoding pT
, CD3
, CD25, Rag1, and IL-7R
(CD127) (10, 13, 14, 15, 16), and some rearrangement begins at the DJ
and VJ
TCR loci (17, 18). In hemopoietic precursors (derived from fetal liver) that are differentiating in vitro in response to Notch/Delta signaling, the first-appearing DN2 phenotype cells display the same dramatic increase in expression of these genes (19). DN2 cells have undergone "specification" but are not yet committed to the T lymphocyte pathway; a high proportion of DN2 cells are still able to differentiate into NK cells, macrophages, or dendritic cells (10, 20, 21, 22, 23). At the DN3 stage, thymocytes stop dividing, further increase expression of the Pro-T differentiation genes as well as Notch target genes (24), and undergo extensive TCR rearrangements. Only at this stage do they become committed to a T cell fate in vivo. Cells only progress beyond the DN3 stage through successful TCR gene rearrangement and TCR-dependent selection, at which time they graduate from Pro-T cell status, to give rise to up to five types of T cells: 
CD4, 
CD8, 
, NKT, or regulatory T cells.
Of all these developmental transitions, surprisingly little is known about the stages encompassing "T lineage specification," that is, the DN1 to DN2 transition. The regulatory participants in these early stages have not been sufficiently characterized to explain the outcome, although Notch/Delta signaling plays a role (19, 25). Traditional methods to identify all the transcription factors that play key roles in early stages of T lineage specification have found limited success, in part because transcription factors are typically present in low copy numbers. T cell precursors at the earliest stages are represented in vivo at any one time by tiny numbers of cells, providing very limiting material for standard microarray analysis. In addition, small fold changes in transcription factor abundance or changes in transcription factor ratios may generate dramatic shifts in cell state. Identification of truly novel regulatory factors and novel isoforms of known factors has also been limited by the comprehensiveness of the microarrays used (26, 27, 28, 29) and by the microarrays nucleotide probe design.
To circumvent these problems, we used a subtractive hybridization technique (30) to probe a mouse Pro-T cell cDNA macroarray library. The subtractive technique allowed us to enrich a Pro-T cell probe for message not shared by progenitor/premyeloid cells and to identify genes (known and previously unknown) that might be specifically up-regulated during the initiation of the T lineage program. Clones selected by the subtraction were sequenced and their patterns of expression characterized in detail using sensitive, quantitative real-time PCR (qRT-PCR) on a range of highly purified cell populations. The Pro-T cell library was generated by random priming of mRNA from SCID thymocytes, which consist of DN13 stage Pro-T cells and NK precursors (31). The resulting macroarray library represents the actual spectrum of transcripts in the DN2 and DN3 cell populations (those immediately preceding lineage commitment) and also contains multiple clones of genes with abundant transcripts, providing opportunities to sample alternate splice variants.
Our subtraction protocol has identified genes that are specifically enriched in T-lineage as opposed to early B or myeloid lineage precursors. Enriched genes include novel transcription factor candidates, chromatin remodeling factors, RNA binding molecules and helicases, a select group of signaling molecules and adaptors, and novel or functionally uncharacterized genes. In this study, we present the resulting expression profiles of >35 of these genes expressed during T lineage specification. A remarkable feature of the whole ensemble of these Pro-T cell genes is the high frequency of "legacy" genes that are expressed strongly in both stem cells and Pro-T cells, although down-regulated elsewhere. In this context, the very select group of regulatory genes that are specifically induced coincidentally with T lineage specification takes on an unexpected significance.
| Materials and Methods |
|---|
|
|
|---|
C57BL/6J, B6.CB17-Prkdcscid/SzJ (B6-scid) (The Jackson Laboratory) and B6-Rag2null mice (originally from E. Palmer, Basel Institute, Basel, Switzerland) were bred and maintained in specific pathogen-free facilities at Caltech.
Thymus and BM samples were taken from animals 57 wk old. The animals used were bred and maintained under sterile conditions at Caltech.
cDNA library
The C.B-17-scid thymocyte, random-primed, cDNA library was constructed in the pSPORT1 vector (Invitrogen Life Technologies), and was arrayed and spotted at high density onto Hybond-N+ nylon filters (Amersham Biosciences) using the Q-BOT robot (Genetix) as described previously (31).
Cell populations for library generation and library screening
The two types of cells used as sources of RNA for the subtraction protocol were a bulk population of Pro-T cells and a population of progenitor/premyeloid cells. To obtain large numbers of Pro-T cells in the DN1-DN3 stages, we took advantage of the Rag2 knockout mouse, in which thymocyte development arrests at DN3. In the wild-type mouse thymus, DN3 cells account for only 1% of thymocytes, but even without sorting, Rag2 knockout thymocytes consist of 90% DN3 Pro-T cells, with the remaining cells being DN1, DN2, NK, or thymic stromal cells. Because these cells were not pure Pro-T cells, we refer to them in the text as "Pro-T plus." A progenitor/premyeloid population was obtained from LinKit+ BM cells matured in culture toward a myeloid fate. Specifically, BM cells that were Kit+Gr1CD11bTer119CD19 from Rag2ko mice were cultured for 42 h at 37°C in 5% CO2 in IMDM with 10% heat-inactivated FCS supplemented with IL-3 (100 µl of WEHI-3B cell supernatant/ml medium), stem cell factor (Kit ligand) (100 µl of BHK-MKL cell supernatant/ml medium), and 10 ng/ml rIL-6 (PeproTech). At the time of harvest, the morphological appearance of the cells ranged from undifferentiated, blast-like cells to mature granulocytes.
Cell populations for quantitative RNA expression analysis
In addition to the subtraction populations, sorted cell populations were obtained for analysis by qRT-PCR. Two hemopoietic progenitor populations, a Pro-B population, and a population of myeloid cells were all sorted from the BM of Rag2ko mice. LSK CD27 cells were Kit+Sca-1+CD27Gr1CD11bTer119CD19. LSK CD27+ cells were Kit+Sca-1+CD27+Gr1CD11bTer119CD19. Pro-B cells were CD19+Kit+/Gr1CD11bTer119. Sorted BM myeloid cells were Gr1+CD11b+KitTer119.
Initial expression screening made use of the unsorted Rag2ko thymocytes called Pro-T plus. For more detailed analysis, five populations of DN Pro-T cells were purified from C57BL/6 mouse thymi by cell sorting, essentially as described previously (13, 24). Each of these DN subsets is CD4CD8CD3
Ter119F4/80 and Gr1. DN1 cells are Kit+CD44+CD25. DN2 cells are Kit+CD44+CD25+. DN3a cells are KitlowCD44CD25+CD27. DN3b cells are CD44CD25+CD27+, and DN4 cells are CD44CD25. All Abs used in this study were obtained from eBioscience or BD Pharmingen.
Thymocytes and BM cells were obtained from animals immediately after euthanasia. Cells were incubated in CBSS (5.4 mM KCl, 0.3 mM Na2HPO4, 0.4 mM KH2PO4, 4.2 mM NaHCO3, 137 mM NaCl, and 5.6 mM D-glucose (pH 7.4))/1% BSA with clone 2.4G2 anti-CD32/CD16 (Fc
RIII/II) supernatant on ice for 10 min, followed by washing and addition of Abs for staining. Cells stained with biotin-conjugated Abs were washed through a layer of FCS before staining with streptavidin-PECy5 (eBioscience). Stained cells were sorted using FACS Aria cell sorter (BD Immunocytometry Systems). All sorted fractions were reanalyzed immediately for purity and all fractions used here were at least 96% pure.
Preparation of fetal liver cells as input for OP9-DL1 and OP9-control cultures is described below.
Generation of subtractive probe
A Pro-T plus unselected cDNA probe and a T lineage-enriched subtracted probe were generated from the pool of enriched Pro-T cells (
1 x 107 cells; Pro-T plus) and the population of myeloid and multipotent progenitors (
9 x 106 cells) described above. RNA was isolated from each population of cells by the Qiagen RNeasy Lipid Minikit (Qiagen), and the RNA was analyzed for purity and integrity by Agilent Bioanalyzer and RNA 6000 Nanochips (Agilent). Messenger RNA was isolated from total RNA by the Ambion Poly(A) Purist Kit (Ambion). The mRNA was also evaluated for purity and concentration by the Bioanalyzer.
The subtraction method was adapted from Rast et al. (30). cDNA was synthesized using enzymes and buffers from a Clontech Marathon cDNA synthesis kit (BD Biosciences), but with the LT7 random-BT primer: 5'-(biotin)-CGGAGGTAATACGACTCACTATAGGGAGNNNNNN-3' (34 nt). Qiagen PCR purification columns were used to purify samples between first-strand and second-strand synthesis stages.
Different linkers are used for Selectate (the Pro-T plus-derived cDNA) or Driver (the progenitor/myeloid-derived cDNA) to avoid nonspecific subtraction. Linkers contain a 3' dideoxy residue to prevent filling in of overhang, and a 5' phosphate for blunt ended ligation to the cDNA. The Selectate linker sequences were 5'-GGGTGCTGTATTGTGTACTTGAACGGGCGGCCGCA-3' and 3'-dideoxy-CGCCCGCCGGCGT-P-5'. The Driver linker sequences were 5'-GCCAACGTATGTAAGGTTGAGTTCCGGGCAGGT-3' and 3'-dideoxy-CCCGTCCA-P-5'. Linkers were annealed by placing the linker pairs in a 1:1 molar ratio at a concentration totaling 1 µg/µl in 10 mM Tris (pH 7.9) and 100 mM NaCl, heating in a heating block to 95°C for 5 min, then turning off the block and allowing the linkers to anneal as the block cools to room temperature (
30 min). Ligation efficiency was evaluated by comparing their electrophoresis migration on a 2% agarose gel relative to the untreated mixture. These linkers were ligated to either the Selectate or Driver cDNA for 16 h at 16°C with DNA ligase.
cDNA with linkers attached was PCR amplified with LT7 primer (5'-CGGAGGTAATACGACTCACTATAGG-3') and a primer specific for either the Selectate linker (5'-GGGTGCTGTATTGTGTACTTGAACG-3') or for the Driver linker (5'-GCCAACGTATGTAAGGTTGAGTTCC-3') to produce 600 ng of product. The resulting Selectate was size selected for 300- to 500-bp product by electrophoresis of the PCR in an agarose gel, excising the appropriate region of the gel, and electroeluting cDNA from the gel. The electroeluate was precipitated and resuspended in 50 µl of water or T low E (10 mM Tris and 0.1 mM EDTA, pH 7.8). Driver product was precipitated and resuspended in 16 µl of water.
Size-selected Selectate was amplified by PCR (primers listed above), and 1 µg of Selectate was set aside for production of the unsubtracted library probe. Selectate (3 µg) was subjected to single-strand purification by Dynal Streptavidin beads, according to the manufacturers instructions (Invitrogen Life Technologies). RNase-free technique was used from this point forward for both Selectate and Driver. The Ambion MEGAshortscript kit (Ambion) was used according to the manufacturers instructions to translate the Driver cDNA to RNA. Single-stranded Selectate DNA (200 ng) from Pro-T plus cells was mixed with Driver RNA (30 µg) in a 10-µl total volume, denatured at 95°C, iced, then hybridized at 65°C for 40 h. Double-stranded and single-stranded products were separated by hydroxylapatite chromatography (30, 32), and the eluate containing the single-stranded product was desalted and concentrated. The single-stranded product was used to manufacture the subtracted, radioactive probe by Ambion Maxiscript kit using Amersham 800 Ci/mM 32P-UTP.
Subtractive hybridization protocol
Macroarray filters were sequentially hybridized with cDNA from Pro-T plus Rag2ko thymocytes, stripped of probe, then hybridized with subtracted probe, i.e., probe enriched for mRNA that was not shared by the progenitor/premyeloid population. Hybridization intensity for each probe was measured by a PhosphorImager (Molecular Dynamics, GE Healthcare) using BioArray software (Genetix). Representative data are shown in Fig. 1.
|
![]() |
(f) is the whitening filter, S(f) is the signal in Fourier space, and N(f) is the noise in Fourier space. A Wiener filter is often used to filter random, usually small-scale, noise from data leaving mostly the large-scale correlations; this is its normal use in image analysis. However, because our clones were randomly placed on the blots, the only correlations expected in our data are the relatively small-scale ones between spot pairs. Hence, all large-scale correlations are likely due to systematic noise such as inhomogeneous probing or washing of the blots. Therefore, we used the whitening filter to remove such correlations (J. E. Moore, unpublished results). The ratio of the average spot-pair hybridization intensity before and after subtraction was termed enrichment: (intensity for subtracted probe) ÷ (intensity for the unsubtracted probe) = enrichment. The logarithms of the enrichments for 73,728 spot pairs were calculated, and a clone was deemed to be significantly enriched when the logarithm of its enrichment was more than three SDs above the mode of its blot (Fig. 1C). Clones more than four SDs above the mode were selected for special attention (see Table I). The modes were calculated by a process called "estimating the rate of an inhomogeneous Poisson process by Jth waiting times" (33), briefly outlined here. For each blot, the logarithms of the enrichments were ordered. A window size, J, was chosen, which for these calculations was (1)/24 the number of spot pairs on a blot, or 768; other reasonable values of J do not appreciably change the estimates. An integer, I, was chosen so as to minimize the difference between the Ith and (I + J)th logarithms, and the mode was estimated by averaging these.
|
|
|
GOToolBox analysis
The GO-Stats function of the GOToolBox website (35) was used to perform a hypergeometric analysis of statistically relevant over- or under-represented terms within our data set as compared with the Mouse Genome Informatics database of genes. The Benjamini & Hochberg correction for multiple testing was applied. Selected results of searches in the categories of Biological Processes and Cellular Components are reported herein.
Bioinformatic databases
The following databases were used: www.ncbi.nlm.nih.gov, http://genome.ucsc.edu, www.ensembl.org/Mus_musculus, http://crfb.univ-mrs.fr/GOToolBox/index.php, and www.informatics.jax.org.
Coculture of fetal liver cells with OP9 cells
Hemopoietic progenitors cocultured with BM stromal cells (OP9 cell line) will develop into B lymphocytes in vitro. When OP9 stromal cells are transfected to stably express the Notch ligand Delta-like 1 (OP9-DL1), progenitor cells will develop into T lymphocytes in coculture (36). Mouse fetal liver cells (containing hemopoietic progenitors) were cocultured with OP9-control or OP9-DL1 cells exactly as described previously (19). In short, Kit+Lin (Lin = Gr1, Ter119, F4/80, CD19) cells from day 14 to 14.5 mouse embryo livers were obtained by FACS sorting. Kit+Lin fetal liver cells were cocultured with OP9 control cells or with OP9-DL1 cells (36). Cocultures were harvested for RNA analysis by forceful pipetting at indicated time points. In some experiments, to test the effect of delayed addition or withdrawal of Notch signals, progenitor cells were transferred to secondary cultures at day 4. OP9-control and OP9-DL1 cocultures were harvested, Kit+CD27+Lin cells were isolated from each culture by sorting, and these were each split and used to seed fresh monolayers of OP9 control and OP9-DL1 cells, to be harvested at the indicated later time points.
To compare the time courses of T lineage differentiation from distinct precursor subsets, Kit+Lin fetal liver cells were fractionated into Kit+Sca-1+ ("LSK"), Kit+Sca-1lowCD27+Flk2/Flt3 (CD135)+IL-7R
(CD127) ("Flk+"), and Kit+Sca-1lowCD27+CD135+CD127+ ("CLP-like") subsets, as described elsewhere (T. Taghon, M. A. Yui, and E. V. Rothenberg, submitted for publication). These subsets were then cocultured with OP9-control or OP9-DL1 cells for 17 days before harvesting for RNA.
Quantitative real-time PCR
qRT-PCR was performed on diluted samples of cDNA using SYBR Green PCR Master Mix in an ABI PRISM 7700 Sequence Detector (Applied Biosystems). In all figures comparing expression levels of multiple genes, the measurements for all genes shown were conducted on the same cDNA samples. The
Ct method was used for all expression measurements, with a fixed threshold to enable direct comparison between test genes and the GAPDH standard. Primers were designed, using Primer3 software (37), to have optimal melting temperatures and to cross introns. The primers were BLAST tested for gene specificity before being synthesized (Operon Biotechnologies). Each primer pair was evaluated for acceptable dose-response titration slopes and amplification. Primer sequences are as follows: Ablim, forward, CTGGCAGCTCAGAGGAGTTC, and reverse, CGCAGCTGGGATGATAATG; Aff3, forward, CAACAGAGAGCAGCGCAACA, and reverse, CCCGTCTCCATATTGCACACTT; AI449175, forward, GCTCCTTCCCAGAAGACTCTC, and reverse, TCAGGCTCTTCAAAATGGTCTT; Akap8, forward, AAATTGAGAAACGGCGTCAG, and reverse, AATGTGCGGCTTCAATCTTT;
-actin, forward, ACACCCGCCACCAG, and reverse, TACAGCCCGGGGAG; Bat2, forward, ATACTGCCACAAGCCGAAAG, and reverse, TCAGGTCCACTCCACTGTCA; Bcl11a, forward, GTCTGCACACGGAGCTCTAA, and reverse, CACTGGTGAATGGCTGTTTG; Bcl11b, forward, GGGCGATGCCAGAATAGAT, and reverse, GGTAGCCTCCACATGGTCAG; Crsp7, forward, ATGGTGGCAGTGTTGGAAGT, and reverse, GGTTTTCTTGCGGACATCAT; Ctdsp1, forward, CCAGTGAACAATGCGGACTT, and reverse, CCCATTCGCTGTAGGAACTC; Ddx17, forward, AGACAAAGAGGCGCTGTGAT, and reverse, CCTTTCCAGATCGGAACTCA; Ddx19b, forward, GCCAAGTAGAGCCTGCAAAC, and reverse, ACTTGCCCATCTGCTCAATC; Deltex1, forward, GAGGATGTGGTTCGGAGGTA, and reverse, CCCTCATAGCCAGATGCTGT; Deltex3L, forward, CGGACACCTACGAGGTGAAG, and reverse, TTTCCAGGACAATGGTCACA; Eva1, forward, TCACAGCCCTTTGTCCTACA, and reverse, AGTTAGCGCATCTCCCACAG; FgfrL1, forward, TGCAAATACCATGGGCTACA, and reverse, GCTTGTGGATGACGATGAAG; Fkbp5, forward, AACGAAGGAGCAACGGTAAA, and reverse, AATCGGAATGTCGTGGTCTT; FUS, forward, CAGCAACGAGCTGGAGACTG, and reverse, TCTGGCTTAGGTGCCTTACACTG; GAPDH, forward, ACTCCACTCACGGCAAATTCA, and reverse, GCCTCACCCCATTTGATGTT; GATA3, forward, GAGGTGGTGTCGCATTCCAA, and reverse, TTTCACAGCACTAGAGACCCTGTTA; Gpr56, forward, TTGCAGCAGCTTAGCAGGTA, and reverse, GTCTCCCAGGAAGCTCACAG; Grap, forward, GTGTGACGAGCAACCACTGA, and reverse, TCCACAACTTCCA CGATGTC; Heb-alternate, forward, GTGCTTATCCTGTCCCTGGAATG, and reverse, TGGCTTGGGAGATGGGTAAC; Heb-canonical, forward, GAGAAGAAGACCGCTCCATGAT, and reverse, TGGCTTGGGAGATGGGTAAC; Helz, forward, TGATGGGCTATTTGGGTGTT, and reverse, CTGGAGGGCCATGTCATAGT; Huwe1, forward, GGTTGCTGCCACAGCTATTT, and reverse, CACCAACCTTTGCTGGAGAT; Ldb1, forward, TGAAGTTGGCTCCACCTTAGT, and reverse, GCTCCTTCGGCGAGTACAG; MLL1, forward, TGCCCATAGCCCAT, and reverse, TCTGTGAATGAGGC; MLL2, forward, GTGCAGCAGAAGATGGTGAA, and reverse, AGAGCAGCCAGCAGGTCTAA; Myb, forward, AGCGGGAATCGGATGAATCT, and reverse, GAGCAGAAGAAGTTTCCCGATTT; Mxd4, forward, CCGAACAACAGGTCTTCACA, and reverse, CGCTTCAGAAGGCTCAGAGT; Prss16, forward, CCCAAACAAGGGTGGTTAGA, and reverse, CTTGGCCAGTTCTGTGTTGA; Ptpn7, forward, CTTACACGCTGGACGCTACA, and reverse, TCCAGGTCTTCAGGGTTGAC; PU.1, forward, GCGCTGGCACCTTTTTGTAT, and reverse, CAATAATTTTACTTGTCTTTAGTGGTTA; Rab2, forward, TGCCAAGACTGCGTCTAATG, and reverse, GCTGAGGGCCAATTTTAATG; Rabgap1, forward, CCTCCCAGTGGTTCCTTACA, and reverse, GGGCGACATTAAAGATGACAC; Scl, forward, CAACAACAACCGGGTGAAGA, and reverse, ATTCTGCTGCCTCCATCGTT; Senp2, forward, TAAGGTTCTCGGCACCATTC, and reverse, GGCTGGGATCTCATCAGTGT; Spatial, forward, GACACAAGAGGCAGCCTACAG, and reverse, GGATGCACCAGGAGGACTT; Tcf7 (aka T cell factor-1 (TCF-1)), forward, CAAGGCAGAGAAGGAGGCTAAG, and reverse, GGCAGCGCTCTCCTTGAG; Tmem13, forward, GCCCTCCCTAGACCCAACTG, and reverse, GCTTCCAAGTAGGCTGTTCCA; Trim44, forward, TCTGTGTCCTGTGTCCAGTCATT, and reverse, CAGTCCACCGGAATCTTTGC; Zcchc11, forward, TGACAGTGCTTCAGGGATTG, and reverse, TAGCCTCTGCTCAGGTGTCA; Zfp27, forward, TTTTTGCCAGCAGCAGATAG, and reverse, CTGCACCACATCCCGATAG; Zfp30, forward, TGCCTACGAGAGGGATCTGT, and reverse, CCTTGTTCCAACAGGGTGA; and Zfp109, forward, GCTGCTCAGAGGAAGCTGTA, and reverse, CCCCAGTGAAAGGCATCTTA.
Data display as heat maps
The heat maps were generated in the Excel program by arranging expression data in a table with the genes forming the rows and the conditions forming the columns. For each gene, its expression data is normalized by dividing by the geometric mean of that genes maximum and minimum expressions. All of the normalized values between 1/
and
are assigned to the middle bin, yellow. Each step from dark blue (lowest expression) to red (highest expression) in the rainbow represents a 3-fold decrease or increase, respectively, in the boundaries of the bin, except for the final bins; dark blue ranges from 0 to 1/27
, and red ranges from 27
to infinity.
| Results |
|---|
|
|
|---|
To identify previously uncharacterized genes that might act during the earliest stages of the T cell developmental program, we performed a hybridization screen for T-lineage-enriched transcripts in a macroarrayed cDNA library of
70,000 clones from mouse Pro-T (DN1-DN3) and pre-NK cells. This library, generated in our lab (31), had yielded novel and informative Pro-T cell transcripts before (38) and provided an opportunity to recover unannotated genes as well as alternative transcripts that might not be represented in microarrays. To establish a baseline, the library was initially probed with Rag2ko thymocyte cDNA (Pro-T plus; because this mutation prevents
-selection, this population is primarily DN3 cells). It was then probed with "subtracted probe," consisting of Rag2ko thymocyte cDNA from which message shared by a myeloid-biased progenitor population was subtracted. T lineage specific cDNAs were those that hybridized with specifically increased intensities to the subtraction-enriched probe (Fig. 1). Clones thus identified as enriched (see Materials and Methods) were sequenced and mapped to their coordinates in the mouse genomic sequence. More than 1000 sequences were analyzed to retrieve genes specific to early T lineage cells.
An early indication of the robustness of the subtraction was evidenced by the fact that 348 clones, one third of the enriched clones, were found to represent genes already known to be up-regulated in or unique to Pro-T cells (Table I). One of these genes, Tcf7 (TCF-1), encodes a transcription factor with known essential roles in T cell development (39, 40, 41) while the others encode pre-TCR and TCR components, signaling molecules (Lck and LAT), the mutagenic DNA polymerase DNTT (terminal deoxynucleotidyl transferase), and distinctive cell surface markers of Pro-T cells. We excluded from consideration 217 clones that represented ribosomal RNA, 51 clones of mitochondrial origin, and 120 clones with significant alignments to short or long interspersed nuclear elements (SINEs or LINEs). Also, 154 of the clones (24%) aligned to unidentified RIKEN sequences in the databases or were not significantly similar to any known sequence in the bacterial or animal NCBI database. (These sequences have been reported and are presented in Supplementary Table I.)8 The 92 genes represented by the remaining clones are the focus of this report.
Candidate genes for early T cell function
Table I lists the transcripts that were identified as enriched in Pro-T plus cells relative to premyeloid cells by 3 (
) or 4 (
) SDs above the mode. These genes were identified by high-quality sequence-matches (typically >500 bp, all were >100 bp) to documented exons. In addition, select matches that include intronic or immediately flanking sequences are listed, as long as they did not include SINE or LINE homologies, because novel alternative splicing, polyadenylation, and promoter use isoforms would also be of interest.
Ninety of the 92 genes listed in Table I were submitted to GOToolBox (35) for classification by Gene Ontology. Hypergeometric statistical analysis was performed using the GOToolBox GO-Stats function. Only Spatial and Prss16 were omitted from this analysis, for reasons described below. Select GO-Toolbox results are listed in Table II. Among the subtraction-enriched transcripts, those encoding transcriptional regulators were markedly over-represented relative to the Mouse Genome Informatics (MGI) database (p < 4 x 106). Also significantly overrepresented in the enriched data set were transcripts predicted to encode components of the ubiquitin cycle (p = 0.0012), Wnt receptor signaling components (p < 3 x 106), and proteins with a nuclear localization (p < 4 x 107). Wnt signaling and Notch signaling components as well as transcriptional regulators generally were of interest because of the critical roles of these signaling pathways in early T cell development (42, 43, 44, 45). These results suggested that our subtraction-enriched clones could be a rich source of potential regulatory genes for the early stages of T cell development.
|
Comparison of gene expression patterns between Pro-T and Pro-B cells and multilineage progenitors
The significance of new candidate regulatory genes for T cell lineage determination could be quite different depending on their expression pattern in the above cell populations. We identified three general categories of expression: 1) inherited from a stem-cell precursor, a category we termed "legacy"; 2) expressed in a general "pan-lymphoid" pattern; and 3) actually induced in developing precursors through a T lineage-specific process. We analyzed the patterns of expression of 43 genes from the screen in sorted populations of wild-type mouse hemopoietic cells. By using gene-specific qRT-PCR, we were able to compare expression quantitatively in highly purified, sorted cells from very small populations. Pro-T plus cell populations were compared not only with sorted Gr-1+Mac-1+ myeloid cells and CD19+ Pro-B cells from Rag-knockout BM, but also with sorted populations of enriched hemopoietic stem/progenitor cells (LinKit+Sca-1+CD27) and multipotent lymphomyeloid progenitors (LinKit+Sca-1+CD27+) (19, 46). These results are presented in Fig. 2 as qRT-PCR graphs and in Fig. 3 as a clustered heat map.
|
|
The populations used for this analysis were validated by analyses of regulatory landmarks for the stem cell to T cell transition (Fig. 2A), namely, the genes encoding the stem cell transcription factor SCL/Tal1, the T cell transcription factor GATA3, the myeloid transcription factor PU.1, the Bcl11b relative that is required for B cell development, Bcl11a, and the direct Notch target gene Deltex1, which encodes an E3 ubiquitin ligase (47) (Fig. 2A). These showed the expected patterns of expression for the cell populations. SCL was expressed highly in the progenitor subsets but not the others; PU.1 was expressed highly in the progenitors and Pro-B cells and was further enriched in myeloid cells, but down-regulated in the Pro-T cells; GATA3 was up-regulated specifically in the Pro-T cells; and Bcl11a was highest in the Pro-B cells but specifically down-regulated in the Pro-T and myeloid cells.
As shown in Fig. 2, the majority of the subtraction-selected genes was verified to show at least 2-fold more expression in Pro-T cells than in the sorted BM myeloid cells (yellow or white vs pink bars, Fig. 2), and in most cases the difference was at least 10-fold (please note log scale in Fig. 2). The exceptions were Senp2, Ctdsp1, and Rab2, which showed weak enrichment if any. Trappc2l had <3-fold enrichment. The genes that were T-enriched showed various patterns of expression relative to Pro-B cells (blue bars) and to the stem and multipotent progenitor cells (green bars). However, as will be described below, remarkably few of these genes were truly T lineage restricted.
Dominance of "legacy genes" and pan-lymphoid genes in Pro-T vs premyeloid-enriched gene set
Many of the genes that were differentially expressed between Pro-T and premyeloid cells were expressed at similar levels in Pro-T cells and in Pro-B cells (
2x difference and/or within error), implying functions shared in early T and B lineage development. These pan-lymphoid genes include Aff3 (LAF4), Crsp7, Mll1, Mll2, Mxd4, Zfp27, Ddx17, Trim44, Zcchc11, Gpr56, Grap, and Akap8. Although boundaries between classes are not sharply defined, Myb, FUS, Ablim, Huwe1, and Bat2 could be considered pan-lymphoid as well. All of these genes except Ablim and Grap were also expressed at similar levels in the multilineage LSK CD27+ precursors, implying that their lymphoid function may be inherited from a pluripotent precursor. These similarities are evident in the heat map shown in Fig. 3.
Genes specifically up-regulated as part of the T lineage developmental choice would be expected to be expressed more highly in Pro-T cells than in either Pro-B or BM myeloid cells (>2x), and a number of genes were found to have this pattern. However, even within this set, the majority had expression levels in one or both of the progenitor populations (LSK CD27 and LSK CD27+) similar to (within 2x) that found in the Pro-T cell population. These genes include Zfp109, Zfp30, Ldb1, Rabgap1, Ptpn7, and Ddx19b. Genes such as Myb, AI449175, and Helz were also expressed most highly in the Pro-T cell samples, but the magnitudes of their up-regulation relative to stem and progenitor cells were only 23x. None of these transcripts were as T lineage enriched as GATA3 or Tcf7 (Fig. 2A) or even as the canonical form of HEB (HEBcan). These newfound genes therefore are not specifically induced during T lineage specification, but instead represent multipotent precursor legacy genes that T lineage cells continue to express, even while other lineages down-regulate them.
Genes specifically up-regulated in T lineage precursors
Against this background, the T lineage specificity of a select group of genes from our screen stood out (Figs. 2 and 3). These included transcripts of three genes encoding transcription factors, Tcf7 (TCF-1), Bcl11b and HEBalt (the alternative promoter use form of HEB); the RING finger protein Dtx3L; the signaling adaptor protein Fkbp5; and two products of unknown function, Tmem131 (RW1, Neg) and Eva1 (epithelial V Ag). Tmem131 and Fkbp5 were up-regulated by slightly less than an order of magnitude (9-fold) from precursors (Fig. 3, green to gold). HEBalt levels in Pro-T cells were much higher than those in LSK cells, but were also up-regulated substantially in Pro-B cells, in agreement with previous report (48). Bcl11b was unusual for the magnitude and specificity of its up-regulation (Fig. 3, dark blue to red), even greater than the up-regulation of known T lineage factor Tcf7 (TCF-1, Fig. 3, blue to orange) and comparable to that of Deltex1 (Fig. 3, dark blue to red). These genes are investigated in more detail below.
Subtraction-enriched thymic stromal genes and a gene with shared lymphoid and stromal expression
Because mRNAs for the library construction and the subtraction protocol were obtained from nonsorted Rag2ko and SCID thymocytes with some contamination by stromal epithelial cells, our screen would be expected to enrich for stromal-specific cDNAs as well as for thymic lymphocyte-specific ones. Two genes selected by the subtraction, encoding the serine protease Prss16 and Spatial (Titest, 1700021K02Rik), were also specifically expressed in the Pro-T plus population (Fig. 2C and data not shown). However, their transcripts were not found in sorted hemopoietic populations (Fig. 4A), in accord with their annotation as stromal specific genes. Unlike Spatial or Prss16, a third "stromal" annotated gene, Eva1 (epithelial V-like Ag 1), was verified to be expressed in sort-purified DN3 cells (Fig. 4A). Eva1, thought to be a homotypic adhesion molecule and previously found only in thymic stroma, liver, and other epithelial tissues (49), was expressed within the T lineage in a stage-specific and transient way, beginning at the DN2 stage and peaking at the preselection DN3a (24) stage (Fig. 4B). It is possible that Eva1 mediates homotypic adhesion interactions between thymocytes and thymic stroma. Expression of Eva1 by DN3 cells would have easily been overlooked in immunohistochemical assays (49) because the percentage of DN3 cells among thymocytes in the wild-type thymus is low (
1%). We found nine noncanonical transcripts for Eva1, a benefit of the macroarray library, that appear to encode at least four novel transcripts with previously unreported exons or promoter regions (Fig. 4C and Supplemental Table I).8 Regulation of Eva1 is potentially interesting because the Eva1 gene is located on chromosome 9 in the only significant physical cluster of Pro-T cell genes identified in our study. It is immediately adjacent to cd3e (within 32 kb) and within 161 kb of Mll1, which flanks the cd3g/cd3d/cd3e cluster on the other side (data not shown).
|
The potential roles of the few T lineage-specific transcription factors and signaling molecules identified in Fig. 2 should depend on the developmental stages at which they are induced. Tcf7 (TCF-1) has been extensively studied (29, 30, 31), and its expression shown to increase gradually through the DN1DN4 progression (8), but the other genes are less well characterized. To determine the timing of up-regulation of these T lineage-biased genes, we analyzed their expression in 5 subpopulations of T cell precursors sorted from wild-type mouse thymus (DN1, DN2, DN3a, DN3b, and DN4), in direct comparison with the two subpopulations of hemopoietic progenitors (LinKit+Sca-1+CD27 and LinKit+Sca-1+CD27+), Pro-B cells, and the sorted BM myeloid population used above. This comparison spans the range of early T lineage milestones: entry into the thymus during the transition to the DN1 stage; "specification" at the DN1 to DN2 transition; "commitment" and proliferation arrest at the DN2 to DN3a transition; and
-selection or 
-selection, via DN3b and DN4 intermediates (13, 24, 50).
The results are shown in Fig. 5 as qRT-PCR graphs, and Fig. 6 shows DN subsets ± LSK population expression results for a more extensive set of genes in heat map form. For developmental reference standards, we measured GATA3 expression as a model T lineage-specific positive regulator (Fig. 5A); Myb expression as a key regulator used by both multipotent progenitor cells and Pro-T cells (Fig. 5B); and PU.1 and SCL as progenitor-cell regulators that are shut off precipitously between the DN1 and DN3 stages (Fig. 5, G and H). The Notch signaling target gene Deltex1 (Dtx1) was also analyzed (Fig. 5L). Deltex1 was not transcribed in the two hemopoietic progenitor populations (see Fig. 5M, detection thresholds), but its expression was up-regulated
100-fold above background levels at the DN1 and DN2 stages, in agreement with the critical role of Notch signaling in T cell specification. Interestingly, it showed a further up-regulation at the DN3a stage to >2000-fold over the background (Fig. 5L), suggesting a second discrete phase of Notch activity (51).
|
|
10-fold decline after the DN3a stage, in agreement with previous report (48), and consistent with the early hit-and-run positive function this basic helix-loop-helix factor variant appears to play in T cell development (48).
The gene with the most singular pattern of expression in our analysis encodes Bcl11b, a zinc finger factor that usually acts as a transcriptional repressor (52, 53, 54). Bcl11b appeared strictly T lineage-specific relative to stem and progenitor populations and, in contrast to HEBalt, is expressed at only trace levels in Pro-B and myeloid cells (Fig. 5J). In addition, unlike all the other T lineage genes, Bcl1b transcripts showed little expression in DN1 cells, but increased 500-fold between the DN1 and DN2 stages, with only a fewfold further up-regulation to the DN3a stage (Fig. 5J). The magnitude of this increase dwarfed the increase seen in GATA3 expression over the same interval (Fig. 5A). Unlike HEBalt, Bcl11b expression then remained fairly level through the DN4 stage, and its expression continued in peripheral T cells (52) (data not shown). Bcl11b up-regulation was accompanied by the reciprocal down-regulation of its relative, Bcl11a, which was strongly expressed in progenitor cells and non-T cells but down-regulated by two orders of magnitude between the DN1 and DN3 stages (Fig. 5I). This analysis implies that the ratio of Bcl11b to Bcl11a in thymocytes shifts dramatically during progression from DN1 to DN4, by over four orders of magnitude, a finding that is particularly relevant in light of reports that chromosomal translocations affecting Bcl11b expression are found in
20% of pediatric, T cell acute lymphoblastic leukemias (55, 56).
Other signaling genes and transcription factor genes selected by the screen showed less dramatic increases with earlier or later peaks of expression. Dtx3l, encoding a putative interaction partner of Deltex1, roughly paralleled Deltex1 (Fig. 5, F and L) and Eva1 (Fig. 4B) in expression in T lineage populations. Showing little expression in the stem/progenitor cells, Dtx3L was already detectably up-regulated at the DN1 stage and increased to a peak in the DN3a stage. Like several Notch pathway targets (24), Dtx3L was sharply down-regulated after
-selection in the DN3b and DN4 stages. FK506-binding protein5, Fkbp5, was already expressed at significant levels in the multilineage precursor populations, but its T lineage-specific up-regulation also reached a peak at the DN3 stage, albeit with changes of lower amplitude (Fig. 5E). Fkbp5 is a modulator of the glucocorticoid receptor (57). These signaling molecules and cell surface receptors thus appear to be most strongly expressed at a stage coinciding with T lineage commitment, cell cycle arrest, and TCR gene rearrangement, but after initial T lineage specification has begun.
The genes with similar levels in prethymic progenitor cells and Pro-T plus cell fractions (Fig. 2) confirmed their "legacy" patterns of expression by the continuity and constancy of their expression patterns throughout the early DN stages. Myb expression showed little change from prethymic stages throughout the Pro-T stages, with only a gentle increase from DN1 to DN3 and a steeper drop after
-selection (Fig. 5B). Two relatively novel KRAB-domain zinc finger transcription factors with "legacy" patterns of expression, Zfp30 and Zfp109, were up-regulated between the LSK CD27 and LSK CD27+ stages of prethymic differentiation and continued their expression through the DN1 to DN3 stages with a decrease after
-selection (Fig. 5, C and D).
Gene expression analyses in these DN and BM subsets were also conducted for Helz, Ddx19b, Tmem131 (Fig. 6A), and further analysis of DN thymocyte subsets was performed on Aff3, Grap, Ldb1, Mll1, Mll2, Trim44, Atxn2l, Tcf7, and FUS, in comparison with the Notch target gene HES1 (Fig. 6B). None of these matched the T lineage specification-associated induction of Bcl11b. Aff3 actually declined steadily from the DN2 to the DN4 stage after an early plateau. Tcf7, FUS, Grap, Helz, Ldb1, Mll1, Mll2, Tmem131, and Trim44 remained steady or increased gently to the DN3 stage, with a decline thereafter; but the range of expression was narrow. Ddx19b followed the same pattern, after an initial drop between the LSK CD27+ prethymic stage and DN1 stage. In a companion study of >80 Pro-T cell-expressed transcription factors (E.-S. David, G. Buzi, L. Rowen, R. Butler, R. A. Diamond, M. K. Anderson, and E. V. Rothenberg, manuscript in preparation), only Bcl11b demonstrated T lineage specificity and >100-fold up-regulation at the DN1 to DN2 transition.
Bcl11b induction by Notch/Delta signaling
Notch/Delta signaling induces expression of the known T lineage regulatory genes GATA3 and Tcf7 with a characteristic time course in fetal liver-derived hemopoietic precursors (19, 58, 59). OP9 stromal cells normally support B cell differentiation of hemopoietic precursors, but OP9 cells engineered to express the Notch ligand Delta-like1 (OP9-DL1 cells) support T cell development. Time course analysis of hemopoietic precursor cells in coculture with OP9 control or OP9-DL1 cells provides a second way to look at the earliest events involved in T lineage specification, separable from any technical issues about the correct identification and purification of precursor subsets. We therefore cultured fetal liver-derived hemopoietic precursors on OP9-DL1 or OP9-control stroma and compared the expression kinetics of Bcl11b with those of Tcf7, Deltex1, the T lineage gene CD3
, Eva1, and legacy or pan-lymphoid transcription factor genes as shown in Fig. 7. Samples were obtained as described previously (19), representing 2-day intervals in a time course of 10 days of culture overall.
|
For most of the other genes identified as T cell specific in this screen, the OP9-DL1 response kinetics of fetal liver progenitors (Fig. 7) gave results consistent with their steady-state expression patterns in adult prethymic cells and DN thymocytes (Figs. 5 and 6). Fkbp5 (Fig. 7) and Tmem131 (data not shown) were up-regulated in response to Notch/Delta signaling, in agreement with their identification as Notch target genes (28), but with a very shallow change in magnitude. Eva1 was also induced in a Notch/Delta dependent way. The legacy genes Myb and Mll1 showed virtually unchanging expression from fetal liver progenitor throughout the differentiation time course, in the presence or absence of DL1, in keeping with their shared use in T, B, and stem cells.
The OP9 kinetic assays did discriminate between Bcl11b and HEBalt usage in T vs B lineage differentiation. HEBalt is expressed in B lineage as well as T lineage precursors (48), but in the adult, in vivo-derived populations its expression appeared to be strongly biased toward the T lineage (Fig. 2B). In fetal liver precursors differentiating in vitro, however (Fig. 7), HEBalt was induced in the absence of DL1 (turquoise line, magenta line; B cell conditions) only a fewfold less strongly than in the presence of DL1 (navy line, yellow line; T cell conditions). It also appeared to be expressed at substantial levels in the fetal liver-derived starting populations (Fig. 7), as in the BM LSK populations (Fig. 2B). In contrast, Bcl11b expression was completely dependent on the T lineage differentiation conditions. Thus, our screen identifies Bcl11b as a singularly specific early component of the T cell program in vivo and in vitro.
Kinetics of Bcl11b induction depend on developmental state of prethymic precursors
The exponential increase of Bcl11b RNA expression over a 4- to 6-day period, as shown in Fig. 7, suggested that in addition to the Notch-dependent process inducing Bcl11b transcription, the frequency of cells competent to express the gene may also be increasing or that Bcl11b may exert a positive feedback effect on its own expression. We therefore tested whether the kinetics of Bcl11b induction under these conditions were dependent on the developmental status of the input cells. We took advantage of the fact that distinct subsets of prethymic precursors in the fetal liver progress to the DN2 stage with faster or slower kinetics in the OP9-DL1 system: CLP-like (LinKit+CD27+CD135+CD127+) cells differentiate faster, while stem-like LSK cells (LinKit+Sca-1+) show a lag (T. Taghon, M. A. Yui, and E. V. Rothenberg, submitted for publication), and "Flk+" cells (LinKit+Sca-1lowCD27+CD135+CD127) cells give intermediate responses. Fig. 8 shows that none of these populations express detectable Bcl11b initially ("0" time points). When cocultured with OP9-DL1, Bcl11b is rapidly up-regulated in the CLP-like cells, to levels approaching maximal within 2 days, while the LSK cells take 7 days to reach the same level and the Flk+ cells require at least 3 days (Fig. 8,
). These kinetics are in excellent agreement with the time it takes for each population to generate DN2 and later stage cells in vitro (Fig. 8, line graphs) (T. Taghon, M. A. Yui, and E. V. Rothenberg, submitted for publication). Thus, the duration of Notch-Delta signaling required to turn on Bcl11b depends on the initial developmental state of the responding cells.
|
| Discussion |
|---|
|
|
|---|
This study is distinguished from related studies of early T cell development in part by its focus on the transition from prethymic progenitors (specifically LSK CD27 and LSK CD27+ from BM and LinKit+CD27+ fetal liver precursors) into the first intrathymic stages. Although several other studies have compared populations from DN2 through
-selection, the earlier transitions have remained more obscure. Furthermore, the de novo, gene cloning approach of our work identified a number of genes not previously studied in T cell development, including those encoding transcriptional regulators Aff3, MLL2, Zfp27, Zfp109, and Zfp30, RNA-binding proteins Helz, FUS and Ddx19b, signaling component Grap, the immunophilin Fkbp5, the Lim-binding protein Ldb1, as well as uncharacterized products like Eva1, Tmem131, and Trim44 (Table III). A comparison of our findings and the results of microarray studies by Hoffmann et al. (27), Tabrizifard et al. (26), Dik et al. (28), and Lee et al. (29) shows general agreement where results for comparable samples were given, but also highlights the impact of microarray chip comprehensiveness and probe design. Microarray data from these sources regarding the genes in Fig. 2 are presented in Table III. Hoffmann et al. (27) and Tabrizifard et al. (26) studied T cell development in the mouse, but the Hoffmann work did not examine prethymic or DN1 populations and Tabrizifard et al. (26) started only with DN1 cells and reported results only for transcription factors. These studies were constrained by the limits of their microarrays as shown by the dashes in Table III (not included on the chip used) and possibly also by detection threshold issues (Table III "n/r", gene expression not reported). Dik et al. (28) evaluated gene expression in human hemopoietic cells roughly equivalent to the LSK CD27+ through DN transitions and extended to the late CD4 and CD8 single-positive T cell stages. These authors also noted selective up-regulation of Bcl11b at a point in human T cell development similar to our murine data. However, they found Bcl11a expression reinduced in CD4+ ISP cells, representing a later T lineage population, and this was not supported by our findings. The recent work by Lee et al. (29) used a highly comprehensive human microarray chip but their analysis focuses on the later transitions, from
-selection to naive CD4+ T cell. Without early thymocyte and hemopoietic populations for comparison, the Lee paper answers different questions about gene expression in T cell development.
|
and pT
at the DN1 to DN2 transition is accompanied by only a 2- to 4-fold greater expression of GATA3 and Tcf7, with no detectable increase in known Notch target gene expression to suggest enhanced Notch signaling (24). Therefore, any additional factors that might provide a gain in T lineage-specific regulatory function during the DN1 to DN2 transition would be of great interest as combinatorial participants in the lineage specification process. A priori, it was assumed that many transcription factors would be up-regulated in this interval as the cells began T lineage differentiation, and that the challenge would be to detect a subset with functional importance. The results instead revealed remarkable continuities between the multilineage stem and progenitor cells and the Pro-T cells.
The strategy we used to select the genes of interest was deliberately designed to enable us to recover legacy genes as well as strictly T lineage-specific ones. This decision was based on the evidence for low-level "multilineage priming" of stem cells for expression of genes used in other hemopoietic lineages (60, 61). The surprise was that the Pro-T cell-enriched genes identified by this approach were so dominated by legacy genes, many of them virtually unchanging in their levels of expression from the multipotent progenitor to the Pro-T cell state. It is tempting to speculate that this inherited assemblage of regulatory factors may contribute to the remarkable maintenance of developmental plasticity in intrathymic Pro-T cells until just before
-selection (10, 12, 17, <