Key Points
BioID analysis reveals extensive transcription factor interactions.
GATA3 and TCF7 share binding sites with ARID1a in T cell progenitors.
ARID1a is essential for normal T cell development.
Abstract
Maturation of lymphoid cells is controlled by the action of stage and lineage-restricted transcription factors working in concert with the general transcription and chromatin remodeling machinery to regulate gene expression. To better understand this functional interplay, we used Biotin Identification in human embryonic kidney cells to identify proximity interaction partners for GATA3, TCF7 (TCF1), SPI1, HLF, IKZF1, PAX5, ID1, and ID2. The proximity interaction partners shared among the lineage-restricted transcription factors included ARID1a, a BRG1-associated factor complex component. CUT&RUN analysis revealed that ARID1a shared binding with TCF7 and GATA3 at a substantial number of putative regulatory elements in mouse T cell progenitors. In support of an important function for ARID1a in lymphocyte development, deletion of Arid1a in early lymphoid progenitors in mice resulted in a pronounced developmental arrest in early T cell development with a reduction of CD4+CD8+ cells and a 20-fold reduction in thymic cellularity. Exploring gene expression patterns in DN3 cells from Wt and Arid1a-deficient mice suggested that the developmental block resided in the DN3a to DN3b transition, indicating a deficiency in β-selection. Our work highlights the critical importance of functional interactions between stage and lineage-restricted factors and the basic transcription machinery during lymphocyte differentiation.
Introduction
The generation of mature blood cells from multipotent progenitors in the bone marrow (BM) is a complex process dependent on the coordinated actions of stage and lineage-restricted transcription factors (TF). One example of this is T lymphocyte development progressing through multiple defined differentiation stages in the BM and thymus. The earliest lymphoid-restricted progenitor cells in the BM are the lymphoid-primed multipotent progenitors displaying full lymphoid but reduced erythroid and megakaryocytic potential (1). Primitive T cell progenitors are believed to migrate from the BM to thymus as lineage-restricted, although not yet committed, cells (2–4). After reaching the thymus, the early thymic progenitor (ETP) (2–4) undergoes terminal commitment and maturation into functional T lymphocytes (5). This step-wise process can be divided into distinct stages based on the expression of specific surface markers on which the most immature cells lack both CD4 and CD8, defining them as double negative (DN) cells (5). This progenitor population can be further subfractionated based on differential expression of the surface markers c-KIT, CD44, and CD25 defining four basic stages of development, DN1 (ETP) to DN4 (5). The maturation process is regulated by the expression of stage-specific TFs (6) able to drive the development of the progenitors and cause terminal lineage commitment by coordinated activation and repression of their target genes. This TF network includes PU.1 (SPI1) (7–10), TCF7 (TCF1) (11, 12), GATA3 (13–15) and BCL11b (16–19). Activation of Bcl11b expression causes terminal T lineage commitment in a subpopulation of DN2 cells, resulting in a reduction in c-KIT levels to define the DN2b stage (16–18). In addition to these DNA binding regulatory proteins, the process is critically dependent on the action of the TF ID2, which modulates the functional activity of E-proteins to instruct T lineage as opposed to innate lymphoid cell differentiation (20).
To investigate how the TFs required for T cell development interact with the basal transcription machinery, we used an in vivo protein biotinylation system (Biotin Identification [BioID]), in which an abortive mutant of the Escherichia coli biotin ligase, BirA*, is fused to a protein of interest to effect biotinylation of proteins in a proximity-dependent manner (21, 22). The biotinylation reaction involves covalent attachment of biotin to proteins via the generation of a biotinoyl-5′-AMP intermediate, which reacts with lysine residues of acceptor proteins within ∼10 nm (23). This labeling radius corresponds to ∼30 nt of DNA or a distance corresponding to a folded protein complex of ∼3000 kDa (http://www.calctool.org/CALC/prof/bio/protein_size), allowing for labeling of proteins in the proximity of the bait with a reasonable specificity. Once biotinylation is achieved, proteins are extracted from cells under denaturing conditions, isolated using streptavidin affinity purification, and identified by mass spectrometry (MS) (Fig. 1A).
BioID revealed that the lineage determining TF interact with a large set of transcriptional activator and repressor complexes including chromatin remodeling complexes, histone modifiers, and other TFs. One of the most prominent interactors of the classical T lineage TF was the BAF (SWI/SNFA) complex protein AT-rich, interactive domain-containing protein 1a (ARID1a) (24, 25). This gene was recently reported to be of importance for normal blood cell development and stem cell function (26) and exploring the impact of lymphoid-restricted targeted deletion of Arid1a in mice revealed a prominent block in the DN3 stage, suggesting a critical function for this protein in early T cell development.
Materials and Methods
Animal models
ARID1aflox/flox mice (27) and human CD2 promoter–driven iCRE (28), intended to cause inactivation of the Arid1a gene in early lymphocyte development (29). Transgenic mice were on C57BL/6 (CD45.2) background and all experiments were performed using littermate controls of mixed gender. Arid1a genotyping was performed using the forward (F-5′-GTAATGGGAAAGCGACTACTGGAG-3′) and reverse primers (R-5′-TGTTCATTTTTGTGGCGGGAG-3′) to generate a 669-bp Wt product, a 845-bp Arid1aFlox product and a 298-bp product from the deleted allele. The experiments included a mixture of male and females in all experimental groups using littermate controls. The data presented are generated from 4- to 5-wk-old mice unless otherwise stated. Animal procedures were performed with consent from the local ethics committee at Linköping University (Linköping, Sweden).
Cell culture for BioID
2. Two hundred and ninety-three T-REx Flp-In T-REx cells stably expressing FLAGBirA*-GATA3, FLAGBirA*-TCF7, FLAGBirA*-HLF, FLAGBirA*-PU.1, FLAGBirA*-PAX5, FLAGBirA*-IKAROS, FLAGBirA*-ID1, and FLAGBirA*-ID2 were generated using the Flp-In system, according to the manufacturer’s instructions. After hygromycin B selection (DMEM + 10% FBS + 200 μg/ml hygromycin B), 5 × 150 cm2 plates of subconfluent (80%) cells were incubated for 24 h in complete media supplemented with 1 μg/ml tetracycline (Sigma-Aldrich) and 50-μm biotin (BioShop, Burlington, ON, Canada). Cells were collected and pelleted (1500 rpm, 5 min), the pellet was washed with PBS, and dried pellets were snap frozen. Two biological replicates were prepared for each sample.
Biotin-streptavidin affinity purification for MS
g) for 30 min. Clarified supernatants were incubated with 30 μl of packed, pre-equilibrated streptavidin-Sepharose beads (GE Healthcare) at 4°C for 3 h. Beads were collected by centrifugation (2000 rpm, 2 min), washed six times with 50 mM ammonium bicarbonate pH 8.3, and treated with TPCK-trypsin (16 h at 37°C; Promega, Madison, WI). The supernatant containing the tryptic peptides was collected and lyophilized. Peptides were resuspended in 0.1% formic acid and one-sixth of the sample was analyzed per MS run.
Mass spectrometric analysis
30). Using X!Tandem (Jackhammer TPP 2013.06.15.1) (30) and Comet (2014.02 rev. 2) (30), converted files were searched using the Trans-Proteomic Pipeline (TPP v4.7 Polar Vortex rev 1, Linux build 201705011551) against the Human RefSeq Version 45 database (containing 36113 entries). Search parameters specified a parent ion mass tolerance of 10 ppm, and an MS/MS fragment ion tolerance of 0.4 Da, with up to two missed cleavages allowed for trypsin. No fixed modifications were used. Variable modifications of Deamidated (NQ):Oxidation (M):Acetyl (Protein N-term):GG (K):Acetyl (N-term) were allowed. Proteins identified with an iProphet cut-off of 0.9 (corresponding to ≤1% FDR) and at least two unique peptides were analyzed with SAINT Express v.3.6. Control runs (20 runs for BioID; all from cells expressing the FlagBirA* epitope tag only) were collapsed to the four highest spectral counts for each prey, and high confidence interactors were defined as those with Bayesian false discovery rate ≤0.01. Two biological replicates (i.e., separate transfections) were each subjected to two MS runs (two technical replicates).
m/z (within a range of 10 ppm; exclusion list size = 500) were excluded from analysis for 5 s. For protein identification, Thermo.RAW files were converted to the.mzXML format using ProteoWizard (v3.0.10800; 4/27/2017) (FACS
All FACS analysis were performed on fresh tissues. Lineage stains were performed on single-cell suspensions subjected to RBC lysis by ice cold erythrocyte lysis buffer (150 mM NH4Cl, 10 mM NaHCO3, 1 mM EDTA) and Fc blocked (CD16/CD32 [FC] 93; eBioscience, SanDiego CA). Populations were defined as described in the figure legends. Analysis of cell populations in spleen and thymus in Figs. 4–6, Supplemental Fig. 3 were performed on fresh (Fc)-blocked thymocytes using CD45, CD4, CD8a, CD3, CD44, CD25, CD5, TCRbeta, CD69, NK1.1, c-KIT, and Lin (GR1, Mac1, TER119, CD19, CD11c). The T cell populations where characterized as DN (Lin−CD45+CD4−CD8−), double-positive (DP) (Lin−CD45+CD4+CD8+), D4SP (Lin−CD45+CD4+CD8−), and D8SP (Lin−CD45+CD4−CD8+), and NK cells were defined as CD45+NK1.1+. The DN population was further subdivided into DN1 (Lin−CD45+CD3−CD4−CD8−CD44+c-KIT+CD25−), DN2 (Lin−CD45+CD3−CD4−CD8−CD44+c-KIT+CD25+), DN2a (Lin−CD45+CD3−CD4−CD8−CD44+CD25+KitHigh), DN2b (Lin−CD45+CD3−CD4−CD8−CD44+CD25+KitLow), DN3(Lin−CD45+CD3−CD4−CD8−CD44−CD25+c-KIT−), and DN4 (Lin−CD45+CD3−CD4−CD8−CD44−CD25−c-KIT−). The early progenitor stain in Supplemental Fig. 3 was performed using the following Abs: IL-7Ra, FLT3, SCA1, KIT, LY6D, GFRA2, and Lin (Gr1, Mac1, Ter119, CD19, CD3e, NK1.1, CD11c, B220). B cell stains in Supplemental Fig. 3 were performed using fresh BM stained with lineage mixture Lin (GR1, Mac1, Ter119, CD3, NK1.1, CD11c), CD45, CD43, and CD19. The B cell populations were characterized as in (31) pro-B (Lin−CD19+IgM−IgD−CD43+Kit+CD25−), immature B (Lin−CD19+IgM+IgD−), mature B (Lin−CD19+IgMmixIgD+), and late pro-B (Lin−CD19+IgM−IgD−CD43+Kit−CD25+) cells. Linage stains in Supplemental Fig. 4 were conducted using CD19, NK1.1, Mac1, Gr1, CD11c, CD3, CD8a, and CD4. Analysis and cell sorting were performed on a BD FACSAria (BD Biosciences, San Jose, CA) using propidium iodide (Invitrogen, Paisly, U.K.) as viability marker. Gates for the defined populations were defined using fluorescence minus one controls.
Quantitative RT-PCR
Quantitative PCR (Q-PCR) analysis of sorted cells was performed as previously described (32). Assays-on-demand probes (Applied Biosystems, Foster City, CA) used were: Hprt, Mm00446968_m1; Ikzf3, Mm01306721_m1; myb12, Mm00485340_m1; Arida1a, Mm00473838_m1; Arid1b, Mm01338353_m1; and Arid2, Mm00558381_m1.
TCRβ recombination analysis
Cells were sorted and DNA/RNA was extracted using Qiagen’s Allprep DNA/RNA Microkit according to the manufacturer’s instructions. The TCRβ-DJ assays were adopted from (33) and the recombination events were quantified by real-time Q-PCR using FastStart Universal SYBR green Master (ROX) (Roche Diagnostics). Normalization to albumin amplification was performed as in (34). For TCR-VDJ recombination assay, we used the mixture of different Vh family primers (35) along with Jβ2–3′ 5′-TGAGAGCTGTCTCCTACTATCGATT-3′ and for TCR- DJ recombination assay, Dβ1–5′ 5′-CAGCCCCTTCAGCAAAGAT-3′ and Jβ1–3′ 5′-CCTAAGTTCCTTTCCAAGACCAT-3′. From the cDNA synthesized from the sorted cells, the transcribed TCRβ genes were analyzed using mixtures of different Vh family primers (35) along with a constant gene primer (36).
RNA sequencing and data analysis
DN3 cells from Arid1a+/+ or −/− were sorted from thymus in triplicate and total RNA was isolated using RNAeasy Micro Kit (Qiagen, Hilden, Germany). Libraries for sequencing were constructed using NuGEN’s Ovation Ultralow Library systems (NuGEN Technologies, San Carlos, CA) and were subject to 76 cycles single-end sequencing on a NextSeq500 (Illumina, San Diego, CA). Bam files were created from resulting fastq files by mapping to mouse reference genome (mm10/GRCm38) using STAR (2.6.0b-1) (37). Downstream analyses were conducted using the HOMER platform (v4.8 and v4.10) (38). For analysis of statistically significance among differently expressed genes the data were analyzed using analyzeRepeats.pl with the (-count exons, -condenseGenes, and -noadj options) followed by the getDiffExpression.pl command using DESeq2 (39). Parameters for differential expression was set to an FDR <0.05 and a |log2 fold-change| ≥1.
Assay for transposase accessible chromatin sequencing and data analysis
Eighty thousand DN3 cells were sorted from Arid1aWt or Arid1a−/− mice. Resulting cells were processed for assay for transposase accessible chromatin (ATAC) sequencing (ATAC-seq) library preparation essentially as described in (40). Libraries were single-end sequenced for 76 cycles on a NexSeq500.
Our ATAC data from Arid1a+/+ or Arid1a−/− DN3 cells were mapped to mm10 reference genome using Bowtie2 (Version 2.3.4.2) (41, 42) using single read standard settings. ATAC-seq data from DN2, DN3, and DN4 cells in Fig. 7A were retrieved from Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/) (GSE100738) (43) (duplicate samples) were paired-end processed with the ENCODE ATAC-Seq/DNase-Seq Pipeline (https://github.com/kundajelab/atac_dnase_pipelines) for ATAC-peak finding with the following settings: -species mm10 -enable_idr -no_xcor -auto_detect_adapter -mapq_thresh 10 -multimapping 4 -rm_chr_from_tag mito -nth 24. The fastq files were also mapped to mm10 using Bowtie2 (Version 2.3.4.2) (41, 42) using paired read standard settings, the mitochondrial chromosome was filtered out using samtools (http://www.htslib.org/doc/samtools.html), duplicate reads were filtered out using Picard tools MarkDuplicates (2.18.23) (https://broadinstitute.github.io/picard/), and tag directories were created using the HOMER platform (38) (makeTagDirectory). A combined DN3 and DN4 ATAC-peak file was created using mergePeaks.pl in HOMER and significantly differential peaks were identified using getDiffExpression.pl (settings: -peaks, -edgeR, -fdr 0.05, -log2fold 1). Heatmaps were created using annotatePeaks.pl (settings: -size 5000 -hist 100 -ghist) in HOMER, clustered in Cluster3 (44) and visualized using JavaTreeview (45). Histograms were created in HOMER using annotatePeaks.pl (settings: -size 5000 -hist 25). BedGraph files using normalization to 10 million reads were created using makeUCSCfile in HOMER and were uploaded to the UCSC Genome Browser (http://genome.ucsc.edu) (46) for visualization.
Cleavage under targets and release using nuclease (CUT&RUN)
CUT&RUN was performed similar to (47) with minor changes. One million DN3-like Scid.adh.2C2 cells per sample were washed in 1 ml of ice-cold 1× PBS followed by an additional wash in 1 ml nuclear extraction buffer (20 mM HEPES [pH 7.9], 10 mM KCl, 0.5 mM spermidine, 0.1% Triton-X 200, 20% glycerol). Nuclei was pelleted with centrifugation 600 × g for 3 min and then resuspended in 600 μl of nuclei extraction buffer. Thirty microliters of BioMag Concanvilin A magnetic beads (86057-3; Polysciences) were resuspended and washed twice with 1 ml of binding buffer (20 mM HEPES [pH 7.9], 10 mM KCl, 1 mM CaCl2, 1 mM MnCl2) and resuspended in 300 μl of binding buffer. The beads were added to the nuclei and incubated 10 min with rotation 4°C followed by blocking for 5 min in 1 ml of wash buffer (20 mM HEPES [pH 7.5], 150 mM NaCl, 0.5 mM spermidine, 0.1% BSA) + 2 mM EDTA pH 8. Nuclei were washed twice with 1 ml of wash buffer and resuspended in 250 μl of wash buffer. Five micrograms of the following Abs were added to 250 μl of wash buffer and added to the nuclei followed by incubation for 3 h with rotation 4°C: rabbit monoclonal αTCF1 (C63D9[2203S], Lot 8; Cell Signaling Technologies), rabbit monoclonal αARID1a (D2A8V (12354S), Lot 3; Cell Signaling Technologies), and polyclonal normal rabbit IgG (12-370, Lot 3109265; Millipore). The nuclei were washed twice and resuspended in 250 μl of wash buffer. Three micrograms in-house made pA-MNase was added to 250 μl of wash buffer, mixed with the cells, incubated for 1 h with rotation 4°C and subsequently washed twice with 1 ml of wash buffer. The nuclei were resuspended in 150 μl of wash buffer and incubated for 30 min in an ice-water bath with 2 mM CaCl2. One hundred and fifty microliters RSTOP+ buffer (200 mM NaCl, 20 mM EDTA [pH 8], 4 mM EGTA [pH 7.8], 50 μg/ml RNase A, and 10 pg/ml fragmented (150-bp size) Drosophila genomic DNA) was added to quench the MNase. Fragments were released on 37°C for 30 min followed by pelleting of the beads and transfer of the supernatant to new tubes. Three microliters of 10% SDS and 2.5 μg of proteinase-K was added and incubated at 70°C for 10 min followed by purification using Zymo Research chromatin immunoprecipitation (ChIP) DNA Clean and Concentrator. All buffers contain cOmplete, EDTA-free Protease Inhibitor Cocktail unless stated. Libraries for sequencing were essentially prepared as previously described for ChIP sequencing (ChIP-seq) (48) with two modifications: 1) Prior to adapter ligation all purification steps were carried out using Zymo Research ChIP DNA Clean and Concentrator to not lose fragments smaller than 100 nt and 2) size-selection with Ampure XP beads was performed after adapter ligation with a targeted mean fragment size of 150–300 bp. Libraries were paired-end sequenced on an Illumina Nextseq 500 (read-length: 2 × 38 bp).
CUT&RUN data processing
Reads were trimmed with Trim-galore in paired-end mode (Babraham Bioinformatics) and mapped to the mm10 genome (Gencode version GRCm38) with Bowtie2 (2.3.5) (42). Bam files were generated using Samtools v.1.9 (49) and PCR duplicates were removed with Picard MarkDuplicates (2.21.4) followed by a filtration of read for MAPQ-score ≥10 using Samtools. The filtered bam-file was subsequently converted back to the sam-format and larger fragments than 120 bp were filtered out followed by conversion back to the bam-format using Samtools. Peaks were called from the filtered bam-file using MACS2 callpeak (50) with default settings, using the normal IgG as the control. Because of the sparsity of Cut&Run data, the duplicates were merged prior to peak calling. T cell TF and cofactors overlapping peaks were derived using mergePeaks.pl from the HOMER package.
Tag directories were created using the HOMER package (38), and bed-files were made from the tag directories with tagDir2bed.pl. BigWigs were generated from the bed-files using Bedtools genomeCoverageBed (51) and bedGraphToBigWig and normalized to 10 million reads. BigWig files were uploaded to the UCSC Genome Browser (http://genome.ucsc.edu) (52) for visualization.
ChIP-seq data analysis
Fastq files of ChIP-seq data from GATA3 in Scid.adh.2c2 cells (GSE93755) (10) and HDAC2, MTA2, LSD1, CHD4, RING1b, REST, and BCL11b from DN3 cells (GSE110305) (17) were retrieved from GEO and aligned to mouse reference genome mm10 using Bowtie2 (2.3.4.3) (41) with standard single-end settings. Further analyses were performed using the HOMER package (v4.8 and v4.10) (38). Peaks were identified with findPeaks.pl against a matched control sample using the settings -P .1 -LP .1 -poisson .1 -style factor. TF peak reproducibility was determined by a HOMER adaptation of the irreproducibility discovery rate package (53) (Karmel A. 2015. homer-idr: Second pass updated) according to https://sites.google.com/site/anshulkundaje/projects/idr. Only reproducible high-quality peaks, defined by normalized scores of at least 10 tags per 10 million and an acceptable irreproducibility discovery rate score, were submitted to further analysis. Motif enrichment analysis was performed with the findMotifsGenome.pl command of the HOMER package.
Data availability
RNA-, ATAC-seq and Cut&Run data generated for this paper have been deposited in GEO (https://www.ncbi.nlm.nih.gov/geo/) under the accession number; GSE131673. Proteomics data have been deposited in the Mass Spectrometry Interactive Virtual Environment database under the accession number MSV000082188.
Results
BioID reveals complex interactions between lymphoid-restricted TF and the basic transcriptional machinery
To gain insight into the regulatory networks associated with early T cell development, we analyzed the interactome of a set of key regulators of early lymphoid development, IKZF1 (IKAROS) (54, 55), HLF (56), GATA3 (14, 15), PU.1 (7, 8), TCF7 (11, 12), ID1, and ID2 (20) using BioID (Fig. 1A). We also included the B lineage TF PAX5, critical for normal B cell development (57), to delineate if there exists any fundamental differences in the ability of T and B lineage–associated proteins to interact with the basic transcription machinery. These factors exert their functions in a variety of different developmental stages and cell lineages (6). Hence, to avoid the complications of using multiple cell lines with putatively widely differing proteomes, we performed our analysis in the human embryonic kidney cell line 293 T-REx, allowing for a direct comparison of TF interactomes in an identical, well-characterized, cellular context. This cell line carries a tetracycline-responsive cassette into which TF-encoding cDNA is inserted to achieve tetracycline-responsive induction of expression (58). Following streptavidin affinity purification and MS/MS, 2002 high confidence proximity interactors (PXIs) were identified among 848 unique proteins. Although some proximity partners were biotinylated uniquely by individual TFfusions (Fig. 1B, Supplemental Dataset 1), several were identified in multiple TF complexes (Fig. 1B, Supplemental Dataset 1). Importantly, many of the proteins identified as PXIs for several TF were components of the general transcription machinery, as supported by gene ontology analysis of the dataset (Supplemental Dataset 1). This analysis highlighted significant enrichment in gene ontology categories related to transcriptional regulation, chromatin remodeling, and histone modifiers.
BioID reveals extensive interactions between the basal transcription machinery and lineage specific transcription factors. (A) Illustrates the basic principles underlying the identification of high confidence proximity interactions using BioID. (B) Cytoscape (version 3.7.1) generated map displaying high confidence (FDR ≤ 0.01) PXIs of the transcription factors TCF7 GATA3, HLF, PAX5, PU.1, IKAROS (IKZF1) and the transcriptional regulators ID1 and ID2. Bait proteins are represented by white squares and are presented using an edge-weighted spring-embedded layout (based on a “force-directed” paradigm as implemented by Kamada and Kawai). Relative distances between bait proteins indicate the number of shared interaction partners. High confidence PXIs are represented by colored round nodes. Prey color indicates “indegree,” or the number of individual bait proteins that interact with a given prey. Data are based on two independent experiments and based on four MS runs, as described in Materials and Methods. Detailed protein interaction data can be found in the Supplemental tables.
GATA3, TCF7, PU.1, and IKZF1 BioID data suggested that these factors are all associated with both transcriptional activator and repressor complexes, well in line with their complex role in T cell differentiation (5). In most cases, we detected multiple components of the same regulatory complexes, including e.g., constituents of BAF, histone acetylase transferase (HAT), histone methyltransferase (HMT), polycomb group (PCG) and histone deacetylase (HDAC) complexes (Fig. 2, Supplemental Fig. 1, Supplemental Dataset 1). Interestingly, IKZF1 displayed a more restricted interactome, associating primarily with histone modifiers and cofactors involved in transcriptional repression (Supplemental Fig. 1). PAX5 BioID (Supplemental Dataset 1) identified multiple complexes involved in chromatin remodeling (including the INO80 and Cohesin complexes), and PU.1 (Supplemental Fig. 1) displayed specific interactions with members of the adaptor protein complexes 2 and 3. The ID1 and ID2 interactomes were distinct from those of the other TF characterized (Fig. 1B, Supplemental Fig. 1, Supplemental Dataset 1), consistent with their unique function in transcriptional regulation (59). Although minimal interactions were detected between the ID proteins, chromatin remodeling complexes and histone modifiers, robust interactions were observed with factors involved in cell cycle regulation and centrosome formation, in line with the finding that the ID proteins are associated with the anaphase promoting complex (60).
GATA3 (A) and TCF7 (B) interact with multiple regulatory complexes and TF. GATA3 high confidence PXIs identified using BioID. Protein names were imported into Cytoscape 3.7.0 for visual representation: GATA3 interactors are represented by round nodes and grouped according to gene ontology. Previously reported interacting partners indicated with a thick outline.
The validity of the BioID data are supported by a number of previously reported interactions. For example, 1) the ID protein interaction partners TCF12 and TCF3 (Supplemental Fig. 1, Supplemental Dataset 1) were identified in this study (61); 2) GATA3 (Fig. 2A, Supplemental Dataset 1) was previously demonstrated to recruit the histone deacetylase HDAC3 to chromatin (62) and; 3) consistent with several previous reports, IKZF1 (Supplemental Fig. 1, Supplemental Dataset 1) BioID identified Mi-2β (CDH4), an ATP-dependent nucleosome-remodeling enzyme, and components of the NuRD complex (63), supporting an IKZF1/CHD4/NuRD functional axis (64). Moreover, using RNA sequencing (RNA-seq) data from DN3 cells (Supplemental Dataset 2) we identified mRNAs (fragments per kilobase of transcript per million mapped reads >0.5) for ∼85% of the high confidence interactors identified in 293 cells, supporting that a majority of the identified proteins are expressed in T cell progenitors. Hence, BioID analysis allowed for the identification of a multitude of putative collaboration partners for T lineage TF.
Because the BioID analysis identifies proteins in the proximity of the TFs rather than just factors binding directly to the bait in living cells, verification of potentially relevant PXIs using genome wide DNA binding analysis such as ChIP- or CUT&RUN-seq analysis is most informative. To explore the content of overlapping binding of some identified PXIs and a key regulator of T lineage differentiation, GATA3, in T cell progenitors, we reanalyzed previously published data from murine T cell progenitors (10, 17) (Supplemental Fig. 2). Site enrichment analysis based on GATA3 ChIP-seq data from T cell progenitors identified binding sites for a number of the PXIs (Supplemental Fig. 2A, Supplemental Dataset 1). These included an ETS binding site (ELK4) potentially targeted by the PXI FLI1 and binding sites for TCF7L2 (TCF4). We also detected enrichment of binding sites for HOXA2 and, although the binding of HOX proteins to any given site in vivo is known to be complex (65), we identified several HOX proteins among the GATA3 PXIs (Fig. 2A, Supplemental Dataset 1). The BioID analysis suggested a potential interaction between GATA3 and the T lineage restriction factor BCL11b, an interaction of special interest due to their interplay in early T cell development (16). Reanalysis of ChIP-seq data from T cell progenitors (10, 17) revealed that GATA3 displayed overlapping binding at more than 50% of the identified BCL11b sites (Supplemental Fig. 2B), including both promoter and putative enhancer elements as exemplified by the Zinc finger protein 715 (Zfp715) and the growth factor receptor bound 2 (Grb2) genes (Supplemental Fig. 2C).
In addition to the identification of putative classical DNA binding collaboration partners the BioID analysis provide an insight into coregulatory complexes associated with the TF networks. Reanalysis of ChIP-seq data (10, 17) revealed that several of the complexes identified as potential cofactors for GATA3 in the BioID analysis, including the NURD complex proteins HDAC2, MLL2, KDM1a (LST), and CHD4 displayed overlapping binding with between 42 and 51% of the GATA3 bound sites (Supplemental Fig. 2D). Similar data were obtained for the polycomb protein RING1b, binding at 40% of the GATA3 bound sites, whereas >50% of REST bound regions overlapped with GATA3 targets (Supplemental Fig. 2D). To expand our analysis, we performed CUT&RUN analysis (66) to identify binding sites for TCF7 as well as overlapping binding with the identified cofactor complexes in the T cell progenitor cell line Scid.adh.2C2 (P2C2). This analysis verified that several of the coregulatory complexes identified in the BioID analysis indeed shared binding sites with TCF7 in T cell progenitors (Supplemental Fig. 2E). These data support the idea that the BioID experiments allowed for the identification of relevant cofactors for T lineage specific TF.
In addition to the ability of BioID technology to identify interactions in living cells, it has an additional advantage in that the proximity-based analysis allows for identification of multiple components of a cofactor complex facilitating the analysis of the data out of a biological perspective. In our dataset this is exemplified by the BRM/BAF complex as the GATA3 PXIs included 16, the TCF7 14 and PU.1 15, protein components of this chromatin remodeling complex (Fig. 2, Supplemental Fig. 1). One of the most prominently biotinylated proteins in this complex was ARID1a (24, 25), a factor previously reported to be important for normal lymphocyte development (26). To explore the landscape of ARID1a binding in T cell progenitors (P2C2 cells), we performed CUT&RUN analysis to identify regions bound by this BAF complex protein. Motif enrichment analysis identified both GATA and TCF among the top four enriched motifs (Fig. 3A). Thirty-six percent of the ARID1a bound regions were found to overlap with detectable levels of TCF7 (Fig. 3B–D), whereas GATA3 binding overlapped with 68% of the ARID1a peaks (Fig. 3B–D). Comparing the binding of ARID1a to that of a set of TF and transcriptional cofactors revealed extensive overlap with among others RUNX1 and BCL11b (Fig. 3D) providing an independent line of support for that ARID1a is cofactor for key transcriptional regulators in T cell development.
ARID1a shares binding sites with key T cell transcriptional regulators. (A) Motif enrichment analysis of ARID1a CUT&RUN peaks were performed using findMotifsGenome in HOMER (mm10, -size 200). Rank, motif, p value, percent in target/background and best match as well as other known of matching motifs are listed. (B) UCSC-genome browser (http://genome.ucsc.edu) visualization of ARID1a, TCF7 and GATA3 binding to the Tcf7 locus. (C) Venn diagrams displaying the level of overlapping binding between ARID1a and TCF7 or ARID1a and GATA3 in pro–T cells. (D) Diagram displaying the fraction of overlapping binding of ARID1a and other proteins identified as part of the GATA3 or TCF7 proximity interactome (Fig. 2, Supplemental Fig. 1B). Fastq files of ChIP- or CUT&RUN-seq data from P2C2 or sorted DN3 cells was retrieved from GEO (GSE93755 and GSE110305) and processed as described in Materials and Methods. Overlapping peaks were identified using the mergePeaks command in HOMER (-size given).
ARID1a is critical for normal lymphocyte development
As our analysis indicated that ARID1a is an important cofactor for key regulators of transcription in T cell development we wanted to investigate the potential importance of this protein in the generation of T lymphoid cells. To this end, we crossed ARID1aflox/flox mice (27) with a human CD2 promoter–driven CRE (28) to cause deletion of lox-site flanked DNA in early lymphocyte development (29). PCR analysis verified the efficient deletion of the floxed allele in early T lineage progenitor cells including CD4+CD8+ DP and CD3+CD4−CD8− DN cells (Supplemental Fig. 3A). Gross anatomical examination of the spleen did not reveal any striking discrepancies between Arid1aflox/flox/wt (Arid1aWt) or Arid1aflox/flox/hCD2Cre (Arid1a−/−) with regard to spleen size or cell number (Fig. 4A). Analysis of T lineage cells in the spleen revealed a reduction in the CD3+ population as a consequence of lower numbers of both CD4 and CD8 single-positive (SP) cells (Fig. 4B, 4C). The number of NK cells were not significantly different, however, the relative proportion of TCRβ+NK1.1+ cells was reduced in the ARID1a-deficient mice (Fig. 4D). Hence, we conclude that ARID1a is important for normal formation of the T-lymphoid compartment.
Loss of functional ARID1a results in reduced levels of peripheral lymphocytes. (A) Displays the number of cells recovered from the spleen of Arid1aflox/flox/Wt (Arid1aWt) or Arid1aflox/flox/hCD2Cre (Arid1a−/−) mice. (B) Representative FACS plots Lin−CD45+ cells identifying CD4- and CD8-expressing splenic T cells from Arid1aWt or Arid1a−/− mice. (C) Composed of diagrams displaying the estimated cell numbers of CD3+ (Lin−CD45+CD3+), CD4SP (Lin−CD45+CD4+CD8−), and CD8SP (Lin−CD45+CD4−CD8+) cell populations in the spleen of 4- to 5-wk-old Arid1aWt (n = 7) or Arid1a−/− (n = 9) mice. (D) Displays diagrams over the numbers of NK1.1+ cells (CD45+NK1.1+) and the percentage of NK1.1+ cells with surface expression of TCRβ in the spleen of 4- to 5-wk-old Arid1aWt (n = 7) or Arid1a−/− (n = 9) mice. Each dot represent one animal and the statistical analysis was performed using unpaired t test. ***p < 0.001, ****p < 0.0001.
To identify potential disturbances in T cell development we examined the cellular composition of the primary lymphoid organs, BM, and thymus. As would be expected from the use of a lymphoid-restricted CRE transgenic strain, we were unable to detect any significant alterations in the numbers of multipotent Lin−Sca1+Kit+ (LSK) cells, the lymphoid-primed multipotent progenitor cells, or in the common lymphoid progenitor compartment (Supplemental Fig. 3B). Neither did we detect any significant alterations in the earliest B cell progenitors including the CD19−GFRA2+ BLP population (67) (Supplemental Fig. 3B). Performing a high-resolution analysis of the CD19+ populations in the BM (31) identified a modest reduction in the mature (IgM+IgD+) compartment also reflected in a reduced number of CD19+ cells in the spleen (Supplemental Fig. 3C, 3D).
In support of an important role of ARID1a in early T cell development, we noted that the thymus size was dramatically reduced in Arid1a−/− mice, as reflected by an over 95% reduction in cellularity (Fig. 5A). The reduction in thymic cellularity was reflected in reduced numbers of CD4SP, CD8SP as well as DP cells whereas the number of CD45+ DN cells were comparable to what was observed in wild-type (Wt) mice (Fig. 5B, 5C). Despite that the number of CD4SP was reduced, the relative portion of these cells was rather comparable to what we observed in Wt mice (Fig. 4B). As it has been reported that the inactivation of the SWI/SNF component SMARCA4 (BRG1) results in deregulated expression of CD4 (68), we investigated the levels of TCRβ, CD5, and CD69 on the CD4+CD8− thymocytes (Fig. 5D–F). This revealed that the absolute majority of the CD4SP cells detected in Arid1a−/− animals lacked the expression of CD5 as well as TCRβ leading us to conclude that they do not represent conventional CD4+CD8− thymocytes.
ARID1a deficiency results in impaired T cell development. The diagram in (A) displays the absolute cell numbers in the thymus of 4–5 wk old Arid1aflox/flox (Arid1aWtt) (n = 7), Arid1aflox/flox/hCD2Cre (Arid1a−/−) (n = 9) mice as determined by FACS analysis. (B) Displays representative FACS plots of Lin−CD45+ cells, displaying the thymic T cell populations from 4 to 5 wk old Arid1aWt or Arid1a−/− mice. (C) Display diagrams showing the estimated number of DN (Lin−CD45+CD4−CD8−), DP (Lin−CD45+CD4+CD8+), D4SP (Lin−CD45+CD4+CD8−), and D8SP (Lin−CD45+CD4−CD8+) cell populations in the thymus of 4- to 5-wk-old Arid1aWt (n = 7) or Arid1a−/− (n = 9) mice. (D and E) Representative histograms and diagrams displaying the expression of CD5 and TCRβ on CD4SP thymocytes from 4- to 5-wk-old Arid1aWt (n = 7) or Arid1a−/− (n = 9) mice. Each dot represents one animal and the statistical analysis was performed with an unpaired t test. (F) Representative FACS plots displaying the expression of CD69 and TCRβ on CD4SP thymocytes from 4- to 5-wk-old Arid1aWt or Arid1a−/− mice. The arrow indicate the developmental trajectory. ***p < 0.001, ****p < 0.0001.
The DN progenitor population can be further subdivided based on differential expression of the surface markers CD44, CD25, and c-KIT. The most immature DN1 progenitors (ETPs) are CD44+CD25−c-KIT+ and the subsequent DN2 cells CD44+CD25+ c-KIT+, whereas DN3 cells express CD25 but lack CD44 and c-KIT expression. DN4 cells lack expression of either of these markers but generate CD4+CD8+ DP progenitors upon further differentiation. Estimating the relative proportions of DN populations we detected a decrease in DN1 cells, whereas the DN2 compartment appeared relatively intact (Fig. 6A, 6B). Subdividing the DN2 population into DN2a and DN2b cells based on the level of c-KIT expression revealed that the loss of Arid1a resulted in skewing of the DN2 population with a decreased portion of DN2a and increase fraction of DN2b cells in ARID1a-deficient mice. The DN3 stage was overrepresented in the Arid1a−/− mice, whereas we detected a reduction of DN4 cells in these animals (Fig. 6A, 6B). From this, we conclude that T cell development is arrested in the DN3 to DN4 transition in the absence of Arid1a.
Developmental progression from the DN3 to the DN4 stage is dependent on ARID1a. (A) Shows representative FACS plots of DN (Lin−CD45+CD3−CD4−CD8−) thymic T cell populations divided into DN1-4 and DN2a-2b from 4- to 5-wk-old Arid1aflox/flox/wt (Arid1aWtt) or Arid1aflox/flox/hCD2Cre (Arid1a−/−) mice. (B) Diagrams displaying the relative fractions of cells representing DN populations in the thymus of Arid1aWt (n = 5) or Arid1a−/− (n = 4) mice. T cell progenitors are defined as DN1 (Lin−CD45+CD3−CD4−CD8−CD44+c-KIT+CD25−), DN2 (Lin−CD45+CD3−CD4−CD8−CD44+c-KIT+CD25+), DN2a (Lin−CD45+CD3−CD4−CD8−CD44+CD25+KitHigh), DN2b (Lin−CD45+CD3−CD4−CD8−CD44+CD25+KitLow), DN3(Lin−CD45+CD3−CD4−CD8−CD44−CD25+c-KIT−), and DN4 (Lin−CD45+CD3−CD4−CD8−CD44−CD25−c-KIT−). Each dot represents one animal and the statistical analysis is performed with an unpaired t test. (C) Display diagrams with Q-PCR data comparing the relative expression of Ikzf3 and Mybl2 in DN3 cells from Arid1aWt (n = 5) or Arid1a−/− (n = 4) mice. Each dot represents one sample from one animal analyzed by triplicate Q-PCR analysis. Splenic CD3+ and CD19+ cells were used as reference. The statistical analysis was performed using unpaired t test. (D) Displays a SYBRGreen-stained agarose gels with RT-PCR products amplified from VDJ recombined and transcribed TCRβ genes in Arid1afWt and Arid1a−/−DN3 cells as well as control populations (total CD3 and CD19+ cells as well as Tail genomic DNA [Tail] and nontemplate control [NTP]). A combined set of V-primers were used generating two distinct bands dependent on the design of the primers for the different V-gene families. cDNA generated from CD19+ B-cells was used as negative control. *p < 0.05, **p < 0.01, ****p < 0.0001.
As we detected an increase of cells in the DN3 compartment already in 4-wk-old mice and as ARID1a has been reported to be mutated in both leukemia and lymphoma (69–72), we wanted to investigate if the expanded population reflected a malignant or premalignant state. To this end we monitored Arid1a−/− mice to the age of 45–55 wk but were unable to detect malignant expansion of lymphoid cells in the aged animals (Supplemental Fig. 4A).
Because the progenitor cells in the DN3 population are undergoing selection for functional TCRβ recombination, the cells can be divided into preselection DN3a cells and proliferating postselection DN3b cells (73). These populations can be identified based on the surface expression of CD27 but also on expression of Ikzf3 and Mybl2 with higher expression in DN3b as compared with DN3a cells (73). Comparing RNA-seq data from Arid1aWt and Arid1a−/− DN3 cells (Supplemental Dataset 2) did not reveal any differences in CD27 mRNA levels; however, the expression of Ikzf3 and Mybl2 were both reduced in the absence of ARID1a as determined by both RNA-seq and Q-PCR analysis (Fig. 6C, Supplemental Dataset 2). The RNA-seq data also suggested that the absence of ARID1a resulted in higher expression of CD25, a finding supported by analysis of surface CD25 expression (Supplemental Fig. 4C). Estimating the size of the DN3 cells from Arid1aWt and Arid1a−/− mice using forward scatter in the FACS analysis (Supplemental Fig. 4B) suggested that the average size of the DN3 cells from Arid1a-deficient mice was smaller than the corresponding Arid1aWt cells. Hence, the developmental block observed would be consistent with a deficiency in functional TCRβ expression (73). This would be in line with the finding that rearrangement of the TCRβ-gene depends on a functional SWI-SNF complex (74). However, investigation of the recombination status (Supplemental Fig. 4D, 4E) and expression of VDJ encoding TCRβ message (Fig. 6D) failed to detect any major differences in ARID1a-deficient versus control DN3 cells. This argues against that the developmental block observed is a consequence of failed TCRβ recombination or expression.
Because ARID1a is part of a chromatin remodeling complex of apparent critical importance for the DN3 to DN4 transition we wanted to explore if ARID1a is associated with epigenetic alterations. To this end we took advantage of existing ATAC-seq data from ex vivo analyzed DN cells, which allowed us to identify ∼2300 peaks with a significant change in genomic accessibility in the developmental transition from the DN3 to the DN4 stage (Fig. 7A). By assigning these regions to specific genes using proximity annotation, we identified epigenetic changes at putative regulatory elements linked to a set of potentially relevant genes including Cd25, Rag2, Lcp2 (Slp76), and Cd2 (Supplemental Dataset 3). Using ATAC-seq analysis to compare the epigenetic accessibility at these elements in Arid1aWt as compared with Arid1a−/− DN3 cells (Fig. 7B–D) we noted that 772 of the dynamically regulated sites displayed a lower accessibility in the absence of ARID1a. These included regions in proximity to the Cd8a, Cd44, Lmo2, and the Rag2 genes (Supplemental Dataset 3). Hence, ARID1a is an important factor for the formation of the normal epigenetic landscape in DN3 cells.
ARID1a is important for the normal regulation of the epigenetic landscape in the DN3 to DN4 transition. (A) Displays a heatmap based on ATAC-seq data from Wt DN2a, DN3, and DN4 cells retrieved from GEO (GSE100738) and processed as described in Materials and Methods. The heatmap of DN2a, DN3, and DN4 ATAC signal (-size 5000 -hist 100) are presented on ATAC-peak position with a significantly different enrichment in DN3 versus DN4 cells. (B) ATAC-seq data from DN3 cells extracted from Arid1aflox/flox/wt (Arid1aWtt) or Arid1aflox/flox/hCD2Cre (Arid1a−/−) mice are visualized in a heatmap at the same positions as in (A). Heatmap data generated in HOMER was clustered based on k-means with eight (A) or six (B) clusters, 100 runs, and centered correlation clustered using Cluster 3 (44) and visualized using JavaTreeview (45). (C) BedGraphs of an ARID1a dependent ATAC-peak at the promoter of the Slc5a9 gene. (D) Histogram displaying the ATAC signal coverage at ARID1a dependent sites [based on cluster in (B)]. (E) Diagrams with Q-PCR data comparing the relative expression of Arid1a, Arid1b, and Arid2 in DN populations from Arid1aWtt (n = 4) or Arid1a−/− (n = 4) mice. Each dot represents one sample from one animal analyzed by triplicate Q-PCR analysis. The statistical analysis compares the level in a given population in Arid1afWt or Arid1a−/− mice and was performed using unpaired t test. *p < 0.05, **p < 0.01, ***p < 0.001.
ARID1a and ARID1b are mutually exclusive components of the BAF complex (24, 25) and it has previously been reported that ARID1a and ARID1b display partially redundant functions (24, 75, 76). To identify the transcriptional activity of the Arid genes in normal as well as ARID1a-deficient T cell progenitors, we sorted DN populations followed by quantified expression analysis of Arid1a, Arid1b, and Arid2 (Fig. 7E). The expression of Arid1a was gradually downregulated during early T cell development. Gene expression analysis of the dysfunctional Arid1a allele revealed an upregulation of mRNA levels in the DN4 cells generated in Arid1a−/− mice. In the Arid1aWt mice, Arid1b displayed a similar pattern of expression with the highest mRNA levels in DN1 cells. In Arid1a−/− cells, the expression of Arid1b in DN1, DN3, and DN4 cells was significantly upregulated as compared with the corresponding control cells. DN4 cells from Arid1a-deficient mice also displayed higher expression levels of Arid2, a component of the polybromo-associated BAF (PBAF) (SWI/SNFB) complex (77). This suggests a critical dose dependent role for ARID proteins in T cell development.
Discussion
We in this study report that the BAF complex component ARID1a associates with a set of TF of essential importance in T cell development. Using a proximity assay (21, 22), we identified putative collaboration partners for TF in living cells and unraveled complex regulatory networks in T cell development. The BioID approach does not distinguish between factors that directly interact with the bait protein and factors that are part of the same functional complex. This apparent lack of distinction does, however, not hamper the functional analysis because from a biological perspective, it is more critical to identify participation in multimodular functional complexes, or even functionally proximal environments, rather than to distinguish direct interactions. As our analysis is focused on TF, we show in this study that ChIP-/CUT&RUN-seq analysis, identifying overlapping binding, represents an excellent way to verify that two factors are indeed binding to the same regulatory element in the genome. Our analyses show that lineage specific TF interact with a multitude of broadly expressed proteins, highlighting that stage and lineage-restricted TF act in an intricate interplay with the basal transcriptional machinery. It should, however, be considered that our BioID analysis is conducted outside of the natural context of where these TF exert their normal biological function. This has the disadvantage that we cannot identify interactions with proteins that are not expressed in HEK cells. However, the use of one and the same cellular context for all of the bait proteins allows for a direct comparison of the interplay with the general transcription machinery. Among the PXIs we identified several TF including BCL11b, IKZF3, RUNX1, and TCF3, all with essential roles in early T cell development (78), as well as RBPJ, the DNA-binding component of the Notch signaling pathway. Notch-1 signaling is a key event in normal as well as malignant lymphocyte development and its association with GATA3, TCF7 as well as PU.1 support the idea of a direct integration of stage-specific TF networks and Notch target induced gene activation (79). The PXIs also include coactivator complexes such as EP300 and CREBBP (80), SMARCA4 (BRG1) (68), as well as corepressors such as HDAC3 (81) and NCOR1 (82), all with essential roles in T cell development. As we provide support for that mRNA encoding the majority of the PXIs are expressed in T cell progenitors and provide evidence for overlapping binding between GATA3 and TCF7 and several putative cofactors identified in the BioID analysis we do believe that our experiments in HEK cells identify relevant partners and therefore can be explored to resolve TF networks in T cell development.
Our data reveal a highly complex interplay between stage and lineage specific TF and the basal transcription machinery. This is well in line with the observation that several key TF act in a target gene selective manner, both as activators and repressors of transcription (16–18). Although the interactome of several central TF displayed large similarities, IKZF1 displayed a rather unique pattern of interacting factors (Fig. 1B, Supplemental Dataset 1). Several components of the NURD complex were identified, which is well in line with the previously reported findings that IKZF1 exerts important functions in early hematopoiesis via this repressor complex (63). Although most of the T cell factors in our analysis interacted with several ARID proteins and other components of the SWI/SNF family complexes, IKZF1 BioID only identified ARID3a (Supplemental Dataset 1). Although ARID1a and ARID1b are broadly expressed, ARID3a has been reported to be expressed more selectively (83). Although the role of ARID3a in the function of the SWI/SNF complexes remains unclear, it has been suggested that this protein may mediate interactions with chromatin remodeling complexes (84). Hence, one could envision a model where IKZF1 recruitment of certain epigenetic regulators is restricted to cells expressing ARID3a, generating a degree of lineage specific functional capacity. The proximity interactome of the ID proteins was also clearly distinct from the other fusion proteins. In this case, we detected few interactions with TF apart from E-proteins, well known to be targeted by ID mediated inhibition (59). However, the interaction with the anaphase-promoting complex involved in protein ubiquitination could suggest that the ID proteins not only inhibit the DNA binding of E-proteins but also target them for degradation. It is also notable that Arid1b and Arid2 transcription is upregulated in DN4 cells in the absence of functional ARID1a. This could be a consequence of functional selection such that the transition from DN3 to DN4 demands a certain level of ARID activity and thus high expressers are selected. However, upregulation of the dysfunctional Arid1a locus could not be explained purely by selection and would rather indicate that functional levels of ARID proteins can be sensed in the developing cell. This adds another level of complexity as the activity of the basal transcription machinery may be under the control of functional feedback loops.
Although the deletion of ARID1a did not appear to have any major impact on the absolute numbers of the earliest T lineage progenitors we detect a strong developmental block and expansion in the DN3 stage of development. We cannot detect any apparent link to any of the TFs we used for our BioID analysis, but the developmental block is consistent with a deficient TCRβ selection. Although the SWI/SNF complex has been implicated in the mediation of chromatin accessibility promoting recombination (74), we detected normal levels of TCRβ VDJ recombination. This would argue against the possibility that accessibility of the β-locus is dramatically impaired or that the high expression of AP1 proteins in DN3 cells from Arid1a−/− mice (Jun and Fos, Supplemental Dataset 2) would cause a disturbance of the recombination process (85). Our data are distinct from those presented in a recent paper where ARID1a was inactivated in hematopoietic stem cells (26). These authors reported a stem cell defect as well as a developmental block in the DN2 stage of T cell differentiation. This discrepancy is likely explained by a stem cell defect aggravating the impact of reduced levels of ARID1a in the lymphoid compartment.
Although ARID1a has been reported to be mutated in multiple types of human cancers, including hematological malignancies (69–72), we did not detect any signs of leukemia or lymphoma development even in 1-y-old mice. This is likely because ARID1a mutations need to arise in a certain cellular and mutational context not fulfilled in our model system. In summary, we believe that our work highlights the complexity of the interplay between stage and lineage specific TF and the basal transcription machinery required for normal T cell differentiation.
Disclosures
The authors have no financial conflicts of interest.
Footnotes
This work was supported by grants from the Swedish Cancer Society (2017-258), the Swedish Childhood Cancer Foundation (2019-0020), and the Swedish Research Council (2018-02448), including a strategic grant to the Stem Therapy program at Lund University, Knut and Alice Wallenberg’s Foundation (2014-0089), and donations from Henry Hallberg, Lund University, and Linköping University. Work in the B.R. laboratory was funded by the Canadian Institutes of Health Research, the Canada Foundation for Innovation, and the Princess Margaret Cancer Foundation.
The sequences presented in this article have been submitted to the Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/) under the accession number GSE131673 and to the Mass Spectrometry Interactive Virtual Environment database under accession number MSV000082188.
The online version of this article contains supplemental material.
Abbreviations used in this article:
- ARID1a
- AT-rich, interactive domain-containing protein 1a
- ATAC
- assay for transposase accessible chromatin
- ATAC-seq
- ATAC sequencing
- BAF
- BRG1-associated factor
- BioID
- Biotin Identification
- BM
- bone marrow
- ChIP
- chromatin immunoprecipitation
- ChIP-seq
- ChIP sequencing
- DN
- double negative
- DP
- double-positive
- ETP
- early thymic progenitor
- GEO
- Gene Expression Omnibus
- MS
- mass spectrometry
- MS/MS
- tandem MS
- PXI
- proximity interactor
- Q-PCR
- quantitative PCR
- RNA-seq
- RNA sequencing
- SP
- single-positive
- TF
- transcription factor.
- Received August 8, 2019.
- Accepted June 29, 2020.
- Copyright © 2020 by The American Association of Immunologists, Inc.