Abstract
T cell development occurs in the thymus, where uncommitted progenitors are directed into a range of sublineages with distinct functions. The goal is to generate a TCR repertoire diverse enough to recognize potential pathogens while remaining tolerant of self. Decades of intensive research have characterized the transcriptional programs controlling critical differentiation checkpoints at the population level. However, greater precision regarding how and when these programs orchestrate differentiation at the single-cell level is required. Single-cell RNA sequencing approaches are now being brought to bear on this question, to track the identity of cells and analyze their gene expression programs at a resolution not previously possible. In this review, we discuss recent advances in the application of these technologies that have the potential to yield unprecedented insight to T cell development.
Introduction
All blood cells are derived from hematopoietic stem cells in the bone marrow. Although most blood lineages arise in the bone marrow, T cells are unique. The progenitors that are destined to be T cells migrate from bone marrow to recapitulate hematopoietic processes in a separate dedicated primary lymphoid organ: the thymus. The purpose of the thymus is to generate T cells with a range of distinct functional identities and a TCR repertoire diverse enough to recognize potential pathogens while remaining self-tolerant (1).
Studies by many laboratories have characterized in detail the cell surface markers, transcriptional factors, and other molecules that define the stages in T cell development. In most cases, these appear to be highly conserved across different species (2). The earliest stages of T cell development are termed double-negative (DN) because they lack the expression of CD4 and CD8 and can be further distinguished into a sequence of four populations (DN1–4) using CD25 and CD44 in mice (3) or CD38, CD34, and CD1a in humans (4, 5). The two principal T cell branches,αβ and γδ, also bifurcate during this early phase.
DN1 stage thymocytes are thought to maintain multilineage potential. Commitment to the T cell lineage then occurs as the cells progress from the DN2a to DN2b stage. This early stage is also when rearrangements of V, D, and J regions of TCRB/G/D genes commence (6, 7). The γδ T cell lineage is thought to branch off at the DN2 to DN3 transition, whereas αβ development first requires that DN3a stage thymocytes undergo selection for a functional TCRβ chain. Upon passing β selection to reach the DN3b stage, the cells then progress onto DN4 and then the CD4+CD8+ double-positive (DP) stage (8, 9).
Rearrangement of the TCRA genes gives rise to the complete αβ TCR complex on DP thymocytes, which must “audition” for selection on appropriate MHC/peptide complexes on thymic epithelial cells (10). Only those DP cells expressing a TCR capable of interacting with MHC/peptide differentiate further into CD4+ helper, CD8+ cytotoxic, mucosal-associated invariant T, NK T, and other αβ T cell lineages (11–13). Thymocytes expressing TCRs that recognize self-MHC/peptide with too high an avidity undergo negative selection.
Despite the immense body of knowledge that the field has acquired, there are still many aspects of T cell development that remain unclear. One important question is how and when the lineage decisions are made. Based on conventional flow cytometric analysis and gene knockout studies, γδ development appears to branch from the early DN3 stage (14, 15), but is this when the lineage decision is actually made (Fig. 1A)? Similarly, is the decision of DP thymocytes to differentiate into a CD4, CD8, or another αβ T cell lineage made at positive selection (Fig. 1B)? In addition, are these lineage decisions made in an instructive manner or stochastic manner (Fig. 1C)?
Schematic overview of unresolved issues in T cell development. (A) At a global level, all thymocytes appear to proceed from DN1 to DN3, at which the γδ lineage branches off. It is unclear whether the γδ lineage decision is made at the time of branching or predetermined prior to the apparent branch point. In contrast, DN3a cells with a functional TCRβ are selected to continue down the αβ lineage. (B) The decision of a DP thymocyte to become a CD4, CD8, or other αβ T cell may simply be determined at the point of positive/negative selection, or there may be specific DP subsets that are already primed to develop into specific αβ lineages. (C) Lineage decisions may be instructive, in which cell-specific signals direct precursors to become αβ or γδ T cells. The lineage decision may be a result of chance, in which DN thymocytes that successfully rearrange the TCRG and TCRD genes develop into γδ T cells, whereas those that successfully rearrange the TCRB gene progress to the DP stage prior to developing into an αβ T cell.
Secondly, it is often questioned whether the gradual silencing and activation of genes that is associated with lineage specification is indeed a stepwise process that occurs in every cell (16). Are thymocyte precursors truly multipotential, with a multipotent transcriptional program, that can then differentiate into one of many different T cell lineages? Or is there actually a range of biased progenitors present, each of which is the precursor of a distinct lineage, such as different DP thymocytes each giving rise to a distinct lineage (Fig. 1B)? Evidence in other developmental pathways, including the bone marrow, suggests that there is some level of progenitor heterogeneity with biases in their lineage potential (17).
Until recently, it had been difficult to address these issues, in part, because of the difficulty in identifying and analyzing very rare subpopulations. With the advent of single-cell RNA sequencing (scRNA-seq) technologies, it is now possible to determine cellular heterogeneity at single-cell resolution to address these questions and to achieve depth of understanding not previously possible (18, 19).
scRNA-seq
Over the years, genomic analyses have provided insights into molecular mechanisms that drive T cell development (20). These studies have identified transcription factors that play key roles in orchestrating the gene regulatory networks along this developmental pathway (20, 21). Although bulk RNA sequencing has identified many pathways, these approaches cannot distinguish variations in cell type composition of populations from continually changing transcriptome states (22). Hence, we still lack a precise picture of the transcriptional networks that control the asynchronous progression of differentiating thymocyte progenitors. scRNA-seq promises to offer a more precise understanding of thymocyte differentiation.
scRNA-seq represents a group of techniques for analyzing the transcriptomes of individual cells. They have improved significantly of late and have become robust and highly effective research tools (23, 24). With transcriptomic information at the resolution of single cells, mapping of cellular heterogeneity, differentiation trajectories, and regulatory mechanisms of lineage decisions can be elucidated to advance our understanding of thymocyte differentiation; namely, by resolving ambiguities around checkpoints and the identity of individual cells that may have been masked by more traditional population analyses (25). With these approaches, we could address important questions, including whether progenitors develop along single or diverse pathways, the identification of new intermediates, and determination of monotonic or wavelike behavior of the regulatory dynamics. A range of platforms, experimental designs, and downstream analysis is now available (24, 26). Thus, it is important to select a suitable platform to answer the biological question of interest. scRNA-seq methods can be categorized based on two main features: quantification and capture.
Quantification of scRNA-seq
Quantification refers to the process of sequencing mRNA transcripts, and the two main methods are full-length detection and 3′- or 5′-end tagging (27, 28). Full-length methods reverse transcribe the entire mRNA into full-length cDNA. Thus, in theory, these could provide relatively uniform and in-depth coverage of transcripts, suitable for identification of coding single nucleotide polymorphism/mutation, discovery of splice variants, and detection of rare transcripts (29). However, these approaches are very expensive, and PCR amplification bias is a major complication. This issue can be overcome by 3′/5′ tag-based methods, which transcribe a short segment of the mRNA from either the 3′ or 5′ end of the mRNA to generate a short cDNA tag combined with a unique molecular identifier (UMI). Being restricted to a short fragment at one end results in lower sequencing coverage. Although these fragments can extend up to 3 kb in length, they do not span the length of all genes. This drawback makes it difficult to distinguish transcript isoforms and map single nucleotide polymorphisms/mutations (24). Nevertheless, 3′/5′-tagging methods are relatively cheap, affording sequencing of a greater number of cells. In addition, the inclusion of UMIs allows for more precise quantification of individual mRNA transcripts (30). UMIs are unique combinations of 4 to 10 nucleotides that are randomly integrated into individual cDNA molecules to exclude potential PCR duplications (31, 32). Given the need to resolve potentially heterogenous stages in T cell development, 3′/5′-tagged methods have been the method of choice for the analysis of thymocyte populations (33–36).
Capturing of scRNA-seq
Capture refers to both the process of isolating cells of interest and the throughput rate for profiling mRNAs. The number of cells that are required and the sequencing depth are critical considerations when addressing a biological question, informing the optimum platform choice. Currently available platforms most commonly employ microfluidics, droplet, or microwell-based methods for the isolation of cells (Fig. 2) (37–40).
Comparison of scRNA-seq approaches. When addressing a biological question by scRNA-seq, selection of a suitable platform is crucial. Whether the goal is to identify diverse cell types, analyze rare populations within a heterogenous sample for detailed analyses of individual cells, or have flexibility with library construction, the required target cell number, sequencing depth, transcript coverage, need for UMIs, and cost are all factors for consideration.
Microfluidics are highly integrated systems that use microfluidic chips containing multi-individual nanochambers (41). They separate cells by pumping microvolumes of cell suspensions into individual sorting chambers, where the cells undergo lysis and transcription processes to output cDNA that is then purified, amplified, and prepared into a library for sequencing (42). Fluidigm’s C1 is one of the widely used commercial platforms. It mainly uses Smart-seq2 technology to produce full-length cDNA for sequencing (43). Thus, it is useful for very detailed analyses of individual cells (Fig. 2). However, it has a relatively low throughput due to the limited number of wells: 96 or 384 wells per plate (44). It can only capture ∼10% of cells due to limited number of chambers and is very expensive (45). Therefore, it is best for analyses of specific cell types in great detail, but not ideal for identifying rare cell types within a population.
Droplet-based methods are higher in throughput and work by encapsulating individual cells in nanoliter-scale droplets, each containing a bead with a UMI, barcode, and primer sequences (46). Inside the droplet, the bead attaches to the cell, which then undergoes cell lysis before producing cDNA with barcodes and UMI attached to either 3′ or 5′ ends (47). The two main platforms are Chromium 10X Genomics and Drop-seq. 10X Genomics offers high-throughput profiling of either 5′ or 3′ ends of RNAs. Although it has a lower sequencing depth compared with Fluidigm’s C1, the use of UMIs results in high capture efficiency (48). Consequently, it is a cost-efficient platform and suitable for analyzing rare populations within a heterogenous biological space or for distinguishing between closely related populations with similar transcriptional profiles (Fig. 2). Drop-seq also utilizes UMIs to increase its capture efficiency. Compared to 10X, Drop-seq is relatively inexpensive and has higher throughput rate but with lower sequencing depth (37, 49). Therefore, it is useful when identifying diverse cell types within a biological sample (Fig. 2), but low sequencing depth makes it unreliable for identifying rare cell types or to resolve very closely related populations.
Microwell-based methods isolate single cells into microfluidic plates or tubes by micromanipulation (pipetting) or FACS (Fig. 2) (38, 45). Although the capture of cells is time-consuming, it is useful for isolating a specific subgroup of cells first based on surface markers. In addition, it is highly flexible, as it can easily be adapted to perform different types of library construction (50).
Assessing checkpoints in T cell development by scRNA-seq
There is no gold-standard method for analyzing scRNA-seq data. Thus, a comprehensive experimental design tailored to the biological question being addressed is essential to retrieve the maximum information (51). Some of the key methods include clustering approaches that classify cells based on their transcriptional profiles and attempt to reassemble de novo models of thymocyte differentiation (28). The populations identified by clustering can be used to infer a differentiation pathway and determine how the data fit with the established understanding of T cell development (52). Analyses of differential gene expression can then be applied to determine potential regulatory mechanisms involved in lineage decisions at single-cell resolution (53). Recently, there have been several studies that have successfully applied scRNA-seq to address unresolved aspects of T cell development. These articles have revealed just how dynamic the gene expression networks are from early DN stage to later CD4 versus CD8 lineage differentiation.
Lineage decisions in T cell development
The current consensus in T cell development is that the γδ lineage branches from DN2 to DN3 transition, when the TCRB/G/D gene rearrangements occur. A recent scRNA-seq study on DN1, DN2, DN4, and immature Vγ2+ revealed similarities in the expression profile between a subset of DN1 and immature Vγ2+ (54). Using Fluidigm’s C1 platform, the authors identified the transcription factor SOX13 as a novel putative γδ T cell regulator and showed that overexpression impairs transition to αβ lineage, whereas downregulation reduces γδ T cell development (54). Although the precise function of SOX13 still needs to be evaluated, it was postulated that the IL-17+ γδ subset likely originates from SOX13+ DN1s, suggesting the fate of γδ is determined prior to bifurcation, supporting a precommitment model (54). However, whether this potential precommitted state is a common feature of all γδ T cell subsets remains to be determined.
scRNA-seq has also been recently applied to investigate the CD4 versus CD8 lineage decision (55). This study captured pre- and postselection DP thymocytes and identified TCR signals as a key determinant in the CD4 versus CD8 lineage choice. They employed full-length transcript profiling by Smart-seq to reliably capture coreceptor expression of individual cells and were able to capture cells that appear to be selection intermediates. These intermediates could be classified as CD4+CD8a+, CD4+CD8a−, or CD4−CD8a+. Both CD4+CD8a− and CD4−CD8a+ intermediates appeared to express significantly higher TCR signaling-induced activation markers, suggesting an importance of continued TCR engagement but contradicting the prevailing kinetic signaling models that suggest that CD8 lineage commitment is dependent on the termination of TCR signaling (56, 57). The authors concluded that the CD4 versus CD8 decision is made in instructive manner by TCR signal strength and that duration informs the timing of coreceptor gene activity (55).
Heterogeneity within early T cell development
A recent study that employed a combination of Fluidigm’s C1 and 10X 3′ mRNA capture showed that early thymic progenitors, defined as CD4−CD8−c-Kit+CD44+CD25+, display distinct expression patterns compared with DN2a cells, the stage immediately prior to T lineage commitment (58). The downregulation of “non-T” genes and the upregulation of T cell genes in DN2a populations were the key changes found. However, some progenitors already express key T cell genes, raising the issue of heterogeneity within these progenitors and whether these represent progenitors with different lineage potential or if they represent different phases (i.e., early versus late progenitors). Similar heterogeneity among early thymic progenitors in the human thymus has also been reported (34). It was found that one of these progenitor populations in the mouse, identified as CD63+Ly6c+, is in fact a discrete committed granulocyte precursor subset (58). In contrast, it has been reported that CD34+CD1a+CD123+ progenitors in humans give rise to plasmacytoid dendritic cells but not T cells, whereas CD44+CD2+ progenitors can generate both T and non-T cells and the other progenitors only generated T cells in vitro (34). These studies have also revealed novel discoveries on asynchronous downregulation of progenitor-associated genes with transient waves of gene activation, which were interpreted as reflecting multiple transient regulatory states as progenitors progress toward T lineage commitment at the DN2 stage. Thus, early thymic progenitors are not simply a discrete homogeneous population representing the earliest stages of T cell development in the thymus, but may in fact be a heterogeneous mixture of cells.
Heterogeneity has also been reported for the thymus seeding progenitor. Using 10X scRNA-seq, a recent study identified two putative thymus-seeding progenitors with distinct transcriptional profiles in humans (35). Although they both appeared to support T cell development, only one population correlated with thymic progenitors found in the mouse, whereas the second population appears to contribute to the differentiation of plasmacytoid dendritic cells and B cells. This study further supports the presence of rare and potentially biased precursors that would not have been revealed in bulk population analyses.
Single-cell multiomics sequencing
Multiomics analysis provides multifaceted insight by adding additional layers of information on top of a transcriptional profile and can be applied to single cells (59). Exploring other layers, including genome sequencing, epigenome states, protein quantification, or spatial information, ultimately reveals more sophisticated details. This approach can reveal additional information beyond what scRNA-seq can alone, including genetics and epigenetic drivers, as well as discovery of novel states and cellular interactions (60). This integrative analysis has the potential to yield unprecedented insight into T cell development.
Transcriptome plus epigenome (chromatin accessibility/DNA methylation/histone modifications)
Chromatin and epigenetic regulation, such as by DNA methylation and histone modifications, is central to the establishment of appropriate transcriptional programs during the development of any lineage (61). Signaling from soluble factors or cell–cell interactions leads to the activation of pioneering transcription factors and other chromatin modifiers. These DNA-binding factors, in turn, affect the positioning of nucleosomes and arrange the genome into distinct spatial structures (62). This dynamic alteration of the chromatin landscape then permits other transcription factors to interact with target DNA sequences or forces them to detach from the DNA, thereby altering gene expression profiles (63).
Previously, chromatin immunoprecipitation followed by sequencing has been the standard method for analyzing genome-wide maps of chromatin modification enrichment and DNA-binding proteins (64). This approach, along with other conventional methods for analyzing bulk populations, has revealed the importance of the epigenome in regulating the gene expression in T cell development. A substantial shift in organization of the epigenome has been found with T lineage commitment, in which the switching of chromatin from a repressive to transcriptionally active stage turns on the expression of T cell–committed genes (65–67). Moreover, there are specific active and repressive histone modifications and DNA methylations that change during T cell development. For example, prior to the DN2 stage, Bcl11b, a key driver of T cell commitment, appears to be silenced by DNA methylation and the repressive histone H3K27me3 modification (68). These epigenomic regulators tighten nucleosome packaging, thus influencing the binding of transcription factors. Then, at the DN2 stage, the Bcl11b gene switches from repressive compartments to transcriptionally active compartments in the chromatin (67, 68). These studies demonstrate the importance of characterizing not only transcription profiles, but also the characterization of chromatin interactions and epigenetics that establish the framework for establishing the transcriptional profile of T cells. However, these bulk population analyses lack the resolution to reveal epigenetic states at the single-cell level and thus cannot probe the potential heterogeneity of T cell/thymocyte populations. Integrative analysis among chromatin accessibility, DNA methylation, or histone modifications with transcriptional information will be important to link gene expression with precise epigenetic states at the single-cell level.
Recently, analysis of chromatin accessibility has been developed for the single-cell level, which has the potential to reveal new insight into cellular heterogeneity as it relates to chromatin states (69). Single-cell assay for transposase-accessible chromatin using sequencing (ATAC-seq) uses a hyperactive Tn5 transposase that accesses open chromatin regions and fragments the DNA. The presence of an accessible site corresponds to an active regulatory region of chromatin (70, 71). Importantly, ATAC-seq information can be linked to transcriptional information. The first method, single-cell combinatorial indexing for chromatin accessibility and mRNA, involves plate-based combinatorial indexing in which the sorted nuclei in a 96-well plate are incorporated with cell-specific barcodes (72, 73). The labeled nuclei are then redistributed into two pools when a second barcode is added so that half the material undergoes ATAC-seq and the other half undergoes RNA-seq (73). A second method, single-nucleus chromatin accessibility and mRNA expression sequencing, uses a microdroplet platform in which transposases are used to open chromatin and isolate nuclei (74). The extracted “tagmented” nuclei are then captured in droplets containing both barcoded beads and oligonucleotides that label both cDNA and open chromatin fragments (75). They are then used to generate a library of cDNA (transcriptome) and genomic DNA (from chromatin) (76). Although both methods display similar quality of transcript reads, single-nucleus chromatin accessibility and mRNA expression sequencing have been found to reveal a higher level of chromatin complexity than single-cell combinatorial indexing for chromatin accessibility and mRNA (19).
Information on DNA methylation can also be simultaneously obtained with transcriptional information from the same single cell. Cells can be lysed such that the nucleus is separated from RNA (77). The genomic DNA and RNA are then separately amplified to generate DNA methylome and transcriptome data, respectively (78).
There have also been efforts to simultaneously acquire histone modification and transcriptomic information from individual cells. These have mainly employed droplet-based chromatin immunoprecipitation sequencing (79, 80). However, the robustness of single-cell chromatin immunoprecipitation remains poor, achieving low coverage of the genome (81).
The integrated data produced by these methods can reveal valuable layers of information with downstream analysis. In addition to the mapping of cellular heterogeneity and pseudotime analysis for developmental trajectories that can be achieved just with the RNA data, overlaying epigenetic information with RNA can provide information of gene regulation (70). Peak calling, motif analysis, and interaction predictions can be applied to determine similarities in chromatin accessibility among cells (70). Furthermore, gene-regulatory features, such as the position of nucleosomes, and characterization of DNA elements, including promoter and enhancer activities, can be analyzed to generate a high-resolution chromatin profile of cells, and these approaches have been applied to understanding T cell development (82).
A recent study combined scRNA-seq and ATAC-seq to map the developmental trajectories of αβ thymocytes undergoing selection (83). The authors identified novel transcriptomic and epigenomic patterns that allowed them to map gene-regulatory networks of thymocytes that differentiate toward the CD4 or CD8 lineages. This revealed asymmetric differentiation programs that might play an important role in controlling the CD4 versus CD8 decisions. They also found highly conserved differentiation programs between human and mouse αβ thymocytes and confirmed the requirement of Runx3 and ThPOK, which were previously identified as the key drivers of CD8 and CD4 differentiation, respectively (83). Interestingly, although the two lineages exhibit little difference in the expression of effector molecules like cytokines and chemokines, differences in the chromatin status of the cytokine and chemokine genes can be observed between the two lineages (84), thus demonstrating the value of obtaining information beyond just RNA. Moreover, this study revealed distinct and temporally restricted transcriptional and epigenetic mechanisms associated with the development of αβ versus γδ lineages, with greater chromatin accessibility observed at T-bet, Runx3, and AP-1 binding sites during γδ differentiation compared with αβ differentiation (84). Thus, steady changes in the chromatin landscape as immature αβ T cells developed, whereas immature γδ T cells exhibited much more dramatic changes as they developed (84).
Transcriptome plus VDJ analysis
A VDJ enrichment step can be added to generate TCR-enriched libraries for 10X Genomics 5′ library construction to obtain detailed information on TCR gene usage (85, 86). This works by generating cDNA with a cellular barcode that is then divided, in which one portion is processed for 5′ gene expression library, whereas the other is prepared with TCR sequence enrichment and VDJ segment library preparation (87). The two libraries are sequenced separately, and the transcriptome and VDJ data are integrated with downstream analysis (85). This allows for clonal and diversity assessment in addition to the detection of rare VDJ transcripts, clonotypes, and cell type subpopulation (87).
Recently, this method was applied to determine the role of αβ chain pairing in determining cell lineage and Ag specificity. This study identified differences in the V and J germline regions, CDR3 charge, and length distribution between CD4+ and CD8+ repertoires (88). Further analysis suggested that αβ pairing plays a role in determining Ag specificities for an individual’s TCR repertoire. Although yet to be applied to T cell development, such an approach could help clarify when the αβ versus γδ decision is made and to what extent TCR repertoire and lineage decisions are determined in a stochastic manner or in a biased instructive manner (Fig. 1C). Furthermore, transcriptome plus VDJ analysis may reveal specific TCR gene rearrangement events that occur prior to the lineage branch point.
Transcriptome plus proteome
Conventional flow cytometry is extremely useful for charactering cell types or states. However, a limitation is the number of markers that can be simultaneously analyzed, as it is dictated by the number of available fluorophores (89). A modification of scRNA-seq can provide a solution. Cellular indexing of transcriptomes and epitopes by sequencing combines RNA-seq with the labeling of specific proteins with Abs. These Abs are tagged with oligonucleotides that are incorporated into the sequencing library (90). Because RNA does not always correlate with protein, incorporating the capability of protein measurement can provide more precise phenotypic analysis, at least of key features of interest (91, 92). Although this is yet to be used in T cell development, it has been applied in recent clinical studies on colonic CD8+ T cells in ulcerative colitis, in which cell surface protein information derived from cellular indexing of transcriptomes and epitopes by sequencing was overlayed onto the gene expression profiles of colonic CD8+ T cells (93). This led to better phenotyping, particularly for identifying the differentiated dysfunctional CD8+ T cells that occur in ulcerative colitis. This multiplexed method could be applied in T cell development to better delineate the phenotypic dynamics of thymocytes as they develop into the different lineages.
Oligo-tagged Abs can also be applied to cell hashing to multiplex scRNA-seq data (94). For this approach, each sample is first labeled with a unique oligo-tagged Ab. Multiple samples can then be mixed before construction of a single scRNA-seq library. This enables comparisons between individual mice or donors within a single sequencing run, thereby avoiding the technical variations that can occur from library to library and allowing for more precise comparisons between samples (94). Additionally, it also allows the identification of doublets, which is a major concern in microdroplet-based experiment in which more than one cell is captured in a single droplet, which appear as false novel intermediate cell types in downstream analyses (95).
Transcriptome plus genome
During cancer and disease progression, genetic changes including deletions, substitutions, and translocations can result in an altered cell state (96). An understanding of gene mutations that affect T cell development is important for addressing diseases caused by altered T cell development, such as T cell leukemia (12, 97). Direct nuclear tagmentation RNA-seq allows for integration of scRNA-seq data with whole genome sequence (98). Direct nuclear tagmentation RNA-seq involves independently amplifying and sequencing mRNAs and genomic DNA that are derived from the cytoplasm and nucleus of the same single cell in separate 384-well plates (96). The genomic DNA is isolated by direct cleavage and tagging of unfragmented DNA, which is then amplified by PCR with barcodes, whereas the full-length cDNA library is generated by Smart-seq2 construction (99). Combining high-sensitivity measurement of genetic variation to cell state at single-cell resolution can potentially address unresolved issues like defining the transcriptional effect of a mutation in heterogeneous samples or nonmalignant mosaicism (96). Similarly, this method could potentially be used to understand the impacts on T cell development in both normal and disease states, such as in T acute lymphoblastic leukemia.
Spatial transcriptome
As thymocytes progress along the T cell development pathway, they continually interact with various stromal cells of the thymus. These interactions are thought to play a vital role in determining the fate of developing T cells (100, 101). Previous studies have revealed that Aire-expressing medullary epithelial cells mediate the negative selection of single-positive thymocytes, whereas cortical epithelial cells appear to contribute to positive selection of DP thymocytes in the inner cortex, where they present MHC class I– and MHC class II–bound peptides (12, 102, 103). However, how these cells, along with other thymic stroma cell types, regulate T cell development is poorly understood (104). This is in large part due to limitations in visualizing molecular features within the complex thymic microenvironment. Tissue dissociation bias and failing to obtain biological information while retaining thymic organization that occurs with conventional methods for preparing thymic tissue can be problematic (105–107). Spatial transcriptomics, which can provide information on both cell interactions and tissue architecture at single-cell resolution, may potentially be a solution (106).
Although spatial transcriptomics lacks the resolution and sensitivity of fluorescence in situ hybridization–based methods, in combination with scRNA-seq, it enables probabilistic inference of cell-type topography (108, 109). Such approaches can be divided into two main methods. The first is an in situ index method that involves hybridizing a barcoded bead array to a permeabilized tissue segment to capture location information, which then constructs a transcriptome library by scRNA-seq (110). The second is a microfluidics approach, which works by crossflow of two sets of barcodes in orthogonal direction over the sample in a sequential order. Through the integrative analysis, it can elucidate cell–cell interactions and reconstruct spatial information with single-cell transcriptional information (75). Such sequencing-based spatial transcriptomic methods could potentially be applied to dissect the composition of distinct thymic niches and to identify cellular and molecular mediators of cellular interactions during specific stages in T cell development.
Conclusions
scRNA-seq is a powerful technology with the potential to transform our understanding of T cell development. Although there are significant issues that must be taken into account, including no standardized procedure and batch effects, this technology allows for tracking the identity of individual cells that may be masked by more traditional population analyses. This approach has already revealed critical new knowledge about T cell development, including the dynamics of asynchronous gene networks within heterogeneous T cell progenitors. This heterogeneity is likely to be responsible for lineage-biased fates and addresses the issue of when the T cell lineage decisions are actually made. Current single-cell sequencing technologies have continually improved and can now be integrated with other omics layers, including proteomes, genomes, epigenomes, and spatial information, which ultimately reveals more sophisticated information. Although these integrative analyses are yet to be applied in T cell development, they will almost certainly allow the field to delve deeper into the intricacies of T cell development, including genetic and epigenetic drivers as well as discovery of novel states and cellular interactions. Furthermore, a whole range of other single-cell approaches, including single-cell barcoding, can also be applied along with scRNA-seq to resolve the many unknowns that still remain about T cell development.
Disclosures
The authors have no financial conflicts of interest.
Footnotes
This work was supported by grants and fellowships from the National Health and Medical Research Council, Australia (1078763, 1090236, 1145888, and 1158024 to D.H.D.G. and 1079586, 1117154, 1122384, and 1122395 to M.M.W.C.), Cancer Council Victoria (1102104 to D.H.D.G.), Diabetes Australia (Y20G-CHOM to M.M.W.C.), U.S. Department of Defense (W81XWH-19-1-0728 to M.M.W.C.), and the Victorian State Government Operational Infrastructure Support and the Independent Research Institutes Infrastructure Support Scheme of the National Health and Medical Research Council, Australia.
Abbreviations used in this article
- ATAC-seq
- assay for transposase-accessible chromatin using sequencing
- DN
- double-negative
- DP
- double-positive
- scRNA-seq
- single-cell RNA sequencing
- UMI
- unique molecular identifier
- Received April 30, 2021.
- Accepted May 20, 2021.
- Copyright © 2021 by The American Association of Immunologists, Inc.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵