Abstract
Myasthenia gravis (MG) is a prototypical B cell-mediated autoimmune disease affecting 20–50 people per 100,000. The majority of patients fall into two clinically distinguishable types based on whether they produce autoantibodies targeting the acetylcholine receptor (AChR-MG) or muscle specific kinase (MuSK-MG). The autoantibodies are pathogenic, but whether their generation is associated with broader defects in the B cell repertoire is unknown. To address this question, we performed deep sequencing of the BCR repertoire of AChR-MG, MuSK-MG, and healthy subjects to generate ∼518,000 unique VH and VL sequences from sorted naive and memory B cell populations. AChR-MG and MuSK-MG subjects displayed distinct gene segment usage biases in both VH and VL sequences within the naive and memory compartments. The memory compartment of AChR-MG was further characterized by reduced positive selection of somatic mutations in the VH CDR and altered VH CDR3 physicochemical properties. The VL repertoire of MuSK-MG was specifically characterized by reduced V-J segment distance in recombined sequences, suggesting diminished VL receptor editing during B cell development. Our results identify large-scale abnormalities in both the naive and memory B cell repertoires. Particular abnormalities were unique to either AChR-MG or MuSK-MG, indicating that the repertoires reflect the distinct properties of the subtypes. These repertoire abnormalities are consistent with previously observed defects in B cell tolerance checkpoints in MG, thereby offering additional insight regarding the impact of tolerance defects on peripheral autoimmune repertoires. These collective findings point toward a deformed B cell repertoire as a fundamental component of MG.
Introduction
Myasthenia gravis (MG) is a chronic autoimmune disorder of neuromuscular signal transmission characterized by muscle weakness and fatigability (1). The estimated annual incidence is 1–2 per 100,000 people and the prevalence is as high as 20–50 per 100,000 (2, 3). Recent epidemiological studies indicate that, like other autoimmune diseases, its incidence is rising considerably (4). Approximately 80–85% of MG patients have circulating autoantibodies that target the acetylcholine receptor (AChR) (1) whereas up to 10% harbor autoantibodies that target muscle specific kinase (MuSK) (5). Autoantibodies to lipoprotein-related protein 4 (LRP4) are detected in a fraction (≈18%) of AChR and MuSK autoantibody double-negative MG patients, but also in a small fraction (<6%) that are AChR or MuSK autoantibody positive (6, 7). A minor fraction (<10%) of MG patients do not have detectable autoantibodies directed toward any of these recognized Ags. The majority of MG patients lacking thymomas have been commonly classified into two major distinct subtypes based on whether they test positive for anti-AChR (AChR-MG) or anti-MuSK (MuSK-MG) Abs. Although both AChR-MG and MuSK-MG demonstrate neuromuscular junction pathologies, these two subtypes differ in the degree of gender bias, clinical presentation, response to immunotherapies, and association with specific HLA alleles (1, 8–10). This implies that non-thymoma AChR-MG and MuSK-MG may have different underlying disease mechanisms and genetics, despite broadly similar presentations.
AChR-MG and MuSK-MG are among the few autoimmune diseases for which the molecular immunopathology is well understood. The autoantibodies in AChR-MG, which are primarily IgG1, lead to the loss of AChR through internalization and localized complement-mediated postsynaptic tissue damage (11, 12). The majority of autoantibodies that recognize MuSK are from the IgG4 subclass. They inhibit the agrin-LRP4-MuSK pathway by preventing agrin-activated LRP4 binding to MuSK, leading to dispersal of the AChRs (13, 14). Transfer (both active and passive) of these autoantibodies from humans to animal models affects the disease, demonstrating the direct role these molecules play in its pathology (15–18). Although the role of autoantibodies in both AChR-MG and MuSK-MG pathologies is clear, the immune dysfunction that precedes their production is less well understood.
The role of the thymus in the immunopathology of many AChR MG patients is conspicuous (19), which is further highlighted by a recent clinical trial demonstrating clinical benefit following thymectomy (20). The thymus in a subset of MG patients includes (19) AChR expression by thymic epithelial cells and myoid cells, the presence of proinflammatory cytokines, and defective regulatory T cells. B cells often organize in the hyperplastic thymus within tertiary lymphoid organs, frequently exemplifying many characteristics of germinal centers. To better define the role of B cells in MG, a number of studies have focused on those that characteristically populate the thymus. B cells populating the hyperplastic thymus express markers of activation and display functional signs of activation (21). Limited-throughput Sanger-based sequencing and B cell immortalization studies have shown that the B cells resident in these MG thymi are broadly clonally heterogeneous, lacking a dominant clone(s) among the infiltrate (22, 23). They harbor the characteristics of Ag experience, including somatic hypermutation (24–26) and biased V region gene segment usage, which include over-representation of the VH3 family at the expense of VH4 genes (22). Importantly, among these B cells are plasmablasts and plasma cells that produce autoantibodies directed toward AChR. However, they represent a minor fraction of the total B cell infiltrate, thus do not appear to be highly enriched (27, 28). Although these studies focusing on the thymic tissue remain seminal, they are not applicable to the broader MG population given that the thymus of many AChR-MG patients is not hyperplastic and therefore contains few, if any, disease-associated B cells. Furthermore, the thymus tissue of MuSK-MG patients is consistently normal in morphology and lacks a B cell infiltrate (29). Therefore, to further profile and expand the evaluation of the MG B cell repertoire, sequencing studies of other compartments, namely the periphery, are needed.
Deep sequencing allows for the comprehensive evaluation of the BCR repertoire properties in health and disease and provides the depth necessary to adequately depict the circulating peripheral repertoire, which includes up to 1011 B cells in humans (30). We applied this approach to explore both the naive and memory B cell compartments in AChR-MG and MuSK-MG patients and compared their features to healthy donors (HD). We focused our study on determining whether the AChR-MG or MuSK-MG repertoires included abnormal clonal expansions, skewed gene usage, and distinctive Ag binding region properties.
Materials and Methods
Study subjects and specimens
The study was approved by the Human Research Protection Program at Yale School of Medicine. Peripheral blood was obtained from HD and MG subjects after acquiring informed consent. MG patients were recruited from the Yale Myasthenia Gravis Clinic, the University of Kansas, or the University of Rochester. Study subjects were selected based on their clear diagnosis of MG, conspicuous symptoms, the absence of immunotherapy or other treatment (AChR), and in the case of MuSK-MG, the absence of aggressive immunotherapy. All of the MG patients harbored MG autoantibodies, and had clinical and electrodiagnostic features consistent with MG. Collected clinical data included demographics, duration of disease, immunosuppressive or other medications, AChR autoantibody titer, thymectomy status/thymus pathology, status of acute or chronic infections, and Myasthenia Gravis Foundation of America Clinical Classification (Table I). HDs were matched to MG patients as closely as possible with regard to age and gender. HDs reported no history of autoimmune disease or malignancies and no acute or chronic infections.
Cell staining and sorting
B cell subsets were isolated using flow cytometry (Supplemental Fig. 1A +, CD27−) or memory B cells (CD19+, CD27+) were then sorted on a FACSAria flow cytometer (BD) into PCR tubes and immediately frozen on dry ice.
Library preparation and BCR sequencing
RNA (250 ng) was reverse-transcribed into cDNA using a biotinylated oligo dT primer. An adaptor sequence was added to the 3′ end of all cDNA, which contains a universal priming site and a 17-nucleotide unique molecular identifier (UMI). Products were purified using streptavidin-coated magnetic beads followed by a primary PCR reaction using a pool of primers targeting the IGHA, IGHD, IGHE, IGHG, IGHM, IGKC, and IGLC regions, as well as a primer against the universal adapter sequence. The Ig-specific primers contained tail sequences for the later Illumina adaptor addition. PCR products were then purified using AMPure XP beads. A secondary PCR was then performed to add the Illumina P5 adaptor to the C region end and a sample-indexed P7 adaptor to the universal adaptor end. The number of secondary PCR cycles was tailored to each sample to avoid entering the plateau phase, as judged by a prior quantitative PCR analysis. Final products were purified, quantified with a TapeStation (Agilent Genomics), and pooled in equimolar proportions, followed by high-throughput 2 × 300 bp paired-end sequencing with a 20% PhiX spike on the Illumina MiSeq platform according to the manufacturer’s recommendations, except for performing 325 cycles for read 1 and 275 cycles for read 2. Sequencing data were deposited in the Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) under BioProject accession PRJNA338795; sequencing runs used are denoted A79HP, AAYHL and AB8KB.
Raw read quality control and assembly
Raw reads were filtered in several steps to identify and remove low quality sequences. Conservative thresholds were applied in all cases to increase the reliability of mutation calls and clonal assignments, at the potential expense of excluding some genuine variants. Preprocessing was carried out using pRESTO v0.5.0 (http://presto.readthedocs.io) (31), as follows. 1) Reads with a mean Phred quality score below 20 were removed. 2) Reads without valid C region primer or template-switch sequences were removed, with a maximum primer match error rate of 0.2 and a maximum template-switch error rate of 0.5. Both template-switch additions and C region primer sequences were deleted from the reads. A preliminary isotype assignment for each read was made according to its C region primer match. 3) Reads with identical UMIs were collapsed into a single consensus sequence for each UMI. The UMI read groups with error scores exceeding 0.1 or majority isotype frequency under 0.6 were discarded. In cases where multiple isotypes were identified in a single UMI read group, the consensus sequence was based only upon the subset of reads in the UMI read group assigned to the majority isotype. 4) UMI consensus sequence mate-pairs were first assembled de novo into full-length Ig sequences with a minimum allowed overlap length of 8 bp, a maximum error rate of 0.3, and p value threshold of 1 × 10−5. UMI mate-pairs failing de novo assembly (e.g., due to long CDR3 sequences preventing mate-pair overlap) were assembled by alignment against the August 28, 2014 International Immunogenetics Information System (IMGT) V-segment germlines (32) with a minimum allowed identity of 0.5 and a E-value threshold of 1 × 10−5. 5) A secondary isotype assignment was made by local alignment of the tail (J segment end) of the UMI consensus sequences against a set of sequences expected to be upstream of the C region primer (sequences available on request), which were derived from the IMGT C region and J segment consensus reference sequences (32). Alignment was restricted to a 100 bp window with a maximum allowed error rate of 0.4.
V(D)J gene annotation
Following preprocessing, V(D)J germline segments were assigned with IMGT/HighV-QUEST (http://imgt.org) (32) using the July 7, 2015 version of the IMGT gene database.
Additional quality control and clonal clustering
Postprocessing of IMGT/HighV-QUEST output, additional quality control, and clonal clustering was performed using Change-O v0.3.0 (http://changeo.readthedocs.io) (33), Alakazam v0.2.3 (http://alakazam.readthedocs.io) (33), SHazaM v0.1.2 (http://shazam.readthedocs.io) (33), and custom scripts within the R statistical computing environment (34), as follows. 1) Non-functional sequences were removed from further analysis. 2) To remove sample contamination caused by multiplex indexing errors, sequences with the same UMI, V segment gene, J segment gene, and junction length were removed from all samples except one. Multisample sequences were assigned to the sample with the maximum number of reads for the given UMI read group. 3) Sequences with >10% missing data (N characters) were discarded. 4) Duplicate V(D)J sequences were discarded, with the exception of the retention of duplicate sequences derived from different biological samples, cell sorts, and/or which were assigned to different isotypes by internal C region alignment. Each sequence was assigned a total read count equal to the sum of member UMI counts contributing to the UMI consensus sequences and a preliminary mRNA copy number value, based on the total number of UMIs with an identical sequence. The internal C region alignment was used for all further steps dependent upon isotype assignment. 5) Functional VH V(D)J sequences were assigned to clonal groups by first partitioning sequences based on common IGHV gene annotations, IGHJ gene annotations, and junction region lengths. IGHV and IGHJ gene annotations for each partition were determined by the union of ambiguous gene assignments within each junction length partition having at least one overlapping gene annotation among all gene assignments. Within these larger groups, sequences differing from one another by a distance of <0.25 within the junction region were defined as clones by single-linkage clustering. Distance was measured as the number of nucleotide differences weighted by a human single-nucleotide substitution model derived from the mutation model previously described (33, 35), and normalized by the length of the junction region. The clonal distance threshold was determined by manual inspection to identify the local minima between the two modes of the within-sample bimodal distance-to-nearest histogram (36). 6) Full-length germline sequences were reconstructed for each clonal cluster with D segment and N/P regions masked (replaced with Ns), with any ambiguous gene assignments within clonal groups resolved by majority rule. 7) Sequences with inconsistent V segment, J segment, and/or isotype assignments were removed from further analysis (e.g., IGLV paired with IGKJ). 8) To remove chimeric sequences, sequences with more than five mismatches from the germline reference in any 10 bp window were discarded. 9) Artificial diversity introduced by errors in UMI sequences was corrected using two stages of single-linkage hierarchical clustering with a Hamming distance threshold of 4/17 bp. First, the UMI sequences within sets of identical V(D)J sequences, as determined during the duplicate removal process described above, were clustered to correct inflated mRNA copy numbers. A consensus UMI sequence was determined by majority rule for each UMI cluster, and the mRNA copy number was corrected to the number of UMI clusters for the given sequence. In the second stage of correction, unique variants were removed from clonal clusters, which bore similar consensus UMI and V(D)J sequences. The consensus UMIs for sequence sets belonging to the same cell sort and VH clonal cluster, determined as described above, were clustered using the same approach as the first stage. For each consensus UMI cluster within a clonal group, only the sequence variant with the maximum read count was retained. Second-stage correction for the VL sequences was identical to VH sequences, except that correction was performed within groups with the same cell sort, V segment gene, J segment gene, and junction length, as clonal clustering was not performed for VL sequences. 10) All unique sequences that were not represented by at least two reads were removed from further analysis, where the two-read threshold was either two reads within a single UMI read group or two distinct UMIs, using the UMI and read-count corrections described above.
BCR repertoire analysis: exclusion criteria and sequence filtering
To reduce artifacts and noise due to sample quality and numerical discreteness, all samples with <1900 unique VH sequences were excluded from the final analyses. The 1900 sequence threshold was chosen on the basis of no observed correlation between sequence count and relative clonal abundance. All analyses were performed on the set of unique sequences (with duplicates removed) within each sample, allowing retention of duplicate sequences represented by more than one isotype. UMI copy numbers, estimating the number of mRNA templates, were not considered during abundance analyses, including clonal diversity and gene usage analyses.
BCR repertoire analysis: clonal abundance and diversity
Repertoire diversity analysis was performed using Alakazam (33). Initial relative clonal abundance, prior to inference of the complete clonal abundance distribution, was calculated as the number of unique clonal variants within a clonal cluster divided by the total number of unique variants for the given patient. A minimum bound on the estimate of unseen clones was determined using the Chao1 estimator (37). The complete clonal abundance distribution was inferred using the one-parameter variant of the Chao et al. method (38), with confidence intervals derived via bootstrapping (resampling with replacement) from the inferred complete distribution using 2000 realizations. Repertoire diversity and evenness was characterized using the generalized diversity index proposed by Hill (39). To allow comparison of all subjects on the same scale regardless of sequencing yield, variability in sequencing depth was accounted for using rarefaction to a uniform sequence count of 1959 for each sample, with 2000 realizations of repeated random sampling with replacement from the inferred complete clonal abundance. The diversity index for each subject was determined from the mean value over all resampling realizations, with a 95% confidence interval about the diversity estimate derived via a Z score using the bootstrap distribution variance. Evenness was calculated at diversity order q = 4 as 4D/0D, which heavily up-weights abundant clones.
BCR repertoire analysis: V gene analysis
Individual genotypes were inferred using TIgGER v0.2.3 (http://tigger.readthedocs.io) (33, 40). V gene family usage was calculated as the percentage of unique sequences assigned to a given gene family within the total number of sequences for each isotype in each cellular compartment. V gene families IGHV6, IGHV7, IGKV5 through IGKV7, and IGLV4 through IGLV10 were excluded from gene usage analysis due to low total abundance. The Repertoire Dissimilarity Index (RDI) for each set of samples was calculated as previously described (41), with subsampling and 2000 bootstrap realizations, except that analysis was performed at the level of V gene family and the arcsinh transformation on abundance was excluded. V-J gene proximity was determined by assigning the relative position of each V and J gene using the VH and VL loci annotations from IMGT (42) coupled with additional reported data (43, 44). Positions were fractionally ranked from the most 5′ V gene (position 0.0) to the most 3′ J gene (position 1.0), excluding any non-V/J genes and any V genes annotated as pseudogenes or ORFs from consideration.
BCR repertoire analysis: mutation and selection analysis
Mutational frequencies were calculated as the number of mismatches from the germline sequence, excluding the N/P and D regions, using SHazaM. Lineage trees were constructed for each clone using PHYLIP v3.69 (45) and Alakazam (33, 46). The selection strength of expanded memory compartment lineages was quantified with the BASELINe (47) implementation provided in SHazaM, using the focused test statistic. Each clonal lineage was represented by the most recent common ancestor (MRCA) of the lineage, and the probability density function (PDF) for each status (HD, AChR-MG, MuSK-MG) was determined by unweighted convolution of the PDFs for each subject in the status group. Selection analysis was restricted to lineages represented by at least three unique observed sequences. The MRCA of each lineage was defined as the member of the lineage with the minimum mutational distance from germline; inferred intermediates were allowed as MRCA sequences. For analysis of selection by physicochemical property, a replacement mutation was defined as a mutation that introduces a change in amino acid property class, according to the classes recommended by IMGT (http://www.imgt.org/IMGTeducation/Aide-memoire/_UK/aminoacids/IMGTclasses.html).
BCR repertoire analysis: somatic hypermutation profile analysis
Somatic hypermutation profiles in HD, AChR-MG, and MuSK-MG subjects were analyzed with SHazaM using, separately, memory VH sequences with IgM, IgG, and IgA combined, memory IgK sequences, and memory IgL sequences. To reduce bias from clonal expansion, only one representative sequence was analyzed from each VH clonal group, with the representative sequence defined as the sequence with the highest mutational load within the clonal group. Using the synonymous, 5-mer, functional criteria (35), mutability and substitution rates of 5-mer motifs were estimated for individual subjects. Mutability of each 5-mer was directly measured from the data when at least 200 mutations were observed for a given 5-mer. Comparisons between individual subjects were performed via pairwise Pearson correlation of 5-mer mutability scores between individual subjects. Similarly, comparisons between status groups were performed via pairwise Pearson correlation of the average 5-mer mutability scores for all subjects in a given status. For each pairwise comparison, the set of 5-mers used was restricted to those that were directly measured (non-inferred) in the subjects or statuses under comparison, with renormalization of mutability scores following removal of non-intersecting 5-mers. To confirm comparison results between disease status groups, a background distribution of correlations between statuses was generated by performing all possible 35 permutations of the status labels of the subjects and recalculating the average mutability for each status using the same approach described above. The percentile method was then used to determine the statistical significance of the observed correlations, relative to the background distribution.
BCR repertoire analysis: H chain CDR3 convergence and physicochemical property analysis
Public H chain CDR3 (H-CDR3) sequences were identified by clustering identical length H-CDR3 amino acid sequences using the uclust algorithm (48) with a 90% identity threshold. Clustering was performed on the combined memory and naive population, with public H-CDR3 sequences defined as sequences that clustered with at least one sequence observed in a different subject. Repertoire properties, including gene usage and H-CDR3 physicochemical properties, were calculated using Alakazam. The H-CDR3 grand average of hydrophobicity (GRAVY) was calculated using the Kyte and Doolittle scale (49), net charge was calculated using the EMBOSS pK values (http://emboss.sourceforge.net/apps/cvs/emboss/apps/iep.html) and the Henderson-Hasselbalch equation at pH 7.4 (50), average bulkiness was calculated using the Zimmerman et al. scale (51), average polarity was calculated using the Grantham scale (52), and the aliphatic index was calculated using the Ikai method (53) normalized by the number of informative (non-missing) positions. Acidic (Asp or Glu), basic (Arg, His, or Lys) and aromatic content (His, Phe, Trp, or Tyr) was calculated as the fraction of informative (non-missing) residues in the H-CDR3.
Statistical tests
Statistical tests were performed using ANOVA to determined significance, followed by one-tailed t test for determination of pairwise p values. All confidence intervals are 95%.
Results
We performed deep sequencing of the BCR repertoire in AChR-MG (n = 4), MuSK-MG (n = 5), and HD (n = 4) subjects (Table I). None of the MG subjects had undergone significant immunotherapy treatment, and three of the subjects (AR02, AR03, and MK02) were part of our earlier study (54), which identified B cell tolerance checkpoint defects. Sequencing of the VH and VL from sorted naive and memory B cells from each subject produced a total of 22 million raw reads. After quality control and processing, a high-fidelity data set was generated that consisted of 205,540 unique error-corrected IgM, IgG, and IgA sequences and 311,979 unique error-corrected IgL and IgK sequences (Table II). Within each subject, approximately half of the memory VH repertoire was IgM (49–57%), with the remaining sequences composed of IgG (13–15%) and IgA (22–30%) (Supplemental Fig. 1B). IgL made up 34–40% of the naive compartment and 36–41% of the memory compartment (Supplemental Fig. 1C), with no significant difference observed between MG and HD subjects.
MG patients do not exhibit a public H-CDR3 signature
Given the common Ag target in the MG subtypes (AChR-MG or MuSK-MG), we hypothesized that the subjects’ repertoires would contain evidence of convergent H-CDR3 sequences, which have been observed during infection and vaccination responses (55, 56). Clustering of identical length H-CDR3 aa sequences identified a small number of these convergent, or public, H-CDR3 clusters across subjects (Supplemental Fig. 2A). The total abundance of H-CDR3 clusters public to AChR-MG and/or MuSK-MG subjects was a small fraction of the total H-CDR3 clusters (0.27%) (Supplemental Fig. 2B). When considering H-CDR3 clusters exclusively public across HD and MG subjects as an estimate of the expected public background, we observed a similarly low abundance of public clusters (0.22%) (Supplemental Fig. 2C). Furthermore, in pairwise comparisons between subjects the average frequency of public H-CDR3 clusters was similar in MG (0.32% for AChR-MG and 0.30% for MuSK-MG) and HD (0.33%) subjects. Public sequences were dominated by short H-CDR3s, with an average length of 11.8 aa in public H-CDR3s that were observed exclusively in MG (Supplemental Fig. 2B), and an average length of 10.8 aa for public H-CDR3s shared between MG and HD (Supplemental Fig. 2C). We also investigated whether IGHV4-34 sequences of MG patients showed enrichment for the framework region 1 hydrophobic patch characteristic of autoreactive 9G4+ Abs in systemic lupus erythematosus (SLE) (57), but found no such signature (data not shown). Thus, we found no compelling evidence for enrichment of convergent H-CDR3s in the peripheral naive and memory B cell populations derived from MG patients.
Clonality of the MG memory compartment is similar to that of healthy repertoires
To test whether the B cell repertoires in MG patients were characterized by large clonal expansions, sequences were clustered into clonal groups, with each member of the group representing one unique variant (including variants that differed only by isotype). Relative clonal abundance distributions were calculated as previously described (38). Expanded clones were identified in all MG patients, but even the most abundant clones were <1% of the repertoire, which is consistent with previous observations of peripheral blood in the absence of an active immune response (58). Although a slight upward trend in clone sizes of AChR-MG was observed, this increase was, at most, 0.1% of the subjects’ repertoires (Fig. 1A). To assess the statistical significance of this trend, we compared the evenness of HD and MG subjects. Evenness quantifies the extent to which an abundance distribution deviates from a uniform distribution, with a lower evenness indicating a less uniform distribution. As the shift in clonal abundance was most prevalent in the highly abundant clones, evenness scores were compared using the fourth-order Hill diversity index (39) (q = 4), which strongly up-weights the contribution of abundant clones to the evenness score (Fig. 1B). Although a large decrease in evenness of the AChR-MG repertoires compared with HD was observed, implying greater clonal dominance, this shift was not statistically significant (AChR-MG versus HD t test p = 0.10; MuSK-MG versus HD t test p = 0.38). Overall, no significant evidence was found to support the idea that the peripheral memory compartment of MG patients is more oligoclonal than that of healthy donors.
Memory compartment clonal abundance is comparable between healthy and MG repertoires. (A) The rank-abundance distribution of memory compartment VH clones with clone size (y-axis) as a percent of the repertoire against the size rank of the clone on a log10 scale (x-axis). Each dark line represents the estimated clonal abundance curve for a single subject, with shaded areas representing 95% confidence intervals derived via bootstrap (2000 realizations). Subject status is denoted by HD, AChR (AChR-MG), and MuSK (MuSK-MG). (B) Evenness at diversity order q = 4 of the VH memory compartment clone size distributions. Each point represents the estimated evenness score for a subject from the clonal abundance distributions in (A). The vertical shading represents the SD of the mean evenness scores and the horizontal bar represents the mean of the mean evenness scores. HD versus AChR-MG t test p = 0.100, HD versus MuSK-MG t test p = 0.382.
To assess ongoing recruitment of B cells into the memory compartment, we identified clones spanning the memory and naive compartments. Between 0.3 and 3.9% of memory clones contained at least one member observed in the naive compartment (Supplemental Fig. 2D). The frequency of clones spanning the memory and naive compartments was not significantly related to disease status, but was strongly dependent upon the total yield of unique sequences (Pearson R = 0.78). We also observed a low percentage of clones that contained multiple isotypes (either IgM and IgG, IgM and IgA, or IgG and IgA) with between 0.5% and 3.0% of clones containing sequences with multiple isotypes (Supplemental Fig. 2E). The frequency of clones containing multiple isotypes was also a highly sequencing depth-dependent effect (Pearson R = 0.77). Overall, we found no evidence that circulating MG repertoires were biased due to active clonal expansion or transition of B cells from the naive to memory compartments.
The MG repertoire exhibits biases in V segment usage
Previous studies have demonstrated V segment usage biases in several autoimmune diseases (57, 59). To determine whether similar V segment biases are also present in MG patients, we compared the relative abundance and interstatus variability of each V gene family between MG and HD subjects. In the naive compartment of MG subjects, significant shifts were observed in the usage of IGHV3 and IGHV4 relative to HDs (ANOVA p = 0.021) (Fig. 2A). Specifically, IGHV4 usage increased from 16.2% in HDs to 20.1% in AChR-MG (t test p = 0.027), whereas IGHV3 usage decreased from 56.5% in HDs to 46.7% in AChR-MG (t test p = 0.026) and 42.2% in MuSK-MG (t test p = 0.018). Although not statistically significant, an increase in IGHV4 was also observed in MuSK-MG (t test p = 0.067), as well as an increase in IGHV1 usage in both AChR-MG and MuSK-MG. Significant biases in V family usage of the L chains in the naive compartment of AChR-MG were also observed (ANOVA p = 0.016) (Fig. 2B). Specifically, IGLV2 usage increased from 9.0% in HD to 12.7% in AChR-MG (t test p = 0.006), IGKV1 usage decreased from 31.4 to 25.9% (t test p = 0.021), and IGKV4 decreased from 5.8 to 3.7% (t test p = 0.043). The same V family biases observed in naive AChR-MG repertoires were apparent in the class-switched memory populations, particularly in the case of reduced IGHV3 usage for both IgG (p = 0.006) and IgA (p = 0.052). In the memory compartment we observed significant IGKV/IGLV shifts in MuSK-MG only, with a decrease in IGKV2 usage from 8.20 to 6.26% (p = 0.023) and IGLV1 usage from 12.1 to 9.15% (p = 0.048). Due to variation in the V segment genotypes of the patients, individual V gene usage was not comparable for all patients across all genes (data not shown). However, increases in IGHV1-69, IGHV4-34, and IGLV2-14 abundance comprised a large share of the naive compartment biases observed in MG patients (data not shown).
V family usage is skewed in MG repertoires. (A) VH V family usage for the naive (Naive-IgM) and memory (Memory-IgM, Memory-IgG, and Memory-IgA) compartments. Usage is shown as a percent of the total unique IGHV sequences (y-axis) for HD, AChR-MG (AChR), and MuSK-MG (MuSK). Horizontal bars indicate the mean abundance over all subjects of a given status and vertical shading indicates ± 1 SD about the mean. Significance is denoted by: *p ≤ 0.5. (B) VL V family usage for the naive and memory compartments. Usage is shown as a percent of the combined unique IGLV and IGLK sequences (y-axis). Significance is denoted by: *p ≤ 0.5. (C and D) RDI for naive (Naive-IgM, Naive-IgK, and Naive-IgL) and memory (Memory-IgM, Memory-IgG, Memory-IgA, Memory-IgK, Memory-IgL) V families in (A) and (B). Each point indicates a pairwise RDI score (y-axis) for two subjects, with the mean score for each set of subjects marked by a horizontal bar and a density estimate indicated by the shaded region. RDI scores are grouped by within status comparisons for HD, AChR-MG (AChR), and MuSK-MG (MuSK).
The IGHV usage biases in the naive compartment of AChR-MG and MuSK-MG subjects appeared to be associated with an increase in variability across subjects. To quantify this effect, we compared V family variability using the RDI (41). The RDI scores the aggregate difference in gene usage between any pair of samples, providing a measure of how dissimilar two gene usage distributions are from one another. We observed a significant increase in IGHV RDI within both the naive (ANOVA p = 0.014) and memory (ANOVA p = 6.7 × 10−8) compartments of AChR-MG and MuSK-MG repertoires, with the most pronounced difference in the naive compartment (mean HD RDI = 16.9, AChR-MG RDI = 67.0, MuSK-MG RDI = 87.9) (Fig. 2C). In contrast to IGHV usage, no significant differences in the variability of IGKV or IGLV usage across subjects was found in the naive AChR-MG or MuSK-MG repertoires (Fig. 2D).
Overall, these results show that the naive MG repertoire is abnormal, having biased V segment usage in both the VH and VL, which are mirrored in the class-switched memory compartment. In addition, MG repertoires display considerably more individual variability within the naive compartment compared with HD repertoires, suggesting B cell developmental abnormalities in MG may be partially patient specific.
The MuSK-MG repertoire displays increased proximity of recombined IGLV-IGLJ genes
L chain receptor editing is a primary mechanism by which self-reactive B cells are removed from the naive B cell population (60, 61). In this process, a secondary V-J rearrangement may occur at either the original locus, the alternate chromosome, or through a κ to λ chain switch. Editing results in an increased genomic distance between V-J, as secondary rearrangements at the same locus lead to utilization of V-J genes with greater intervening distance due to deletion of the original V and/or J genes. To investigate whether receptor editing was dysregulated in MG patients, we computed the relative genomic distance between the V gene and J gene for each unique VL sequence in AChR-MG and MuSK-MG repertoires and compared them to HD repertoires. Although no significant differences were found in AChR-MG repertoires, IGLV-IGLJ pair distances in MuSK-MG were reduced within the memory and naive compartments, however, this was statistically significant only for memory (0.42–0.38 with t test p = 0.002 for memory, and 0.41–0.38 with t test p = 0.079 for naive) (Fig. 3). Although the shift in the naive compartment of MuSK-MG was not statistically significant, we note that the mean IGLV-IGLJ distance in the naive and memory compartments of MuSK-MG subjects were equivalent (0.38 naive versus 0.38 memory). Similar to AChR-MG, no significant difference in IGKV-IGKJ distance in MuSK-MG subjects was observed. Thus, MuSK-MG, but not AChR-MG, subjects exhibit reduced IGLV-IGLJ genomic distance consistent with reduction in secondary rearrangement events at the λ locus.
Recombined λ chains of MuSK-MG repertoires show reduced IGLV-IGLJ genomic distance. The mean relative genomic distance between the V and J genes of κ (IgK) and λ (IgL) VL sequences is shown for the naive and memory compartments of HD, AChR-MG (AChR), and MuSK-MG (MuSK). Relative genomic distance is shown as a fractional rank position (y-axis) difference between the J gene and V gene. Significance is denoted by: *p ≤ 0.5, **p ≤ 0.005.
Mutational profiles in MG are similar to that of healthy repertoires
MG autoantibodies are class-switched and contain frequent somatic hypermutations (SHMs) (22, 62). To determine whether SHM was perturbed in MG, we analyzed somatic mutation patterns in the memory compartments of IGHV, IGLV, and IGKV gene sequences. SHM exhibits well-known biases toward particular DNA motifs, leading to mutational hot-spots (such as WRC) and cold-spots (such as SYC), which are dependent upon the microsequence context (35). The intrinsic context biases of SHM, combined with differing sequence compositions of V gene families, lead to significant differences in the average mutational load across families (Fig. 4). In HD subjects, mutational loads ranged from 2.6% (IGHV2) to 3.5% (IGHV3) in IgM, 4.5% (IGHV2) to 6.5% (IGHV1) in IgG, 4.9% (IGHV2) to 6.5% (IGHV1) in IgA, and 2.6% (IGKV2) to 4.2% (IGLV3) in the VL. The mutational loads in MG subjects exhibited similar family-dependent mutational loads and, within each V family, there were no significant differences between AChR-MG, MuSK-MG, and HD subjects (Fig. 4).
Memory compartment mutational load is similar between healthy and MG repertoires. Distribution of mean mutation frequency for VH (A) and VL (B) memory compartment sequences. Mutation frequency for each sequence was calculated as the number of base changes from germline, excluding the N/P and D regions. Mean mutation frequency is shown for each chain, isotype, and gene family. Horizontal bars indicate the mean of the mean mutation rates with the vertical shading indicating ± 1 SD about the mean of means.
To determine if the known SHM microsequence targeting biases were perturbed in MG subjects, and by implication the molecular mechanisms underlying SHM, we calculated SHM mutability profiles using a 5-mer context model (two nucleotides upstream and downstream of the mutated position) as previously described (35). Only synonymous mutations were used in the model to remove the influence of selection. We found that mutability profiles were highly consistent across all individuals (Pearson R 0.83–0.94, 0.67–0.95, and 0.58–0.93 for IGHV, IGKV, and IGLV, respectively) (Supplemental Fig. 3A, 3C, 3E), and the AChR-MG, MuSK-MG, and HD profiles all exhibited the well-known hot-/cold-spot biases of SHM (Supplemental Fig. 3B, 3D, 3F). Furthermore, the full hierarchy of calculated mutabilities across DNA 5-mers was highly consistent across statuses (Pearson R 0.97–0.98, 0.91–0.96, and 0.97–0.98 for IGHV, IGKV, and IGLV, respectively). No evidence was found that mutability profiles within a status groups (AChR-MG, MuSK-MG, and HD) were more highly correlated to each other compared with other statuses (permutation test p value >0.1 for IGHV, IGKV, and IGLV). Overall, both SHM targeting profiles and mutational loads in AChR-MG and MuSK-MG were highly consistent with those of HD subjects and previously reported models (35).
The AChR-MG repertoire exhibits decreased positive selection in CDRs
To determine if affinity maturation was perturbed in MG patients, we quantified selection strength (∑) from the pattern of mutations in the V segment of expanded memory compartment lineages using BASELINe (47). This method compares the observed and expected frequency of non-synonymous mutations with positive ∑ scores indicating an excess of non-synonymous mutations (positive selection) and negative ∑ scores indicating fewer non-synonymous mutations than expected (negative selection). As the majority of positive selection events are shared among members of a B cell clone (and thus appear on the trunk of the lineage tree) (63), we quantified selection pressure on a per-lineage basis using the MRCA of each lineage as the representative sequence. The framework regions are critical to preserve the structure of the Ab receptor and, as expected, these regions showed significant evidence of negative selection in AChR-MG, MuSK-MG, and HD subjects with no difference between status groups (Fig. 5). Furthermore, as expected, AChR-MG, MuSK-MG, and HD subjects all exhibited significant positive selection in the CDRs, which generally contain Ag-contact residues. However, in this case, selection strength in AChR-MG was significantly reduced (∑ = 0.44 ± 0.13 in AChR-MG versus ∑ = 0.78 ± 0.18 in HD) (Fig. 5). To determine whether the observed selection difference was due to a specific physicochemical property, we also used BASELINe to analyze the frequency of somatic mutations that led to changes in side chain hydrophobicity, charge, polarity, and volume classes. All of these properties individually displayed some degree of reduced positive selection in the CDRs of AChR-MG subjects, so that no single physicochemical property appeared primarily responsible for the decrease in selection pressure (Supplemental Fig. 4). Taken together, these results suggest that although the underlying SHM process remains normal within MG subjects, the processes that guide Ag-driven selection (in the VH) are perturbed specifically in AChR-MG subjects.
The AChR-MG memory compartment repertoire displays reduced selective pressure. BASELINe PDFs are shown for HD, AChR-MG (AChR), and MuSK-MG (MuSK) repertoires, with density shown on the y-axis and the selection strength (∑) shown on the x-axis. PDFs for each status were determined via convolution of the individual PDFs for subjects within each status group, resulting in a single aggregate PDF for each status.
The AChR-MG memory repertoire exhibits abnormal CDR3 physicochemical properties
The H-CDR3 encodes the majority of B cell repertoire diversity, and contributes significantly to Ag specificity (64). Previous investigations into autoimmune B cell repertoires have observed an increase in the hydrophobicity (65), positive charge (57) or length (66) of the H-CDR3. To determine if such repertoire-scale biases in H-CDR3 properties were present in MG patients, we analyzed nine aa physicochemical properties: the length of the H-CDR3 region, the mean net charge, the GRAVY, mean side-chain bulkiness, mean polarity, the aliphatic index, and the abundance of acidic, basic, and aromatic residues. Consistent with previous studies (67, 68), significant differences were found between the naive IgM, memory IgM, and class-switched memory populations in HD subjects. Specifically, naive B cells expressed H-CDR3s that were longer (ANOVA p = 0.003) (Fig. 6B), more negatively charged (ANOVA p = 0.188) (Fig. 6C), and had higher hydrophobicity (ANOVA p = 0.020) (Fig. 6D). H-CDR3 properties of MuSK-MG subjects were similar to those of HDs. In contrast, AChR-MG subjects displayed significant abnormalities in the H-CDR3 GRAVY scores (ANOVA p = 0.012). Specifically, the mean GRAVY score of the memory IgA repertoire increased 1.1-fold in AChR-MG compared with HD (from −0.609 to −0.552; t test p = 0.008) (Fig. 6D). Although not statistically significant, we note that the mean net charge and GRAVY scores in the AChR-MG memory IgM repertoire both displayed large fold increases compared with HDs (Fig. 6C). Thus, AChR-MG subjects exhibit repertoire-scale biases in H-CDR3 physicochemical properties across multiple B cell compartments.
The AChR-MG memory compartment repertoire displays abnormal H-CDR3 physicochemical properties. (A) Log2 fold-change of nine H-CDR3 physicochemical properties for AChR-MG (AChR) and MuSK-MG (MuSK), with respect to scores in HD. Shown are the mean scores for each property for both naive (Naive-IgM) and memory IgG (Memory-IgG) repertoires along with a linear fit for each status; correlation is significant only for AChR-MG (R2 = 0.93; p = 3.06 × 10−5). (B) H-CDR3 mean length in amino acid residues, (C) GRAVY index, and (D) mean net charge at pH 7.4 for each subject is shown as a single point for both the naive (Naive-IgM) and memory (Memory-IgM, Memory-IgG, and Memory-IgA) compartments. Horizontal bars indicated the mean of the mean property scores with the vertical shading indicating ± 1 SD about the mean of means. Significance is denoted by: *p ≤ 0.5.
Given the B cell tolerance checkpoint defects we previously identified in MG (54), we sought to assess the extent to which the observed H-CDR3 differences in the memory compartment of AChR-MG repertoires were driven by the naive repertoire. We found that the physicochemical property biases observed in the class-switched memory compartment (AChR-MG versus HD) were significantly correlated with changes that were already apparent in the naive IgM compartment (R2 = 0.93; p = 3.06 × 10−5). In general, the biases within the AChR-MG naive compartment were further amplified within the class-switched memory compartment, with the largest fold-change biases displayed by net charge, GRAVY score, and the aliphatic index (Fig. 6A). Although we also observed a correlation between naive IgM and memory IgG biases in MuSK-MG, this effect was small and not statistically significant (R2 = 0.08; p = 0.469). Taken together, these results show that AChR-MG, but not MuSK-MG, repertoires display aberrant H-CDR3 physicochemical properties, and that these biases are likely initiated in the naive compartment.
Discussion
MG is a prototypical B cell-mediated autoimmune disease and holds historical significance as the first neurologic disease shown to be mediated by autoantibodies. The study of MG has driven discoveries in other autoimmune diseases (69), and it remains both clinically and scientifically relevant. Understanding the shape of the B cell repertoire in MG may lead to more effective therapeutics, and may further serve to elucidate potential pathogenic mechanisms in other B cell-mediated autoimmune diseases. Through the use of high-throughput BCR sequencing we have, to the best of our knowledge, described for the first time the naive and memory B cell repertoire of both AChR-MG and MuSK-MG patients in high resolution. A recent study by our group demonstrated that both AChR-MG and MuSK-MG fail to properly enforce B cell tolerance (54). Given that tolerance checkpoints counter-select a considerable fraction of developing B cells (61), one may reason that distinctive BCR repertoire characteristics would be conspicuous in a naive B cell compartment that is shaped without proper tolerance regulation. We have uncovered defects that materialize in the absence of properly regulated tolerance checkpoints, revealed by marked differences from healthy controls in the naive B cell repertoires of both AChR-MG and MuSK-MG subjects. Furthermore, we reasoned that an abnormally formed naive compartment would produce similar abnormalities in the memory compartment that it supplies. Our test of this hypothesis, shown in this study, indicates that deformations in the naive B cell compartment of AChR-MG and MuSK-MG are transferred to the memory compartment.
The naive repertoire and defective B cell counter selection
We observed significant differences in IGHV gene family usage in MG patients, including an increase in IGHV4 family genes (Fig. 2A). Increases in IGHV4 usage have been previously implicated in several other autoimmune diseases, including SLE, multiple sclerosis, and Wiskott-Aldrich syndrome (59, 70–72). A mechanistic explanation for the relationship between IGHV4-34 and SLE has been partially defined (57), but no clear explanation for the abundance of IGHV4 family genes in other autoimmune diseases is known. In the case of MG, the biases observed in the class-switched memory population are already apparent in the naive compartment. It is possible that the IGHV4 biases found in other autoimmune diseases similarly originate within the naive compartment. Previous studies have observed an increase in IGKV4 usage in SLE, celiac disease, and type 1 diabetes (73), but we did not observe this hallmark in MG VL repertoires. In fact, the naive repertoires of both AChR-MG and MuSK-MG subjects showed a decrease in IGKV4 usage compared with healthy controls.
Healthy naive repertoires show highly consistent IGHV usage in naive IgM+ cells in both our data and previous reports (41). In contrast, we found that naive MG repertoires were highly variable between individuals (Fig. 2C). The increased variability in naive MG repertoires implies a difference in selection—either the introduction of otherwise absent selection or the lack of appropriate selection. We also note that both H-CDR3 charge and hydrophobicity exhibited low variability in naive HD repertoires, and that these properties showed higher variance in naive MG repertoires, further implicating selective forces as the root of the disorderliness in naive MG repertoires, given that nucleotide diversity in the V segment contributes only partially to H-CDR3 physicochemical character.
To evade the development of an immune response against self, central and peripheral tolerance mechanisms counter-select B cells during their development (61, 66). Deficiencies in the integrity of B cell tolerance mechanisms can be demonstrated in a number of autoimmune diseases, including SLE, rheumatoid arthritis, multiple sclerosis (74–76), and recently MG (54). Our sequencing data, derived from naive IgM+ B cells, indicate that both AChR-MG and MuSK-MG subjects display characteristics in their naive B cell repertoire that are conspicuously different from those observed in HDs. We suggest that these findings are consistent with, and may be reflective of, the reported inadequate counter-selection during B cell development in MG (54).
The memory compartment and BCR selection abnormalities
We examined the MG memory B cell repertoire for abnormalities in the abundance or character of B cell clones relative to the HDs. No statistically significant difference in clonal abundance was observed. The analysis of convergent H-CDR3 sequences specific to MG subjects revealed no evidence of public H-CDR3 sequences that could be considered a signature for either AChR-MG or MuSK-MG subjects. Mutational load and SHM hot- and cold-spot targeting preferences of MG repertoires were examined; neither mutational loads nor SHM targeting preferences were perturbed with respect to HDs.
Despite the lack of any significant bias in clonal structure, mutational loads, or SHM targeting preferences, we did observe several abnormalities in the memory compartments of AChR-MG and MuSK-MG subjects related to clonal selection. BASELINe (47) analysis of mutation patterns identified reduced selection pressure in the H-CDR1 and H-CDR2 of AChR-MG subjects. AChR-MG repertoires were also characterized by an increase in mean hydrophobicity of the H-CDR3. Interestingly, the biases observed in the AChR-MG H-CDR3 appear to originate in the naive compartment, and are subsequently amplified in the IgG memory repertoire. In contrast, MuSK-MG repertoires showed no significant difference in H-CDR1 and H-CDR2 selection pressure or H-CDR3 properties compared with HDs. Furthermore, class-switched MuSK-MG repertoires exhibited the same characteristic shifts in H-CDR3 length, net change, and hydrophobicity compared with naive cells that have been demonstrated in previous studies of healthy repertoires (67, 68). Together these results show that the peripheral AChR-MG memory compartment exhibits reduced selection pressure on the VH sequence.
A primary mechanism by which self-reactive B cells are removed from the naive B cell population is through L chain receptor editing (60, 61), wherein a secondary V-J rearrangement may occur at either the original locus, the alternate chromosome, or through a κ to λ chain switch. Previous work has estimated that 20–50% of naive B cells have undergone receptor editing through λ chain replacement (61). In both SLE and RA, subjects have been shown to exhibit elevated levels of auto-reactive naive B cells, and these diseases have also shown biases in V to J gene proximity in the L chain genes used (61). When we examined the proximity of rearranged IGLV and IGLJ genes used, we found that both the naive and memory repertoire of MuSK-MG subjects used more proximal IGLV-IGLJ pairs. Although an increase in IGLV-IGLJ proximity is not a direct indicator of a deficiency in secondary receptor rearrangement, it is an explanation consistent with the observed tolerance defect in MuSK-MG (54).
We have shown that there are significant abnormalities in the memory compartments of both AChR-MG and MuSK-MG patients. However, these abnormalities are already apparent in the naive compartment of MG subjects—a pattern consistent with defective counter-selection during B cell development. Additionally, we have shown that MG memory compartment biases are not consistent between the two MG subtypes. These results suggest fundamental differences in the underlying mechanisms leading to AChR-MG and MuSK-MG autoimmunity, and may underlie the divergent disease characteristics, including IgG subclass usage and response to therapy, especially B cell depletion (9).
Therapeutics and MG B cell repertoire
Biologics have been increasingly used to treat autoimmune disease. In the treatment of MG, biologics, such as anti-CD20, which directly affect depletion of the B cell compartment, are promisingly effective in a subset of patients (9, 10, 77). However, recent studies demonstrate that B cell depletion therapy does not correct tolerance checkpoint defects, which persist after such treatment (78). CD20-mediated B cell depletion targets the naive and memory repertoire for deletion, but it leaves Ab (and presumably autoantibody) producing cells intact. Our study, which describes a deformed naive B cell repertoire, raises questions regarding the durability of such treatment and further suggests the potential need for chronic administration. Recent application of autologous stem cell transplantation has shown promise in MG treatment (79), although the treatment is restricted to a very small subgroup of patients. Nonetheless, it may be that immune ablation is required for lasting disease remission, given the abnormalities that are established in the naive B cell compartment.
Repertoire analysis, coupled with clinical diagnostics, may provide a useful insight into the immunological changes associated with a number of biologic treatment approaches, and may serve as a biomarker for tracking or predicting the response to particular therapeutic approaches. Both the results and approach of this study could be applied to first identify abnormalities in the repertoire that associate with clinical parameters, and then be used to monitor the repopulated B cell compartment along with its associated clinical presentation. In this regard, repertoire sequencing may provide a valuable benchmark for tracking changes in the B cell repertoire in patients with MG that could adjudicate retreatment decisions prior to the onset of potentially harmful clinical relapses.
Limitations and future directions
The number of total sequences in our study was large enough (>0.5 million) to allow conclusions to be drawn from the data; however, the limited set of AChR-MG and MuSK-MG subjects may restrict the generalizability of our findings. This study was specifically designed to determine whether broad repertoire abnormalities are commonly present in the disease. Extended interpretation meant to associate the repertoire with minor disease subclasses, disease severity/variability, and the influence of treatment was not built into the experimental design. Further studies with larger cohorts of clinically heterogeneous MG subjects, including longitudinally acquired samples, will be necessary to accurately associate the abnormal repertoire shape with clinical status, demographics, and the response to treatment.
This study was also not designed to directly associate the abnormalities in the repertoire with production of MG-specific autoantibodies, although all of the MG patients studied expressed such autoantibodies. The number of circulating autoantibody-producing cells present in MG or any autoantibody-mediated disease is likely to be small. This is highlighted by a very limited number of studies that describe the identification of B cells with particular specificity. An example includes a postvaccination study demonstrating that specific memory B cells represent as little as 0.01–0.11% (80) of the compartment. Furthermore, the specific B cell compartment in which these cells reside in MG patients is not completely understood. They may be long-lived plasma cells inhabiting the bone marrow or thymus (as shown in some cases of MG), circulating plasmablasts, or a combination of contributions from these compartments. These MG-specific sequences are thus likely to constitute a minute fraction of the sequencing data collected and analyzed in this study. Consequently, our data present a generalized broader distortion of the whole MG B cell repertoire, likely reflecting underlying immune dysregulation in AChR-MG and MuSK-MG. This leaves open the question as to how the distorted MG repertoire contributes to pathogenic autoantibody production. A number of approaches can be used to address this question, such as single-cell VH:VL pairing coupled with sorting of specific B cells via labeled Ag. However, these technologies still require considerable development to overcome throughput limitations and are accordingly outside the scope of this study. Approaches currently in development will likely enable high-throughput autoantibody characterization at the repertoire scale.
We restricted our investigative scope to consideration of only the major naive (CD20+CD19+CD27−) and memory (CD20+CD19+CD27+) B cell populations. We recognize that a more complex sorting strategy, which includes additional B cell subpopulations, may reveal further details about the nature and origin of MG B cell repertoire abnormalities. In particular, our results show many MG repertoire abnormalities are likely initiated in the naive compartment. As such, separation of the peripheral naive compartment into new emigrant (CD19+CD10+CD27−IgM+) and mature naive (CD19+CD10−CD27−IgM+) populations may reveal how AChR-MG and MuSK-MG abnormalities reflect central or peripheral tolerance checkpoints defects, and whether these defects occur at the same developmental stage in both disease subtypes. Similarly, investigation of the plasmablast (CD27+CD20−CD38+), double negative (CD27−IgM−) and unswitched memory populations may prove informative. However, the low abundance of these subpopulations likely limited their impact on the repertoire scale perturbations we observed in our data sets.
In conclusion, we have shown that AChR-MG and MuSK-MG patients are unable to generate or sustain the B cell repertoire found in healthy individuals. Despite the commonality of a deformed repertoire, AChR-MG and MuSK-MG each exhibit unique abnormalities, which implies fundamental differences in the underlying mechanisms leading to AChR-MG and MuSK-MG autoimmunity. These abnormalities likely originate in the naive compartment due, in part, to defective tolerance checkpoints, which are then subsequently transferred to the memory compartment.
Disclosures
The authors have no financial conflicts of interest.
Acknowledgments
We thank Dr. Kaya Bilguvar and Christopher Castaldi from the Yale Center for Genome Analysis for expert assistance with executing the sequencing on the next-generation platform and Dr. Aditya Kumar for technical expertise with the flow cytometry. The authors also thank the Yale Center for Research Computing (funded by National Institutes of Health Grants RR19895 and RR029676-01) for use of computing resources.
Footnotes
The study was directed by the coprincipal investigators, S.H.K. and K.C.O.; it was designed and initiated by J.A.V.H., S.H.K., F.V., and K.C.O.; the next-generation BCR sequencing was performed using an approach designed by F.V. and executed by T.J.G. and T.J.B. Experiments were performed and data were collected by J.A.V.H., P.S., T.J.G., and T.J.B.; the sequencing data were processed, analyzed, and interpreted by J.A.V.H., J.Q.Z., L.C., and S.H.K. using code authored by J.A.V.H., C.R.B., and S.H.K. The manuscript was written by J.A.V.H., S.H.K., and K.C.O. and edited through contributions by all of the authors. R.J.N., R.J.B., M.M.D., and E.C. recruited study-appropriate subjects, collected clinical specimens, and provided the interpreted clinical data.
This work was supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health through a Grant to K.C.O., under award number R01AI114780, by the National Institutes of Health under award number R01AI104739 to S.H.K., by the National Library of Medicine under award number T15 LM07056 to J.A.V.H., and by the Gruber Science Fellowship awarded to J.Q.Z.
The sequences presented in this article have been submitted to the Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) under BioProject accession number PRJNA338795.
The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
The online version of this article contains supplemental material.
Abbreviations used in this article:
- AChR
- acetylcholine receptor
- CDR
- complementarity determining region
- GRAVY
- grand average of hydrophobicity
- H-CDR3
- H chain CDR3
- HD
- healthy donor
- IMGT
- International Immunogenetics Information System
- LRP4
- lipoprotein-related protein 4
- MG
- myasthenia gravis
- MRCA
- most recent common ancestor
- MuSK
- muscle specific kinase
- probability density function
- RDI
- Repertoire Dissimilarity Index
- SHM
- somatic hypermutation
- SLE
- systemic lupus erythematosus
- UMI
- unique molecular identifier.
- Received August 16, 2016.
- Accepted December 13, 2016.
- Copyright © 2017 by The American Association of Immunologists, Inc.