Skip to main content

Main menu

  • Home
  • Articles
    • Current Issue
    • Next in The JI
    • Archive
    • Brief Reviews
      • Neuroimmunology: To Sense and Protect
    • Pillars of Immunology
    • Translating Immunology
    • Most Read
    • Top Downloads
    • Annual Meeting Abstracts
  • COVID-19/SARS/MERS Articles
  • Info
    • About the Journal
    • For Authors
    • Journal Policies
    • Influence Statement
    • For Advertisers
  • Editors
  • Submit
    • Submit a Manuscript
    • Instructions for Authors
    • Journal Policies
  • Subscribe
    • Journal Subscriptions
    • Email Alerts
    • RSS Feeds
    • ImmunoCasts
  • More
    • Most Read
    • Most Cited
    • ImmunoCasts
    • AAI Disclaimer
    • Feedback
    • Help
    • Accessibility Statement
  • Other Publications
    • American Association of Immunologists
    • ImmunoHorizons

User menu

  • Subscribe
  • My alerts
  • Log in
  • Log out

Search

  • Advanced search
The Journal of Immunology
  • Other Publications
    • American Association of Immunologists
    • ImmunoHorizons
  • Subscribe
  • My alerts
  • Log in
  • Log out
The Journal of Immunology

Advanced Search

  • Home
  • Articles
    • Current Issue
    • Next in The JI
    • Archive
    • Brief Reviews
    • Pillars of Immunology
    • Translating Immunology
    • Most Read
    • Top Downloads
    • Annual Meeting Abstracts
  • COVID-19/SARS/MERS Articles
  • Info
    • About the Journal
    • For Authors
    • Journal Policies
    • Influence Statement
    • For Advertisers
  • Editors
  • Submit
    • Submit a Manuscript
    • Instructions for Authors
    • Journal Policies
  • Subscribe
    • Journal Subscriptions
    • Email Alerts
    • RSS Feeds
    • ImmunoCasts
  • More
    • Most Read
    • Most Cited
    • ImmunoCasts
    • AAI Disclaimer
    • Feedback
    • Help
    • Accessibility Statement
  • Follow The Journal of Immunology on Twitter
  • Follow The Journal of Immunology on RSS

The Inference of Antigen Selection on Ig Genes

Izidore S. Lossos, Robert Tibshirani, Balasubramanian Narasimhan and Ronald Levy
J Immunol November 1, 2000, 165 (9) 5122-5126; DOI: https://doi.org/10.4049/jimmunol.165.9.5122
Izidore S. Lossos
*Department of Medicine, Division of Oncology, and Department of Health Research and Policy, and
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Robert Tibshirani
†Department of Statistics, Stanford University Medical Center, Stanford, CA 94305
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Balasubramanian Narasimhan
†Department of Statistics, Stanford University Medical Center, Stanford, CA 94305
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ronald Levy
*Department of Medicine, Division of Oncology, and Department of Health Research and Policy, and
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading

Abstract

Analysis of somatic mutations in V regions of Ig genes is important for understanding various biological processes. It is customary to estimate Ag selection on Ig genes by assessment of replacement (R) as opposed to silent (S) mutations in the complementary-determining regions and S as opposed to R mutations in the framework regions. In the past such an evaluation was performed using a binomial distribution model equation, which is inappropriate for Ig genes in which mutations have four different distribution possibilities (R and S mutations in the complementary-determining region and/or framework regions of the gene). In the present work, we propose a multinomial distribution model for assessment of Ag selection. Side-by-side application of multinomial and binomial models on 86 previously established Ig sequences disclosed 8 discrepancies, leading to opposite statistical conclusions about Ag selection. We suggest the use of the multinomial model for all future analysis of Ag selection.

Functional Ig genes are created by an ordered process of gene rearrangement. The large diversity of the primary Ab repertoire, which is independent of prior exposure to Ag, is achieved by combinatorial permutation of the Ig heavy and light chain V, D, and J segments and by addition or deletion of short coding sequences at the VD and DJ joints in heavy chains and VJ joints in the light chains. Following Ag encounter, the affinity of the Ab for the Ag increases during a process of affinity maturation (1). Affinity maturation results from a combination of somatic hypermutation of the rearranged V segments and Ag selection of mutants with improved binding properties. This process leads to preferential accumulation of replacement (R)3 as opposed to silent (S) mutations in the complementary-determining regions (CDRs), which form the Ag binding sites. Concomitantly, S as opposed to R mutations, tend to cluster in the framework regions (FRs), which are required to maintain structural integrity. Initial estimates of Ag selection assumed a random pattern of R and S mutations and assumed they would localize to a region proportional to the relative size of the CDR and FR (2). Therefore, in the case of Ig heavy chain V region, mutations would localize three times more frequently to the FRs than to the CDRs and a CDR:FR ratio >0.3 would indicate Ag selection (3). Subsequently, Shlomchik et al. (4) proposed the use of a binomial distribution model for assessment of Ag selection. However, this method failed to consider the intrinsic properties of the CDR and the FR. The codon compositions of the CDR and the FR have mutational biases, since the CDRs generally consist of codons which are more susceptible to R mutations than those in the FRs. To account for the inherent susceptibility of the CDR and FR to R mutations, Chang and Casali (5) calculated the relative tendencies of the V regions of individual Ig germline genes to accumulate R mutations (Rf), and used these Rf values to estimate the expected frequency of R and S mutations, in particular CDR and FR, for a given total number of mutations. They used a binomial distribution model proposed by Shlomchik et al. (4) to determine the probability that a particular number of R mutations occurred by chance. This model, by definition, is applicable to variables that have only two distribution possibilities, whereas the total Ig mutations have four different distribution possibilities (R and S mutations in CDR and/or FR of the gene), thus requiring application of a multinomial distribution model (6). Moreover, previous methods failed to account for all the statistical possibilities to obtain a certain observed number of R mutations. Their equation consists of a single binomial probability whereas the correct version should consist of a sum of all the binomial probabilities, which include the observed value.

In the present work, we propose a new method for estimation of Ag selection pressure on Ig genes that corrects the pitfalls mentioned above and apply this method to previously published Ig gene sequences.

Materials and Methods

Multinomial distribution model for estimation of excess or scarcity of R mutations in the Ig gene

The probability that an excess or scarcity of R mutations in VH CDR or FR occurred by chance was calculated by a multinomial distribution model (6). The total number of mutations in each VH gene is denoted by n = r1 + s1 + r2 + s2, in which r1 and r2 are R mutations in the FR and CDR, respectively, and correspondingly, s1 and s2 are S mutations in the FR and CDR. The theoretical probabilities for r1, s1, r2, and s2 mutations are denoted by p1, q1, p2, and q2, respectively. These probabilities were calculated using the following equations: p1 = RfFR × LrFR; q1 = (1− RfFR) × LrFR; p2 = RfCDR × (1− LrFR); and q2 = (1− RfCDR) × (1− LrFR) in which LrFR is a relative size of the FR, and RfFR and RfCDR are the inherent susceptibility to R mutations of the FRs and CDRs, respectively. RfFR and RfCDR were calculated for each of the identified human germline genes and were based on the chance of the occurrence in each codon of an amino acid replacement given any single nucleotide change not resulting in a termination codon.

The probability of observing r1 or fewer R mutations in FRs is given by the multinomial tail probability: Embedded Image The sum is taken over values of k ranging from 0 to r1 and all combinations of S1, R2, S2 such that k + S1 + R2 + S2 = n. To compute the P value of an observed number r1, it is customary to split the probability at r1: P = P(R1 < r1) + 0.5 × P(R1 = r1). It should be noted that P(R1 = r1) = P(R1 ≤ r1) − P(R1 ≤ r1 −1), and P(R1 < r1) = P(R1 ≤ r1) − P(R1 = r1).

The probability of observing r2 or more R mutations in CDRs is similarly computed using the following equation: Embedded Image And the P value is computed using the formula: P(R2 > r2) + 0.5 × P(R2 = r2). For both FR and CDR, we used one-sided P values.

There is an approximate method for computing P values for this problem. We calculate the expected number of R mutations in the FR: E = p1 × n. Then we compute the standardized deviation: Embedded Image Under the usual Poisson model, this quantity should have approximately a standard normal distribution and can be compared with a normal table. However, we found that this approximation can be quite poor, and hence do not recommend it.

Assessment of the equation

To assess the applicability of this equation and to compare the results obtained by this method to those previously reported by application of the Chang and Casali equation (5), we evaluated Ag selection pressure on 7 autoantibodies evaluated by Chang and Casali (5), 24 autoantibodies and lymphoma-derived VH genes randomly selected from previously published articles (7, 8, 9, 10, 11, 12), and 55 VH gene sequences derived from diffuse large B cell lymphoma cases established in our laboratory (13). For this comparison, we used the RfFR and RfCDR values implied in these articles. Recalculation of these values resulted in slightly different Rf values for some of the VH genes. The discrepancies most probably result from the use of slightly different germline sequences before the final sequence of the VH gene locus was established and from the use of sequences in which there are polymorphic variations. For the future calculation of the Rf values, we suggest using our JAVA applet, available at http://www-stat.stanford.edu/imuunoglobin, which calculates the Rf values for imported germline sequences.

Results

A total of 86 VH gene sequences were analyzed using the multinomial distribution model equation for the presence of Ag selection as demonstrated by the conservation of the FR sequence and/or excess of R mutations in CDR (Table I⇓). These results were compared with the results obtained by the binomial distribution equation suggested by Chang and Casali (5). A total of eight discrepancies leading to opposite statistical conclusions were observed. These included six VH gene sequences in which an excess of R mutations in the CDR (five sequences) or a scarcity of R mutations in the FR (one sequence) were suggested by the binomial distribution equation, but rejected by the multinomial distribution model equation. In an another two VH gene sequences, evidence for scarcity of R mutations in the FR was obtained using the multinomial but not by the binomial distribution model equation. In the majority of the remaining VH gene sequences, the P values obtained using the two equations differed in magnitude but did not lead to discrepant statistical conclusions. The similarity cannot be explained mathematically, but is quite fortuitous. There is no guarantee in general that the binomial formula will give a good approximation. The multinomial distribution model equation suggested an excess and/or a scarcity of R mutations in the CDR and FR, respectively, in 13 of the 14 VH gene sequences derived from high-affinity autoantibodies. By contrast, the binomial distribution suggested Ag selection in only 11 of these sequences. One of the autoantibody sequences did not fulfill the statistical criteria for Ag selection by either of the analytical models.

View this table:
  • View inline
  • View popup
Table I.

Comparison of multinomial and binomial distribution models for estimation of Ag selection on human Ig genes

View this table:
  • View inline
  • View popup
Table 1A.

Continued.

View this table:
  • View inline
  • View popup
Table 1B.

Continued.

Discussion

Analysis of somatic mutations in V regions of Ig genes is important for studying the evolution of the Ab response, for assessment of the molecular features of autoantibodies, and for the investigation of lymphoma pathogenesis. Analysis of mutations in VH genes can provide insights regarding the role of Ag before or during lymphoma clonal outgrowth. In the absence of Ag-positive or -negative selective pressure on Ig V regions, a random mutational process would result in an even distribution of R and S mutations throughout the coding sequence. However, Ag-selected Abs demonstrated a higher frequency of R mutations in CDRs than in FRs, whereas preservation of a functional Ig molecule is associated with a higher frequency of S mutations and scarcity of R mutations in FRs. In the past, such an evaluation was performed using the binomial distribution model equation, as suggested by Shlomchik et al. (4) and further modified by Chang and Casali (5). The necessity for such an evaluation and its wide usage are demonstrated by the fact that the Chang and Casali article was cited 130 times (The Web of ScienceSM on the Internet). However, their formula is incorrect and application of the binomial distribution model to variables (mutations) that have more than two distribution possibilities is incorrect. It should be viewed as the application of an improper statistical method for the data analysis, similar, for example, to the use of parametric statistical methods for the analysis of the nonparametric variables. Moreover, Chang and Casali (5) considered in their equation a single binomial probability, while correct statistical analysis should consist of a sum of all the observed probabilities, as is proposed in the new method presented herein. Consequently, incorrect biological conclusions may have been reached, as indeed had happened in eight tested VH gene sequences (Table I⇑). Fortuitously, the magnitude of the observed difference between the two statistical methods in the present study was relatively small. However, we would argue that a proper statistical method should require the application of a multinomial distribution equation for all future estimates of Ag selection.

In the present work, we propose a new statistical method for estimation of Ag selection. It corrects the pitfalls present in the previous method while still taking into account the inherent susceptibility of the codons of the CDR and the FR to R mutations. One consideration not addressed here is the known propensity for certain positions to mutate—hot spots (14). Our equation assumes that mutations in VH genes occur randomly, thus disregarding the possible contribution from intrinsic biases in the hypermutation mechanisms due to the presence of mutational hot spots. Since the hot spots are located in CDRs but not in FRs, the assumption that mutations in FRs occur randomly is absolutely correct. Regarding the CDRs, to consider mutational hot spots, one would need to know all the hot spots in each VH gene sequence and their relative propensity to undergo mutations in comparison to each remaining non-hot spot codon in the sequence. Consideration of these hot spots may require a custom equation for each VH gene sequence, thus precluding its wide applicability. Until the data required to construct the custom equation for each germline VH gene sequence exists, our model can provide good approximation of the Ag selection on Ig genes.

In conclusion, we suggest the use of the multinomial model for all future analysis of Ag selection. The investigators should compare the tested Ig gene sequence to the most similar germline sequence, with particular attention to the presence of known polymorphic variants. The JAVA applet for computing the multinomial P values and Rf values of CDRs and FRs is available at http://www-stat.stanford.edu/immunoglobin. Usage of this applet will allow uniform analysis of Ig sequences and prevent possible errors that may occur while calculating the Rf values.

Footnotes

  • ↵1 This study was supported by Grants CA33399 and CA34233 from the U.S. Public Health Service-National Institutes of Health. I.S.L. is a Harold Dobbs Oncology Fellow. R.L. is an American Cancer Society Clinical Research Professor.

  • ↵2 Address correspondence and reprint requests to Dr. Ronald Levy, Division of Oncology CCSR 1126, Stanford University School of Medicine, Stanford, CA 94305-5306. E-mail address: levy{at}leland.stanford.edu

  • ↵3 Abbreviations used in this paper: R, replacement; S, silent; CDR, complementary-determining region; FR, framework region.

  • Received December 16, 1999.
  • Accepted August 9, 2000.
  • Copyright © 2000 by The American Association of Immunologists

References

  1. ↵
    Rajewsky, K.. 1996. Clonal selection and learning in the antibody system. Nature 381: 751
    OpenUrlCrossRefPubMed
  2. ↵
    Jukes, T. H., J. L. King. 1979. Evolutionary nucleotide replacements in DNA. Nature 281: 605
    OpenUrlCrossRefPubMed
  3. ↵
    Davi, F., K. Maloum, A. Michel, O. Pritsch, C. Magnac, E. Macintyre, F. Salomon-Nguyen, J. L. Binet, G. Dighiero, H. Merle-Beral. 1996. High frequency of somatic mutations in the VH genes expressed in prolymphocytic leukemia. Blood 88: 3953
    OpenUrlAbstract/FREE Full Text
  4. ↵
    Shlomchik, M. J., A. H. Aucolin, D. S. Pisetsky, M. G. Weigert. 1987. Structure and function of anti-DNA antibodies derived from a single autoimmune mouse. Proc. Natl. Acad. Sci. USA 84: 9150
    OpenUrlAbstract/FREE Full Text
  5. ↵
    Chang, B., P. Casali. 1994. The CDR1 sequences of a major proportion of human germline Ig VH genes are inherently susceptible to amino acid replacement. Immunol. Today 15: 367
    OpenUrlCrossRefPubMed
  6. ↵
    Hogg, R., A. Craig. 1978. Introduction to Mathematical Statistics Macmillan, New York.
  7. ↵
    Thiede, C., B. Alpen, A. Morgner, M. Schmidt, M. Ritter, G. Ehninger, M. Stolte, E. Bayerdorffer, and A. Neubauer. 1998. Ongoing somatic mutations and clonal expansions after cure of Helicobacter pylori infection in gastric mucosa-associated lymphoid tissue B-cell lymphoma [Published erratum appears in 1999 J. Clin. Oncol. 17:1092]. J. Clin. Oncol. 16:3822.
  8. ↵
    Aarts, W. M., R. Willemze, R. J. Bende, C. J. Meijer, S. T. Pals, C. J. van Noesel. 1998. VH gene analysis of primary cutaneous B-cell lymphomas: evidence for ongoing somatic hypermutation and isotype switching. Blood 92: 3857
    OpenUrlAbstract/FREE Full Text
  9. ↵
    Krenn, V., A. Konig, F. Hensel, C. Berek, M. M. Souto Carneiro, W. Haedicke, Y. Wang, H. Vollmers, H. K. Muller-Hermelink. 1999. Molecular analysis of rheumatoid factor (RF)-negative B cell hybridomas from rheumatoid synovial tissue: evidence for an antigen-induced stimulation with selection of high mutated IgVH and low mutated IgVL/λ genes. Clin. Exp. Immunol. 115: 168
    OpenUrlCrossRefPubMed
  10. ↵
    Mockridge, C. I., C. J. Chapman, M. B. Spellerberg, B. Sheth, T. P. Fleming, D. A. Isenberg, F. K. Stevenson. 1998. Sequence analysis of V(4–34)-encoded antibodies from single B cells of two patients with systemic lupus erythematosus (SLE). Clin. Exp. Immunol. 114: 129
    OpenUrlCrossRefPubMed
  11. ↵
    Ravirajan, C. T., M. A. Rahman, L. Papadaki, M. H. Griffiths, J. Kalsi, A. C. Martin, M. R. Ehrenstein, D. S. Latchman, D. A. Isenberg. 1998. Genetic, structural and functional properties of an IgG DNA-binding monoclonal antibody from a lupus patient with nephritis. Eur. J. Immunol. 28: 339
    OpenUrlCrossRefPubMed
  12. ↵
    Taniguchi, M., K. Oka, A. Hiasa, M. Yamaguchi, T. Ohno, K. Kita, H. Shiku. 1998. De novo CD5+ diffuse large B-cell lymphomas express VH genes with somatic mutation. Blood 91: 1145
    OpenUrlAbstract/FREE Full Text
  13. ↵
    Lossos, I., C. Okada, R. Tibshirani, R. Warnke, J. Vose, T. Greiner, R. Levy. 2000. Molecular analysis of immunoglobulin genes in diffuse large B cell lymphomas. Blood 95: 1797
    OpenUrlAbstract/FREE Full Text
  14. ↵
    Betz, A. G., M. S. Neuberger, C. Milstein. 1993. Discriminating intrinsic and antigen-selected mutational hotspots in immunoglobulin V genes. Immunol. Today 14: 405
    OpenUrlCrossRefPubMed
PreviousNext
Back to top

In this issue

The Journal of Immunology: 165 (9)
The Journal of Immunology
Vol. 165, Issue 9
1 Nov 2000
  • Table of Contents
  • About the Cover
Print
Download PDF
Article Alerts
Sign In to Email Alerts with your Email Address
Email Article

Thank you for your interest in spreading the word about The Journal of Immunology.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
The Inference of Antigen Selection on Ig Genes
(Your Name) has forwarded a page to you from The Journal of Immunology
(Your Name) thought you would like to see this page from the The Journal of Immunology web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
The Inference of Antigen Selection on Ig Genes
Izidore S. Lossos, Robert Tibshirani, Balasubramanian Narasimhan, Ronald Levy
The Journal of Immunology November 1, 2000, 165 (9) 5122-5126; DOI: 10.4049/jimmunol.165.9.5122

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Share
The Inference of Antigen Selection on Ig Genes
Izidore S. Lossos, Robert Tibshirani, Balasubramanian Narasimhan, Ronald Levy
The Journal of Immunology November 1, 2000, 165 (9) 5122-5126; DOI: 10.4049/jimmunol.165.9.5122
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like

Jump to section

  • Article
    • Abstract
    • Materials and Methods
    • Results
    • Discussion
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • PDF

Related Articles

Cited By...

More in this TOC Section

  • The Murine MHC Class II Super Enhancer IA/IE-SE Contains a Functionally Redundant CTCF-Binding Component and a Novel Element Critical for Maximal Expression
  • Two Human Monoclonal HLA-Reactive Antibodies Cross-React with Mamu-B*008, a Rhesus Macaque MHC Allotype Associated with Control of Simian Immunodeficiency Virus Replication
  • The Crystal Structure of the MHC Class I (MHC-I) Molecule in the Green Anole Lizard Demonstrates the Unique MHC-I System in Reptiles
Show more MOLECULAR AND STRUCTURAL IMMUNOLOGY

Similar Articles

Navigate

  • Home
  • Current Issue
  • Next in The JI
  • Archive
  • Brief Reviews
  • Pillars of Immunology
  • Translating Immunology

For Authors

  • Submit a Manuscript
  • Instructions for Authors
  • About the Journal
  • Journal Policies
  • Editors

General Information

  • Advertisers
  • Subscribers
  • Rights and Permissions
  • Accessibility Statement
  • FAR 889
  • Privacy Policy
  • Disclaimer

Journal Services

  • Email Alerts
  • RSS Feeds
  • ImmunoCasts
  • Twitter

Copyright © 2021 by The American Association of Immunologists, Inc.

Print ISSN 0022-1767        Online ISSN 1550-6606