The JI
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     
 


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Request Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Oprea, M.
Right arrow Articles by Kepler, T. B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Oprea, M.
Right arrow Articles by Kepler, T. B.
The Journal of Immunology, 2001, 166: 892-899.
Copyright © 2001 by The American Association of Immunologists

The Targeting of Somatic Hypermutation Closely Resembles That of Meiotic Mutation1

Mihaela Oprea*, Lindsay G. Cowell{dagger} and Thomas B. Kepler2,{ddagger}

* Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, NM 87545; {dagger} Department of Immunology, Duke University Medical Center, Durham, NC 27710; and {ddagger} The Santa Fe Institute, Santa Fe, NM 87501


    Abstract
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix 1
 References
 
We have compared the microsequence specificity of mutations introduced during somatic hypermutation (SH) and those introduced meiotically during neutral evolution. We have minimized the effects of selection by studying nonproductive (hence unselected) Ig V region genes for somatic mutations and processed pseudogenes for meiotic mutations. We find that the two sets of patterns are very similar: the mutabilities of nucleotide triplets are positively correlated between the somatic and meiotic sets. The major differences that do exist fall into three distinct categories: 1) The mutability is sharply higher at CG dinucleotides under meiotic but not somatic mutation. 2) The complementary triplets AGC and GCT are much more mutable under somatic than under meiotic mutation. 3) Triplets of the form WAN (W = T or A) are uniformly more mutable under somatic than under meiotic mutation. Nevertheless, the relative mutabilities both within this set and within the SAN (S = G or C) triplets are highly correlated with those under meiotic mutation. We also find that the somatic triplet specificity is strongly symmetric under strand exchange for A/T triplets as well as for G/C triplets in spite of the strong predominance of A over T mutations. Thus, we suggest that somatic mutation has at least two distinct components: one that specifically targets AGC/GCT triplets and another that acts as true catalysis of meiotic mutation.


    Introduction
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix 1
 References
 
During the first 2 wk of infection, the primary Ig repertoire is diversified by a hypermutation mechanism that introduces mutations at a rate ~6 orders of magnitude above background (1). Although some properties of somatic hypermutation (SH)3 have been well characterized, the mechanism by which Ig DNA is modified remains unknown and the molecules involved unidentified. Many different models have been proposed, including those involving gene conversion (2), reverse transcription (3), asymmetric error-prone replication (4, 5), error-prone repair (6), transcription-coupled repair (7), and strand-break repair (8) but none has yet proven convincing. Recent attempts to implicate specific gene products known to be involved in DNA metabolism using knockout mice have produced largely negative results (9, 10, 11, 12, 13) or have shown small effects (14, 15, 16). Similarly, studies involving human patients with identified DNA metabolism deficiencies (17, 18, 19) had negative results (for review, see 20).

Examination of the mutations introduced during SH has led to the formulation of complicated models involving multiple targeting mechanisms, including different mutators for A-T and G-C bp and multiple stages of processing (8, 16, 21, 22, 23, 24, 25, 26).

It has been recognized that SH exhibits microsequence dependence in both its targeting (27) and spectra (25). Similar microsequence dependence of mutation frequency and spectra has been shown to occur during neutral evolution (28, 29). The purpose of the present study was to investigate the relationships between the mechanisms underlying the accumulation of mutations during germline evolution and those accumulated during SH by comparing the characteristics of mutation targeting and spectra under meiotic mutation and under SH. A previous study (30) found differences in the T:A to C:G transition frequency and in the mutability4 of G between SH and meiotic processes and thus concluded that the mechanism introducing somatic mutations is different from that responsible for germline evolution. We have shown previously that the spectra of SH and meiotic mutation are different (25). We are here undertaking a more comprehensive study that might reveal similarities undetectable in previous studies and further characterize the differences.


    Materials and Methods
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix 1
 References
 
DNA sequence data: SH

We collected a data set comprised of 1721 mutations accumulated in nonfunctionally rearranged human Ig genes, murine 3' Ig V-flanking region DNA, and murine J–C intron DNA (31, 32, 33, 34, 35, 36, 37, 38). In all cases, the germline sequence is known; mutations were identified by comparison of each sequence with its corresponding germline sequence. Insertions and deletions were not treated in our analysis. Further details regarding this sequence collection can be found elsewhere (25, 31).

DNA sequence data: meiotic mutations

We collected a set of processed human pseudogenes by searching GenBank, release 111.0. Processed pseudogenes result from reverse transcription of mRNA from functional genes and the integration of the reverse- transcribed DNA into new chromosomal positions. These pseudogenes are usually integrated far from the parent gene and are therefore not transcribed and do not participate in gene conversion events (28, 39, 40). We then used a locally built version of the BLASTALL algorithm from National Center for Biotechnology Information to search the primate DNA database for sequences with homology to the processed pseudogenes. Only the pseudogenes for which the functional ortholog was unambiguously identified were kept for further analysis. When multiple pseudogenes of the same gene were available, we only used one in the analysis. We searched GenBank (using the BLAST program) for an ortholog of each gene in a species other than Homo sapiens. The accession of numbers of the genes in the final data set are given in Table IGo. Each group of two functional genes and a processed pseudogene we subjected to sequence alignment using the ClustalW program (http://www2.ebi.ac.uk/clustalw). From the obtained alignments, we inferred the state in the ancestor of the human gene and processed pseudogene at each nucleotide position according to the following rules (41): wherever the two human genes agreed, we assumed that they carry the ancestral state; where they did not agree, we turned to the second ortholog. If this ortholog agreed with any of the human genes, the ancestral state was assumed to be the one carried by two of the three genes. If the nucleotide was different in all three genes, we declared the ancestral state ambiguous and excluded that nucleotide position from the analysis. We also discarded positions where an insertion or deletion was identified in any of the three genes.


View this table:
[in this window]
[in a new window]
 
Table I. Pseudogenes and related orthologs used in this study1

 
Having identified the ancestral state, we then traversed the alignment and counted the number of occurrences of each of the 64 nucleotide triplets in the ancestral gene, as well as the number of instances in which the pseudogene carried a mutation at the central nucleotide of a triplet.

A given number of mutations in a triplet in a given pseudogene is the result of its intrinsic propensity to mutate as well as the divergence time between the gene and the pseudogene. A pseudogene may have a high mutation count because it contains highly mutable triplets or because it is very old. To account for these factors, we determined the relative age of the genes and adjusted the total triplet count in each pseudogene by the relative age of the pseudogene (see below).

The pooled mutation and adjusted total counts were used in the study of strand symmetry of the mutational mechanism and of the potential relation between triplet targeting in somatic and meiotic mutation. There were 2,261 mutations in 53,479 triplets.

Statistical models and methods

Our analyses are based on models for the acquisition of mutations in which the mutability of a given nucleotide depends on the microsequence motif that contains it. We consider two motif sizes: singlets and triplets. Models based on singlets account only for the identity of the target nucleotide itself, i.e., whether it is A, G, C, or T. Models based on triplets account for the identity of the target nucleotide and its immediate neighbors. In other words, we consider the mutability of XYZ where the target nucleotide Y is flanked by nucleotide X (5') and nucleotide Z (3').

Every nucleotide in the database is characterized by three factors: the type of mutation to which it has been exposed (somatic or meiotic), the sequence in which it is located, and the motif in which it is found. Each nucleotide, therefore, has probability pijk of being mutated, where the indices i, j, and k identify the mutational set, sequence number within the set, and motif, respectively. This probability is modeled as:

(1)
where {theta}ij is the effective time of exposure to mutation, or age, of the jth sequence in the ith set and µik is the mutability of the kth motif under the mutational process i. Although the times {theta} are not of interest to us, it is necessary to include them in the model for consistent comparison among sequences from different sources and for consistent pooling of data from diverse sources. We denote the total nucleotide count in class (i, j, k) by nijk and the number of mutations among those by mijk. Our analyses are based on the likelihood model given by

(2)
Specific hypotheses within the context of this model are expressed as constraints on the mutabilities µ. For example, the hypothesis that the mutabilities under meiotic and somatic processes are the same is expressed as µ1k = µ2k. The parameters {theta} and µ were estimated by maximizing the log likelihood, Eq. 1Go, subject to the constraints for the hypothesis under consideration and to the identifiability constraint on {theta}: {sum}jk{theta}ijnijk = {sum}jknijk, for both i. This constraint ensures that the mean "time of exposure" is normalized between sets.

Analyses using contingency tables or correlation tests (where counts over all sequences in a set are needed) were performed using pooled counts derived from the likelihood model and adjusted as follows. The total counts (mutated plus unmutated) for each motif, denoted ñi · k, are adjusted for consistent estimation: ñi · k = {sum}jijnijk, where ij is the maximum likelihood estimate for the effective time of exposure, {theta}ij.

We applied correlation tests designed to infer the correlation coefficient among the binomial parameters (proportions or probabilities) that underlie our count data. The data themselves also have binomial sampling variability, which is not correlated. Therefore, the task is somewhat more complicated than an ordinary (Pearson) correlation test, which, in addition, assumes normality and equality of variances. We have used two types of estimators: those that are designed to diminish the bias induced by the presence of binomial sampling by accounting for the excess variance and those that do not make this correction. The results of hypothesis testing, where the null hypothesis is that the correlation coefficient is zero, do not depend on this choice, but the numerical value of the estimated correlation coefficient does. All estimators use the fact that the triplets with greater total counts provide more reliable estimates of the underlying binomial parameter and must be weighted more heavily than those with few total counts. See Appendix for the formula defining the estimators.

We carried out the hypothesis testing on these estimators by randomly permuting the triplet labels on one of the sets in the paired data and reporting (as p) the quantile of the real estimated correlation coefficient among the estimators obtained using the permuted data.


    Results
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix 1
 References
 
Mutation targeting: complementation symmetry

To investigate the presence of strand bias in the mechanisms responsible for introducing mutations, we compared the mutabilities of motifs with those of their complements. The first-order model, in which mutability depends on the identity of the base itself but not on its neighbors, shows that the somatic set is highly asymmetric, with mutability at A almost twice that at T (Table IIGo). The G:C ratio is not nearly as high as that for A:T but is also significantly different from 1. The meiotic set does not show any evidence of complementation asymmetry. This result holds even when we exclude from the computation the sites that span CG dinucleotides.


View this table:
[in this window]
[in a new window]
 
Table II. Mutability ratio of complementary nucleotides

 
For the model in which both neighbors influence the mutability, we performed correlation tests comparing the mutability of a given triplet with that of its complement. Triplets were classified according to their central bases and analyzed separately to remove the clear asymmetry of the single-nucleotide rates.

We find that the correlations between triplets and their complements are extremely high under SH (Fig. 1Go) but not meiotic mutation. Tests of the correlation coefficient bear this out (Table IIIGo). Note, however, that if we include triplets spanning CG dinucleotides in the calculation of correlation coefficients for the germline set, we obtain a significant correlation for this set as well. We obtain similar results when we account for the binomial variance, although the values of the correlation coefficients are (as expected) higher: r = 0.83 (p < 10-4) for the somatic set with AGC/GCT excluded, and r = 0.74 (p = 0.12) for the meiotic set with CG dinucleotide-containing triplets excluded. The correlation becomes significant for the meiotic set as well if we include these motifs.



View larger version (16K):
[in this window]
[in a new window]
 
FIGURE 1. Mutability scatter plots: comparison of the estimated mutability under somatic hypermutation between nucleotide triplets (XYZ) and their complements (). The dashed line is the principal line. A, Triplets of the form XAZ, with A mutating compared to their complements with T mutating. For visual clarity, points are labeled with the XAZ member of the pair only. B, Triplets of the form XGZ, with G mutating compared to their complements with C mutating. The correlation apparent here was tested using the method described the Materials and Methods (see Table IIIGo).

 

View this table:
[in this window]
[in a new window]
 
Table III. Symmetry of microsequence specificity of mutation targeting: linear correlation coefficients for triplet-complement mutability pairs

 
Mutation targeting: meiotic/somatic comparison

To compare the microsequence mutability patterns in meiotic and somatic processes, we computed the log-likelihood differences between two models: one is the fully parameterized model in which the mutability for each triplet in each of the two sets is separately estimated, for a total of 128 mutability parameters (plus age parameters; see Materials and Methods). In the second model, all triplet mutabilities are assumed to be identical between the somatic and meiotic sets. The age parameters are still assumed independent and take up any differences in overall mutation rate.

Each nucleotide triplet contributes a term to the log-likelihood difference; the larger the term, the more poorly the assumption of equality between somatic and meiotic data sets accommodates that triplet (Fig. 2Go). We find that almost three-fourths of the log-likelihood difference is due to the following triplets (or motifs): triplets containing CG dinucleotides, AGC, and its complement GCT, and triplets of the form WAN, where W is T or A, N is any nucleotide. We estimated the contributions of each of these classes by amending the model to recognize the appropriate number of triplet classes. For example, to estimate the contribution of CG dinucleotides, the amended model recognizes two classes of triplets: those containing CG dinucleotides and those that do not. All of the triplets within a class are constrained to have the same ratio of somatic mutability to meiotic mutability. Each of the above classes therefore uses 1 df. The increase in log likelihood produced by the serial inclusion of each of these classes is: NCG/CGN, 115.5; AGC/GCT, 40.8; WAN, 49.7, out of a total likelihood difference (largest minus smallest model) of 291.3 (63 df). In sum, these 3 df (of 63) account for 206 of the total 291.3 log-likelihood difference.



View larger version (17K):
[in this window]
[in a new window]
 
FIGURE 2. Strength of deviation from equality of triplet mutability between somatic and meiotic sets. Triplets with A/T mutating are shown in A; triplets with G/C mutating are shown in B. Contributions to the log-likelihood difference from individual triplet motifs are drawn upward when the mutability is higher in the somatically mutated set and downward when the mutability is higher in the meiotically mutated set.

 
Scatterplot comparisons of triplet mutabilities between somatic and meiotic data sets largely corroborate these results and provide additional insights; the correlation between somatic and meiotic mutabilities stands out quite clearly (Fig. 3Go). For central nucleotide A, when triplets are grouped as above with WAN and SAN (S = G or C), the within-groups correlation stands out strongly. The observed patterns are further confirmed by computation of the linear correlation coefficients (Table IVGo). These were performed both for the complete data sets and as modified by the above considerations to remove the effects of those triplets that are clearly involved in processes unique to one set or the other and without taking into account binomial sampling variance. If we account for the binomial sampling, the estimated correlation coefficients become higher, but the p values are similar: r = 0.73(0.0004) and r = 0.55(0.01), depending on whether we do or we do not divide the triplets with A as the central nucleotide into WAN and SAN classes. Inspection of Fig. 3Go also reveals that, consistent with our findings of complementation symmetry, the triplets NTW, complementary to WAN, also have mutability higher than the triplets NTS. The effect is not as marked for T as it is for A, but this may be due to the smaller number of mutations at T.



View larger version (35K):
[in this window]
[in a new window]
 
FIGURE 3. Mutability scatter plots: comparison of the estimated mutability between triplets in the somatically mutated data set and those in the meiotically mutated data set. A, Triplets with A mutating; triplets with G, T, and C mutating are shown in B, C, and D, respectively. The solid line in each panel is the principle lines. In A, the dashed lines correspond to principle lines constructed independently for two groups of triplets: those of the form WAN and those of the form SAN (see text). Correlation coefficients are shown in Table IVGo. The error bars give the SE due to binomial sampling.

 

View this table:
[in this window]
[in a new window]
 
Table IV. Similarity between microsequence dependence of mutation targeting under meiotic and somatic processes: linear correlation coefficients for triplet mutabilities1

 
Mutation spectrum: complementation symmetry

We tested the complementation symmetry of the mutation spectrum conditioned only on the identity of the mutating base. For both the somatic and meiotic data, we constructed 2 x 2 x 3 contingency tables with mutating base classified as purine/pyrimidine and weak/strong, and resulting nucleotide as the transition partner, complement or transition partner’s complement (31), and tested for independence of the purine/pyrimidine classification and the resulting nucleotide (complementation symmetry). Both {chi}2 tests failed to provide any evidence for departures from complementation symmetry (meiotic: {chi}2 = 7.53, if we do not include mutations at CG dinucleotides and 8.20 if we do; somatic: {chi}2 = 6.14; none of these values is significant at the 0.05 level).

The microsequence dependence of the spectrum under somatic hypermutation is symmetric: the estimated common correlation coefficient for the rate of transitions and of transversions to the complement of the mutating base between a triplet and its complement is r = 0.43 (p = 0.001). This result also holds if we do not include the triplets that span CG dinucleotides; these triplets are extremely rare and their mutation counts are also very low. For the meiotic set, the estimated correlation coefficient with CG dinucleotides excluded is r = 0.23 (p = 0.12). Similar to what we observed in mutational targeting, if we include CG dinucleotides, the spectrum becomes symmetric in the meiotic case as well (r = 0.36, p = 0.003).

Mutation spectrum: meiotic/somatic comparison

When represented in terms relative to the mutating base, the mutation spectrum is strikingly consistent regardless of which base is mutating, for both meiotic and somatic processes (Fig. 4Go). The spectra are not the same between somatic and meiotic processes however (Fig. 4Go). Direct test of the spectrum conditional on the mutating base only shows very strong differences between meiotic and somatic mutation ({chi}2 = 14.42 (A), 35.68 (G), 22.02 (T), and 7.82 (C); with the exception of C, all other values are significant at the 0.01 level).



View larger version (61K):
[in this window]
[in a new window]
 
FIGURE 4. Comparison of the mutation spectra under somatic and meiotic mutation. Pie charts show the proportion of mutations to each of the corresponding bases. Colors indicate the biochemical relationship of the mutating nucleotide and the nucleotide resulting from the mutation: dark gray, transitions; medium gray, complements; light gray, complement of the transition.

 
The correlation coefficient between somatic and meiotic sets (computed as the combined triplet correlations as above for the symmetry tests) is not significantly different from zero (r = -0.03, p = 0.78).


    Discussion
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix 1
 References
 
We compared the characteristics of mutations introduced by SH to those introduced meiotically. To ensure that observed characteristics are due to the mutational process itself, we have minimized the effects of selection by choosing, where possible, DNA sequences that are not subject to selection. The SH data are from nonproductively rearranged Ig V genes and from introns flanking rearranged V genes. For the meiotic mutations, we have used processed pseudogenes. For these, we do not completely eliminate selection, since there is uncertainty in assignment of observed nucleotide differences to the pseudogene or its ortholog. We attempt to minimize this uncertainty by considering the state of each nucleotide site in a second ortholog, from a species other than human.

A marked asymmetry between the mutability under SH of thymidine and that of adenine has been noted previously and taken as evidence for strand bias of the hypermutation mechanism (42). We also find a higher mutability at A than at T and that this asymmetry is much greater than any singlet asymmetry under meiotic mutation. But we also find that when this overall mutability difference is factored out, the microsequence specificity at A is very similar to that at T (Fig. 1Go and Table IIIGo). Similar findings have been reported (23, 24) and used to justify the conclusion that both strands are targeted by SH and that two mechanisms, one strand-unbiased mutating G and C and the other strand-biased acting on A and T, operate. We find, however, that the triplet mutabilities are surprisingly complementation symmetric for both A/T and G/C mutations. In fact, once the single-nucleotide mutabilities have been taken into account, the triplet symmetry is evident for SH. The triplet symmetry appears in meiotic mutation depends strongly on whether the triplets that span CG dinucleotides are included in the calculation of the correlation coefficient. Thus, although we also conclude that there are two distinct components of SH targeting, we find that they share similar strand symmetry.

With certain well-defined exceptions, the sequence specificity of mutational targeting underlying meiotic and somatic mutations are significantly correlated. This is quite remarkable since the time scales over which these changes have accrued differ by about 7 orders of magnitude (about 1 mo for SH and on the order of a million years for meiotic mutation). This would be expected if mutations under SH are introduced by catalytic enhancement of the processes responsible for meiotic mutations. Thus, if a major proportion of mutations introduced during evolution occur at strand breaks, then SH hastens the introduction of these breaks, but they are introduced in the same places. In this sense, the reaction resembles true catalysis.

The differences in the triplet mutabilities between somatic and meiotic mutation are largely attributable to three effects: 1) The mutability of triplets containing CG dinucleotides is much higher under meiotic mutation than under SH. The mutability of CG dinucleotides is a well-understood consequence of the methylation of such dinucleotides (43). This excess mutability has been seen in studies of pseudogene-ortholog pairs (29) and in surveys of genetic lesions associated with human genetic disease (44). 2) The mutability of the triplet AGC and its complement GCT is considerably higher under SH than under meiotic mutation. This is the well-known serine hot spot (27, 36). 3) The mutability of triplets of the form WAN is higher under SH. The mutabilities of the triplets within each of the two subsets (WAN, SAN) are correlated with those in the meiotic data set. Although the pattern is weaker for T mutating, the complementary triplets NTW also segregate at higher mutability from the triplets NTS and both sets are correlated with the meiotic mutabilities. The overarching similarities between somatic and meiotic mutation targeting, punctuated sharply by specific differences suggests that two components are involved in the targeting: a "background" mechanism that has recruited and modified components of the DNA repair machinery, and a mechanism, perhaps novel, specific to AGC/GCT triplets (see below).

We also investigated the relationships between the mutation spectra under somatic and meiotic mutation. It was previously suggested that the two processes may be related because both result in an excess of transitions over transversions (22). We find, however, that the proportion of transitions is significantly smaller under SH. The effect of this is that the rate of replacement mutations is higher under SH and, consequently, so is the net rate of diversification. Both of these effects are consistent with diversification under SH being advantageous whereas mutations under meiotic mutation presumably are merely unavoidable.

We have previously shown that the mutation spectrum under SH is microsequence dependent: what a nucleotide mutates to is influenced by what its neighbors are (25). We compared this spectrum to that previously inferred from a set of meiotic mutations and found no correlations. That meiotic data set, however, combined information from triplets and their complements; furthermore, the mutations were inferred by a somewhat different process than the one we use here. The more comprehensive comparison here confirms the previous result: although there are significant effects of neighboring nucleotides on the mutation spectrum in both meiotic and somatic processes, the triplet dependencies are uncorrelated.

The following model is consistent with the findings thus far, though it is certainly not uniquely so. An initial lesion is created in the dsDNA. The targeting at this point is symmetric: sense strand XAZ is affected just as frequently as sense strand . This occurs naturally if the lesion is a double-strand break, consistent with the findings of Sale and Neuberger (8). In fact, the complementation symmetry of targeting even suggests a staggered cut. In a blunt cut, the complementary nucleotides are not in equivalent states: one is 3' of the break and the other is 5' of it. A staggered cut that also breaks the base pairing leaves the two nucleotides both 5' or both 3' of the break, though now on opposite sides of it. Furthermore, both are unpaired and overhanging. Note that now the apparent strand asymmetry can now be viewed as the asymmetry between the DNA 5' and 3' of the break. The probability that religation is mutagenic now depends on which side of the break the purine is on, with the probability of mutagenic repair higher if the purine is on the plus strand. This would result if, for example, purines are more susceptible to excision when overhanging and gaps in the plus strand (or 5' of the double-stranded break) are less likely to be repaired correctly.

Several studies have found reduced mutation rates in mismatch repair-deficient mice (11, 14, 16) and relative enhancement of mutations at the AGC/GCT hot spots (16) or at G and C bases (13, 15). Rada et al. (16) inferred from this observation that the mutator has two components, one that is dependent on the mismatch repair protein MSH-2 and another that is MSH-2 independent. We concur and suggest that MSH-2 is responsible for introducing lesions as described above and leaves the signature of catalytically enhanced meiotic mutation. A second component, as yet unidentified, is targeted specifically at AGC/GCT triplets or at the palindomic quadruplet AGCT (L. G. Cowell and T. B. Kepler, manuscript in preparation), which contains both triplet motifs, and introduces lesions preferentially at these sites. One candidate for the unknown molecule is a modified site-specific methylase. Other groups have hypothesized the presence of a two-component mutator (21, 22, 23, 24), consistent with the observation that G and C are mutated more frequently in the murine cell line 18-81 (26) and the Burkitt lymphoma line Ramos (8). Furthermore, the G · C-targeting component is argued to have arisen first (or been co-opted first by SH) (22), consistent with the observations that AGC/GCT or G and C are preferentially targeted in shark (45) and Xenopus (46).

The identity of the molecules involved in somatic hypermutation will surely be revealed soon, but even after their names are known, it will remain to learn how they do what they do. For this task, careful analysis of the mutation patterns will be essential.


    Appendix 1
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix 1
 References
 
Estimator for the correlation coefficient among binomial proportions

The model underlying the data analysis is that of two sets of mutabilities which are linearly correlated and which give rise to binomial (count) data. The task is to estimate the linear correlation coefficient. The difficulty is that the binomial sampling variability is independent (i.e., uncorrelated); it is only the indirectly observed mutabilities that are correlated. The estimation is as follows.

The adjusted counts for each motif k are designated by nik where i = 1, 2 is the group index (somatic or meiotic; triplet or complement), and k designates the motif. Similarly, mik denotes the number of mutated occurrences of motif k in group i. For each of the four nucleotides, the number of triplets is denoted by K. The dot denotes summation over the respective coefficient.

The estimators for the correlation coefficients are computed as:

(3)
where

(4)

(5)
and

(6)


    Acknowledgments
 
We thank Claudia Berek and Latham Claflin for sharing unpublished data.


    Footnotes
 
1 Partially supported by National Science Foundation Award MCB 9357637 (to T.B.K.) and by National Institutes of Health Grant AI28433 (to A.P. and M.O.). Portions of the work were done under the auspices of the U.S. Department of Energy. During much of this work T.B.K. and L.G.C. were in the Biomathematics Program, Department of Statistics, North Carolina State University, Raleigh, NC. Back

2 Address correspondence and reprint requests to Dr. Thomas B. Kepler, The Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501. Back

3 Abbreviation used in this paper: SH, somatic hypermutation. Back

4 We use the term "mutability" rather than "mutation rate" to emphasize its role as a property of the DNA sequence itself. Back

Received for publication December 8, 1999.
    References
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix 1
 References
 

  1. McKean, D. M., K. Huppi, M. Bell, L. Staudt, W. Gerhard, M. Weigert. 1984. Generation of antibody diversity in the immune response of BALB/c mice to influenza virus hemagglutinin. Proc. Natl. Acad. Sci. USA 81:3180.[Abstract/Free Full Text]
  2. Maizels, N.. 1989. Might gene conversion be the mechanism of somatic hypermutation of mammalian Ig genes?. Trends Genet. 5:4.[Medline]
  3. Steele, E., J. Pollard. 1987. Hypothesis: somatic mutation by gene conversion via the error prone DNA -> RNA -> DNA information loop. Mol. Immunol. 24:667.[Medline]
  4. Manser, T.. 1990. The efficiency of antibody maturation; can the rate of B cell division be limiting?. Immunol. Today 11:305.[Medline]
  5. Rogerson, B., J. Hackett, A. Peters, D. Haasch, U. Storb. 1991. Mutation pattern of immunoglobulin transgenes is compatible with a model of somatic hypermutation in which targeting of the mutator is linked to the direction of DNA replication. EMBO J. 10:4331.[Medline]
  6. Gearhart, P. J., D.F. Bogenhagen. 1983. Clusters of point mutations are found exclusively around rearranged antibody variable genes. Proc. Natl. Acad. Sci. USA 80:3439.[Abstract/Free Full Text]
  7. Peters, A., U. Storb. 1996. Somatic hypermutation of immunoglobulin genes is linked to transcription initiation. Immunity 4:57.[Medline]
  8. Sale, J. E., M. Neuberger. 1998. TdT-accessible breaks are scattered over the immunoglobulin V domain in a constitutively hypermutating B cell line. Immunity 9:859.[Medline]
  9. Shen, H., D. Cheo, E. Friedberg, U. Storb. 1997. The inactivation of the xpc gene does not affect somatic hypermutation or class switch recombination of immunoglobulin genes. Mol. Immunol. 34:527.[Medline]
  10. Zheng, B., S. Han, E. Spanopoulou, G. Kelsoe. 1998. Immunoglobulin gene hypermutation in germinal centers is independent of the RAG-1 V(D)J recombinase. Immunol. Rev. 162:133.[Medline]
  11. Winter, D., Q. Phung, A. Umar, S. Baker, R. Tarone, K. Tanaka, R. Liskay, T. Kunkel, V. Bohr, P. Gearhart. 1998. Altered spectra of hypermutation in antibodies from mice deficient for the DNA mismatch repair protein PMS2. Proc. Natl. Acad. Sci. USA 95:6953.[Abstract/Free Full Text]
  12. Frey, S., B. Bertocci, F. Delbos, L. Quint, J.-C. Weill, C.-A. Raynaud. 1998. Mismatch repair deficiency interferes with the accumulation of mutations in chronically stimulated B cells and not with the hypermutation process. Immunity 9:127.[Medline]
  13. Jacobs, H., Y. Fukita, G. van der Horst, J. de Boer, G. Weeda, J. Essers, N. de Wind, B. Engelward, L. Samson, S. Verbeek, et al 1998. Hypermutation of immunoglobulin genes in memory B cells of DNA repair-deficient mice. J. Exp. Med. 187:1735.[Abstract/Free Full Text]
  14. Cascalho, M., J. Wong, C. Steinberg, M. Wabl. 1998. Mismatch repair co-opted by hypermutation. Science 279:1207.[Abstract/Free Full Text]
  15. Phung, Q., D. Winter, A. Cranston, R. Tarone, V. Bohr, R. Fishel, P. Gearhart. 1998. Increased hypermutation at G and C nucleotides in immunoglobulin variable genes from mice deficient in the MSH2 mismatch repair protein. J. Exp. Med. 187:1745.[Abstract/Free Full Text]
  16. Rada, C., M. M. R. Ehrenstein, M. Neuberger, C. Milstein. 1998. Hot spot focusing of somatic hypermutation in MSH2-deficient mice suggests two stages of mutational targeting. Immunity 9:135.[Medline]
  17. Sack, S., Y. Liu, J. Germain, N. Green. 1998. Somatic hypermutation of immunoglobulin genes is independent of the bloom’s syndrome DNA helicase. Clin. Exp. Immunol. 112:248.[Medline]
  18. Wagner, S., M. Neuberger. 1996. Somatic hypermutation of immunoglobulin genes. Annu. Rev. Immunol. 14:441.[Medline]
  19. Kim, N., K. Kage, F. Matsuda, M. Lefranc, U. Storb. 1997. B lymphocytes of xeroderma pigmentosum or Cockayne syndrome patients with inherited defects in nucleotide excision repair are fully capable of somatic hypermutation of immunoglobulin genes. J. Exp. Med. 186:413.[Abstract/Free Full Text]
  20. Harris, R. S., Q. Kong, N. Maizels. 1999. Somatic hypermutation and the three R’s: repair, replication and recombination. Mutat. Res. 436:157.[Medline]
  21. Spencer, J., M. Dunn, D. Dunn-Walters. 1999. Characteristics of sequences around individual nucleotide substitutions in Igvh genes suggest different GC and AT mutators. J. Immunol. 162:6596.[Abstract/Free Full Text]
  22. Diaz, M., J. Velez, M. Singh, J. Cerny, M. Flajnik. 1999. Mutational pattern in the nurse shark antigen receptor gene (NAR) is similar to mammalian Ig and to spontaneous mutations in evolution: the translesion synthesis model of somatic hypermutation. Int. Immunol. 11:825.[Abstract/Free Full Text]
  23. Dörner, T., S. Foster, N. Farner, P. Lipsky. 1998. Somatic hypermutation of human immunoglobulin heavy chain genes: targeting of RGYW motifs on both DNA strands. Eur. J. Immunol. 28:3384.[Medline]
  24. Milstein, C., M. Neuberger, R. Staden. 1998. Both DNA strands of antibody genes are hypermutation targets. Proc. Natl. Acad. Sci. USA 95:8791.[Abstract/Free Full Text]
  25. Cowell, L., T. Kepler. 1999. The nucleotide-replacement spectrum under somatic hypermutation exhibits microsequence dependence that is strand-symmetric and distinct from that under germline mutation. J. Immunol. 164:1971.[Abstract/Free Full Text]
  26. Bachl, J., M. Wabl. 1996. An immunoglobulin mutator that targets G.C base pairs. Proc. Natl. Acad. Sci. USA 93:851.[Abstract/Free Full Text]
  27. Rogozin, I., N. Kolchanov. 1992. Somatic hypermutagenesis in immunoglobulin genes. II. Influence of neighboring base sequences on mutagenesis. Biochim. Biophys. Acta 1171:11.[Medline]
  28. Bains, W.. 1992. Local sequence dependence of rate of base replacement in mammals. Mutat. Res. 267:43.[Medline]
  29. Hess, S., J. Blake, R. Blake. 1994. Wide variations in neighbor-dependent substitution rates. J. Mol. Biol. 236:1022.[Medline]
  30. Golding, G., P. Gearhart, B. Glickman. 1987. Patterns of somatic mutations in immunoglobulin variable genes. Genetics 115:169.[Abstract/Free Full Text]
  31. Cowell, L., H. Kim, T. Humaljoki, C. Berek, T. Kepler. 1999. Enhanced evolvability in immunoglobulin V genes under somatic hypermutation. J. Mol. Evol. 49:23.[Medline]
  32. Brezinschek, H. P., R.I. Brezinschek, P. Lipsky. 1995. Analysis of the heavy chain repertoire of human peripheral B cells using single-cell polymerase chain reaction. J. Immunol. 155:190.[Abstract]
  33. Weber, J. S., J. Berry, T. Manser, J. L. Claflin. 1994. Mutations in Ig V(D)J genes are distributed asymmetrically and independently of the position of V(D)J. J. Immunol. 153:3594.[Abstract]
  34. Weber, J. S., J. Berry, S. Litwin, J.L. Claflin. 1991. Somatic hypermutation of the JC intron is markedly reduced in unrearranged {kappa} and H alleles and is unevenly distributed in rearranged alleles. J. Immunol. 146:3218.[Abstract]
  35. Wu, P., L. Claflin. 1999. Promoter-associated displacement of hypermutations. Int. Immunol. 10:1131.[Abstract/Free Full Text]
  36. Smith, D., G. Creadon, P. Jena, J. Portanova, B. Kotzin, L. Wysocki. 1996. Di- and trinucleotide target preferences in somatic mutagenesis in normal and autoreactive B cells. J. Immunol. 156:2642.[Abstract]
  37. Rickert, R., S. Clarke. 1993. Low frequencies of somatic mutation in two expressed V{kappa} genes: unequal distribution of mutation in 5' and 3' flanking regions. Int. Immunol. 5:255.[Abstract/Free Full Text]
  38. Weber, J. S., J. Berry, T. Manser, J. L. Claflin. 1991. Position of the rearranged V{kappa} and its 5' flanking sequences determines the location of somatic mutations in the J{kappa} locus. J. Immunol. 146:3652.[Abstract]
  39. Ophir, R., T. Itoh, D. Graur, T. Gojobori. 1999. A simple method for estimating the intensity of purifying selection in protein-coding genes. Mol. Biol. Evol. 16:49.[Abstract]
  40. Li, W.. 1997. Molecular Evolution Sinauer Associates, Inc, Sunderland, MA.
  41. Li, W., C. Wu, C. Luo. 1984. Nonrandomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications. J. Mol. Evol. 21:58.[Medline]
  42. Lebecque, S., P. Gearhart. 1990. Boundaries of somatic mutation in rearranged immunoglobulin genes: 5' boundary is near the promoter, and 3' boundary is approximately 1 kb from V(D)J gene. J. Exp. Med. 172:1717.[Abstract/Free Full Text]
  43. Bird, A.. 1980. DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res. 8:1499.[Abstract/Free Full Text]
  44. Cooper, D., H. Youssoufian. 1988. The CpG dinucleotide and human genetic disease. Hum. Genet. 78:151.[Medline]
  45. Wilson, M., E. Hsu, A. Marcuz, L. Courtet, L. Du Pasquier, C. Steinberg. 1992. What limits affinity maturation of antibodies in Xenopus: the rate of somatic mutation or the ability to select mutants?. EMBO J. 11:4337.[Medline]
  46. Hinds-Frey, K., H. Nishikata, R. Litman, G. Litman. 1993. Somatic variation precedes extensive diversification of germline sequences and combinatorial joining in the evolution of immunoglobulin heavy chain diversity. J. Exp. Med. 178:815.[Abstract/Free Full Text]



This article has been cited by other articles:


Home page
J. Biol. Chem.Home page
J. Zheng, J. Huang, Y. Mao, S. Liu, X. Sun, X. Zhu, T. Ma, L. Zhang, J. Ji, Y. Zhang, et al.
Immunoglobulin Gene Transcripts Have Distinct VHDJH Recombination Characteristics in Human Epithelial Cancer Cells
J. Biol. Chem., May 15, 2009; 284(20): 13610 - 13619.
[Abstract] [Full Text] [PDF]


Home page
Int ImmunolHome page
U. Hershberg, M. Uduman, M. J. Shlomchik, and S. H. Kleinstein
Improved methods for detecting selection by mutation analysis of Ig V region sequences
Int. Immunol., May 1, 2008; 20(5): 683 - 694.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
U. Hershberg and M. J. Shlomchik
Differences in potential for amino acid change after mutation reveals distinct strategies for {kappa} and {lambda} light-chain variation
PNAS, October 24, 2006; 103(43): 15963 - 15968.
[Abstract] [Full Text] [PDF]


Home page
J. Immunol.Home page
F. Yang, G. C. Waldbieser, and C. J. Lobb
The Nucleotide Targets of Somatic Mutation and the Role of Selection in Immunoglobulin Heavy Chains of a Teleost Fish
J. Immunol., February 1, 2006; 176(3): 1655 - 1667.
[Abstract] [Full Text] [PDF]


Home page
J. Immunol.Home page
I. B. Rogozin and M. Diaz
Cutting Edge: DGYW/WRCH Is a Better Predictor of Mutability at G:C Bases in Ig Hypermutation Than the Widely Accepted RGYW/WRCY Motif and Probably Reflects a Two-Step Activation-Induced Cytidine Deaminase-Triggered Process
J. Immunol., March 15, 2004; 172(6): 3382 - 3384.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
Y. I. Pavlov, I. B. Rogozin, A. P. Galkin, A. Y. Aksenova, F. Hanaoka, C. Rada, and T. A. Kunkel
Correlation of somatic hypermutation specificity and A-T base pair substitution errors by DNA polymerase eta during copying of a mouse immunoglobulin kappa light chain transgene
PNAS, July 23, 2002; 99(15): 9954 - 9959.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Request Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Oprea, M.
Right arrow Articles by Kepler, T. B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Oprea, M.
Right arrow Articles by Kepler, T. B.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS