|
|
||||||||
Edward Jenner Institute for Vaccine Research Compton, Berkshire, United Kingdom
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Definition of the peptides that are recognized by T cells responding to infections and tumors is of utility for evaluation of immunity during natural responses and also facilitates analysis of T cell responses after vaccination. Identification and manipulation of the peptide epitopes recognized during the natural response to an Ag can also enable the design of peptides that not only mimic, but actually improve upon, the natural immunogen (2). Such heteroclitic peptides are now finding application in vaccine design (3). Conversely, understanding of the peptide-MHC interactions involved in immunopathological T cell responses (e.g., in autoimmunity, allergy, or transplant rejection) can enable the design of altered peptide ligands that either antagonize or block undesirable responses. For example, an analog peptide was shown to inhibit the autoimmune demyelinating disease experimental allergic encephalomyelitis by antagonizing the CD4+ T cell response to a highly encephalitogenic peptide (4, 5), and a blocking peptide was demonstrated to prevent autoimmune insulin-dependent diabetes mellitus by inhibiting the expansion of autoreactive CTL (6), both in murine models.
Although the utility of manipulating MHC-peptide-TCR interactions has been recognized for some time, many studies continue to identify epitopes or design competitor peptides in a random way, by screening peptide libraries, rather than enhancing affinity in a rational way. By using in silico techniques to understand the basis of activity, affinity can, through a cyclic process, be improved by making incremental changes to the peptide structure. In this paper we exemplify an in silico method for the analysis and prediction of peptide-MHC binding affinity that is able to direct affinity enhancement in a rational or guided manner.
A great variety of methods exist for peptide-MHC binding prediction (for a recent review, see Ref.7). Some involve the identification of so-called binding motifs (8), which characterize the peptide specificity of MHC alleles in terms of dominant anchor positions with strong preferences for a highly restricted amino acid set. It is well known, for example, that the best-understood human class I allele, HLA-A*0201, has anchor residues at peptide positions P2 (accepting leucine and methionine) and P9 (accepting valine and leucine). Motifs are widely exploited, being simple to use and to understand. There are fundamental problems with motif-based epitope prediction methods, however, as they produce significant numbers of both false positives and false negatives and are overly reliant on the choice of anchors. Subsequently, much more sophisticated methods have arisen (7). These have included empirical methods, such as de Groots EpiVax methodology (9), Artificial Neural Networks (10), Hidden Markov Models (11), Support Vector Machines (12), and Profiles (13).
Recently, we have applied, in a systematic fashion, a novel bioinformatics approach to the problem of affinity prediction based on quantitative structure-activity relationship (QSAR)5 methods (14). QSAR analysis is a successful and widely used strategy for designing compounds with desired biological properties and is based on the assumption that the biological activity of a chemical entity, such as a peptide, depends on its structure. Properly used, this strategy can save large amounts of laboratory-based experimental work.
We have developed an additive QSAR method for peptide-MHC binding (15), based on the concept, defined by Free and Wilson (16), that each substituent makes an additive and constant contribution to the biological activity regardless of variation in the rest of the molecule. Parkers hypothesis (17) for the independent binding of side chains (IBS hypothesis) is also based on this concept. According to the additive method, the binding affinity of a peptide can be represented as the sum of amino acid contributions at each position. We extended the classical Free-Wilson model with terms accounting for interactions between amino acid side chains. This method was applied to peptides binding to several class I (18, 19) and class II (20) alleles using data from the literature (www.jenner.ac.uk/JenPep). The derived models are available free on the Internet (www.jenner.ac.uk/MHCPred) and can be used for binding affinity predictions (21).
Working solely with data from the literature has a number of disadvantages. Peptides are highly biased in terms of their position-dependent amino acid composition, often favoring hydrophobic sequences. This arises, in part, from preselection processes that result in self-reinforcement. Binding motifs are often used to reduce the experimental cost of epitope identification. Very sparse sequence patterns are matched, and the corresponding subset of peptides tested, with an enormous resulting reduction in sequence diversity. This bias is more prominent at the anchor positions, which usually have extremely restricted sets of amino acid types. In addition, when working solely with literature data it is not possible to test the predicted binding affinities of newly designed peptides.
In this study we determined the binding affinities of a set of 90 nonamer peptides to the MHC class I allele HLA-A*0201 using an in-house, FACS-based, MHC stabilization assay (22). From these data we then derived an additive QSAR model for peptide interaction with HLA-A*0201. HLA-A*0201 is one of the most frequent class I alleles in many different populations (23). Peptides that bind to this allele are 811 aa in length and, as noted above, have two main anchor residues at positions 2 and at the C-terminal end (24). Generally speaking, the presence of anchors is deemed to be necessary, but not sufficient, for high affinity binding. Prominent roles for several other positions (1, 3, 7), so-called secondary anchor residues, are also well known (25). We used our QSAR model to reassess the preferred amino acids at each position and to design new A2 binding peptides. The top 10 high binders, as predicted, were tested. All showed extremely high binding affinity, 2 orders of magnitude greater than the highest value from the initial training set. Furthermore, the importance of the primary anchors was tested by systematically evaluating the A2 binding affinities of monosubstituted variants of the best newly designed binder.
| Materials and Methods |
|---|
|
|
|---|
The initial training set included 90 peptides (Table I). Eighty-eight of them had known binding affinities (IC50) in the range from 105 to 109 M and were originally assessed by a quantitative assay based on the inhibition of binding of a radiolabeled standard peptide to detergent-solubilized MHC molecules (25, 26) and presented as logIC50 (pIC50). Known peptides were selected from JenPep (www.jenner.ac.uk/JenPep) (27, 28). Fifteen of them were low binders (pIC50, <6.301; IC50, >500 nM; Table I; peptides 115), 33 intermediate binders (7.301> pIC50 >6.301; 50 nM< IC50 <500 nM; peptides 1648), and 40 high binders (pIC50, >7.301; IC50, <50 nM; peptides 4888). Two variants of the best binder 88 were included as well (peptides 89 and 90). All peptides used in the present study were ordered from Mimotopes (Pensby, U.K.). The test set was designed to include the top 10 predicted by the QSAR model to be high binders. Thirty-eight monosubstituted variants of the best newly designed binder were then tested to assess the importance of anchor positions 2 and 9.
|
Peptide binding to HLA-A2 was assessed using a FACS-based MHC stabilization assay (29) with modifications as described previously (22). Briefly, T2 cells were incubated in 96-well, flat-bottom plates at 2 x 105 cells/well in a 200-µl volume of AIM V medium (Life Technologies, Paisley, U.K.) with human
2-microglobulin at a final concentration of 100 nM (Scipac, Sittingbourne, U.K.) with and without peptides at concentrations between 200 and 0.04 µM for 16 h at 37°C. Cells were then washed, and surface levels of HLA-A2 were assessed by staining with FITC-conjugated, A2.1-specific mAb BB7.2 (BD Biosciences, Oxford, U.K.) or an FITC-conjugated isotype control Ab (BD Biosciences). Cells were fixed at 4°C in 4% paraformaldehyde and analyzed on a FACSCalibur (BD Biosciences) using CellQuest software. Results are expressed as fluorescence index (FI) values. These were calculated as the test mean fluorescence intensity (MFI) minus the no peptide isotype control MFI divided by the no peptide HLA-A2-stained control MFI minus the no peptide isotype control MFI. The half-maximal binding level (BL50), which is the peptide concentration yielding the half-maximal FI of the reference peptide in each assay, was calculated and presented as pBL50 (logBL50). The HLA-A2 high binder FLPSDFFPSV (IC50 = 2.6 nM) (30) was used as a reference peptide.
Additive method
The additive method for binding affinity prediction was described in detail previously (15). Briefly, the binding affinity of a nonamer is represented by equation 1:
![]() |
![]() |
![]() |
![]() |
![]() |
We used PLS as implemented in the QSAR module of SYBYL6.9 (Tripos, St. Louis, MO). The scaling method was set at none. The column filtering was switched off. The optimal number of components was found by leave-one-out cross-validation. The predictive power of the model was assessed by the cross-validated coefficient q2. Outliers with residuals above 1 log unit were excluded, and the model was rederived. The non-cross-validated model was assessed by the explained variance r2 and was used to predict the binding affinity of newly designed peptides.
| Results |
|---|
|
|
|---|
Using data solely derived from the literature for development of peptide-MHC binding models has significant limitations. One can explore only the role at each particular position of those amino acids already present in the data. Moreover, the models can only be validated using cross-validation, or arbitrarily selected artificial test sets, rather than by evaluating truly independent, or blind, test sets, and the affinity of newly designed high binding peptides cannot be determined. To overcome these limitations, we established a FACS-based assay for measuring peptide binding affinities experimentally. Initially we used this assay to measure the HLA-A*0201 binding affinities of a set of peptides whose binding to HLA-A*0201, as assessed using radiolabeled competition assays, has been reported in the literature. The peptides were chosen to cover a full range of measurable affinities. Dose/FI curves were created for each peptide and presented as semilogarithmic plots (Fig. 1). The high binders have low BL50 values (high pBL50, pBL50 = logBL50), and the low binders have high BL50 values (low pBL50). Peptides that did not reach 50% of the binding level of the reference peptide were considered nonbinders. There were nine nonbinders in the training set.
|
|
The measured BL50 values were used to build an additive QSAR model for peptide binding affinity to the HLA-A*0201 molecule. There were 41 absent amino acids for all nine positions. Most of the missing amino acids were for positions 2 and 9. The initial matrix consisted of 140 columns (1y + 139x variables) and 81 rows (peptides). Peptides 17, 28, and 44 gave residuals between experimental and predicted, by leave-one-out cross-validation, pBL50 values of >1 log unit. They were excluded from the training set as outliers. The final model had q2 = 0.602, r2 = 0.954, and number of components = 6. This model was used to analyze the amino acid preferences at each position in the peptide sequences and to design a set of high binders.
The contributions of the amino acids at each of the nine positions according to the additive model are given in Fig. 3. The preferred amino acids for position 1 are Ile and Phe. His is deleterious at this position. Leu and Met are the preferred amino acids for position 2, whereas Thr is deleterious. Asp, Trp, and Phe are favored at position 3, whereas Glu, Ser, Gln, and Met are deleterious in this study. Pro and Asp are well accepted at position 4, whereas Val, Phe, and Ser are not preferred. Phe is the best contributing amino acid at position 5, and Pro, Ile, Leu, Asp, and Arg give small positive contributions to the affinity. At position 6, Pro, Val, and Tyr are preferred, whereas Ile, Ala, Gln, and Leu are deleterious. At position 7, Val and Pro are favored, whereas Thr is disfavored. Glu, Thr, and Asp are well accepted at position 8, whereas Ile, Ala, Val, and Met are not. The only acceptable residue at position 9 is Val.
|
The derived QSAR model was then used to design peptides with very high HLA-A2 binding affinities. For this purpose we combined the preferred amino acids at each position. For certain positions (1, 2, 3, 4, 5, 9) there were clear leaders, but for other positions (6, 7, 8) a wider range of amino acids was acceptable. It is well known that peptide positions 2 and 9 (C terminal) are primary anchors for binding to the HLA-A*0201 molecule (24). The side chains of these residues occupy pockets B and F, respectively, in the MHC binding groove (31). Positions 1, 3, 6, and 7 are considered secondary anchors (25) that bind to pockets A, D, C, and E, respectively (31). Positions 4 and 8 are named flag positions because of their solvent-exposed orientation and possible interactions with the TCR (31).
We selected Leu for position 2 and Val for position 9 as anchors. For position 1, Ile and Phe were selected; for position 3, Phe, Asp, and Trp were selected; for position 4, Pro and Asp were selected; for position 5, Phe, Leu, and Ile were selected; for position 6, Pro, Val, and Phe were selected; for position 7, Pro, Val, and Ile were selected; and for position 8, Pro, Glu, Thr, Asp, and Ser were selected. The combination of all preferred amino acids generated 1620 peptides. Their affinities were predicted by the additive model, and the affinities of the top 10 high binders were tested experimentally. The test peptides and their predicted and experimental affinities are given in Table II (peptides 91100). A good correlation between both affinities was found, rpred = 0.683 (Fig. 4). Notably, these 10 peptides all had BL50 values higher than those of the best peptides in the training set, with the pre-eminent test peptide 93 having a measured binding affinity >2 orders of magnitude greater than that of the best binder from the training set.
|
|
We reasoned that an optimized high binder might bear nonpreferred amino acids at the anchor positions and retain adequate measurable affinity. To test this hypothesis, a set of variants of the best binding peptide (ILDPFPVTV), monosubstituted at positions 2 and 9, was designed and tested. The experimental pBL50 values are shown in Table III (peptides 101138). Peptides with pBL50 >5.000 were considered high binders, these with pBL50 between 4.000 and 5.000 as intermediate binders, and those with pBL50 <4.000 as low binders. Peptides with pBL50 <3.000 are nonbinders. Among the 19 variants for position 2 there were 11 high, four intermediate, one low, and three nonbinders. The variants at position 9 gave 11 high, three intermediate, three low, and two nonbinders. This analysis showed that high affinity MHC binding can be achieved in the absence of the amino acids normally preferred at anchor positions.
|
| Discussion |
|---|
|
|
|---|
The approach we have developed, which we have called the additive method, is an example of a QSAR technique. QSAR procedures are a powerful, if underused, bioinformatic tool for in silico prediction. QSAR has found much application, however, in computational drug design, where it can function as an engine of either interpolation or extrapolation. In interpolation, it can describe the properties of novel or extant molecules, peptides in our case, within a window of measured properties, but it can also be used to explore beyond those boundaries, most often being used to enhance binding affinity. The property of extrapolation into novel property space is the one we have exploited in this study.
In previous papers we have shown that the interpolative powers of our approach work effectively, capturing the essence of predictivity (15, 18, 19, 20). We demonstrate in this study that similar techniques can be used to effectively increase binding affinity in a rational and directed manner, allowing us to design a series of so-called superbinders and, in turn, to use these to explore the effect of systematic substitution of dominant anchor positions. The list of tolerated anchors can be extended to a much larger set than has been commonly envisaged. This has implications for both our understanding of the role anchor residues play in peptide binding to MHCs and the relative effectiveness of binding motif and more sophisticated models of binding, such as in silico prediction devices.
Although our QSAR method has a tendency to underestimate the predicted BL50 values (i.e., the predicted BL50 values are lower than the experimental), it is able to distinguish accurately the proper amino acid preferences at each position in the peptide. The test peptides have strong amino acid preferences at position 1 (Ile), position 2 (Leu), position 3 (Asp), position 6 (Pro), and position 9 (Val). These positions are well known primary and secondary anchors. They fit into pockets A, B, D, C, and F, respectively, on the MHC molecule. Variations at positions 4, 5, 7, and 8 are allowed. The amino acids at these positions either do not fit any pocket (positions 4, 5, and 8) or bind a shallow pocket (position 7 corresponds to pocket E). The top three high binders (pBL50, >8.000) exhibit variation only at position 7. All the newly designed peptides are high binders with BL50 values higher than the highest BL50 value in the training set. The best binder from the test set is peptide ILDPFPVTV. Its pBL50 value is 8.654, which is >2 orders of magnitude higher than the pBL50 value of the best binder from the training set (peptide YLFPGPVTA with pBL50 of 6.305) and may be the highest binding affinity ever cited in the literature.
The rational design of high or superbinding peptides is a technique with wide application in a variety of immunological settings. Our current results are a further vindication of the utility of our approach in the prediction of peptide-MHC binding affinity, the principal prerequisite for proteinacious epitopes. Peptides presented by HLA-A2, in particular, would be useful from a vaccination standpoint as they would give responses in a high proportion of the HLA-diverse population. Perhaps more important, however, is the demonstrated ability of this approach to engineer epitopes with special properties dependent on enhanced affinity. These might include augmenting the immunogenicity of potential cancer vaccines derived from cancer Ag epitopes or designing high affinity epitopes, responses to which are reported to be less dependent on CD4 help (5). Alternatively, one could design effective and efficacious competitor peptides able to block detrimental responses, as has been done in a murine diabetes model (6).
The experimental pBL50 values of the monosubstituted variants showed that an extremely good binder can tolerate a wide range of amino acids at the anchor positions. Only Glu, Lys, and Arg at position 2 and Asp and Arg at position 9 lead to a total loss of affinity, whereas Gly at position 2, and Glu, His, and Tyr at position 9 give low binding. The most preferred amino acids at positions 2 and 9 are Leu and Val, respectively, followed by Met, Ile, and Val at position 2, and Leu, Ile, Gly, and Ala at position 9. However, many other amino acids, previously thought nonoptimal, are also tolerated. This extensive tolerance at the anchor positions further strengthens our view that peptide-MHC molecule affinity is consequence of the whole peptide structure, not simply of anchor residues. This is manifest as a complicated ensemble of multiple amino acid interactions with the MHC molecule, arising from all parts of the peptide, not just primary and secondary anchors.
Moreover, a possible synergistic effect may also operate between amino acids at different positions on the peptide, this synergism perhaps explaining the underestimation of the predicted affinities of these high binders. Indeed, these superbinders have much higher affinities than a simple sum of amino acid contributions from different positions might suggest. This phenomenon is an example of enthalpic cooperativity or so-called enthalpy-entropy compensation (32). Generally, where multiple weak noncovalent interactions hold a molecular complex together, the enthalpy of all the individual intermolecular bonding interactions is weakened by extensive intermolecular motion. The noncovalent complex between a peptide and a protein is an excellent example of such a system. As additional interaction sites generate a more strongly bound complex, intermolecular motion is dampened, with all individual interactions becoming more favorable. Experimentally, at least for other systems, the trade-off between intermolecular motion and enthalpic interactions accounts has been shown to account for the way in which entropy and enthalpy compensate for each other.
We have demonstrated that systematic monosubstitution of high binding peptides produces peptides that lack traditional anchors, yet retain high affinity. The relative importance of the anchor residues should thus be rethought. One does not require traditional anchors if the rest of the peptide is sufficiently optimized, either artificially, as in this case, or by chance in naturally occurring epitopes (33, 34, 35). Instead, one should seek more sophisticated and comprehensive models of binding better able to account for all possibilities. This helps to explain why many epitopes are missed when using only anchor motif-based epitope prediction programs. Flexibility as to which amino acids can be tolerated at the anchor positions increases the effective number of peptides that can be presented by a given HLA allele. This augments the chance that a T cell response can be mounted by every individual to each Ag or pathogen. It also has other implications, e.g., if multiple amino acids in an epitope can influence the peptide-HLA interaction, this may increase opportunities for pathogen escape from CD8 responses via alteration of peptide binding to MHC (36).
However, we must strike a minor note of caution in this study. Although we can now undertake the rational manipulation of peptide MHC affinity, such binding events are, in themselves, only part of the overall process of Ag presentation, albeit ones of paramount importance. Instead, a complex pathway is involved (37). Put at its simplest, proteins are synthesized, cleaved by proteolysis within the proteasome, and transported via TAP to the endoplasmic reticulum, before being exported to the cell surface bound to MHCs. However, the process is complicated by the involvement of other proteases, such as tripeptidyl peptidase II (38) in the cytoplasm and ERAAP in the endoplasmic reticulum (39). To properly predict T cell epitopes we will require not only an understanding of binding, but also a complete dynamic model of the cell biology underlying the Ag presentation pathway.
In conclusion, we have shown that our additive method is of utility for peptide-MHC binding affinity prediction and can be used successfully for the design of novel high binding peptides. Indeed, QSAR is a technique able to optimize molecular structure to deliver enhanced, reduced, or otherwise modulated biological properties of any measurable kind. We could, for example, use it to optimize the MHC binding affinity of weak affinity peptides, such as putative cancer vaccines. Further, it is equally appropriate for the analysis and manipulation of peptide-MHC complex interaction with T cell receptors as for determining peptide affinity for MHC. It is thus a tool of general utility to the immunologist, whether they are looking to design or enhance epitopes, nonimmunogenic competitor peptides, or T cell antagonists. These are themes we will explore in later work.
| Footnotes |
|---|
2 Current address: Oxxon Pharmaccines, Oxford, U.K. ![]()
3 Current address: Austin Research Institute, A&RMC, Victoria, Australia. ![]()
4 Address correspondence and reprint requests to Dr. Darren R. Flower, Edward Jenner Institute for Vaccine Research Compton, High Street, Berkshire, Compton, U.K. RG20 7NN. ![]()
5 Abbreviations used in this paper: QSAR, quantitative structure-affinity relationship; BL50, half-maximal binding level; FI, fluorescence index; MFI, mean fluorescence intensity; PLS, partial least squares. ![]()
Received for publication December 22, 2003. Accepted for publication March 12, 2004.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
R. J. P. Brown, V. S. Juttla, A. W. Tarr, R. Finnis, W. L. Irving, S. Hemsley, D. R. Flower, P. Borrow, and J. K. Ball Evolutionary dynamics of hepatitis C virus envelope genes during chronic infection J. Gen. Virol., July 1, 2005; 86(7): 1931 - 1942. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. A. Doytchinova and D. R. Flower In Silico Identification of Supertypes for Class II MHCs J. Immunol., June 1, 2005; 174(11): 7085 - 7095. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Doytchinova, S. Hemsley, and D. R. Flower Transporter Associated with Antigen Processing Preselection of Peptides Binding to the MHC: A Bioinformatic Evaluation J. Immunol., December 1, 2004; 173(11): 6813 - 6819. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |