|
|
||||||||

* Division of Translational Immunology and Biodefense, La Jolla Institute for Allergy and Immunology, San Diego, CA 92121; and
Epimmune Inc., San Diego, CA 92121
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
10 different major groups, or supertypes (2), characterized by overlapping peptide-binding motifs and repertoires. For example, previous studies have demonstrated that a number of HLA molecules, collectively designated as the A2 supertype, have similar peptide-binding motifs. The most common allelic molecules of the A2 supertype are A*0201, A*0202, A*0203, A*0206, and A*6802 when Caucasian, Black, and Oriental ethnicities are considered (2, 3, 4, 5, 6). Peptide epitopes (7, 8, 9, 10, 11, 12) derived from important pathogens, such as HIV, hepatitis B virus (HBV),
3 hepatitis C virus (HCV), and Plasmodium falciparum or cancer Ags, have been identified, which bind most or all of these allelic variants (degenerate binders). These supertopes are frequently recognized by exposed, infected, or immune individuals, demonstrating that they are indeed generated during the course of natural infection or cancer and that a repertoire of T cells capable of recognizing the corresponding epitope/HLA complex exists in humans (7, 8, 9, 10, 11, 12, 13, 14). The data obtained in the case of the A2 supertype have been extended to demonstrate that relevant supertopes can also be identified for the A3 and B7 supertypes (2, 4, 5, 7, 8, 15, 16, 17, 18, 19, 20). Finally, recent data suggest that HLA supertypic specificities extend to other primate species, such as chimpanzees, gorillas, and macaques (21, 22, 23, 24). Previously, we predicted the existence of an HLA B44 supertype that encompasses molecules corresponding to the HLA B18, B40 (B60 and B61), B44, and B45 serological specificities (2, 4, 5). This prediction was based on reported peptide-binding motifs derived from sequencing pooled natural self ligands and sequences of individual self ligands and T cell epitopes. The structure of the B and F pockets of various MHC alleles was also considered in formulating this prediction. The predicted specificity of B44 supertype molecules is for acidic residues (glutamic acid and aspartic acid) in position 2, and hydrophobic or aromatic residues (leucine, isoleucine, valine, methionine, phenylalanine, tryptophan, tyrosine, and alanine) at the C terminus. More recently, similar motif specificities have been reported for the chimpanzee class I allele Patr A*16 (25) and the macaque class I molecule Mamu A*11 (21).
Until now, however, direct demonstration has been lacking that the apparent overlap in peptide-binding motifs of the B44-supertype molecules directly translates into overlapping peptide binding repertoires. This has been due in large part to the lack of quantitative binding assays to measure the binding capacity of potential peptide ligands to the various B44-supertype molecules. Furthermore, despite the availability of molecular binding assays for A2, A3, and B7 supertypes, bioinformatic methods allowing accurate predictions of peptides with supertype degenerate binding capacity has not been available.
In this study, we set out to establish quantitative assays for the most common molecules of the B44 supertype and to examine their associated peptide-binding motifs. We also wanted to determine whether it would be possible to identify peptide ligands capable of broad cross-reactivity with several B44-supertype molecules (supertopes), and if so, we were interested in defining the molecular basis for broad cross-reactivity.
| Materials and Methods |
|---|
|
|
|---|
Peptides used were synthesized at Epimmune (San Diego, CA) as described elsewhere (26) or purchased as crude material from Mimotopes (Minneapolis, MN and Clayton, Victoria, Australia) or Pepscan (Lelystad, Netherlands). Peptides synthesized at Epimmune were typically purified to >95% homogeneity by reverse phase HPLC (26). Lyophilized peptides were resuspended at 420 mg/ml in 100% DMSO and then diluted to the required concentrations in 0.05% PBS (v/v), Nonidet P-40 (Fluka Biochemika, Buchs, Switzerland).
MHC purification
The EBV-transformed cell lines DUCAF (A*3002, B*1801), 2F7 (A*6801, B*4001), SWEIG (A*2902, B*4002), WT47 (A*3201, B*4402), PITOUT (A*2902, B*4403), and OMW (A*0201, B*4501) were used as the primary sources of MHC molecules. In some cases, to demonstrate assay specificity, MHC molecules were purified from 721.221 lines transfected with B*1801, B*4001, B*4002, A*0201, A*2902, A*3002, or A*3201 (Pure Protein LLD, Oklahoma City, OK) or an A*6801-transfected C1R line. Cells were maintained in vitro, and HLA molecules were purified from cell lysates as described elsewhere (27). Briefly, class I molecules were captured by repeated passage over protein A-Sepharose beads conjugated with the anti-HLA (A, B, C) Ab W6/32 (27). In some cases, HLA-A molecules were further purified from HLA-B and -C molecules by passage over a B1.23.2 (anti-HLA-B, -C, and some -A) (28, 29, 30, 31) column.
MHC-peptide-binding assays
Quantitative assays to measure the binding of peptides to soluble class I molecules are based on the inhibition of binding of a radiolabeled standard peptide, and were performed as previously described (27). Briefly, 110 nM concentrations of radiolabeled peptide was coincubated at room temperature with 1 nM to 1 µM of purified MHC in the presence of 13 µM human
2-microglobulin (Scripps Laboratories, San Diego, CA) and a mixture of protease inhibitors. After a 2-day incubation, the percent of MHC-bound radioactivity was determined by size exclusion gel filtration chromatography using a TSK 2000 column. Alternatively, the percent of MHC-bound radioactivity was determined by capturing MHC/peptide complexes on W6/32 and/or B123.2 Ab-coated Optiplates (Packard Instrument, Meriden, CT), and measuring bound cpm using the TopCount (Packard Instrument) microscintillation counter.
Peptide 1420.12 (sequence SEIDLILGY), a sequence reported as a self peptide ligand of B44 molecules (32), was used as the radiolabeled probe and standard control for inhibition for the B*1801, B*4402, and B*4403 assays. The average IC50 values of this peptide in the B*1801, B*4402, and B*4403 assays were 3.1, 9.2, and 6.8 nM, respectively. For the B*4001 and B*4002 assays, peptide 1461.04 (sequence YEFLQPILL), a W1>Y analog of a peptide of unknown origin previously reported as a self peptide ligand of B40 (33), was used as the radiolabeled ligand and control for inhibition. The IC50 values of 1461.04 in the respective assays were 1.6 and 1.7 nM. The B*4501 radiolabeled ligand and control for inhibition was 1420.36, an artificial sequence (AEFKYIAAV) representing a consensus from analysis of pools of peptides eluted from B*4006 molecules (34); 1420.36 had an average IC50 of 4.9 nM in B*4501 assays. In the case of competitive assays, the concentration of peptide yielding 50% inhibition of the binding of the radiolabeled peptide was calculated. Peptides were tested at six different concentrations over a 100,000-fold range in three to five independent experiments.
Biochemical validation of assay specificity
The EBV-transformed cell lines used as the primary sources of purified MHC molecules express comparable amounts of A and B class I molecules. To validate the allelic specificity of the peptide-binding assays developed and to verify that the signals obtained were not due to the presence of contaminating amounts of HLA A molecules, MHC molecules purified from HLA A and/or B single allele-transfected cell lines were used in assay development studies. In these studies, the signal obtained in direct binding assays by a putative B loci ligand was compared with binding obtained using ligands for the coexpressed A molecule on either, or both (if available), the A or B allele (data not shown). Cross-inhibition studies were also performed, as well as inhibition of binding analyses using panels of peptides of known HLA-A or -B specificity.
More specifically, B*4001 molecules purified from the EBV-transformed homozygous clone 2F7 (A*6801, B*4001) bound peptide 1461.04 (sequence YEFLQPILL), a W1>Y analog of a peptide of unknown origin previously reported as an endogenously bound B40 ligand (33). The signal obtained could be readily inhibited by an excess amount of an unlabeled peptide. This binding was B*4001 specific because peptide 1461.04 did not bind MHC molecules purified from an A*6801 C1R line. Furthermore, the previously reported A*6801 high affinity binder 941.12 (17) was a poor inhibitor (IC50 > 10,000 nM) of 1461.04 binding (Table I). In a subsequent assay, two peptides reported as naturally processed ligands of B*40 molecules (33) were tested for their capacity to inhibit 1461.04 binding. Also tested were known ligands for other HLA-A and -B specificities (6, 17, 18, 32, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44). As shown in Table I, both peptides described as B*40 ligands inhibited 1461.04 binding with an IC50 of <25 nM. Conversely, none of the nine peptides with non-B44-supertype motifs inhibited 1461.04 binding with an IC50 of <10,000 nM.
|
HLA B*4403 molecules purified from the EBV-transformed homozygous cell line PITOUT (A*2902, B*4403) bound peptide 1420.12 (sequence SEIDLILGY), an endogenous ligand of B44 molecules (32), and the HLA-B44-restricted epitope HIV gp12019 (sequence AENLWVTVY) (38). These peptides could readily inhibit the binding of each other but only poorly, if at all, the binding of a control A*2902 binding peptide (human J chain 101109). Conversely, the human J chain 101109 peptide did not inhibit the binding of either radiolabeled 1420.12 or gp12019. Specificity of the binding of human J chain 101109 to A*2902 binding was further verified using A*2902 molecules purified from single MHC allele-transfected cells. As shown in Table I, six of eight peptides previously reported as endogenously bound to or restricted by HLA B*44 molecules bound with IC50 < 50 nM. Conversely, none of the nine ligands of other class I molecules bound with IC50 better than 500 nM.
HLA B*4402 molecules were purified from the EBV-transformed homozygous cell line WT47 (A*3201, B*4402) and, as in the case of B*4403, good binding was obtained using either the HLA-B44 epitope HIV gp12019 or peptide 1420.12, and the signal obtained could be inhibited by excess amounts of the unlabeled version of either peptide. Conversely, peptides previously identified as A*3201 ligands or epitopes could not inhibit binding (Table I) of either HIV gp12019 or peptide 1420.12, and neither HIV gp12019 or peptide 1420.12 could bind A*3201 molecules purified from single MHC allele-transfected cells.
A B*1801-specific binding assay was developed using class I molecules purified from the EBV-transformed homozygous cell line DUCAF (A*3002, B*1801) and peptide 1420.12. Specificity for B*1801 was confirmed using MHC purified from B*1801 and A*3002 single transfectants. Of 11 peptides, 6 reported as ligands for B*40 or B*44 molecules inhibited 1420.12 binding with IC50 of 10 nM or less (Table I). Conversely, of the nine peptides known to bind other HLA-A or B molecules, only the B7-supertype binder B35CON2 (16, 18) also bound B*1801. Additionally, five peptides previously reported as A*3002 ligands or epitopes were tested for their B*1801-binding capacity, and none of them bound with an IC50 of <30,000 nM (data not shown).
Finally, an HLA B*4501-specific binding assay was developed using MHC molecules purified from the EBV-transformed homozygous cell line OMW (A*0201, B*4501). Optimal binding to MHC purified with the B123.2 Ab was obtained using peptide 1420.36, an artificial sequence (AEFKYIAAV) representing a consensus from analysis of pools of peptides eluted from B*4006 molecules (34). No binding was detected in the same MHC preparations when HBV core 1827 F6>Y, previously identified as a high affinity A*0201-binding peptide (3, 6, 26), was used as the radiolabeled ligand. Additional analyses, summarized in Table I, showed that 5 of 11 peptides previously identified as ligands or epitopes for B*40 or B*44 molecules could inhibit 1420.36 binding with IC50 < 500 nM. Conversely, none of the nine peptides with other motifs or known specificities bound B*4501 with an IC50 of <25,000 nM.
Bioinformatic analysis
In all assays, a relative binding value was calculated for each peptide by dividing the IC50 of the positive control for inhibition by the IC50 measured for each specific peptide tested. Standardized relative binding values also allow the calculation of a geometric mean, or average relative binding value (ARB), for all peptides with a particular characteristic (6, 17, 18, 26, 40, 45, 46, 47). Maps of secondary interactions influencing peptide binding to HLA B44-supertype molecules based on ARB were derived as previously described (6, 17, 18, 26, 40, 45, 46, 47). Essentially, all peptides of a given size (9 or 10 amino acids) and with at least 2 tolerated main anchor residues were selected for analysis. The binding capacity of peptides in each size group was analyzed by determining the ARB values for peptides that contain specific amino acid residues in specific positions. For determination of the specificity at main anchor positions, ARB values were standardized relative to the ARB of peptides carrying the residue associated with the best binding. For secondary anchor determinations, ARB values for peptides with a particular residue in a specific position were standardized relative to the ARB of the whole peptide set considered. Because of the rare occurrence of certain amino acids, for some analyses residues were grouped according to individual chemical similarities as previously described (6, 17, 18, 26, 40, 45, 46, 47).
This report describes for the first time an algorithm to predict degenerate binders of the B44 supertype. To predict degenerate binders, individual peptide sequences are scored using a coefficient matrix (see Results). As with other polynomial methods, the score for each peptide is calculated as the product of the coefficients corresponding to each peptide residue. On the basis of the product of the corresponding coefficients, peptides may be segregated into 4 categories, designated as positive, intermediate, low, and negative. These categories correspond to scores
10, 110, 0.11, and <0.1, respectively.
The performance of predictive algorithms is compared using measures of sensitivity (SENS) and positive predictive value (PPV). SENS is the fraction of total possible positives that are identified using the specific algorithm or algorithm cut-off score, and PPV measures the fraction of peptides that are predicted to be positive that are actually positive. Accuracy, which is the fraction of correct predictions (true positives and true negatives), is also calculated, although this measure can be somewhat misleading if the data set used is skewed, or if the methodologies being compared have not been tested on an identical set of peptides. A further measure that is used for comparison purposes is EFF75, which is the efficiency of prediction (PPV) associated with selection of 75% of the total number of positive peptides.
| Results |
|---|
|
|
|---|
Previous studies (2, 4, 5) defined a putative B44 supertype encompassing various allelic variants of the B18, B37, B40, B41, B44, B45, B47, B49, and B50 Ags. The B44 supertype is defined by a shared peptide-binding specificity for peptides with the acidic residue glutamic acid in position 2 and a hydrophobic residue at the C terminus. However, whether these motif similarities correspond to significant overlaps in peptide binding repertoire was not addressed experimentally.
First, we selected a set of B44-supertype molecules representative of the alleles most frequent in major ethnic groups. Population-typing data collated from published sources (48) are summarized in Table II. On the basis of this data, the most frequent allele over all is B*4001. B*4001 is present in 24.6% of the Asian ethnicity and in >10%, on average, across the four major ethnicities studied. Two other alleles, B*4402 and B*4403, also have average frequencies of
10%. B*4402 is notably frequent in the Black population (14.8%), whereas B*4403 is frequent in Caucasians (>16%). Three other alleles, B*1801, B*4501, and B*4002, representative of the B18, B45, and B61 Ags, respectively, are less frequent but still present in at least 7% of one or more major ethnic groups and >3% of the general population. All other alleles are found in <5% of all populations reported and <3% on average. Based on this analysis, a panel of six molecules (B*1801, B*4001, B*4002, B*4402, B*4403, and B*4501) predicted to be members of the HLA B44 supertype were selected for further studies defining their peptide-binding specificity and cross-reactivity.
|
Initial binding assays (Table I) identified the HIV gp12019 peptide (sequence AENLWVTVY) as having the capacity to bind with high or intermediate affinity all of the molecules selected for study (B*4001, B*4002, B*4402, B*4403, B*1801, and B*4501). To define the primary motif associated with each molecule, a panel of analog peptides representing single amino acid substitutions of HIV gp12019 were synthesized and tested for their binding capacity relative to the binding capacity of the unsubstituted epitope (Fig. 1). As in previous studies (6, 18), in each position preferred residues were defined as those with relative binding (RB) of 0.1 or better. Residues with RB between 0.01 and 0.1 were defined as tolerated, and those with RB <0.01 were considered as nontolerated.
|
B*4002 Again, at position 2 only glutamic acid was preferred or tolerated. All other substitutions examined were associated with a >100-fold reduction in binding capacity. At the C terminus, a preference for small (alanine and threonine) and aliphatic (leucine, isoleucine, valine, methionine) residues was noted. The aromatic residue phenylalanine, as well as the small residue glycine, was also preferred. Other noncharged substitutions (tyrosine, histidine, glutamine, tryptophan, serine, asparagine, and proline) were tolerated. At position 1, >100-fold reductions in binding capacity were noted in the context of the asparagine, aspartic acid, and proline substitutions. This pattern is similar to that observed in position 1 for other HLA molecules, including a majority of those in the B7 supertype (18), and is consistent with the designation of position 1 as an important B*4002 secondary anchor.
B*4403 For B*4403 it was also found that at position 2 only glutamic acid was preferred, and that all other substitutions were associated with a greater than 100-fold reduction in binding capacity. The C terminus was shown to have a preference for the large aliphatic residues methionine and isoleucine, and the aromatic residues tyrosine and phenylalanine. Other large residues (tryptophan, histidine, glutamine, and leucine), and the small residues alanine, glycine, and threonine were also preferred, but with RB in the 0.10.2 range. These data suggest that B*4403 is associated with a previously unappreciated broad specificity at the C terminus. Additional significant influences on binding capacity were only noted in 2 instances at position 1.
B*4402 At position 2, only glutamic acid was preferred or tolerated. At the C terminus a preference for the large aliphatic hydrophobic residues methionine, isoleucine, and leucine and the aromatic residues tyrosine, phenylalanine, and tryptophan was noted. Small (alanine, glycine, serine, threonine, cysteine, and valine), polar (glutamine, asparagine) and basic (histidine, arginine, and lysine) residues were tolerated, with RB between 0.01 and 0.07. Additional effects could be noted at positions 1, 3, and 4, where about one-half of the substitutions were associated with reductions in binding capacity in the 10- to 50-fold range, indicating their potential role as secondary anchors. Overall, the B*4402 motif appears distinct but remarkably similar to the peptide-binding motif defined above for B*4403 molecules.
B*1801 At position 2, only glutamic acid was preferred, but the polar residues serine, glutamine, and tyrosine, the acidic residue aspartic acid, and the aliphatic residue methionine were tolerated. All other substitutions tested were associated with >100-fold reduction in binding capacity. At the C terminus, only analogs carrying the aromatic residues tyrosine and phenylalanine (but not tryptophan) or methionine were preferred. All other substitutions were associated with a 100-fold or greater decrease in binding capacity.
B*4501 As with the other molecules examined, only glutamic acid was preferred at position 2. The aliphatic hydrophobic residue methionine was tolerated, but all other substitutions tested at positions 2 were associated with >100-fold reduction in binding capacity. Similar to the case with B*4403, influences associated with >100-fold reductions in peptide-binding capacity could not be noted at the C terminus. However, the small residues alanine and threonine were most preferred, and the next most preferred residues (glycine, serine, and valine) were also small. Interestingly, B*4501 binding capacity was found to also be dependent on the residue in position 1. Of nine substitutions tested, only the alanine residue was preferred.
Taken together, these data demonstrate that B*4001, B*4002, B*4402, B*4403, B*1801 and B*4501 all share a very stringent requirement for glutamic acid in position 2 of their peptide ligands. With the exception of B*1801, other residues are only rarely tolerated, and even then with relatively poor affinity. The presence of an additional primary anchor position was more difficult to identify, but the C terminus in each case appears to play an important role in determining peptide-binding capacity. Interestingly, and in contrast to position 2, in each case the specificity at the C terminus appears to be quite broad, with many residues being either tolerated or preferred. With the exception of position 1, only rarely were significant influences noted at other positions.
Library analysis of B44-supertype binding capacity
Next, a library of 549 peptides derived from various infectious disease and tumor Ags was assembled and tested in parallel for binding to B*1801, B*4001, B*4002, B*4402, B*4403, and B*4501. All of the peptides were between 8 and 13 residues long and had glutamic acid in position 2. At the C terminus, all peptides had phenylalanine, leucine, isoleucine, methionine, valine, alanine, tryptophan, tyrosine, glycine, or threonine, which are the residues noted in the single substitution analysis to be preferred in the context of two or more molecules.
Overall, between 22.8 and 46.6%, with an average of 34.6%, of the peptides in the library bound any one molecule. Peptides between 8 and 13 residues long bound with similar frequencies, although those 913 residues long were preferred. Examination of cross-reactivity patterns revealed that 37.0% of the peptides were degenerate, having the capacity to bind three or more molecules with an affinity of 500 nM or better. By contrast, in an additional library of 177 peptides with aspartic acid in position 2 and the same range of residues at the C terminus only between 0.6 and 9.0%, with an average of 3.6%, of peptides bound any one molecule. Furthermore, only 1.1% of the D2 peptides were degenerate (data not shown).
Next, the binding capacity associated with each C-terminal anchor residue was examined. The ARBs associated with peptides that bear a specific residue at the C terminus were calculated (Table III). These data are largely in agreement with patterns ascertained in the single substitution analyses, and indicate that each molecule is associated with a unique pattern of preferences. For example, B*1801 prefers the aromatic residues phenylalanine and tyrosine, B*4001 had a preference for the aliphatic residue leucine, and B*4501 had a preference for the small residues alanine and threonine. When considering the geometric mean of ARB across all six molecules, the aromatic residue phenylalanine and the aliphatic hydrophobic residues isoleucine and leucine had the highest ARB, although no residue was associated with an ARB of <0.24.
|
0.33, or
3, are roughly equivalent to 1 SD from the mean. Thus, for the present analyses, these thresholds have been used to indicate significant (positive or deleterious) influences on peptide-binding capacity. Summary maps, showing residue/position pairs with significant influences on binding capacity were also derived (Fig. 2) for 9- and 10-mer ligands.
|
Identification of HLA-B44 supertype cross-reactive epitopes (supertopes) and derivation of a detailed B44 supermotif
In the process of developing the peptide library used for the present study, conserved regions of the HBV, HCV, HIV, and P. falciparum genomes were screened to identify peptides with the B44 supermotif. A total of 177 peptides derived from these targets were synthesized and tested for binding. As a result, a number of B44 supertopes derived from important infectious disease Ags, and potentially useful for vaccine development, have been identified (Table IV). The data presented above (and Tables S1S6 and Fig. 2) demonstrated that there are common patterns among the B44-supertype molecules with respect to secondary influences on peptide-binding capacity. On the basis of these observations, wesought to determine whether these commonalities could be used to predict B44-supertype degeneracy.
|
|
|
10, 110, 0.11, and <0.1, respectively. As shown in Table VI, we found that these categories correlated with B44-supertype degeneracy. Specifically, 86.8% of the peptides with SARB
10 were degenerate, compared with 59.8% of the intermediate, 19.8% of the low, and 3.4% of the negative peptides.
|
Test of the SARB method with an independent (blind) set of peptides that had not been used in any of the library analyses was also performed (Table VI). The blind set of peptides were derived from various human tumor-associated Ags, including carcinoembryonic Ag, MAGE 2, MAGE 3, p53, and Her2/neu. As with the original analysis, it was found that peptides could be distributed along a SARB scoring gradient that was largely reflective of frequency of supertype degeneracy. Specifically, 92.9% of the positive peptides (SARB
10) and 82.1% of the intermediate peptides (SARB between 1 and 10) were degenerate, compared with only 27.3% of the negative peptides (SARB <0.1).
The performance of a predictive algorithm can be assessed with several measures. These include PPV, which is the fraction of peptides predicted to be positive that are actually positive, and SENS, which is the fraction of total positives identified using the selected cutoff score. Additional measures that can be used include accuracy, which is the fraction of correct predictions (a measure that can be severely inaccurate if the data set is skewed), and EFF75, which is the efficiency associated with selection of 75% of the total number of positive peptides. These performance measures, when using a SARB score of
1 as the cutoff score, for both the analysis and blind sets are summarized in Table VI. As shown, the B44-SARB method performs well on all measures. Overall 86.1% of the predictions for the analysis set and 68.8% for the blind set were accurate. In the analysis set, 72% of the peptides scoring positive were indeed positive (PPV), and 80.6% of all positive peptides were identified (SENS). In the blind set, 85.7% of the positive predictions were true positives (PPV), although the SENS measure was somewhat lower (60%). Finally, when the cutoff score for the analysis library was set to allow identification of 75% of all positives (EFF75), an efficiency of 74.6% was noted. These values are comparable with those observed for algorithms that we and others have developed for predicting single MHC allele-peptide binding (47, 49, 50, 51) and underscore the effectiveness of the SARB method.
| Discussion |
|---|
|
|
|---|
Comparison of the individual patterns of specificity revealed that for each molecule the dominant primary anchor was the residue in position 2 of peptide ligands, where invariably the only preferred residue was glutamic acid. The only other position at which a minority of the residues tested were preferred, indicative of a dominant influence on specificity, was the C terminus. However, in most cases, specificity at the C terminus was very broad, and only vague patterns of chemical similarity could be detected. In general, aliphatic hydrophobic (isoleucine, leucine, valine, and methionine), aromatic (phenylalanine, tryptophan, and tyrosine), and small (alanine, glycine, and threonine) residues were preferred or tolerated. The breadth of residues allowed at the C terminus suggests that it is the avoidance of certain residues, especially those with positive or negative charges, that confers binding capacity, rather than the dependence on the strength of a specific chemical interaction. This observation is supported by the remarkable frequency with which the presence of glycine at the C terminus was tolerated, or even preferred. In this context, it is reasonable to speculate that the narrow specificity of position 2 is balanced by the very broad specificity of the C terminus. This type of strategy might result in an optimal size of the repertoire of bound peptides and has also been noted for HLA B7-supertype molecules (16, 18), as well as Mamu A*01 (52, 53) and Mamu B*17 (54).
For positions 3 through C-1, most substitutions tested were preferred or tolerated, indicating that these positions in general have only relatively minor secondary influences on peptide-binding capacity. A common pattern of secondary influence was, however, suggested in position 1, where the acidic residue aspartic acid and the small residues valine and proline were not tolerated in most contexts. The negative influence of the presence of aspartic acid and proline on peptide-binding capacity in position 1 has been noted in the case of many other HLA-A and -B molecules (17, 18, 26, 45).
The peptide-binding motifs for B*4402 and B*4403 have been known since 1994 (37), and the motifs for B*4001 (B60) and B*4006 (B61) were reported in 1995 (34). Although noting the similarities in primary anchors, the authors of those studies did not hypothesize that these molecules could have peptide binding repertoires that would overlap with each other. Three other articles from the same period do note the fact that B37, B40 and B44 have similar primary anchor motifs, sharing a preference for peptides with negatively charged residues in position 2 (32, 55, 56). However, these studies did not report ligands that were bound by multiple alleles. More importantly, none of these authors speculated as to whether the peptide binding repertoires of these molecules overlap or whether they would comprise an HLA-supertype, as pointed out by Sidney, Sette and coworkers (2, 4, 5). This underlines the important distinction between demonstrating that molecules have similar primary anchor motifs and demonstrating that they have overlapping peptide-binding repertoires. For example, we have shown that the repertoires of HLA B7-supertype molecules apparently do not overlap with those of HLA C molecules sharing similar primary anchor motifs (16).
Our data fit well with previously described motifs obtained on the basis of elution and sequencing of natural ligands (32, 34, 37, 56). One notable exception is the B18 motif described by Bourgault-Villada et al. (57), which was based on the presence of acidic residues in position 3. We have found that B*1801 preferentially binds its ligands via the presence of glutamic acid in position 2. This motif is congruent with the sequence of a previously reported B18-restricted epitope (58).
This report is the first description of a B*4501 motif. Inclusion of B*4501 into the B44 supertype was originally suggested on the basis of the structure of the B and F pockets of the B*4501 molecule, hypothesized to engage the residue in position 2 and at the C terminus, respectively, of peptide ligands. B*4501 shares the narrow B44 supermotif specificity for glutamic acid in position 2 of its ligands. Also, like other B44-supertype molecules, B*4501 tolerates a range of hydrophobic, aromatic, and small residues at the C terminus, although a distinct preference for the small residues alanine and threonine was noted. B*4501 is prevalent in >10% of the North American Black population (Table I) and appears to be prevalent with similar frequency in several African Black populations (59) (D. Mann, unpublished observations). Thus, the development of a peptide-binding assay and the characterization of the specificity of this molecule is expected to be a valuable aid in the design and development of various vaccines targeting important diseases that are prevalent in sub-Saharan Africa, such as AIDS and malaria.
The detailed motifs we have derived provide coefficients for use in developing polynomial algorithms (8, 47) for predicting high affinity binders for 6 common B44-supertype molecules. Single allele algorithms often have some efficacy in also identifying peptides with supertype-binding capacity, reflecting the phenomenon that ligands with the highest binding affinity are often also the most degenerate binders (6, 7, 8, 15, 16, 17, 18). Until now, however, no algorithm has been described that specifically addresses prediction of degenerate binding. In fact, when development of specific algorithms to predict A2 supertype binders was attempted, it was met with limited success (A. Sette and J. Sidney, unpublished observations). This may reflect the fact that in the past, in most cases, only peptides binding with high affinity to the most common "prototype" allele (e.g., A*0201 in the case of the A2 supertype) were tested for cross-reactivity on other supertype molecules. Ultimately, this testing strategy results in the generation of data sets that are biased in terms of structural features, and might therefore be suboptimal for purposes of deriving supertype-specific algorithms. The recent availability of high throughput MHC peptide-binding assays has made it feasible in the present context to test all peptides for binding to all supertype alleles, rather than just selected subsets. The efficacy of the SARB method presented herein in the case of the B44 supertype suggests that the resulting availability of more balanced and unbiased data sets, which includes both positive and negative binders, is an important element for establishing algorithms predicting degenerate binding.
In addition to allowing accurate epitope prediction (17, 18, 26, 40, 45, 47, 60, 61), detailed peptide-binding motifs defining both primary and secondary anchor positions allow for the rational design of optimized ligands. For example, suboptimal main anchor residues may be replaced with more preferred residues, generating epitope analogues with increased binding affinity (18, 62, 63) and immunogenicity (62, 63, 64, 65, 66, 67). Evaluation of the efficacy of the detailed B44 supermotif to aid in the design of analogues with increased degeneracy will be the subject of future experiments, although the effectiveness of supermotifs in the design of degenerate binders has been demonstrated in the case of, for example, the B7 supermotif (18).
In conclusion, the data described herein formally demonstrates that B44-supertype molecules recognize similar features at primary and secondary anchor positions of their peptide ligands and also share largely overlapping peptide-binding repertoires. These results have significant implications for the design of epitope-based vaccine constructs and for monitoring immune responses. Indeed, the immunological relevance of supertype cross-reactivity at the peptide binding level has been demonstrated in a number of studies in both infectious disease (7, 8, 9, 12, 13, 15, 19, 68) and cancer (10, 11, 14, 20, 69, 70, 71) settings.
| Acknowledgments |
|---|
| Footnotes |
|---|
2 Address correspondence and reprint requests to Dr. Alessandro Sette, Division of Translational Immunology and Biodefense, La Jolla Institute for Allergy and Immunology, 10355 Science Center Drive, San Diego, CA 92121. E-mail address: alex{at}liai.org ![]()
3 Abbreviations used in this paper: HBV, hepatitis B virus; HCV, hepatitis C virus; RB, relative binding; ARB, average relative binding; SARB, supertype average relative binding; PPV, positive predictive value; SENS, sensitivity; EFF75, efficiency at 75% sensitivity. ![]()
4 The on-line version of this article contains supplemental material. ![]()
Received for publication June 23, 2003. Accepted for publication September 17, 2003.
| References |
|---|
|
|
|---|