|
|
||||||||




* Department of Microbiology and Immunology, University of Oklahoma Health Sciences Center, Oklahoma City, OK 73104; and
Pure Protein LLC, Oklahoma City, OK 73104
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Understanding the breadth of the proteome sampled by class I HLA is vital to a number of immunologic fields. For instance, the development of vaccines that target CTL to infected or neoplastic cells requires the identification of Ags or epitopes that distinguish the unhealthy cell. Are viral Ags that localize to the nucleus less likely to be presented to CTL than a viral protein that resides preferentially in the Golgi apparatus? Will an up-regulated tumor Ag on chromosome 20 be better presented than a putative tumor Ag on chromosome 7? Understanding the range of peptide presentation is also pertinent to clinical transplantation: can we expect minor histocompatibility Ags to be derived from allogeneic polymorphisms throughout the human genome, or are person-to-person differences in select proteins more subject to presentation? Finally, as have studies of nucleotide and amino acid sequences increased our knowledge of evolution, understanding the breadth of the proteome sampled will further expose the evolutionary pressures that have shaped the immune function of MHC class I molecules.
MHC class I biologists have theorized that no self protein precursors are excluded from the class I peptide-presentation pathway, although this hypothesis has been pieced together from independent peptide-characterization experiments for multiple class I alleles (9). A more complete understanding of the boundaries of peptide sampling has been delayed by the relatively laborious techniques associated with the sequencing of a large number of peptides from any single class I molecule (for examples, see Ref. 10, 11, 12, 13). For instance, the most well-characterized class I molecule, HLA-A*0201, has
110 documented endogenous ligands, most of which were identified in different laboratories using different techniques (14, 15). It cannot be ruled out that the small numbers of peptide-source proteins (i.e., the proteins from which the peptides are derived) identified in each study are not a result of methodological differences (cell line, lysis method, affinity purification Ab, peptide-separation method, and ligand-sequencing method). Thus, no individual study has systematically analyzed a large number of endogenously loaded class I peptide ligands to ascertain the actual extent of peptide sampling of the human proteome.
To explore the repertoire of self proteins presented as peptides by MHC class I molecules, we sequenced >200 peptides from HLA-B*1801 (the largest number of peptides reported from a single experiment). A total of 10 mg of the B*1801 protein was isolated from the class I-deficient 721.221 B cell line that has been extensively used for class I protein expression. Peptides were eluted from B*1801, individual peptide ligands were selected for sequencing by mass spectrometry (MS), and proteomic techniques were applied for data analysis. In contrast to the close sequence relatedness of the bound peptides, the source proteins for these B*1801-bound peptides were found in almost every compartment of the cell and represented the spectrum of biological and cellular functions. Statistical analysis of peptide-source proteins in the context of the unbiased human proteome revealed a preference for RNA- and nucleic acid-binding proteins, ribosomal constituents, and cellular chaperones. These data provide the first direct evidence that a single class I molecule can access proteins from a large portion of, if not the entire, human proteome and present them as a sequence-related set of self peptides.
| Materials and Methods |
|---|
|
|
|---|
Soluble HLA (sHLA)3-B*1801 was produced by transfection of the class I-negative EBV-transformed B-lymphoblastoid cell line 721.221 (16) with a PCR-truncated cDNA cloned into the vector PCDNA 3.1- (Invitrogen, Carlsbad, CA), as previously described (17).
Class I production and purification
sHLA-B*1801-producing transfectants were cultured in a CP-2500 Cell Pharm (Biovest International, Minneapolis, MN), and the sHLA-containing supernatant was collected. Upon completing a bioreactor run, sHLA complexes were affinity purified from the harvests obtained using W6/32 (18) coupled to a Sepharose 4B matrix (Amersham, Piscataway, NJ). Harvests were applied to the column using a peristaltic pump system (Amersham) with a speed of 5 ml/min at 4°C. After the column was extensively washed with PBS, bound sHLA molecules were eluted with 0.1 M glycine (pH 11.0) and immediately neutralized by addition of 1 M Tris-HCl, pH 7.0. Purified molecules were buffer exchanged with PBS at pH 7.2 and concentrated using 10-kDa cutoff Macrosep centrifugal concentrators (Pall Filtron, Northborough, MA).
Intact B*1801 molecules were brought to a final concentration of 10% acetic acid and heated to boiling for 10 min. Peptides were then purified by passage through a 3-kDa Microcon Microconcentrator (Millipore, Bedford, MA) before loading onto a Jupiter Proteo 4-µm C18 reversed-phase HPLC (RP-HPLC) column (Phenomenex, Torrance, CA).
Mass-spectrometric peptide sequencing and analysis
Fractionated peptides were examined on a QStar QTOF mass spectrometer (PerSceptive Sciex, Foster City, CA) equipped with a NanoSpray nano-ESI ionization source (Protana, Odense, Denmark). Individual peptides were selected for tandem MS (MSMS) fragmentation and sequence analysis at random; however, the most abundant peptides were generally fragmented first and produced the best sequence data. Selected ions were analyzed manually and by automated sequence assignment using the programs BioMultiview and MASCOT (19).
Proteomic data analysis
Peptide sequences were analyzed for their putative derivation through protein-protein and protein-translated database BLAST searches (20). Source proteins for the peptides were assigned the appropriate LocusLink ID number (21) before automated annotation using the program Database for Annotation, Visualization, and Integrated Discovery (DAVID) (22), resulting in standardized nomenclature for each of the peptide-source proteins. Cellular protein locations, molecular functions, and biological functions were assessed using the DAVID software utilizing the GOcharts function based upon the Gene Ontology Annotation (GOA) protein hierarchical classifications (23). Statistically over-represented peptide categories were determined using the EASE program (included in the DAVID software package) by calculating the probability of being presented as a peptide vs the entire human proteome using Fishers Exact Test and the Bonferroni correction for multiple queries. Genomic locations of genes corresponding to peptide-source proteins were compiled using gene location, as determined by LocusLink ID.
A text file of all LocusLink ID numbers for the peptide-source proteins reported in this work is available in the supplemental material;4 these are provided for independent data analyses.
| Results |
|---|
|
|
|---|
Using sHLA secretion from hollow fiber bioreactors as previously described (17), we produced
50 mg of B*1801, a molecule with only one peptide sequence described (24), from a single 4-wk-long experiment. Approximately 10 mg of B*1801 (roughly 500 µg of final peptide weight) was affinity purified after production in the EBV-transformed B cell line 721.221 (16). The peptide/MHC complexes were further subjected to acid denaturation to remove bound peptides. Eluted peptides were purified by ultrafiltration and separated based on hydrophobicity by RP-HPLC to reduce complexity (Fig. 1A). Once separated, individual fractions were sprayed for analysis on a Q-Star Q-TOF mass spectrometer (Fig. 1B). Ions were then selected for sequence derivation through MSMS fragmentation (Fig. 1C) and automated and de novo sequence interpretation. Although peptides were chosen at random, abundant peptides produced better sequence data; more sequences were therefore identified from peptides with high to intermediate copy number than those present at low quantities.
|
2 protein) (Table I and supplemental data). Thus, in most cases, multiple peptides were derived from separate areas of the protein, such as three of the four peptides derived from eukaryotic translation elongation factor 1
(Table I and supplemental data).
|
|
|
We next applied proteomic analysis to determine the locations of the peptide-source proteins within the cell. Source proteins were entered into the DAVID program (22), where they were sorted by cellular component according to their GOA classification (23). As expected, most (90%) of the peptide-source proteins were classified as intracellular, the main location for the generation and loading of MHC class I peptides (Fig. 2). Interestingly, many of the source proteins were not derived from the cytoplasm, the cellular locale of peptide generation by proteasomal cleavage (27). Approximately 53% of peptide-source proteins are cytoplasmic proteins, while 43% are nuclear and 18% are membrane bound (note that some of the source proteins could be found in both the cytoplasm and nucleus, resulting in a total higher than 100%). Additionally, many of the peptide-source proteins had more precise location annotation: 8% ribosome, 8% endoplasmic reticulum, 7% mitochondrion, 6% plasma membrane, 3% nucleolus, 2% chromosome, and 2% Golgi. We found no peptides derived from proteins resident to lysosomal or endosomal compartments. Therefore, MHC class I peptides can be derived from proteins resident to almost every compartment in the cell and are not particularly biased toward the cytoplasmic compartment.
|
B*1801 peptide-source proteins were next evaluated for their biological and molecular functions, again according to GOA classification using the program DAVID. Although source proteins possessed multiple biological functions, a majority of the source proteins fell into two categories: 74% were involved in cellular metabolism, while 38% functioned in cell growth and maintenance (Fig. 3A). Minor categories of source-protein biological function included cell communication (16%), cell stress response (12%), and cell external stimuli response (10%). As seen with biological functions, a majority of the peptide-source proteins possessed two major molecular functions: 64% had binding activity, while 46% possessed catalytic activity (Fig. 3B). Aside from the two major molecular functions of source proteins, identified proteins were also involved in transcriptional regulation (11%), signal transduction (8%), molecular or solute transport (9%), and protein chaperoning (6%). Thus, at any time, a majority of the proteins serving as sources for MHC class I peptides are involved in normal cellular metabolism and maintenance through RNA-, DNA-, or protein-binding activities.
|
Each source protein was next assigned to a chromosomal location according to the location of the gene encoding it (Fig. 4). Multiple peptides were derived from source proteins on each chromosome; the largest number of source proteins (19) was encoded on chromosome 2, while the lowest number (2) was encoded by genes on chromosome 18. Although no peptides were found from the Y chromosome, this is to be expected from the 721.221 cell line that is of female etiology (16). Thus, no chromosome appears to be excluded from generating products for MHC class I peptides.
|
To compare the presentation of peptides with representation by the proteome (i.e., is the self repertoire a result of the specificity of class I and its loading pathway or a general result of protein abundance in a particular category), we performed statistical analysis comparing the representation of annotated source proteins of peptides with the representation in the annotated human proteome. Table III denotes the 10 most over-represented categories of B*1801 peptide-source proteins in terms of cellular location (cellular component) and biological and molecular function. As expected, intracellular proteins were significantly over-represented as class I peptides (as were the intracellular categories of cytoplasmic and nuclear). In terms of biological function, a majority of the over-represented proteins were involved in protein and macromolecule biosynthesis. Perhaps most interestingly, in the category of molecular function, RNA-binding proteins were significantly over-represented as peptide-source proteins when compared with the whole proteome. RNA-binding proteins therefore serve as a rich source for MHC class I ligands.
|
| Discussion |
|---|
|
|
|---|
Using sHLA as a means to gather a large amount of class I molecules and their cognate peptides, we eluted peptides endogenously bound in the B cell line 721.221 from
10 mg of sHLA-B*1801 complexes and sequenced them randomly by mass spectrometry. Peptides were largely concordant with the previously identified B18 anchor residue of P2 E (25) and also possessed an aromatic C terminus. A majority of the peptides were nonamers, with several longer peptides also identified possessing canonical anchors. Individual B*1801 peptide ligands sequenced in this study therefore fit with existing B18 knowledge in terms of length and amino acid preferences.
The most striking observation obtained through analyses of peptide-source proteins was that peptides sampled by class I are relatively unbiased when compared with the human proteome. Statistically, the single most over-represented category in terms of class I peptide presentation are intracellular proteins, which reflects the main function of class I in the presentation of intracellular peptides. The statistical significance of other proteome categories sampled drops precipitously, except in the case of RNA-binding proteins, which class I molecules appear to be exceptionally adroit at presenting. It remains to be tested whether RNA-binding protein-derived peptides are overabundant due to a propensity to load into class I molecules or simply because they are among the most abundant proteins in the eukaryotic cell (28). In either instance, with the current exception of RNA-binding proteins, these data provide the first experimental evidence that class I molecules purified from B cells bind and present an accurate reflection of the human proteome.
Although the peptides identified were closely allied in amino acid sequence and length, genetically they were encoded by genes dispersed throughout the genome, being located to every chromosome without apparent preference. Furthermore, no telomeric or centromeric bias could be detected (data not shown). Immune surveillance mechanisms that review class I-presented self, at least through B*1801, can therefore monitor gene products irrespective of their chromosomal locale. Although they directly sample the proteome, these data indicate that class I molecules indirectly provide an unbiased view of the genome.
Current knowledge dictates that most class I peptides are created in the cytoplasm by the proteasome, although mechanisms for peptide generation and loading outside the cytoplasm have been identified, perhaps most notably in dendritic cells (4, 29, 30, 31, 32, 33). Theoretically, in the B cell line used in this study, peptides could be generated from proteins in cellular compartments accessory to the cytoplasm in a number of ways. First, peptide-source proteins could be retrieved from their cellular compartment for cytosolic degradation as a normal feature of cellular metabolism. Alternatively, source proteins may be degraded in their resident compartment; this may be especially relevant for nuclear protein degradation by the many nuclear proteasomes (29). Peptides generated in cellular compartments such as the nucleus can freely diffuse from nuclear pores and enter the class I-processing pathway in the cytosol (34). Finally, Yewdell and colleagues (35, 36) have proposed that a large portion of newly generated class I products is derived from defective ribosomal products; along the same line, newly synthesized proteins have been identified as the major substrate for TAP, and thus the major source of class I peptides (37). Presentation of newly synthesized normal or defective proteins would occur in the cytoplasm before the egress of source proteins to ultimate cellular locations. Whatever the case, class I molecules appear to be extraordinarily adapted to present the entire cellular complement of proteins.
The equitable sampling of class I peptides from genetically, functionally, and compartmentally diverse proteins is most likely necessitated to comprehensively reflect the collective health of the cell. Genetically, presentation of all chromosomal products may allow NK or CTL detection of newly arising cancerous transformations regardless of location. Functionally, intracellular pathogens modify and usurp a wide range of host metabolic cycles (38, 39); distinct changes in proteins in multiple compartments of the cell may be necessary to report complex host-pathogen interactions to immune surveillance systems. Likewise, presentation of the full spectrum of the proteome as proteins are generated from ribosomes may allow early detection of replicating viral invaders. This comprehensive peptide presentation is certainly a more attractive mechanism of immune supervision than compartmentalized or biased presentation. Certainly, it will be important to compare these findings with those generated in other cell lines, such as professional APCs.
In summary, we have demonstrated that B*1801 samples an enormously complex proteome with great efficiency. Transcriptional regulators, chaperones, membrane proteins, and stress-response factors are all available for review by cellular immune mechanisms. As yet, we do not fully understand the mechanisms that enable class I HLA molecules to access such a vast array of protein products, nor do we fully understand the shaping of the innate and adaptive immune responses by this comprehensive view of the host proteome. It is emerging that class I HLA-presented self can both trigger and modulate immune responsiveness (40, 41, 42), and our understanding of the contribution of self protein presentation to human immunity will enhance the detection of disease-influenced changes therein.
| Acknowledgments |
|---|
| Footnotes |
|---|
2 Address correspondence and reprint requests to Dr. William H. Hildebrand, Department of Microbiology and Immunology, 975 NE 10th Street, Oklahoma City, OK 73104. E-mail address: william-hildebrand{at}ouhsc.edu ![]()
3 Abbreviations used in this paper: sHLA, soluble HLA; MS, mass spectrometry; MSMS, tandem ms; RP-HPLC, reversed-phase HPLC. ![]()
4 The on-line version of this article contains supplemental material. ![]()
Received for publication October 10, 2003. Accepted for publication December 16, 2003.
| References |
|---|
|
|
|---|
-induced aminopeptidase in the ER, ERAP1, trims precursors to MHC class I-presented peptides. Nat. Immun. 3:1169.
-cell line. Diabetes 45:1761.[Abstract]
This article has been cited by other articles:
![]() |
H. Escobar, D. K. Crockett, E. Reyes-Vargas, A. Baena, A. L. Rockwood, P. E. Jensen, and J. C. Delgado Large Scale Mass Spectrometric Profiling of Peptides Eluted from HLA Molecules Reveals N-Terminal-Extended Peptide Motifs J. Immunol., October 1, 2008; 181(7): 4874 - 4882. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-H. Fortier, E. Caron, M.-P. Hardy, G. Voisin, S. Lemieux, C. Perreault, and P. Thibault The MHC class I peptide repertoire is molded by the transcriptome J. Exp. Med., March 17, 2008; 205(3): 595 - 610. [Abstract] [Full Text] [PDF] |
||||
![]() |
Epitope discovery in West Nile virus infection: Identification and immune recognition of viral epitopes PNAS, February 26, 2008; 105(8): 2981 - 2986. |
||||
![]() |
M. Marcilla, J. J. Cragnolini, and J. A. Lopez de Castro Proteasome-independent HLA-B27 Ligands Arise Mainly from Small Basic Proteins Mol. Cell. Proteomics, May 1, 2007; 6(5): 923 - 938. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Hofmann, M. Gluckmann, S. Kausche, A. Schmidt, C. Corvey, R. Lichtenfels, C. Huber, C. Albrecht, M. Karas, and W. Herr Rapid and Sensitive Identification of Major Histocompatibility Complex Class I-associated Tumor Peptides by Nano-LC MALDI MS/MS Mol. Cell. Proteomics, December 1, 2005; 4(12): 1888 - 1897. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Caron, R. Charbonneau, G. Huppe, S. Brochu, and C. Perreault The structure and location of SIMP/STT3B account for its prominent imprint on the MHC I immunopeptidome Int. Immunol., December 1, 2005; 17(12): 1583 - 1596. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. D. Hickman-Miller, W. Bardet, A. Gilb, A. D. Luis, K. W. Jackson, D. I. Watkins, and W. H. Hildebrand Rhesus Macaque MHC Class I Molecules Present HLA-B-Like Peptides J. Immunol., July 1, 2005; 175(1): 367 - 375. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Dengjel, O. Schoor, R. Fischer, M. Reich, M. Kraus, M. Muller, K. Kreymborg, F. Altenberend, J. Brandenburg, H. Kalbacher, et al. From the Cover: Autophagy promotes MHC class II presentation of peptides from intracellular source proteins PNAS, May 31, 2005; 102(22): 7922 - 7927. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Torikai, Y. Akatsuka, M. Miyazaki, E. H. Warren III, T. Oba, K. Tsujimura, K. Motoyoshi, Y. Morishima, Y. Kodera, K. Kuzushima, et al. A Novel HLA-A*3303-Restricted Minor Histocompatibility Antigen Encoded by an Unconventional Open Reading Frame of Human TMSB4Y Gene J. Immunol., December 1, 2004; 173(11): 7046 - 7054. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Probst-Kepper, H.-J. Hecht, H. Herrmann, V. Janke, F. Ocklenburg, J. Klempnauer, B. J. van den Eynde, and S. Weiss Conformational Restraints and Flexibility of 14-Meric Peptides in Complex with HLA-B*3501 J. Immunol., November 1, 2004; 173(9): 5610 - 5616. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |