|
|
||||||||
Max-Planck-Institut für Biologie, Abt. Immungenetik, Tübingen, Germany
| Abstract |
|---|
|
|
|---|
. The zebrafish class I region
contains representatives of three phylogenetically distinguishable
groups of PSMB genes, X, Y, and
Z. It is proposed that these genes were present in the
ancestral PSMB region before Mhc class I
genes became associated with it. | Introduction |
|---|
|
|
|---|
ßß
orientation. In
eukaryotes, both the
and the ß rings are comprised of seven
unique proteasome component
(PSM)2 subunits
(3). Of these, only three of the ß subunits appear to be
catalytically active, and these interact with the other ß subunits to
form a proteolytic pocket in the center of the ring structure (3, 4).
In mammals, a second 20S proteasome, the immunoproteasome, has been
identified (5). It differs from the "general
housekeeping" 20S proteasome at the three catalytically active ß
subunits. Upon induction by IFN-
, three additional proteasome
component ß (PSMB) subunits are expressed, PSMB8 (= low molecular
mass protein 7 (LMP7)), PSMB9 (=LMP2), and PSMB10 (=MECL1). These
subunits replace the constitutively expressed housekeeping subunits
PSMB5 (=X, MB1), PSMB6 (=Y,
), and PSMB7 (=Z, MC14), respectively,
during formation of the newly synthesized proteasome. The resulting
immunoproteasome has been shown to exhibit an altered proteolytic
activity, producing peptides that are similar to those presented by the
MHC (Mhc) class I molecules (6, 7).
The TAP molecule is part of a large superfamily of ATP-binding cassette
(ABC) membrane bound transporters (8). In mammals, TAP
molecules transport antigenic peptides produced in the cytosol,
preferentially those destined for binding to class I molecules, into
the lumen of the rough endoplasmic reticulum (9). The TAP
molecule is comprised of two noncovalently associated subunits, TAP1
and TAP2. The genes coding for both of these subunits are located in
the mammalian Mhc, where they are tightly linked with the
PSMB8 and PSMB9 genes (10). The
PSMB8, TAP1, PSMB9, and
TAP2 genes of the cluster occur in the following
orientation:
. Evidence suggests that the
PSMB8 and TAP1 genes are coregulated from a
shared bidirectional promoter (11, 12).
Proteasome and ABC transporter genes have been used to provide evidence in support of the hypothesis that two rounds of chromosome duplication, presumably preceding the emergence of jawed vertebrates (13, 14, 15, 16, 17), were critical for the appearance of the adaptive immune system (15). PSMB and ABC transporter genes are part of paralogous genomic regions that are central to this hypothesis. In humans, the PSMB/TAP gene cluster is linked to the HLA complex located in the 6p21.3 region. Genes paralogous to several HLA complex genes, including PSMB7, which codes for a subunit that is replaced by the PSMB10 subunit in the immunoproteasome, and the ABC2 gene, are present in the q3334 region on chromosome 9 (15, 16). Kasahara and coworkers (15) hypothesized that the preduplication ancestor of the 6p21.3 region contained three tandemly arranged PSMB housekeeping genes, pre-PSMB5, pre-PSMB6, and pre-PSMB7. Upon block duplication (presumably chromosomal), the three genes associated with the region destined to become the Mhc evolved into the PSMB8, PSMB9, and PSMB10 genes. The PSMB10 gene was subsequently translocated to another chromosome. In the 9q3334 paralogous region, the PSMB7 is postulated to be the sole remainder of the alternate duplicated PSMB three-gene cluster.
To test the hypothesis of block duplication of the proposed paralogous regions, Hughes (18) conducted a phylogenetic analysis of the genes involved. He found the estimated times of the duplication events to vary considerably among the pairs of paralogous genes. This observation is inconsistent with the block duplication hypothesis. In particular, the divergence of the TAP1/2 and ABC2 genes appears to have occurred before the divergence of eukaryotes from bacteria. As an alternative explanation, he proposed that there may be a selective advantage to the clustering of broadly expressed genes in regions likely to have a wide range of transcriptional activity. According to this hypothesis, the association of the PSMB and ABC transporter genes with the two proposed paralogous regions has occurred purely by chance, but was subsequently maintained due to selective advantage.
The comparative analysis of nonmammalian jawed vertebrate genomes will undoubtedly lead to a greater understanding of the evolution of gene associations. Unfortunately, little is known about the genomic organization of any nonmammalian vertebrate. Most advanced in this regard is the study of the zebrafish, Danio rerio (17). The zebrafish, a representative of the bony fishes (Osteichthyes), is one of the model organisms in a variety of studies, but particularly in developmental biology (19). In this species, four proteasome genes have been identified (20): two from complete cDNA sequences, Dare-PSMB8 and -PSMB6 (formerly Dare-LMP7 and -Y, respectively), and from two partial cDNA sequences, Dare-PSMB9 and -PSMB5 (formerly Dare-LMP2 and -X, respectively). Linkage studies (20) and analysis of zebrafish genomic bacterial artificial chromosome (BAC) clones containing class I genes (21) have shown that the Dare-PSMB8 and -PSMB9 are linked to Mhc class I genes. However, in zebrafish, Mhc class I and class II are not linked (22). Genes representing four families of the human 6p21.3 and 9q3334 paralogous groups (PSMB, TAP, RING3, and RXRB-like genes) have been found in zebrafish to be linked to Mhc class I genes (20, 23). All of these and the Mhc class I genes are present on a 500-kb contig of BAC and P1 artificial chromosome (PAC) clones, with the PSMB, TAP, RING3, and a class I gene present on a single BAC clone (B. Murray, V. Michalová, H. Sültmann, and J. Klein, manuscript in preparation).
The aim of the present study was to determine the composition and organization of the Mhc-linked PSMB and TAP2 genes in the zebrafish.
| Materials and Methods |
|---|
|
|
|---|
Genomic regions flanking the previously identified Dare-TAP2 and -PSMB8 genes (20) were targeted for nucleotide sequencing. Various restriction enzymes were used to construct subclone libraries of BAC clones 7 and 716 (two overlapping clones previously shown to contain Mhc class I genes; Ref. 21). Restriction fragments were excised from 1% agarose gels (Carl Roth, Karlsruhe, Germany), extracted via the QIAEX II kit (Qiagen, Hilden, Germany), and cloned into the pGEM-7Zf(+) plasmid vector (Promega, Mannheim, Germany). Subcloned fragments spanning the region of the known Dare-TAP2 and PSMB8 genes were sequenced using the Thermo Sequenase cycle sequencing kit (Amersham Pharmacia Biotech, Braunschweig, Germany), the LI-COR DNA sequencer 4200 (MWG-Biotech, Ebersberg, Germany), and fluorescently labeled primers. Sequences were compared with those in the GenBank through both FASTA nucleotide and BLASTx searches.
Analysis of zebrafish cDNA library
Two rounds of PCR amplification were used to obtain full-length
cDNA sequences of clones in a zebrafish cDNA library constructed from
20 adult individuals (24). In the first round, a
vector-specific and a gene-specific primer were used; in the second
round, the same vector primer and a second gene-specific primer located
just downstream of the first were applied (Table I
). PCR amplifications were conducted
using the PTC-100 programmable thermal controller (MJ Research,
Watertown, MA). In each case, 1 µl of the phage suspension was used
as a template and the concentrations of reagents in a 25-µl volume
were as follows: 5 pmol of each primer, 1 U Taq DNA
polymerase (Amersham Pharmacia Biotech), 100 mM NaCl, 10 mM Tris, pH
7.8, and 1.5 mM MgCl2. The thermal profile
consisted of 4 min at 94°C followed by 35 cycles of 94°C for
15 s, 50°C or 55°C for 30 s, 72°C for 2 min, and
completion at 72°C for 8 min. The PCR fragments were cloned into the
pGEM-T vector (Promega) .
|
Sequence alignments were made with the aid of the computer program CLUSTAL W (25) with minor improvements made by eye. Phylogenetic trees were constructed by the neighbor-joining method (26) from pairwise distances estimated from amino acid (AA) alignments using both the programs "neighbor" and "protdist" contained in the PHYLIP, version 3.5c, computer package (27) and the MEGA, version 1.02, computer program (28). In each case, any positions containing indels were excluded from the analysis. The trees were bootstrapped 500 times (29).
Screening of PAC library
The PAC zebrafish genomic library no. 706 was obtained from the
Resource Center/Primary Database of the German Human Genome Project,
Max-Planck Institute for Molecular Genetics (Berlin-Charlottenburg,
Germany; http://www.rzpd.de). It was screened with PCR-amplified
fragments of the Dare-PSMB9A and Dare-PSMB9B cDNA
pGEM clones. The PCR products were isolated from a low-melting-point
agarose gel, labeled with 50 µCi
[
-32P]dCTP by the random priming method
using the Ready-To-Go kit (Amersham Pharmacia Biotech) and hybridized
to the PAC filters following the suggested protocol (Resource
Center/Primary Database).
Genomic organization
Overlapping DNA sequence fragments generated from the analysis of the subclone libraries were identified and organized with the computer program AssemblyLIGN (Eastman Kodak, Rochester, NY). For each contig, the exon/intron organization was deduced from the known cDNA sequences. Intron-specific PCRs were conducted to join the existing contigs, confirm exon/intron boundaries, and estimate the sizes of introns. The PCR, cloning of amplified fragments, and DNA sequencing were performed as described above (primer sequences available upon request). In all cases, BAC 716 DNA was used as a PCR template.
Analysis of promoter regions
DNA regions extending 700 bp upstream of the initiation codon of each gene were analyzed for possible regulatory motifs. Sequences were compared with the TRANSFAC database (30) with the computer program TFSEARCH, version 1.3 (Y. Akiyama, TFSEARCH: Searching Transcription Factor Binding Sites, http://www.rwcp.or.jp/papia/).
Nomenclature
This paper follows the convention for naming Mhc genes (31). Genes with homology to previously named genes in other organisms are given the same name in zebrafish, (e.g., Psmb in mouse, PSMB in human and zebrafish). Alleles are designated by an asterisk and a numerical code.
| Results |
|---|
|
|
|---|
To identify new genes in the Mhc class I region of zebrafish, subclone libraries of the overlapping zebrafish BAC clones 7 and 716 (21) were constructed. Through direct sequencing of targeted subcloned fragments, genomic sequences were found that contained exons of new proteasome genes. From a TaqI subclone library, we recovered genomic fragments that contained possible exons with sequence similarity to the mammalian proteasome genes PSMB6 or PSMB9 (LMP2; Refs. 32, 33, 34). In the second subclone library, an XhoI fragment (3.3 kb) was found to contain exons related to the proteasome genes PSMB7 and PSMB10 (35).
Based on the exons detected, primers were designed (Table I
) to screen
the zebrafish cDNA library. In addition, primers were designed based on
the partial zebrafish PSMB5 and PSMB9 gene
sequences described previously (20) and on EST sequences
with similarity to PSMB7. Through the screening of the cDNA
library, six full-length and one partial cDNA sequences were found
(Fig. 1
). These included sequences of two
new genes, Dare-PSMB11 and -PSMB12, the
originally described Dare-PSMB5 and Dare-PSMB9
(now called PSMB9A) genes (20), a second
related PSMB9 gene, named Dare-PSMB9B,
and the Dare-PSMB7 gene.
|
The cDNA sequence of Dare-PSMB11 (Fig. 1
) is
based on two overlapping amplified fragments. The first fragment
extends 238 bp from the position of the BM-b18-3R2 primer to the
beginning of the 5' untranslated region (UTR), while the second
fragment extends 720 bp from the position of the BM-a37-7F1 primer to
the beginning of the polyA tail. The sequences are identical in the
84-bp region of overlap. The entire cDNA sequence is 855 bp long,
extends up to the beginning of the polyA tail, and includes 74 bp of
the 5' UTR and 127 bp of the 3' UTR. A possible polyadenylation signal
is found at sites 103108 of the 3' UTR. The AA sequence of the entire
polypeptide is 217 AA residues long. It is comprised of a 16-AA
propeptide and a 201-AA mature protein deduced from similarity to other
PSMB subunits (Fig. 2
) and from the
presence of the correct glycine (G)-threonine (T) motif, which is the
site of the cleavage of the N-terminal propeptide (1).
|
The Dare-PSMB12 cDNA sequence (Fig. 1
) is
also based on two overlapping PCR-amplified fragments. The first
fragment extends 462 bp from the Dare*Z1-E5R2
primer to the transcription initiation site. The second fragment
extends 723 bp from the position of the
Dare*Z1-E4F2 primer to the beginning of the polyA
tail. No nucleotide differences are found in the 132-bp overlap. No 5'
UTR is detected in the 1002-bp sequence, while a 161-bp 3' UTR is found
that contains a polyadenylation signal at sites 148153 and a polyA
tail after site 161. However, analysis of the genomic sequence reveals
the presence of a probable initiation codon (shown in lower case and
italics in Fig. 1
) that is lacking in the cDNA sequence. Based on this
initiation codon the deduced mature protein is 237-AA residues long
after the removal of a 44-AA residue long propeptide (assuming cleavage
at the GT motif).
Dare-PSMB9
Based on a fragment of the Dare-PSMB9 gene reported previously (20), primers were designed to characterize the full-length cDNA sequence. Three sequences were detected: two alleles of the original locus renamed Dare-PSMB9A and a second locus identified on the basis of a divergent sequence and denoted Dare-PSMB9B.
The full-length Dare-PSMB9A*01 cDNA sequence (Fig. 1
) was
derived from two overlapping PCR fragments. The sequence of the first
fragment is 594 bp long, extending from the 5' UTR to the position of
the Dare*LMP2.R1 primer. The second fragment, which is identical in the
291-bp overlap to the first one, extends from the position of the
Dare*LMP2.F2 primer to the end of the shown 3' UTR. The
Dare-PSMB9A*01 sequence is identical throughout its length
to the previously reported fragment (20) and is most
likely the full-length cDNA clone of this gene. The sequence is 885 bp
long and contains 54 bp of the 5' UTR and 174 bp of the 3' UTR.
Although a polyA tail has not been found, a potential polyadenylation
signal is present at sites 167172. The sequence codes for a deduced
propeptide of 19-AA residues and a mature protein of 199-AA residues
(assuming cleavage at the GT motif).
An additional PCR fragment was found with high sequence similarity to the PSMB9A*01 gene. The similarity extends from the position of the Dare*LMP2.R1 primer along the entire length of the 594 bp PSMB9A*01 fragment, but the sequence extends by 263 bp into the 5' UTR. Thirteen substitutions differentiate these two sequences, six in the 5' UTR and seven synonymous substitutions in the coding region. The 3' UTR sequence of this cDNA sequence was not determined. Because of its high similarity to the Dare-PSMB9A*01 gene, the sequence is interpreted as being an allele of this gene and as such it is designated Dare-PSMB9A*02 (not shown).
Another full-length cDNA sequence with similarity to the
Dare-PSMB9A genes was found based on two overlapping PCR
fragments (Fig. 1
). The 5' part of the sequence extends 643 bp from the
position of the Dare*LMP2.R3 primer; the 3' part of the sequence
extends 530 bp from the position of the Dare*LMP2.F2 primer. No
differences are found in the 291-bp long overlap. This sequence has a
119-bp long 5' UTR, a 57-bp long 3' UTR, and a potential
polyadenylation signal at sites 3338, but no polyA tail. A 227-AA
polypeptide is deduced from the coding region, and a 17-AA propeptide
and a 210-AA residue long mature protein are postulated based on the
conserved GT motif. A comparison with the PSMB9A*01 sequence
shows low similarity in the 5' and 3' UTR and a slightly longer coding
region containing 66 nucleotide differences. The sequence divergence
suggests that this sequence represents a second PSMB9 locus,
which we designate Dare-PSMB9B, and this
suggestion is borne out by mapping studies.
Screening of a zebrafish genomic PAC library with fragments of the PSMB9A and PSMB9B cDNA clones has shown the PSMB9A and PSMB9B genes to reside on different Mhc class I gene-containing PAC clones: the PSMB9A gene resides on BAC clones 7 and 716 and the PSMB9B gene on the PAC clone BUSMP706A2470Q3 (B. Murray, V. Michalovà, H. Sültmann, and J. Klein, unpublished observations).
Finally, the screening of the PAC clones revealed the presence of a third PSMB9-like locus on a clone (BUSMP706P02172Q3) that does not contain either the PSMB9A or the PSMB9B loci. PCR amplification of this clone with PSMB9-specific primers yielded a product that contained parts of intron 5 and exon 6. The sequence is divergent from both PSMB9A and -B but more closely related to PSMB9B (data not shown). On the basis of its position on the zebrafish Mhc map and its sequence divergence, we interpret the sequence as being derived from a third PSMB9 locus that we designate Dare-PSMB9C. A transcript of PSMB9C was not found in our cDNA library.
Dare-PSMB7
Five partial zebrafish cDNA sequences with similarity to the human
PSMB7 gene were recovered from the GenBank EST library
(AA605681, AA606112, AI331717, AI332003, and AI332014). A consensus
sequence was generated which extends 704 bp 5' of the end of the polyA
tail. The primer Dare*PSMB7.R1 (Table I
) was designed to amplify the
remaining 5' end of the cDNA sequences. A 528-bp fragment was recovered
that was identical with the initial consensus in the 248-bp overlap. No
5' UTR or initiation codon is present in the resulting 962-bp
transcript (Fig. 1
). The deduced 275-AA polypeptide is comprised of a
41-AA propeptide and a 234-AA mature protein (assuming cleavage at the
GT motif). The 3' UTR is 100 bp long up to the start of the polyA tail
and contains a possible polyadenylation signal at sites 9095.
Dare-PSMB5
The partial sequence of the previously reported
Dare-PSMB5 gene (20) was extended in the 3'
direction to conduct a phylogenetic analysis that included the complete
mature protein of all known zebrafish PSMB subunits. We amplified a
706-bp fragment that extended from the primer Dare*X.F2 (Table I
) to
the end of a polyA tail (including a possible polyadenylation signal).
The sequence is identical with the previous (AF032391) in the 43-bp
overlap and extends the deduced polypeptide 37 AA up to the stop codon
(Fig. 2
). The complete Dare-PSMB5 cDNA sequence is 1296 bp
long (not shown; GenBank accession no. AF155578).
Phylogenetic analysis
Phylogenetic trees were constructed based on distance estimates
from an AA alignment (Fig. 2
). Three distance estimates were used, the
first based on simple proportional (p) distances
(28), the second based on the Dayhoff PAM matrix, and the
third based on the categories method (27). All three
distance estimates resulted in the same topology (except for the PSMB6
clade of jawed vertebrates). For this reason, only the tree based on
the p-distances is shown (Fig. 3
). Bootstrap values show strong support
(99100) for the grouping of each of the three types of ß subunits
with one of the ancestral yeast subunits, Y (PRE3, PSMB6, PSMB9), Z
(PUP1, PSMB7, PSMB10), and X (PRE2, PSMB5, PSMB8), and for each of the
PSMB clades (PSMB59). The positions of the Dare-PSMB9A and -PSMB9B
subunits are consistent with previous phylogenies (20) and
support the introduced nomenclature. The new Dare-PSMB11 and -PSMB12
subunits are clearly members of the Y (PRE3) and Z (PUP1) clades,
respectively. In both cases, they are only distantly related to the
other subunits and their phylogenetic positions within the clades are
unclear. In the Y clade, PSMB11 is a sister subunit to all other
subunits, while PRE3 is a sister subunit to the PSMB6 clade. In the Z
clade, PSMB12 is a sister subunit to the PSMB7 clade. However, in no
case are these positions supported by the bootstrap analysis (Fig. 3
).
|
Six regions of contiguous DNA sequence (contigs), spanning a
segment of about 26 kb (Fig. 4
), were
assembled based on the sequence information derived from the analysis
of subclone libraries and intron-specific PCRs. All contigs are joined
by clones for which the complete sequence was not determined. For each
gene, a detailed map of the intron/exon organization was deduced. In
every case, correct splice signals were found at the intron/exon
boundaries. The Dare-TAP2 gene organization was deduced from
the partial zebrafish cDNA sequence (exons 811; Ref. 20)
and from the salmon (Salmo salar) TAP2 gene
(36). Eleven exons of similar sizes and splice site
locations to salmon (36) and human (32)
TAP genes were identified. The exact size of exon 1 is not
known and the given estimate is based on the position of a methionine
codon most similar to that identified as the start of translation in
salmon. Two other possible initiation codons exist. Six exons of the
Dare-PSMB8 were deduced from the existing cDNA sequence
(20). Only a partial sequence of exon 6 was available for
analysis; however, based on the length of the Dare-PSMB8
cDNA and the similarity of the organization to other PSMB8
genes, this is most likely the last exon of the gene. The intron/exon
organization of the Dare-PSMB9A, -PSMB11, and
-PSMB12 genes was deduced from the cDNA sequences reported
herein. The Dare-PSMB9A and -PSMB11 genes have a
very similar organization, each possessing six exons of similar size
and splice site locations. The Dare-PSMB12 gene contains
eight exons. Analysis of the first exon reveals a probable initiation
codon one codon upstream of the end of the reported cDNA sequence.
|
The analysis of the promoter regions for transcription factor
binding motifs showed many possible transcription factor binding sites
(data not shown). Of interest here is that no SP1 sites, found in the
mammalian PSMB8 and PSMB9 genes (33, 37, 38), were found in any of the proteasome gene promoter regions
searched. Further, in each promoter, a possible CCAAT/enhancer-binding
protein (C/EBP)-ß (NF-IL-6) site is present (Figs. 4
and 5
). In each case, the position of the
transcription factor motif is given relative to the initiation codon
(Fig. 5
). The C/EBP-ß nuclear factor is an activator of various
acute-phase proteins (39) and is present in the mammalian
PSMB10 promoters (35). Of particular interest
is the presence of the IFN regulatory factor (IRF, also known as ISRE)
motif in the TAP2, PSMB11, and PSMB12 promoters
(Figs. 4
and 5
). This motif binds both the activator IRF-1 and
repressor IRF-2 transcription factors (40) and is present
in the mammalian TAP2, PSMB8, and PSMB10
promoters (32, 35, 41, 42), as well as the bidirectional
promoter of the TAP1 and PSMB9 genes (11, 12). The zebrafish PSMB9A and PSMB8 genes
lack this element. The initiation codons of the PSMB11 and
PSMB12 genes are 159 bp apart. The IRF element of the
PSMB11 gene is located in the first intron of the
PSMB12 gene and vice versa.
|
| Discussion |
|---|
|
|
|---|
(5). For brevity, we refer to these as the c and i types,
respectively. The zebrafish loci are distributed among these groups as
follows: group X (PSMB5, PSMB8), group
Y (PSMB6, PSMB9A, PSMB9B, PSMB9C, PSMB11), and
group Z (PSMB7, PSMB12; Fig. 3
has not been identified as yet), but
a tentative assignment of the types can be made by two
criteriasequence homology and the presence or absence of relevant
sequence elements (transcription factor binding sites) in the genes
promoter regions (Fig. 5
The location of the IRF-binding sites in the zebrafish PSMB
clusters may not be fortuitous. In humans, an IRF site is positioned in
the region between the TAP1 and the PSMB9
(LMP2) genes (32, 43) and is believed to
regulate the bidirectional expression of both these genes, which are
arranged in a head-to-head orientation (11, 12). A similar
head-to-head arrangement exists in zebrafish, except in this case both
the PSMB11 and PSMB12 promoters possess a
separate IRF-binding site. The central position of these promoters may
influence the expression of four or five genes, PSMB11,
PSMB9 (LMP2), and TAP2 in one direction, as well
as PSMB12 and PSMB8 (LMP7) in the other direction
(Fig. 4
). However, the TAP2 gene also has an additional
IRF-binding site in its own promoter region.
In humans (and other mammals tested thus far), only two PSMB
loci are present in the Mhc (PSMB8 and
PSMB9) within the class II region (32, 43).
Both loci are of the i type; the third i-type locus
(PSMB10), as well as all the other PSMB loci, are
found outside of the Mhc (5). In the zebrafish,
the situation is more complicated. Here, there are at least six
PSMB loci in the Mhc, but not in the class II
region; instead they are all in the class I region. Because the i-type
PSMB genes are functionally tied to the class I and not to
the class II Mhc genes (5), it can be argued
that the zebrafish arrangement makes more sense, particularly because,
in this species, the class I and class II loci are on different
chromosomes (22). Furthermore, in the zebrafish, the
Mhc-associated loci are presumably all of the i type and
they represent all three PSMB groups (in mammals, the two
loci in the Mhc represent the X and Y
groups; the i-type locus of the Z group is on a different
chromosome). Four of the six Mhc-associated zebrafish
PSMB loci are in a single main cluster; the other two are at
a distance of
60 kb (PSMB9C) and
120 kb
(PSMB9B) from the cluster (B. Murray, V. Michalovà, H.
Sültmann, and J. Klein, manuscript in preparation). The
association of i-type PSMB genes with Mhc class I
in zebrafish is in agreement with Hughes (18) hypothesis
of a selective advantage to the clustering of genes that have similar
broad range expression patterns.
In both humans and zebrafish, the PSMB clusters are associated closely with the TAP loci (TAP1 and TAP2 loci in humans and TAP2 locus in the zebrafish; the TAP1 locus could not be identified in this species thus far) and loosely with the RING3 locus (32). The conservation of a close linkage of the TAP2 gene and the PSMB cluster between bony fish and mammals, despite genomic rearrangement, again suggests that it might have a selective advantage (18).
The degree of sequence divergence between the Dare-PSMB9A and Dare-PSMB9B genes is similar to that reported for Xenopus laevis, Xela-PSMB8A (LMP7A) and Xela-PSMB8B (LMP7B) genes (44). Both sets of genes have highly diverged 5' and 3' UTRs and a similar degree of AA identity in the mature protein sequence (85% for Dare and 90% for Xela). However, segregation studies in Xenopus indicate that the two Xela genes may be allelic (44). In contrast, the Dare genes reside at different loci.
The main PSMB cluster, which extends over
18 kb, consists
of the PSMB9A, PSMB11, PSMB12, and
PSMB8 loci, arranged in this order in the following
orientation
(Fig. 4
). The two PSMB loci
outside the main cluster are apparently the result of a duplication of
the PSMB9 locus. Because phylogenetically the
PSMB9B and PSMB9C loci appear to be more closely
related to each other than either of them is to PSMB9A, they
are presumably derived from a common ancestor that had a common
ancestor with the PSMB9A locus. Whether the B and
C loci are functional is unclear at this time; in a cDNA
library only the transcript of the B locus has been found.
However, the PSMB9B locus is apparently present in some
haplotypes and absent in others (B. Murray, V. Michalovà, H.
Sültmann, and J. Klein, manuscript in preparation).
The two extra loci in the main zebrafish PSMB cluster,
PSMB11 and PSMB12, are of special interest
because of their location and their phylogenetic relationships. The
PSMB11 locus is a member of the Y group, which
can be divided into two subgroups represented in humans by the
PSMB6 and PSMB9 loci. Because the zebrafish
genome contains close relatives of these two loci
(Dare-PSMB6 and Dare-PSMB9, respectively) and on
the phylogenetic tree the branch leading to Dare-PSMB11
splits off before the PSMB6 and PSMB9 branches
split from each other (Fig. 3
), the PSMB11 gene appears
to have arisen before the divergence of the bony fish and
mammalian lineages. What has happened to PSMB11 in mammals
is unclear at this time: it may have been lost or it may be present but
unidentified. However, it is certain that if it is present, it is not
located in the Mhc class II region because the entire region
has now been sequenced and no PSMB11 homologue has been
found. The zebrafish PSMB cluster may have been assembled
from genes that were originally on different chromosomes or it may have
arisen by in situ duplication. Taking into account the closeness of the
loci and their orientation in the cluster, the latter explanation is
the more parsimonious of the two. Therefore, we propose that the
ancient PSMB cluster in the part of the chromosome that
later became the Mhc class I region contained the ancestors
of the PSMB6, PSMB9, and PSMB11
genes.
It probably also contained the ancestor of the PSMB12 gene.
The zebrafish PSMB12 gene clusters with the PSMB7
(Z) genes in a clade that also contains the human PSMB10
(MECL1) gene. The Dare-PSMB12 gene is not the
orthologue of the Hosa-PSMB7 gene because a
Dare-PSMB7 gene exists. It is also unlikely that the
zebrafish PSMB12 gene is an orthologue of the human
PSMB10 gene because the genetic distance between the two
sequences is much greater than that between any human-zebrafish
PSMB orthologues. Further, the inclusion of nurse shark
(Ginglymostoma cirratum) and hagfish (Eptatretus
stouti) PSMB7-like subunits (M. Kasahara, unpublished
observations) to this analysis shifts the position of the
Dare-PSMB12 subunit outside the PSMB7/10 clade
(not shown). Therefore, it is most likely that the
Dare-PSMB12 gene is derived either from a gene that was also
the ancestor of PSMB7 or from a gene that was the ancestor
of both PSMB7 and PSMB10. It is not possible at
present to decide between these two alternatives (the bootstrap support
for the alternative depicted in Fig. 3
is too low to carry any weight).
In either case, the PSMB12 gene appears to have diverged
from PSMB7 or both PSMB7 and PSMB10
before the divergence of the bony fish and mammalian lineages and the
loss of PSMB12 in the latter (if mammals really lack this
gene) must have been a secondary event.
Because the X, Y, and Z groups of the
PSMB genes each contain a yeast gene, the groups must have
separated from one another before the divergence of lineages leading to
fungi and Metazoa (Fig. 3
; Ref. 45). The presence of at
least one representative of each of the three groups in the zebrafish
PSMB cluster in the Mhc class I region suggests
that the Ur-Mhc region (before the emergence of class I
genes) contained a set of X, Y, and Z
genes. In this sense, the zebrafish arrangement of PSMB
genes resembles the ancestral arrangement more closely than does the
mammalian organization. The latter is a derived state after the removal
(deletion or translocation) of one or more genes from the ancient
PSMB cluster. To make this proposal compatible with the
genome-wide duplication hypothesis of generating the paralogous genes
(46), one would have to postulate that the duplications
were followed by deletions within the PSMB cluster on the
different chromosomes.
If we assume functionality of all five Y group genes in the zebrafish (PSMB6, PSMB9A, PSMB9B, PSMB9C, and PSMB11) and exchange of subunits encoded in these genes, then at least four distinct 20S proteasomes may exist in this species. Because four of the five genes are located in the Mhc and may possibly be regulated in their expression by lymphokines, it can be speculated that four of the five organelles are immunoproteasomes. These may be involved in the production of different sets of peptides that are loaded onto class I molecules, broadening the range of Ag presentation. The allelic diversity at the Xenopus PSMB8 locus (44) and the rat TAP2 locus (47) has also been interpreted along these lines.
In the zebrafish, the production of a wider range of peptide subsets may provide a greater flexibility in peptide presentation in combination with differential peptide binding by different families of Mhc class I molecules. The various immunoproteasomes could have coevolved with separate Mhc molecules, and the apparent presence of the PSMB9B gene in some haplotypes but not in others may be a reflection of this evolution. In humans, immunoproteasomes have been shown to produce peptides with hydrophobic or basic carboxyl termini (6), which are well suited for binding in the C-terminal anchor pocket of the HLA class I molecules (7). Speculatively, each immunoproteasome may produce a set of peptides with a specific type of C termini. This hypothesis predicts the existence of corresponding class I molecules specialized for binding the products of different immunoproteasomes. In an attempt to test the binding specificity of fish class I molecules, Okamura et al. (48) compared the AA variation at the C-terminal anchor pocket of carp class I genes with the four conserved residues (Tyr84, Thr143, Lys146, and Trp147) of the mammalian classic class I molecules. The comparison revealed conservation of between one and three of these residues in carp molecules, indicating a possible expansion of peptide binding specificity (48). In the zebrafish, three class I loci have been described (49), all of which possess the same three conserved residues (Thr143, Lys146, and Trp147), as found in the most conserved carp gene. However, the full range of class I genes in zebrafish has yet to be described. Additional zebrafish class I genes have been identified (H. Sültmann, B. Murray, and J. Klein, unpublished observations), and the degree of allelic diversity is being investigated at these loci. The above hypothesis predicts the presence of class I molecules with diverse C-terminal anchor pocket motifs in the zebrafish.
| Acknowledgments |
|---|
| Footnotes |
|---|
2 Abbreviations used in this paper: PSM, proteasome component; AA, amino acid; ABC, ATP-binding cassette; BAC, bacterial artificial chromosome; LMP, low molecular mass protein; PAC, P1 artificial chromosome; PSMB, proteasome component ß; UTR, untranslated region; C/EBP, CCAAT/enhancer-binding protein; IRF, IFN regulatory factor. ![]()
Received for publication March 24, 1999. Accepted for publication June 11, 1999.
| References |
|---|
|
|
|---|
-inducible proteasome activator PA28. Immunol. Rev. 163:161.[Medline]
: structural comparison, chromosomal localization, and analysis of the promoter. J. Immunol. 159:2760.[Abstract]
This article has been cited by other articles:
![]() |
Y. Ohta, W. Goetz, M. Z. Hossain, M. Nonaka, and M. F. Flajnik Ancestral Organization of the MHC Revealed in the Amphibian Xenopus J. Immunol., March 15, 2006; 176(6): 3674 - 3685. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Ohta, E. C. McKinney, M. F. Criscitiello, and M. F. Flajnik Proteasome, Transporter Associated with Antigen Processing, and Class I Genes in the Nurse Shark Ginglymostoma cirratum: Evidence for a Stable Class I Region and MHC Haplotype Lineages J. Immunol., January 15, 2002; 168(2): 771 - 781. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Sultmann, A. Sato, B. W. Murray, N. Takezaki, R. Geisler, G.-J. Rauch, and J. Klein Conservation of Mhc Class III Region Synteny Between Zebrafish and Human as Determined by Radiation Hybrid Mapping J. Immunol., December 15, 2000; 165(12): 6984 - 6993. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. S. Clark, P. Pontarotti, A. Gilles, A. Kelly, and G. Elgar Identification and Characterization of a {beta} Proteasome Subunit Cluster in the Japanese Pufferfish (Fugu rubripes) J. Immunol., October 15, 2000; 165(8): 4446 - 4452. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Michalova, B. W. Murray, H. Sultmann, and J. Klein A Contig Map of the Mhc Class I Genomic Region in the Zebrafish Reveals Ancient Synteny J. Immunol., May 15, 2000; 164(10): 5296 - 5305. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |