|
|
||||||||
Department of Structural Biology, and Department of Microbiology and Immunology, School of Medicine, Stanford University, Stanford, CA 94305
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|

and 
T cells (2, 3). KIR are type I transmembrane glycoproteins that have two or three extracellular Ig-like domains for binding MHC class I ligands, a stem region, and a signaling domain (the transmembrane region and cytoplasmic tail) for transducing either an inhibitory or activating signal. In humans, inhibitory KIR specific for polymorphic determinants of HLA-A, B, and C molecules function to make NK cells tolerant of self MHC class I and responsive to cells that lack a normal complement of MHC class I molecules (4, 5, 6, 7). Although the functions of the activating KIR are poorly defined, genetic correlations indicate they contribute to antiviral immunity (8, 9, 10), autoimmunity (11, 12, 13, 14, 15, 16, 17), and resistance to pre-eclampsia (18, 19). KIR are encoded by a family of genes in the leukocyte receptor complex on chromosome 19q13.4. They are flanked by FCAR, the gene encoding the FcR for IgA, and the leukocyte Ig-like receptor (LILR) gene family. Human KIR haplotypes are highly variable because of differences in gene number, gene content, and allelic polymorphism (20, 21, 22, 23, 24, 25, 26). Conserved features of the human KIR gene family are the KIR3DL3 and KIR3DL2 genes, which define the centromeric and telomeric ends of the locus, respectively, and KIR3DP1 and KIR2DL4 that are centrally located. These conserved genes define three framework regions and two intervals of variable gene content (25, 26). The inhibitory HLA-C receptors are encoded by genes in the centromeric part of the locus, whereas the inhibitory HLA-A and B receptors are encoded by genes in the telomeric part of the locus. Recently, the novel and divergent KIR3DX1 (KIR3DL0) gene of unknown function has been located to the central part of the LILR gene family (27).
Characterization of KIR cDNA clones, combined with genomic typing of panels of individuals, has shown the KIR gene family varies considerably between primate species (28, 29, 30, 31, 32), as do the MHC class I genes (33). Notably, the two mouse KIR genes are situated on the X chromosome, not the LRC, and they do not encode NK cell receptors for MHC class I (34, 35). That role is fulfilled by a distinct gene family, Ly49, which encodes type 2 transmembrane glycoproteins having ligand-binding domains like those of C-type lectins (36, 37, 38, 39). Of the human MHC class I genes, the most recently evolved is HLA-C, for which orthologs have been found only in chimpanzee, bonobo, gorilla, and orangutan. Despite its recent origin, HLA-C now encodes the dominant ligands for inhibitory KIR (40). Whereas a minority of HLA-A and HLA-B allotypes are ligands for KIR, all HLA-C allotypes can serve this function. They form two groups: ones with asparagine at position 80, (called C1) which are ligands for KIR2DL2/3, whereas ones with lysine 80 (called C2) are ligands for KIR2DL1 (41, 42, 43). Almost all humans have KIR2DL1 and KIR2DL2/3, and can therefore use HLA-C mediated inhibition to regulate NK cell activity, whereas that is not the case for HLA-A and HLA-B.
In studying the evolution of MHC-C-mediated regulation of NK cells, the orangutan (Pongo pygmaeus) is a potentially informative species. In contrast to humans, where HLA-C is fixed and encodes both C1 and C2 allotypes, the orangutan Popy-C locus is present on
50% of MHC haplotypes and all its allotypes are of the C1 group (28, 44). Previous study of orangutan KIR showed this limitation to Popy-C was reflected in the KIR. In humans, C1 specificity of KIR2D is determined by the presence of lysine at position 44, whereas C2 specificity is determined by methionine 44 (43). In the orangutan, we identified several KIR2D with lysine 44, but none with methionine 44 (28). In addition, several orangutan KIR2D with glutamic acid 44 were characterized. These observations suggested a model in which interactions between KIR and MHC-C in the orangutan are at a half-evolved state by comparison to the human. To investigate the genetics underlying this difference, we have determined the complete sequence of an orangutan KIR haplotype.
| Materials and Methods |
|---|
|
|
|---|
An orangutan cosmid library, MPMGc141, at the Deutsches Ressourcenzentrum für Genomforschung (Berlin, Germany), was screened with a probe containing PopyKIR3DLA, PopyKIR2DSB, and PopyKIR2DL4A cDNA (28). Of 31 clones obtained, 27 contained inserts and were analyzed by Southern blot with the probe used to screen the library. Twelve hybridizing clones were further studied to determine regions of overlap and to assign the clones to the two KIR haplotypes. This was accomplished by dot blot and sequence specific oligonucleotide hybridization to determine gene content, restriction mapping to look for regions with shared restriction patterns, and limited sequencing of subclones obtained from regions of identical restriction pattern.
For dot blots, 50 ng of cosmid DNA were mixed with 100 µl of 0.4 M NaOH, 10 mM EDTA followed by incubation at room temperature for 10 min. The DNA was applied to a prewetted filter in a vacuum manifold and the liquid was removed by vacuum. The membrane was washed with 5x SSPE (diluted from 30x SSPE: 4.5 M NaCl, 0.3 M NaH-2PO4, 30 mM Na2EDTA, final pH 7.4), cross-linked, and placed in prehybridization solution (6x SSPE, 0.2% polyvinylpyrrolidone, 0.2% Ficoll, 0.2% BSA, 0.5% SDS, and 100 µg/ml denatured salmon sperm DNA). The filters were incubated at 42°C for a minimum of 1 h, after which radiolabeled oligonucleotide probes were added. Filters were hybridized for 1 h, removed, and washed twice in 2x SSPE, 0.1% SDS at room temperature for 10 min. The membranes were exposed to x-ray film for 1 h to estimate the amount of DNA present on the membrane. Following exposure to film, the membranes were washed twice at high stringency in 6x SSPE, 1% SDS for 10 min. The wash temperature was determined from the nucleotide composition of the oligonucleotide probe used (Table I). Following the high-stringency wash, the membranes were again exposed to x-ray film.
|
As the cosmid library was made from DNA obtained from one orangutan, the cosmids were expected to represent two KIR haplotypes. To distinguish the haplotypes, we identified regions of different cosmids that had identical restriction maps. Separate aliquots of cosmid DNA were digested with EcoRI and BamHI, and the fragments were subcloned into pBluescriptKS+ (Stratagene). Limited end sequencing of identically sized fragments subcloned from different cosmids was used to determine whether the fragments were identical or different in sequence. The results allowed us to identify whether regions with an identical restriction map corresponded to the same or different haplotypes. A set of six overlapping cosmids, representing one complete KIR haplotype, was selected for sequencing. The sequence analysis confirmed the assignment of cosmids to the two haplotypes; the overlapping cosmids used to assemble the complete KIR haplotype had identical sequences in all the regions of overlap. In addition, the patterns of KIR-hybridizing bands obtained from Southern blotting of the selected cosmid clones were consistent with those obtained from similar analysis of genomic DNA from a panel of nine orangutans (see Fig. 1A). The six cosmids that are not parts of the sequenced haplotype represent the second KIR haplotype but do not completely cover it; dot blot analysis revealed the absence of KIR3DL3, the telomeric framework gene, from all six cosmids.
|
The KIR haplotype sequence was determined by sequencing both strands of the cosmid subclones described above. Analysis of the EcoRI and BamHI subclones confirmed many of the junctions between adjacent subclones. Sequencing was performed on an ABI377 or Beckman Coulter CEQ instrument using the reagents and protocols recommended by the manufacturer. Subclones were first end sequenced. Smaller subclones (<2.5 kb) were sequenced by primer walking; medium length subclones (2.510 kb) were sequenced by deletion mutagenesis using the Erase-A-Base kit (Promega); longer subclones (>10 kb) were shotgun sequenced using the TOPO Shotgun Subcloning kit (Invitrogen Life Technologies). When an overlapping subclone to connect two sequenced subclones could not be found, primers were designed to PCR amplify a segment spanning the junction between the subclones. The amplification products were cloned and multiple clones sequenced to confirm the junction. Sequences were assembled using the Staden Package (45) (http://staden.sourceforge.net/). The finished sequence had quality scores >30. The PopyKIR haplotype sequence is deposited in GenBank with accession number EF014479.
Computational analyses
Phylogenetic analysis.
The data set comprised the orangutan KIR genes obtained here and previously reported genomic sequences for human (AY320039, AC006293, AC011501), chimpanzee (BX842589), and rhesus macaque (BX842590, BX842591) KIR. Sequences of the individual loci were aligned using CLUSTAL X (46) and manually corrected in BIOEDIT (www.mbio.ncsu.edu/BioEdit/bioedit.html). The alignment was then divided into regions, generally following intron-exon boundaries. An exception was intron 6, which was further subdivided into three parts; the first of these (intron 6a) starts at the beginning of the intron and ends at the beginning of the deletion common to MmKIR3DL1 and MmKIR3DL10 (
750 bp), the second (intron 6b) begins here and ends at the beginning of the LINE insertion common to KIR3DL2 and PtKIR3DL1/2 (
2.9 kb), and the third (intron 6c) starts after the long interspersed nuclear element (LINE) insertion and ends at the end of the intron (
600 bp). These alignments were used for neighbor-joining (NJ) and parsimony analyses. NJ analysis was performed using MEGA version 3.1 (47) (www.megasoftware.net/) with 1000 replicates, pairwise deletion, midpoint rooting, and the Tamura-Nei method. PAUP*4.0b10 (48) (http://paup.csit.fsu.edu/) and the tree bisection-reconnection branch-swapping algorithm were used for parsimony analyses with 1000 replicates and a heuristic search. As comparison of the resulting trees revealed no differences, only NJ trees are presented in the figures.
Selection analysis.
Selection analysis was performed on the sequences encoding the D1 and D2 domains for the seven orangutan lineage III KIR. Estimation of rate of nonsynonymous substitution (dN)/rate of synonymous substitution (dS) (
) ratios was conducted by maximum likelihood using PAML version 3.14 (49) (http://abacus.gene.ucl.ac.uk/software/paml.html). A NJ analysis was first performed (as described above) and the likelihood of this tree topology was estimated for four site-specific models in which the selective pressure varied among different sites but the site-specific pattern was identical across all lineages. A likelihood ratio test was then conducted to compare a null model that does not allow
>1 in the distribution with an alternative model that does. Likelihood ratio tests were performed for M1a (nearly neutral)/M2a (selection) and M7 (
)/M8 (
and
). A Bayes Empirical Bayes approach was then used to identify codons belonging to the
>1 site class. In all these analyses, the F3 x 4 model of codon frequencies was used.
Divergence time estimation.
The divergence time for the two orangutan lineage III sublineages was estimated using the Bayesian relaxed molecular clock approach with the MULTIDISTRIBUTE program package (50, 51) (http://statgen.ncsu.edu/thorne/multidivtime.html). Genomic sequences were used for this analysis (from exons 2 to 4). All the nonrecombinant orangutan lineage III KIR were included, as well as KIR2DS4 and PtKIR2DS4 (calibration point), Mm KIR1D and KIR2DL4 (outgroup). The computer program ESTBRANCHES was used to estimate the branch lengths of the constrained topologies and the corresponding variance-covariance matrices. The F84+
model was used with maximum-likelihood parameters estimated previously by PAML. MULTIDIVTIME then used the variance-covariance matrices produced by ESTBRANCHES to run a Markov chain Monte Carlo analysis for estimating mean posterior divergence times on nodes with associated SD and 95% credibility interval. The Markov chain was sampled 10,000 times every 100 cycles and the burn-in stage was set to 100,000 cycles. Priors were set according to the guidelines defined in the MULTIDIVTIME manual. The root of the ingroup tree corresponds to the hominoid-Old World monkey separation and was set to 27 ± 3 million years ago (mya) to cover the range 2133 mya as described previously (52). The separation of KIR2DS4 and Pt KIR2DS4 was used as the internal calibration point. It was constrained to be between 6 and 8 mya corresponding to the human-chimpanzee split. The lower limit corresponds to the lower estimate for the age of Sahelanthropus tchadensis, the oldest hominid fossil known to date (53); while 8 mya corresponds to the upper 95% credibility interval for the human chimpanzee divergence as established in a recent analysis (54).
Analysis of repetitive elements. The KIR haplotype sequences used for the analysis of repetitive elements were: human (AY320039, AC006283, AC011501, and AC009892), chimpanzee (BX842589), rhesus macaque (BX842591), orangutan (this analysis), pig (CR450381), rat (NW_047555), and mouse (AL672068 and NT_039385). In addition, the following cattle sequences obtained from Build 2.1 of the Bos taurus genome assembly were used for the intron 3 analysis (NW_937716, -NW_938268, NW_938467, NW_938583, NW_939141, NW_941773, NW_941846, NW_942117, NW_943617, NW_946585, and NW_980135). The sequences were submitted to the RepeatMasker server (RepeatMasker Open-3.0; www.repeatmasker.org) and the Censor server (55). Five separate regions were analyzed for the primate KIR haplotypes, the LILR to KIR interval, the region between the pseudogene and 2DL4, the KIR to FCAR interval, intragenic regions, and intergenic regions. Comparison of the primate KIR flanking regions was made with the pig LILR to KIR and KIR to FCAR regions, the rat Gp49B to KIR and KIR to FCAR regions, and the mouse PIR to NCR1 and KIR flanking regions. An alignment of primate, rodent, pig, and cattle sequences including exon 3, intron 3, and exon 4 was constructed. This data set was used to ascertain the repetitive element content of intron 3 as well as for phylogenetic analyses.
| Results |
|---|
|
|
|---|
The complete sequence of an orangutan KIR haplotype was obtained from six overlapping cosmid clones. It contains seven PopyKIR genes arrayed in the head-to-tail configuration seen in other primate species (25, 56). Upstream of the KIR genes is the LILR locus, downstream the FCAR locus (Fig. 1B). The part of the PopyLILR locus defined here is orthologous to the syntenic region of the human and chimpanzee LILR loci (94.6 and 93.7% sequence identity, respectively). It contains all of the PopyLILRP2 pseudogene (also called ILT10) and part of the PopyLILRP1 pseudogene (ILT9) (Fig. 1C). The part of the PopyFCAR gene defined here has 95.7% sequence similarity with the corresponding parts of the human and chimpanzee FCAR genes and 92.5% sequence similarity with the rhesus macaque FCAR gene. Thus, the organization of the PopyKIR locus and its genomic context within the leukocyte receptor complex (LRC) is like that observed in other primate species.
Comparison was also made with the KIR loci of three nonprimate species: pig, rat, and mouse (Fig. 1D). For the pig, which has a single KIR gene (57), the order of the loci is the same as in primates: LILR-KIR-FCAR. Although this order is conserved in the rat, the KIR gene and its accompanying KIR gene fragment are differently arrayed in a tail-to-tail configuration (extracted from NW_047555). Furthermore, the regions that flank the rat KIR genes are longer in the rat (55 kb upstream and 25 kb downstream) compared with the pig (11 and 8 kb) and primates (315 and 710 kb). More divergent is the mouse, where the LRC lacks both KIR and FCAR genes. The latter appears absent from the mouse genome and the two KIR genes are located on the X chromosome (34) in a head-to-head configuration (extracted from AL672068).
Gene content of the orangutan KIR haplotype
Of the seven genes in the orangutan KIR haplotype, two are novel and five correspond to previously characterized cDNA clones (Fig. 1E) (28). One of the novel genes, PopyKIR3DL3, is the orangutan equivalent of human KIR3DL3, the framework gene of lineage V that is adjacent to the LILR locus (Fig. 1B). The other novel gene, PopyKIRDP, is a pseudogene that lies upstream of PopyKIR2DL4 (lineage I), the position also occupied by a pseudogene in human, chimpanzee, and rhesus KIR haplotypes. The interval between PopyKIR3DL3 and PopyKIRDP contains three lineage III genes (PopyKIR2DSE, PopyKIR2DSFn, and KIR2DLB), an organization like that in the corresponding region of the group A human KIR haplotype. Downstream of PopyKIR2DL4, there is only one additional gene in the orangutan haplotype. PopyKIR3DLD1 is a lineage II gene that flanks the FCAR locus and corresponds to the lineage II framework gene present in other primate species. Overall, the organization of the orangutan KIR haplotype resembles that of the human KIR A haplotype (25) and the chimpanzee KIR haplotype described by Sambrook et al. (56) (Fig. 2).
|
Repetitive elements in the flanking sequences. The regions that flank the KIR locus contain many repetitive elements. Comparison of the 5' flanking region, which extends from the final LILR exon to the first KIR exon, showed that no repetitive element is conserved between primate and nonprimate species (pig, mouse, and rat) (Fig. 3A). Similar comparison for the 3' flanking region, which extends from the final KIR exon to the first FCAR exon, identified a 332-bp fragment of a LINE element that is common to pig and primates and is interrupted by other insertions (Fig. 3B). Whereas this shared L1MA9 fragment flanks the pig FCAR gene, in primates it is at the other end of the region, away from FCAR and close to the KIR genes. That only one small fragment has been preserved from the many repetitive elements likely present in the common ancestor of the pig and primate KIR loci, points to the unusually dynamic history of the KIR locus.
|
Repetitive elements in the intergenic sequences.
Apart from the region between the pseudogene and KIR2DL4, the segments that separate two KIR genes can be divided into three groups based upon the repetitive elements they contain (Fig. 4). Common to all these intergenic regions is a MER2B element and a LINE fragment (L1M5). Group 1 comprises sequences associated with the most telomeric KIR genes (lineage II KIR) (Fig. 4, group 1), and in them the LINE fragment (shown red in Fig. 4) is longer by
140 bp than in the group 2 and 3 sequences. These two groups have an Alu insertion that is responsible for the further truncation of the LINE fragment. In group 2, the Alu is full length, whereas in group 3 it has been truncated as the result of a subsequent deletion event. Also specific to group 3 is the insertion of an additional Alu element within the MER2B element. There is no orangutan representative in group 3. These relationships indicate that group 1 represents the oldest form of the intergenic sequence and that the group 2 and then group 3 forms evolved from it by successive insertion and deletion events. Analysis of the sequences upstream of the lineage V and KIR2DL4 genes showed that
350 and 600 bp, respectively, corresponding to the 3' end of the intergenic sequence, is present.
|
Repetitive elements in the intragenic sequences. Apart from the recently reported and divergent KIR3DX1 (27), we find that intron 3 of all primate KIR contain a MLT1D element inserted into a LTR33A element. This comparison also confirmed that primate lineage II and III KIR genes are distinguished from other KIR genes by the insertion of an AluSq into the MLT1D/LTR33A structure (56, 58). Extending the analysis to nonprimate species, showed the pig, rat, and mouse KIR genes all have an MLT1D/LTR33A element in intron 3, albeit truncated in rat and mouse. In contrast, analysis of the cattle draft genomic sequences containing exons 3 and 4 revealed that only the cattle KIR related to BtKIR2DL1 (4 of the 18 sequences) contained the MLT1D/LTR33A insertion. The remaining cattle KIR lacked this element and had intronic sequences more closely related to those of KIR3DX1, which from phylogenetic and genomic analyses appears to represent a second KIR lineage that is encoded by an LRC gene away from the region flanked by the LILR and FCAR genes (27). Thus, the complex of MLT1D and LTR33A within intron 3 was likely formed by an insertion in the lineage of KIR flanked by LILR and FCAR, occurring after its separation from the KIR3DX1 lineage.
Analysis of the intronic repeat structure encompassed the entire gene. Previous reports have demonstrated the maintenance of the intronic repeats both within lineages as well as those common to all KIR (56, 58). Our analysis shows that all but the lineage II orangutan KIR maintain the repeat structure common to each of the lineages described previously (56) (data not shown). The lineage II differences are confined to introns 5 and 6. Intron 6 of PopyKIR3DLD lacks the LINE present in human and chimpanzee lineage II KIR. Mm3DL1 and Mm3DL10 also lack this LINE, which was therefore inserted after separation of the human and chimpanzee lineage from the orangutan lineage. Intron 5 of human and chimpanzee lineage II KIR is characterized by two LINE fragments and four Alu elements. Of these the orangutan lineage II KIR lacks the Alu element closest to exon 5, which is also a feature of rhesus lineage II KIR. Thus, insertion of this element into human/chimpanzee lineage II KIR occurred after separation from the orangutan lineage. The orangutan does have the insertion of an AluSp in intron 5 that is common to the hominoid sequences and absent from the rhesus sequences, indicating that this insertion predates the speciation event.
Refined definition for the framework regions of the primate KIR locus
Only the 5' part of KIR3DL3 forms the centromeric framework. Located at the centromeric end of the KIR locus, PopyKIR3DL3 corresponds to human KIR3DL3, chimpanzee PtKIRNewI, and rhesus macaque MmKIRNewI. Distinguishing PopyKIR3DL3 from KIR3DL3 and PtKIRNewI is the presence in PopyKIR3DL3 of an exon 6, which encodes a stem in the predicted protein product, a feature it shares with MmKIRnewI. Domain-by-domain phylogenetic analysis showed that PopyKIR3DL3 groups strongly with KIR3DL3, PtKIRNewI, and MmKIRnewI in exons 15 and the intervening introns, forming the lineage V KIR (Fig. 5A). From intron 5 through to the 3' end of the sequence, this lineage affinity is lost and PopyKIR3DL3 groups strongly with the PopyKIR2D sequences, a pattern showing that PopyKIR3DL3 is the product of recombination (Fig. 5, BD). This result is confirmed by the association of a group 2 intergenic segment PopyKIR3DL3 (Fig. 4) in contrast to the association of group 3 segments with the lineage V KIR of other primate species.
|
KIR2DL4 and part of a pseudogene form the central framework. Adjacent to the cluster of orangutan lineage III KIR lies the PopyKIRDP pseudogene, which is at the same position in the KIR locus as human 3DP1, as well as chimpanzee (PtKIR3DP1) and rhesus (MmKIRDP) pseudogenes. The pseudogene sequence ends soon after the end of exon 5 and in all species a region characterized by a number of repetitive elements is found between the end of the pseudogene and the beginning of 2DL4. Phylogenetic analysis showed that each of these pseudogenes is a chimera. In the 5' part, PopyKIRDP associates strongly with the orangutan lineage III KIR, whereas human 3DP1 associates with 2DL5 of lineage I, the chimpanzee pseudogene associates with 2DP1 and the rhesus pseudogene, which lacks sequences corresponding to exons 1, 2, and 3, associates with other rhesus sequences (data not shown). The three hominoid pseudogenes group together in phylogenetic trees based on exon 5 sequences (Fig. 6A), although the grouping is weakly supported.
|
Domain-by-domain phylogenetic analysis shows that PopyKIR2DL4 groups strongly with human, chimpanzee, and rhesus 2DL4 sequences in all domains. Thus, there is no evidence for recombination with other KIR genes. The PopyKIR2DL4 gene defined represents a different allele (PopyKIR2DL4C) from the two previously characterized cDNA clones (PopyKIR2DL4A and PopyKIR2DL4B), which have mutations preventing expression of a full-length protein. PopyKIR2DL4A has a point mutation in exon 5 that results in premature termination; PopyKIR2DL4B has a single nucleotide deletion that causes a frameshift and premature termination. PopyKIR2DL4C has two intact ITIM motifs in the cytoplasmic tail as well as a charged residue in the transmembrane domain. In contrast, human and chimpanzee 2DL4 have only one ITIM motif and the charged residue in the transmembrane domain.
Taken together, these results show that the central framework of the hominoid KIR locus consists of exon 5 of the pseudogene, the KIR2DL4 gene and the region of unique sequence that joins them. Although exon 5 of MmDP does not group phylogenetically with exon 5 of the hominoid pseudogenes, the identity of the breakpoint within intron 5 and the homology of the repeat region supports a common origin for the primate pseudogenes in this region. Subsequent recombination with another KIR gene could explain the divergence of exon 5 of MmDP.
The telomeric framework is formed by the 3' portion of a lineage II gene. Situated between PopyKIR2DL4 and the FCAR gene is an orangutan lineage II KIR gene that corresponds to the previously characterized PopyKIR3DLD1 cDNA clone (28). In most domains, it is homologous to all other hominoid lineage II KIR. The exceptions to this are the previously described recombination events that group human KIR3DL1 with lineage III KIR from within intron 6 to the 3' end of the gene (32) and place KIR3DL2 outside of the lineage II group in intron 1. In addition, analysis of exon 3 shows that although KIR3DL2 groups with the other lineage II KIR, its branching pattern indicates a greater divergence than expected (data not shown). As discussed above, PopyKIR3DLD1 lacks two of the repetitive elements that characterize chimpanzee and human lineage II KIR, features shared with MmKIR3DL1 and MmKIR3DL10. One is a LINE element insertion in intron 6, the other is an Alu repeat in intron 5.
With the addition of PopyKIR3DLD1 to the KIR data set, the results of phylogenetic analysis show that MmKIR3DL1 and MmKIR3DL10, which were previously considered to be a separate lineage (lineage IV) (32), should be included in the lineage II KIR (Fig. 7). This grouping is supported by the sequence of a region encompassing the 5' end of the gene through to intron 6, excepting the 5' UTR, exon 1, and exon 5 where the deeper branching nodes were not well-resolved. Weak support (values <50 for the bootstrap analysis) was also observed for exon 4 and the parts of intron 6 (6a and 6c) that are retained in the MmKIR3DL1 and MmKIR3DL10 genes: the intron 6b region being absent. From exon 7 through to the 3' end of the sequence, all macaque KIR group together and are separated from the hominoid KIR. The analysis of intergenic repetitive elements (Fig. 4) suggests that this divergence is not a consequence of recombination, but of selection upon exons encoding the signaling domain of the rhesus macaque KIR. Deletion of the intron 6b region, which includes the MER70B/MSTB1 elements present in other hominoid KIR genes, appears to have been an event specific to the macaque genes of this lineage.
|
Orangutan lineage III KIR comprise two sublineages
In between PopyKIR3DL3 and PopyKIRDP are three lineage III KIR genes. These comprise one 2DL gene and two 2DS genes. The 2DL gene corresponds to the PopyKIR2DLB cDNA. The PopyKIR2DSE and PopyKIR2DSFn genes are related to the PopyKIR2DSC and PopyKIR2DSD cDNA, respectively (Fig. 1A). PopyKIRDSFn has a single base pair deletion in exon 5 resulting in a frameshift and premature termination indicating that it is a nonfunctional or null (n) allele. All three orangutan lineage III genes contain a pseudoexon 3, as seen in all lineage III KIR2D genes of other species. Like other pseudoexon 3 sequences, there is a 3-bp deletion relative to expressed exon 3 sequences. The pseudoexon has an intact 5' splice site in two of the three PopyKIR2D genes, the third gene (PopyKIR2DSFn) having an AG to AA change resulting in disruption of the splice site. In all three genes, the 3' splice site is lost through substitution of TT for GT.
The inhibitory PopyKIR2DLB and activating PopyKIR2DSE encode similar extracellular domains but different cytoplasmic tails. From the beginning of the gene through to intron 6, their sequences differ by only 2%, after which the sequence difference increases to 6% in the region encoding the signaling domain. In this region, the short-tailed PopyKIR2DSE and PopyKIR2DSFn differ by only 0.6% of their sequence, but are divergent elsewhere. Like PopyKIR2DLB and PopyKIR2DSE, pairs of human lineage III KIR with similar extracellular domains have opposing signaling function: 2DS1 with 2DL1, 2DS2 with 2DL2.
Several lineage III PopyKIR genes identified by cDNA analysis are not a part of the sequenced haplotype. To obtain further knowledge of their sequences, we characterized the region encompassing exons 24 of the PopyKIR2DSA, PopyKIR2DSB, PopyKIR2DSC, and PopyKIR2DLA genes. Phylogenetic analyses of the combined genomic and cDNA sequence data identified two sublineages for the sequences spanning exons 35. PopyKIR2DSA and PopyKIR2DSB comprise one sublineage, PopyKIR2DLA, PopyKIR2DSC and PopyKIR2DSFn the other. PopyKIR2DLB, PopyKIR2DSD, and PopyKIR2DSE represent interlineage recombinants, having intron 2-intron 4 of the PopyKIR2DLA sublineage (sublineage I) and exon 5 of the PopyKIR2DSA sublineage (sublineage II). Using the sequences spanning exons 24, the divergence time of the two sublineages was calculated to be 10.3 ± 1.25 mya, indicating that the two sublineages arose after separation of the orangutan ancestors from human/chimpanzee ancestors,1018 mya (59, 60). This time is also later than the 13.5 mya estimated for the origin of the short-tailed hominoid KIR (52). Thus, the short-tailed hominoid KIR ancestor was formed before orangutan speciation and subsequently underwent orangutan-specific diversification, a model supported by phylogenetic analysis showing PopyKIR2D branches at a position orthologous to all the human and chimpanzee KIR with no affinity for any subgroup of them.
There are 29 positions of variation in the D1 and D2 domains of lineage III PopyKIR. Eighteen of these positions define the two lineages and 7 are substitutions unique to a single PopyKIR. At positions 36 and 44, there are three motifs which distinguish sublineage 1 (N36, K44), sublineage 2 (Y36, K44), and the recombinants (H36, E44) (Fig. 8A). The D1 and D2 domains of the PopyKIR2D were examined for evidence of natural selection. Recombinant sequences were removed for the analysis encompassing both domains and replaced when the domains were analyzed separately. Residues 143 and 148 were shown to be positively selected in both analyses, while selective pressure on residue 36 was only indicated in the analysis that included the recombinant sequences (Table II). As the specificity determining residue 44 is identical in the nonrecombinant sequences, there was no evidence for selection in the two-domain analysis. When the domains were analyzed separately, the value obtained for residue 44 still failed to approach significance.
|
|
strands (Fig. 8B, upper panel). The residues that distinguish the two sublineages are located on the loops of the D1 domain located opposite to the binding site and on the face of the D2 domain implicated in interdomain contacts (Fig. 8B, middle panel). The sublineage differences might therefore affect the function of PopyKIR2D by altering the angle between the two domains (61, 62) as well as altering the contact with other molecules. Finally, two of the orangutan-specific resides (P71 and T72) are located near the specificity determining residue 44 and may affect binding ability, two (G144 and G145) are located on a loop that is likely involved in interdomain interactions, and the remaining are located in loops of the D2 domain that are located opposite to the binding site and may interact with other molecules (Fig. 8B, lower panel). | Discussion |
|---|
|
|
|---|
Synthesis of our results with the published literature suggests a model for the evolution of the KIR gene family. At an early stage in mammalian evolution, a progenitor KIR3D gene duplicated to produce ancestors for the two modern KIR lineages. The KIR3DX1 lineage, which is represented by a single gene in primates (27) but has expanded in cattle, and the lineage of KIR containing a LTR33A/MLT1D element in intron 3, an insertion which took place after the initial duplication, but before subsequent expansion of the lineage. The fate of this second lineage has varied. In pigs, it remained a single gene (57), while in primates it evolved into an elaborate family of genes placed head to tail (Fig. 2). Although the two rat KIR genes are syntenic to pig and primate KIR, their orientation is tail to tail (extracted from NW_047555), whereas the two mouse KIR genes have moved to the X chromosome (34) where they have head-to-head orientation (extracted from AL672068).
The progenitor of the LTR33A/MLT1D-containing primate KIR was a KIR3D having a group 1 intergenic region on its 3' side and the shared 3' part of the group 2 and 3 intergenic region on its 5' side (Fig. 9, haplotype 1). Duplication of this gene produced two tandem genes in head-to-tail orientation and separated by a group 2 region which could have originated during the duplication, by truncation of the LINE element (Fig. 9, haplotype 2). This duplication either captured the AluSq common to the group 2 and group 3 intergenic sequences or alternatively, the element was inserted before successive duplications. The next duplication event resulted in the truncated pseudogene and capture of the repeat region now found between the KIR pseudogene and the lineage Ia KIR (Fig. 9, haplotype 3). Subsequent duplication and recombination events resulted in the formation of the lineage III progenitor in the lineage V-pseudogene interval (Fig. 9, haplotype 4). Also during this period, exon 4 was deleted from the lineage I KIR progenitor and an Alu insertion formed the first group 3 intergenic region (designated 3* in Fig. 8). In this manner, a primordial primate KIR locus containing all three framework regions and all major primate KIR lineages could have evolved (Fig. 9, haplotype 4). An additional duplication of the lineage III KIR and deletion within the associated group 3 intergenic interval formed the precursor haplotype present before the speciation leading to the rhesus occurred (Fig. 9, haplotype 5). After the speciation event, species-specific duplication and deletion events occurred resulting in the defined rhesus haplotype (56) (Fig. 9, haplotype Rhesus).
|
Expansion of lineage III KIR in the orangutan
The defined orangutan KIR haplotype is similar to that of the hypothetical primordial primate KIR haplotype (haplotype 4 Fig. 9). The main difference is that the single lineage III KIR gene in the ancestral haplotype has duplicated and diverged to give three lineage III KIR genes in the orangutan haplotype. This region in human KIR haplotypes contains genes encoding the inhibitory HLA-C receptors: KIR2DL1, 2, and 3. Thus, the expansion of lineage III KIR has occurred in the same time frame as the emergence of MHC-C.
Previous cDNA analysis identified Popy2DLA and Popy2DLB as the only inhibitory PopyKIR2DL. The sequenced haplotype contains Popy2DLB but not Popy2DLA, raising the possibility that they are alleles. Popy2DLA and Popy2DLB differ at 14 positions (3 in D1, 6 in D2, 5 in the stem, transmembrane, and cytoplasmic regions), included in these is residue 44, which is lysine in KIR2DLA and glutamic acid in KIR2DLB. In human KIR2DL of lineage III, lysine 44 is characteristic of C1 receptors, whereas glutamic acid 44 has not been observed. Thus, the sequenced haplotype does not encode PopyKIR2DLA, the best candidate for an inhibitory Popy-C receptor (preliminary analysis shows that a PopyKIR2DLA-IgFc fusion protein binds to MHC-C with C1 specificity; data not shown). Of the haplotypes two other lineage III genes, Popy2DSE encodes an activating receptor with glutamate 44 and Popy2DSFn is impaired due to a nucleotide deletion. One possibility is that the sequenced haplotype does indeed lack a gene encoding an inhibitory C1-specific KIR, which would be analogous to the orangutan MHC haplotypes that lack the Popy-C gene. Alternatively, the PopyKIR2DLB receptor could be MHC-C specific, using glutamate at position 44 to interact with Popy-C allotypes.
Previous analysis has shown that the duplication which resulted in the formation of the activating KIR occurred before the speciation event leading to the orangutan. Our analysis showed there to be two distinct lineage III KIR sublineages in the orangutan. Divergence time estimation showed that these two groups were formed early in the orangutan lineage, but after the speciation event leading to the orangutan lineage. The mechanism was likely homogenization of the extracellular domain encoding exons of the lineage III KIR by recombination followed by divergence resulting in the two sublineages recognizable today. Analysis of the orangutan lineage III KIR provided evidence for selection acting on residues involved in interdomain interaction. These residues may be important in determining the hinge angle and thus the ligand-binding platform of the KIR molecule. Visual inspection of the sequence alignment also shows a cluster of substitutions occurring within the sequence that encodes the region corresponding to the face of the D2 domain that interacts with the D1 domain. In the orangutan, the lysine at position 44 is common to both sublineages, indicating the glutamic acid found in the recombinant sequences is more recently arisen.
KIR2DL4 is the only KIR gene conserved in higher primates
Analysis of an orangutan KIR haplotype and its comparison to haplotype sequences from other species has further defined the three conserved framework regions of the primate KIR locus. The centrally situated KIR2DL4 (lineage I) is the only gene preserved in its entirety in the orangutan, chimpanzee, human, and rhesus macaque. By contrast, only the 5' part of the lineage V gene at the 5' end of the KIR locus and the 3' part of the lineage II gene at the 3' end are conserved. Although previous analysis of cDNA identified only defective PopyKIR2DL4A and B alleles (28), the PopyKIR2DL4C allele of the sequenced haplotype encodes a full-length protein that is likely functional. Human KIR2DL4 is an HLA-G receptor (63) implicated in regulating NK cell functions during reproduction. Of potential importance is that orangutan and rhesus 2DL4 have two ITIM motifs, whereas human and chimpanzee 2DL4 have only one. Further evidence for functional differences between these species is that rhesus MHC-G is unexpressed, possibly supplanted by the related MHC-AG (64, 65), and that secreted HLA-G isoforms appear specific to the human species (66) and able to stimulate NK cells through endocytosis and signaling from intracellular vesicles (67). Thus, while KIR2DL4 is the most conserved KIR gene in the higher primates, its alleles and functions are continuing to evolve. There is no evidence, so far, for a 2DL4 gene in nonprimate species. Notably, the single pig KIR gene, which is syntenic to the primate KIR locus (57), has no special relationship with primate KIR2DL4. This finding is consistent with the model shown in Fig. 9, in which the KIR2DL4 precursor arises after two rounds of duplication of the progenitor KIR3D gene. The second duplication is coincident with the "capture" of the associated repeat region, which is conserved among the primate species studied, but absent in all others, thus placing this event at a time postdating the speciation event leading to the primate lineage.
| Disclosures |
|---|
|
|
|---|
| Footnotes |
|---|
1 This work was supported by National Institutes of Health Grants AI031168 and AI024258 (to P.P.) and a Ruth L. Kirschstein National Research Service Award (F30) from National Institutes of Health (to A.M.O.A.). ![]()
2 Address correspondence and reprint requests to Dr. Peter Parham, Department of Structural Biology, Stanford University, Fairchild D-159, 299 Campus Drive West, Stanford, CA 94305. E-mail address: peropa{at}stanford.edu ![]()
3 Abbreviations used in this paper: KIR, killer cell Ig-like receptor; NJ, neighbor joining; mya, million years ago; LRC, leukocyte receptor complex; LINE, long interspersed nuclear element; dN, rate of nonsynonymous substitution; dS, rate of synonymous substitution. ![]()
Received for publication January 3, 2007. Accepted for publication April 14, 2007.
| References |
|---|
|
|
|---|

TCR rearrangement express highly diverse killer cell Ig-like receptor patterns. J. Immunol. 166: 3923-3932.