Abstract
This study of a large family of κ L chain clusters in nurse shark completes the characterization of its classical Ig gene content (two H chain isotypes, μ and ω, and four L chain isotypes, κ, λ, σ, and σ-2). The shark κ clusters are minigenes consisting of a simple VL-JL-CL array, where V to J recombination occurs over an ∼500-bp interval, and functional clusters are widely separated by at least 100 kb. Six out of ∼39 κ clusters are prerearranged in the germline (germline joined). Unlike the complex gene organization and multistep assembly process of Ig in mammals, each shark Ig rearrangement, somatic or in the germline, appears to be an independent event localized to the minigene. This study examined the expression of functional, nonproductive, and sterile transcripts of the κ clusters compared with the other three L chain isotypes. κ cluster usage was investigated in young sharks, and a skewed pattern of split gene expression was observed, one similar in functional and nonproductive rearrangements. These results show that the individual activation of the spatially distant κ clusters is nonrandom. Although both split and germline-joined κ genes are expressed, the latter are prominent in young animals and wane with age. We speculate that, in the shark, the differential activation of the multiple isotypes can be advantageously used in receptor editing.
Introduction
Generation of the primary Ab repertoire in mammals is a process the complexity of which has been well documented but is not fully understood, especially with regard to gene usage. During B cell development V(D)J recombination becomes possible when the gene segments are rendered accessible to the RAG recombinase (1–3). Two operationally intertwined series of events are involved, one set recruiting transcription factors and initiating modification of the chromatin (4–7) and the other bringing about a course of DNA looping and contraction wherein the distant gene segments are moved into close proximity with recombinational partners (8–10).
Following successful rearrangement and expression of H and L chains, forming the BCR the B lymphocyte is further subject to selection, and autoreactivity triggers upregulation of RAG that gives rise to secondary rearrangement events (reviewed in Ref. 11). As such, multiple factors impact on V gene usage in preimmune B cells, attributable both to the molecular pathways in play and at the level of positive and negative receptor selection.
Tetrapod vertebrate Igs are encoded by the IgH locus and by one to three IgL loci, generally organized as a series of V gene segments followed by D and/or J gene segments (12). The gene segments assemble as VDJ for H chain and VJ for L chain and are transcribed with downstream C region exons. The κ gene locus in mice, for instance, consists of 101 functional Vκ and four Jκ spread over more than three megabases (13). Rearrangement of Vκ to Jκ involves not only bringing widely separated elements by complex DNA looping interactions but also requires complex regulation that equalizes proximal and distal gene segments (10, 14).
In contrast to the spatially spread-out, multigenic tetrapod Ig loci, the IgH and IgL genes in cartilaginous fishes are organized as multiple clusters, respectively VH-D1-D2-JH-CH and VL-JL-CL (see Fig. 1) (15, 16). V(D)J recombination takes place primarily within the Igμ minigene, which is ∼20 kb from leader to the C region transmembrane exons (17, 18). In the nurse shark the H chain clusters are >120 kb apart (18); as such, at any cartilaginous fish Ig minigene there is no apparent need for DNA looping/locus compaction because the rearranging partners of a cluster are separated by only 200–500 bp. The IgH clusters appear to function autonomously, and H chain exclusion results from asynchronous rearrangement at individual genes within a given time window (19, 20). The activation of any shark IgH cluster is discrete, independent of neighboring IgH as well as its allele. In both respects this differs from V(D)J recombination in mouse and humans, which involves a multistep process progressively mobilizing a large region of chromatin (21).
If shark IgL clusters operate independently, the expression of the four L chain isotypes in shark (22) (i.e., the κ and λ orthologs, and the σ and σ-2) ought to correlate with the number of IgL clusters as they do with IgH. Table I provides the tally on κ genes obtained in the present study together with information on the other three isotypes. Little is known about their function, if they play particular roles, as has evolved for mammalian κ and λ in editing autoreactive receptors (11, 23), and obtaining their relative expression levels provides, to our knowledge, the first complete picture of usage of Ig isotypes arrayed in clusters. Another aspect, peculiar to cartilaginous fishes, is the presence of prejoined Ig sequences in the genome (24–26), some of which are in-frame and potentially functional because mere transcription of the cluster is sufficient to produce a viable protein (see Fig. 1, VκR7). It has not been clear whether and to what extent germline-joined (GL-joined) genes participate in the Ab repertoire compared with the others (“split”) that must somatically undergo VDJ recombination to be expressed as protein.
The GL κ L chain gene organization in the nurse shark was characterized. The expression of functional, nonproductive, and sterile transcripts of the κ genes was examined and compared with the other three L chain isotypes. κ gene usage was studied in young animals, and gene representation within and between isotypes was observed to be nonrandom. GL-joined κ transcripts predominated in the tissues of young animals, and the usage of split genes was skewed, appearing to be due to differential cluster activation.
Materials and Methods
Animals
Sharks were captured off the coast of Florida (Dynasty Marine). Some were immunized (shark-JS, -GR) and others were sacrificed on arrival (adult shark-RU, <2 mo-old pups) or after caesarean section (“newborn”) (20). Their organs were harvested immediately on sacrifice and placed in shark PBS (PBS with 350 mm urea and 200 mm NaCl). DNA was extracted from the erythrocyte pellet after centrifugation of blood through Ficoll. The animal protocols were approved by the Institutional Animal Care and Use Committees of the University of Maryland and of State University of New York–Downstate.
Libraries
The shark-33 erythrocyte genomic bacteriophage library (27), shark-JS spleen cDNA library (28), newborn shark spleen and epigonal organ cDNA libraries (29), and the shark-Y bacterial artificial chromosome (BAC) library have been described (30). BAC clones were purchased from Arizona Genomic Institute (http://www.genome.arizona.edu), and their grid positions (plate addresses) are listed in Table I for clones used in this study.
PCR and quantitative PCR
PCR was performed using genomic DNA or cDNA preparations. First-strand cDNA synthesis (SuperScript III, Life Technologies) was primed with oligo(dT), followed by 30–40 cycles of PCR (20). Primers used to amplify the genomic and cDNA sequences of the four L chain isotypes are listed in Supplemental Table I.
Total shark spleen RNA was treated with Turbo DNA-free (Life Technologies) to minimize DNA contamination. After DNase treatment, samples were tested for residual DNA by performing “RNA only” quantitative PCR (qPCR) reactions. Reverse transcriptase was performed as follows using random hexamer primers and SuperScript III (Life Technologies): 5 min at 25°C, 1 h at 50°C, 15 min at 70°C. qPCR was performed as follow: 30 s at 95°C, 30 s at 58°C, 30 s at 72°C, carried out for 40 cycles. qPCR was performed with the iQ SYBR Green Supermix dye kit (Bio-Rad), using a CFX96 real-time PCR System (Bio-Rad). Nurse shark nucleoside diphosphate kinase (NDK, GenBank accession number M63964) was chosen as the internal standard. NDK is a housekeeping gene that is stably expressed across various shark tissues (spleen, thymus, epigonal organ, brain, stomach, and kidney; unpublished results). The primers (NDK-F, 5′-GGTAACAAGGAACGAACC-3′; NDK-R2, 5′-AGATCCTTAGGAGCCTGA-3′) targeted two exons (unpublished results). NDK mRNA was used to normalize the amount of total RNA for each qPCR reaction. Ig-specific primers are listed in Supplemental Table I. The qPCR primers were selected to enable an optimal annealing temperature at 58°C, amplification efficiency between 90 and 110%, and PCR products 104–137 bp in length. In all cases except for sterile transcripts the primers detected spliced products. Each primer pair was done in triplicate, and the entire series was performed three separate times.
qPCR data were analyzed using the cycle threshold (Ct) method (31). For each RNA sample the Ct for L chain RNA was normalized to the Ct for NDK mRNA, resulting in a ΔCt reflecting the relative level of that L chain isotype in that sample. 2−ΔΔCt was then calculated as measure of the fold increase of L chain RNA.
Probes
Probes specific for the V region of κ (ns4v), λ (ns3v), σ-2 (ns5v) have been described (19); that for σ V region (sigv) was generated from a GL sequence with the primers in FR1 (SIGF, 5′-TTACAGGTGGACAGTGTC-3′) and CDR3 (SIGVR, 5′-GTAAGTACTAGCTGAGGA-3′).
End-labeling PCR fragments
The technique for end-labeling Ig fragments to observe the spectrum of CDR3 lengths in an Ig pool was described for Xenopus H chain (32) and shark L chain (33). Briefly, purified PCR products were digested with Ava II in a 10-μl vol for 1 h, after which 1 μl from a 2.5 μl mixture of [α-32P]dCTP (15 μCi, PerkinElmer) and Klenow (5 U, New England Biolabs) was added. After 15 min at room temperature, 5 μl STOP buffer was added. The samples were denatured and loaded onto a 4% acrylamide-urea sequencing gel together with a control sequencing reaction using M13mp18 phage to calibrate strand size. The PCR reaction, using the NS4L and JL5A primers (Supplemental Table I), generated fragments of ∼380 bp. Digestion of the Ava II site in FR3 produced fragments of 265 bp and the varying bands of <120 bp containing CDR3.
Results
Igk clusters
To obtain an estimate of κ cluster numbers, a shark genomic library was screened with κ VL probe (ns4v). Thirty-five verified signals were obtained from 200,000 phages. With an average insert size of 16.9 kb, the genome coverage was 0.9 (3.75 × 109 bp/nurse shark genome), leading us to estimate 39 κ genes per genome. Table I provides a summary of these results, comparison of the numbers of IgL clusters of the four isotypes, identification of pseudogenes, and relative number of GL-joined genes compared with split genes.
Organization of κ L chain gene clusters in the nurse shark GL. Top: κ gene clusters in the BAC23 clone (Ba141C21); relative position of psVκ18 was deduced from another linkage relationship. The orientations and distances between clusters are unknown, and the J-C intron distances are as indicated. Leader (L), VL gene segment, JL gene sement, CL exon are labeled. Bottom: κ split gene (left) with RSS shown as triangles, filled for RSS with 12-bp spacer and open for RSS with 23-bp spacer. GL-joined (right) shown with prerecombined VJ.
The nurse shark BAC library was screened with the ns4v probe, and from 124 clones 48 were selected for further analysis, of which 31 (65%) were confirmed to carry κ sequence. Twenty-one BAC clones contained one κ cluster as found by PCR amplification for VL and Southern blotting for VL and CL, and 10 others carried two to three clusters. The BAC clones are listed in Table II (GenBank accession nos. KP893402–KP893421, http://www.ncbi.nlm.nih.gov/genbank/), and included also are seven κ genes previously characterized in the same animal (26). The VL sequences with obvious structural defects, such as disrupted reading frame, splice signal, or the recombination signal sequence (RSS), are prefixed by “ps” for pseudogene. These constitute one third (11 of 30) of all VL gene segments isolated in shark-Y so far.
Although there can be more than one κ gene per BAC clone, in 9 of 10 instances only one of the clusters is apparently functional; in BAC6 the Vκ6b gene segment is linked to a defective C exon. The only exception is BAC23, where there are two adjacent ones, Vκ5 and the GL-joined VκR7 gene (Fig. 1).
Long-range PCR indicated that the JL-CL intron ranges from 8 to 12 kb (not shown). Because the average length of BAC inserts in this library is 100 kb (30), the κ clusters are situated relatively far apart and the functional genes on separate BAC clones are at least 100 kb apart.
Germline-rearranged clusters
The κ sequence was amplified from the BAC clones using universal primers in the leader and JL. The PCR product size of 1.1 kb shows that the V-J intersegmental sequence is ∼500 bp, and its deletion can be detected in the GL-joined VJ (Table II, bold rows); GL-joined VκR7 and VκRE18 were identified previously (26). In shark-Y 24 “split” genes and six GL-joined genes have been cloned. To get a sense of their distribution in the population, GL-joined κ genes of three nurse sharks were amplified from their erythrocyte DNA and 40–50 cloned sequences were analyzed in each animal. Two sharks (LA and EC) carried VκRE18, R18, R4, and a novel gene, VκR20 (GenBank accession no. HM068964); a third animal carried VκRE18, R18, RE19, and R20. By differential restriction enzyme digestion of the PCR product, it could be ascertained that the GL-joined genes in shark-LA were present in equal copy numbers and that VκR7 was absent (data not shown).
Sterile transcripts
When the κ V-J sequences were aligned, it could be seen that a polyadenylation motif (AATAAA) was present in every single case, ∼210 bp downstream of the RSS. Polyadenylation can occur at this site or 3′ of the C in nonrearranged or “sterile” transcripts. In the latter, the transcripts contain leader spliced to VL, the intersegmental region, and JL spliced to C region (Fig. 2, κ).
κ, σ, σ-2 sterile transcripts. RNA transcripts from the four L chain isotypes are shown, as labeled. κ, mature mRNA, sterile VL transcript, sterile transcript with VL and JC; σ-2, mature mRNA, sterile transcript with splicing within VL-JL region, sterile transcript without splicing. σ, mature mRNA, sterile transcript with splicing within VL-JL region, sterile transcript with 3′ splice site at the C exon. λ, mature mRNA. The figures drawn with polyA tails have been isolated from cDNA libraries, the others were cloned by RT-PCR. Arrows indicate location of qPCR primers. RSS are shown as filled (12-bp spacer) and open (23-bp spacer) triangles.
PCR products generated using primers in leader and 3′ of the VL RSS, wherein the leader intron has been spliced out, demonstrate that κ sterile transcripts are also present in adult spleen mRNA. Pseudogenes with defective RSS are transcribed as well. Because there is no κ signal from thymus RNA in northerns (not shown) and RAG1 is no longer produced in splenocytes (36, 37), the sterile transcripts appear to be relics of earlier events in B cell differentiation.
L chain isotypes σ and σ-2 gene clusters can rearrange. Two types of σ sterile transcripts have been found, one of which is prominent in spleen (Fig. 2, σ). A sterile transcript of similar characteristics is produced by σ-2 (Fig. 2, σ-2). The positions of qPCR primers detecting mature JC products compared with sterile transcripts are shown in Fig. 2. The generally low levels of GL transcripts (detailed below) means that the JC products detected in qPCR consist almost entirely of transcripts of rearranged genes.
Comparative L chain isotype expression
Differential L chain isotype expression was initially examined by screening a shark-JS spleen cDNA library. C region probes detected similar numbers of IgM-hybridizing phages (229) as κ (216), in contrast with considerably fewer IgW (8), λ (13), σ (8), and σ-2 (19) phages. Screening of larger numbers of phages yielded comparable results among the non-κ sequences (49 λ, 39 σ, 65 σ-2). The graph in Fig. 3A shows that qPCR performed on shark-JS spleen RNA produced parallel results, with respect to κ (48.8) versus non-κ levels, that is, λ (0.3 + 0.6, total 0.9), σ (2), and σ-2 (3.8 + 0.2, total 4). Only B cells rearrange and express L chains (19), but the cellular composition of the spleen changes with age (36), so that standardization with NDK, which is expressed in the other cell types, enable only comparison of L chain expression of within an RNA sample. Despite this caveat, it can be seen that overall L chain levels increase with age, consistent with studies on plasma Ig levels showing rise of the 19S species during neonatal development (38).
Determination of relative L chain transcript levels in individual sharks. DNase-treated RNA samples were analyzed using qPCR detection system. Spleen samples from five individuals were screened for σ -2, κ, λ, and σ genes, specific genes as labeled, and GL-joined genes are designated with asterisks. (A and C) immunized sharks, (B) nonsimmunized shark, (D) pup shark, and (E) newborn shark are shown. Graphs show L chain levels normalized to NDK (fold increase to NDK, 2−ΔΔCt). NDK level is shown in gray. ST, sterile transcript.
There are at least 25 functional κ clusters, four λ, three σ-2, and two σ clusters (Table I). If there were a simple correlation with the number of available functional genes, at least 74% of the L chain mRNA from lymphoid tissue would consist of κ transcripts. Indeed, in the immunized (Fig. 3A, 3C) and nonimmunized (Fig. 3B) animals, κ transcripts are the majority. Although this is also true in young animals, the relative levels are different. In the newborn (Fig. 3E) and the pup (Fig. 3D) of estimated 2 mo, the ratio of two σ-2 rearranging genes, NS5-2/48, to κ is 1:3.3 (1.3:4.3) in AQ and 1:3.4 (6.7:22.8) in LA. These results are consistent with in situ data on splenic secretory cells of young animals (C. Castro and M. Flajnik, unpublished results). In contrast, the σ-2/κ ratios in adults range from 1:14 to 1:31, showing that after an initial bias for σ-2, selection has taken place over time in the adults.
Primer pairs distinguishing sterile transcripts included one primer either 3′ of the VL RSS or 5′ of the JL RSS (see indicated locations in Fig. 2). The sterile transcript levels were generally 1–10% of the total levels of that isotype (Fig. 3, bars labeled ST). The most abundant sterile transcript tended to be σ, where the GL levels from two clusters could consist of 30% of the total σ pool (0.6 of 2.0, Fig. 3A). Although there are also two split σ-2 clusters (NS5-2/48) that can produce sterile transcripts, there is much less (<0.4%) in all the animals.
GL-joined genes, indicated by an asterix in Fig. 3, can be compared with split gene expression in the σ-2 isotype, where two clusters rearrange (NS5-2, 48) and one is prejoined (NS5-16). The ratio of the GL joined to split in newborn is 1:2.6 and in the pup 1:6, and from 1:7 to 1:19 in the adults (Fig. 3, σ-2). In summary, the GL-joined σ-2 is expressed at similar frequency as the two split genes in neonates but expression decreases with age. Although the levels of the GL-joined σ-2 are low, they are almost always higher than the single-copy GL-joined λ gene (NS3-8). All λ genes are GL joined, and although there are several such clusters, the expression level of this isotype is the lowest in all individuals, often on a par with sterile transcripts of κ or even σ (Fig. 3, indicated as ST). The GL-joined κ have a very different expression pattern (detailed below).
κ gene usage
The κ sequences were amplified from pup spleen mRNA using 30 cycles of PCR to minimize artifact. Pup Ig is little affected by somatic hypermutation (34, 39), so assignment to a GL gene was not difficult. GL-joined κs constitute 36% of the cloned sequences (Table III, column 1) although they are 16% of the functional κ genes, and they were not preferentially PCR amplified because they are within the size range of rearranging sequences. This supposition is confirmed by the few repeats observed among somatically generated CDR3.
Moreover, there appeared to be a skewing of expression among split genes, in particular Vκ5 and Vκ53 (Table III, column 1, Supplemental Fig. 1). To determine whether the B cell population had undergone selection, nonproductive κ rearrangements were isolated from total spleen RNA. To obtain nonfunctional VJ, we used the same reverse primer but employed a forward PCR primer (NS4LI1) within the leader intron (37). Thirty-eight of 97 of the cloned sequences (Table III, column 2 under shark-LA, Supplemental Fig. 1) were rearranged out-of-frame, and 29 of 38 (76%) were Vκ5 and Vκ53, similar to what was found in total RNA (28 of 42, 67%).
In comparing the 35 Vκ5 cDNA, 32 were identical in VL and CL other than occasional single base differences that were not shared between clones, confirming bias for the Vκ5 cluster. The 22 Vκ53 clones were similarly verified. Their combined 71% frequency is not due to multiple genomic genes because two copies of Vκ53 and none of Vκ5 were observed among 21 functional split Vκ genes isolated from this animal. There also appears to be no great difference in RSS sequences between highly frequent and underrepresented κ clusters.
To ascertain whether skewed κ gene usage is the norm, nonproductive κ rearrangements were amplified from another pup (Table III, shark-EC). Only split genes were tallied, and 34 of 87 were out-of-frame. The distribution of Vκ gene usage is different in the second pup. Among the nonproductive sequences, 19 of 34 (56%) were Vκ2b and Vκ53. A similar distribution was found among the in-frame split genes (36 of 53, 68%). Of the genes prominent in pup LA, Vκ5 was not observed at all (<1 of 87) in pup EC.
GL-joined cDNA in newborn shark library
Newborn shark cDNA libraries, one constructed from two spleens and another from two epigonal organs, were screened with the ns4v probe. The phage inserts were sequenced and the results are summarized in Table IV. Most κ sequences from both libraries consisted of GL-joined genes. VκRE18, R18, and R6 all contain CDR3 of nine codons; those of the recombined VJ ranged from 7 to 13 codons. The 19 unique somatically recombined sequences consisted of at least 11 Vκ genes, without biased representation of Vκ5 or Vκ53.
κ sequences were amplified from the spleen cDNA of the same newborn animals and from adult PBL. The samples were digested with Ava II, a site present in 13 of the 16 functional κ genes and 4 of 5 GL-joined genes in Table II. Fig. 4 shows a comparison of the neonatal and adult κ populations, the dominant band of nine codons in newborn cDNA confirming the GL-joined character (RE18, R18) of κ L chain pool in these young animals (Tables IV).
κ L chain CDR3 lengths of newborn and adult. Top: Flowchart of end-labeling procedure. κ V region PCR products are generated from an RNA pool of two newborn spleens and from adult PBL RNA. The fragments are purified and incubated with Ava II, followed by end-labeling with [32P]dCTP (see Materials and Methods). Bottom: Electrophoresis of denatured fragments. The samples were loaded next to M13mp18 phage (first four lanes labeled GATC) sequenced with −40 primer. The sizes of the shark fragments and their CDR3 were determined by the phage strand size plus labeled nucleotide (101–116 bases plus 1). The other halves of the Ava II–digested fragments are not shown.
Discussion
Although Ig L chain gene isotypes and their cluster organization in cartilaginous fishes have long been established (16, 22), the expression patterns and relative usage of the various isotypes and individual genes have not been investigated in depth. In this study the characterization of the Igk clusters in nurse shark completes classification of its classical Ig genes. All RNA forms of the various L chain isotypes, including sterile transcripts, were studied and their relative expression determined by qPCR. κ gene usage was examined in several animals.
L chain isotype and GL-joined gene expression
Quantitative PCR measurements in adult animals showed that κ was the most abundantly expressed L chain isotype, followed by σ-2, σ, and λ, generally according to gene numbers. However, in young animals the distribution of individual gene expression is distinctly uneven with respect to the few σ-2 genes compared with the many κ genes. It appears that σ-2 genes are at least equally if not more readily accessible to RAG than κ clusters.
Whereas between isotypes one may anticipate cis-regulatory differences affecting transcriptional initiation or efficiency, it is less easy to understand those within one L chain isotype because of the similarity of their upstream sequence. That is, within the κ and the σ-2 isotypes the relative abundance of their GL-joined genes is not proportional to split genes (Fig. 3). The GL-joined κ genes are the major transcript in total spleen RNA in young animals, and this wanes in older animals (26). This observation is a complex issue on several counts. If the GL-joined gene were expressed as part of the BCR, then this cell population is one that decreases dramatically as the animals age. However, what if a GL-joined IgL is transcribed in a B cell but the protein is not used due to incompatibility with the H chain? Unlike the vast majority of nonproductive rearrangements, the in-frame GL-joined transcripts will not be downregulated as a result of RNA surveillance mechanisms (40). As a result, when such genes are activated but are not used as part of the BCR, the transcript levels could be as high as the actual L chain in use.
However, if this were the case, the levels in newborns and adults would not be so very different as found. Our observations can be better explained if certain B cell subpopulations preferentially express the GL-joined genes. In the nurse shark IgM1gj is a unique IgH cluster with in-frame VDJ and three C exons that have diverged extensively from Cμ2–Cμ4, and only the secreted protein has been observed (29). In cells that secrete IgM1gj RAG would not be needed for its expression. However, IgM1gj protein is secreted with L chain, and in the absence of RAG only the GL-joined IgL clusters can produce L chain protein; in the split clusters sterile transcripts would result. Because IgM1gj is prominent only in young animals, the notion of a subpopulation of cells with preference for GL-joined L chain expression is at this time compatible with the qPCR and cloning observations. If κ and σ-2 genes are generally activated earlier than the other two isotypes, then the GL-joined κ and σ-2 genes would also be expressed in IgM1gj cells. With age, this subpopulation recedes, together with the levels of these GL-joined transcripts.
Unlike nurse sharks, the clearnose skate carries many different GL-joined VDJ, and these constitute 10–15% of H chain transcripts in embryos and hatchlings but cannot be detected in adults (41). Because a large portion of these are out-of-frame or carry debilitating mutations, their expression has been attributed to a generalized IgH cluster activation in cells present in the skate lymphoid tissues. Litman and colleagues (41) suggested that “widespread run-off of Ig clusters” enabled Ig expression of limited diversity early in development. Inasmuch as IgM1gj encoded by a unique shark cluster and the many skate GL-joined VDJ sequences have not been shown to produce actual protein, the similarities of age-dependent GL-joined Ig expression in the shark and skate suggest a form of limited protection by this kind of secreted Ig early in development.
κ gene usage
RSS accessibility in mouse Ig and TCR systems have been correlated with chromatin environment, nucleosome density, and participation of cis-regulatory sites (42, 43). However, recombinational efficiency among the V gene segments is not evenly distributed; for instance, 7 mouse Vκ genes out of 101 participate in >40% of its repertoire (44). It is thought that the frequency of Vκ interaction with the intronic and 3′ κ enhancer and the Sis element (45) involved in locus contraction makes the Vκ gene segment accessible for rearrangement to JL, and the frequency is in part correlated with binding of transcription factors and histone modifications. A recent study concluded that the greater the number of nearby transcription factor–binding sites, the more frequent the usage (9).
What has emerged from the present study is that the split κ cluster usage in shark is also not random. In shark-LA, Vκ5 and Vκ53 make up 71% of the κ transcripts derived from somatically rearranged IgL. Because the representation is similar in nonproductive rearrangements as in total RNA, the bias is not due to selection on the BCR. Although most of the sequence upstream of the leader among the Vκ are almost identical (Ref. 25 and unpublished results), even a few differences may affect promoter efficiency (46). Of the known Vκ that are expressed in shark-LA, six of seven carry identical, canonical RSS whereas those that do not have 1–2 bp substitutions in the nonamer. It is possible that more than one IgL is accessible to RAG in a cell, as suggested by the presence of sterile transcripts (47), but unless many Igk are competitively involved it is hard to see how efficiency of RSS binding in itself causes unequal expression levels. In mice there was no clear relationship between the quality of RSS with frequency of usage (44).
Because nothing is known about H or L chain gene promoters or enhancers in the shark, the basis for its preferential Igk cluster activation must remain speculative. Regulatory differences could be due to as-yet-unidentified cis-regulatory elements in the J-C intron or 3′ of the C exon as in the mouse Igk. Additional activating elements may be more thickly distributed among some shark Vκ clusters than others. Among individual animals the most frequently rearranged Vκ genes are not the same. The sharks probably do not carry the same complement of Igk, as shown by the presence of VκR7 in shark-Y but not in shark-LA. However, this cannot explain other observations. For instance, both pups express the Vκ2b cluster but in one animal it becomes recombined at a higher frequency (Table III). This suggests that elements rendering any Igk accessible to RAG are unevenly distributed in the population. Perhaps certain regulatory elements for Vκ2b are not tightly linked to the cluster so that meiotic recombination or other events have severed their influence in shark-LA. Recent studies in chromatin profiling of mouse Ag receptor loci have revealed additional, novel regulatory elements, demonstrating that recombinational potential is not controlled solely by the classical enhancers (48).
Mouse and shark
What makes for the shark L chain diversity? Interestingly, junctional diversity and combinatorial diversity are inversely important in the mouse and shark. In the mouse, with >100 functional Vκ and 4 Jκ there are at least 400 combinatorial possibilities for recombined VJ; in shark there are only as many VJ combinations are there are functional clusters, that is, 25 (Table I). In mouse the Vκ are diverse and classified into 18 families, where 55–80% sequence identity is shared between families (49). During receptor editing nested rearrangements will recruit upstream VL of different families (44). In contrast, in the shark the functional Vκ are all members of one family that share 80–98% identity. Whereas >90% of mouse κ CDR3 are nine codons in length due to downregulation of TdT activity during L chain rearrangement (50, 51), this is not the case in shark because 90% of its rearranging L chains have N region and thus extensive CDR3 diversity (Refs. 22, 35 and Supplemental Fig. 1). Therefore, where in mouse the Vκ gene segment choice is diverse but the CDR3 loop size spectrum is limited, in shark the L chain diversity hinges on the size and sequence of CDR3. Moreover, for the same reasons of cluster organization and low (9–12) minigene number the H chain diversity in nurse shark also centers on CDR3. This being the case, then the specificity and potential autoreactivity of an Ig receptor in developing shark B cells are more likely to lie in the CDR3.
Based on the above observations we will make a speculation about L chain isotype function and receptor editing in the shark. Although the shark Ig gene organization and Ig assembly may operate differently from those of tetrapods, one fundamental aspect of adaptive immunity they must share is the generation of autoreactive receptors. The randomness of V(D)J recombination, selected for in evolution, is anticipated to produce a majority of receptors that react to the internal self environment (52). In the mouse, when secondary rearrangements at the κ locus are exhausted, V(D)J recombination goes on to a different L chain isotype, the λ locus (53, 54). If any form of receptor editing exists in shark, the constraints of the cluster organization oblige secondary rearrangement to take place at another IgL cluster. This could permit L chain inclusion, but in surviving B cells the first L chain is either functionally inactivated or is displaced in interaction with H chain.
We suggest that if autoreactivity resides in the CDR3 combination of the H and L chains, then further rearrangement at another IgL of the same family could rectify this. However, if the CDR3 of the initial L chain is not crucial to the self-specificity, then the possible remedy is to express another L chain isotype that will, overall, create a different combining site with the H chain. The initial IgL activated is possibly at σ-2, given its preferred expression in young sharks, whose σ-2/κ ratio was 1:3 (Fig. 3D, 3E). A secondary rearrangement could take place at the second σ-2 cluster or at another isotype entirely. Although clonal deletion is an unanswerable solution for lymphocytes expressing anti–self receptors, a form of receptor editing may nonetheless take place in the shark to allay cell wastage, utilizing advantageously its multiple L chain isotypes.
Disclosures
The authors have no financial conflicts of interest.
Acknowledgments
We thank Amanda Chan for determining the genomic organization of shark NDK and Helen Dooley for sharing sequence information on CD79a.
Footnotes
This work was supported in part by funding from National Institutes of Health Grant GM068095 (to E.H.).
The sequences presented in this article have been submitted to GenBank (http://www.ncbi.nlm.nih.gov/genbank/) under accession numbers KP893402–KP893421.
The online version of this article contains supplemental material.
Abbreviations used in this article:
- BAC
- bacterial artificial chromosome
- GL
- germline
- NDK
- nucleoside diphosphate kinase
- qPCR
- quantitative PCR
- RSS
- recombination signal sequence.
- Received June 24, 2015.
- Accepted August 4, 2015.
- Copyright © 2015 by The American Association of Immunologists, Inc.