Collectins are oligomeric molecules with C-type lectin domains attached to collagen-like regions via α-helical neck regions. They bind nonself glycoconjugates on the surface of microorganisms and inhibit infection by direct neutralization, agglutination, or opsonization. During the characterization of the gene encoding bovine CL-43 (43-kDa collectin), we identified a novel collectin-gene. We report the cloning and partial characterization of the novel collectin CL-46. The mRNA comprises 1188 nucleotides encoding a protein of 371 aa with an included leader peptide of 20 residues. CL-46 has two cysteine residues in the N-terminal segment, a potential N-glycosylation site in the collagen region, and an extended hydrophilic loop close to the binding site of the carbohydrate recognition domain. It is expressed in the thymus, liver, mammary gland, and tissues of the digestive system. Recombinant CL-46 corresponding to the α-helical neck region and the C-type lectin domain binds preferential N-acetyl-d-glucoseamine and N-acetyl-d-mannoseamine. The gene encoding CL-46 spans ∼10 kb and consists of eight exons, with high structural resemblance to the gene encoding human surfactant protein D. It is located on the bovine chromosome 28 at position q1.8 together with the gene encoding conglutinin and CL-43. Several potential thymus-related cis-regulatory elements were identified in the 5′-upstream sequence, indicating that the expression in thymus may be modulated by signals involved in T cell development.
Collectins are a group of proteins containing C-type carbohydrate recognition domains (CRD)4 attached to collagen-like regions via α-helical coiled-coil regions (1). The group includes mannan-binding lectin (MBL), surfactant proteins A and D (SP-A and SP-D), conglutinin, 43-kDa collectin (CL-43), and the recently identified CL-L1 and CL-P1 (2, 3). Conglutinin and CL-43 have so far only been identified in the Bovidae. MBL, conglutinin, and CL-43 are plasma proteins synthesized in the liver. SP-A and SP-D are mainly found in pulmonary surfactant and are synthesized by alveolar type II cells and Clara cells. In addition, epithelial cells of a variety of mucosal surfaces express SP-D, and SP-A is also produced by cells lining the gastrointestinal tract (4, 5). The collectins play an important role in the nonadaptive immune defense, as demonstrated by the finding that SP-A- or SP-D-deficient mice are susceptible to a variety of infections (6, 7). They bind via their CRDs to nonself glycoconjugates on the surface of microorganisms. Binding leads directly to neutralization and aggregation and, furthermore, to opsonization by macrophages using various collectin receptors (8, 9). SP-A- and SP-D-mediated opsonization decreases the production of the proinflammatory cytokines and dampens the T cell response, preventing the lung from inflammation and immune-mediated damage (10, 11). MBL is the only collectin that activates the complement system. After binding to microorganisms, the MBL-associated serine proteases cleave and activate C4, C2, and C3 (12). This may lead directly to complement-mediated lysis of the microorganisms or may indirectly increase the opsonization mediated by deposition of C3. Conglutinin binds specifically the carbohydrate moiety of iC3b and agglutinate cells coated with this complement product (13, 14). Besides binding to carbohydrates, the collectins bind unique phospholipids (15, 16). The interaction of SP-D with phospholipids influences the surfactant homeostasis in that SP-D-deficient mice accumulate lung surfactant and have a changed morphology of alveolar type II cells and macrophages (17, 18).
Here we report the primary structure of a new member of the collectin family, which we have named collectin 46 (CL-46). CL-46 has a distinct expression pattern, being expressed mainly in the thymus and liver, and structurally it seems to be a hybrid of SP-D and conglutinin.
Materials and Methods
RNA was isolated from 250 mg of bovine thymus using the TRI-reagent kit according to the manufacturer’s instructions (Sigma-Aldrich, St. Louis, MO). cDNA was synthesized from 3 μg of total RNA with Superscript II H− reverse transcriptase (Life Technologies, Grand Island, NY) according to the recommended conditions and with the specific primer 5′-CCATAGGAGGCCTGGCTT-3′.
Two microliters of the 20-μl cDNA reaction was used as template in the following PCR to amplify a full-length CL-46 transcript. In a total volume of 30 μl, a PCR was performed using the primers 5′-GCACTTCAGACTCCAGTACTAGCCTGTG-3′ and 5′-CCATAGGAGGCCTGGCTT-3′ and using Pwo polymerase (Roche, Mannheim, Germany) according to the conditions recommended by the manufacturer. After 2 min initial denaturation at 94°C, 0.5 U of Pwo polymerase was added and 10 cycles of 94°C for 45 s, 69–60°C for 30 s, 72°C for 3 min were performed with a decreasing annealing temperature of 1°C per cycle. Twenty-five cycles at 94°C for 45 s, 59°C for 30 s, 72°C for 3 min followed. Half of a unit of Taq polymerase (Life Technologies) and fresh dNTPs were added to a final concentration of 1 mM and were followed by a final extension of 6 min at 72°C. The product was analyzed by agarose gel electrophoresis and purified from the excised gel plug with Qiaex II gel extraction kit (Qiagen, Hilden, Germany). Ten nanograms of the product was ligated into the PCRII-vector and heat shock transformed into INVaF′ Escherichia coli using the Original TA cloning kit (Invitrogen, Groningen, The Netherlands). Plasmids were purified using the Wizard plus SV Miniprep purification system (Promega, Madison, WI) and sequenced.
RNA was isolated from various tissues and cDNA synthesis was conducted as described above using the primers 5′-TGTCATTCCACTTGCCCTCCC-3′ and 5′-TTCTCTTCCATAAACATTCTCA-3′ specific for CL-46 and bovine β-actin, respectively. Between 1 and 3 μg of RNA was used as template in the synthesis of cDNA. A volume of 1–3 μl of the first-strand synthesis was used as template in the following PCRs. β-Actin was amplified by a PCR using the primers 5′-CTGGCCGGGACCTGACA-3′ and 5′-CACGCCGCAGTCCATTTAG-3′ according to standard conditions for the Taq polymerase (Life Technologies). After 2 min of initial denaturation, 0.2 U of Taq was added to a 30-μl reaction. Twenty-five cycles of 94°C at 45 s, 54°C for 30 s, 72°C for 45 s, and a final extension at 72°C for 1.5 min followed. CL-46 was amplified as described above using the primers 5′-CTCAAGCAGCGGGTGACC-3′ and 5′-CTCTGGTTGTCCAGCATTGT-3′. Amplification comprised initial denaturation at 94°C for 2 min, addition of 0.2 U of polymerase, and 10 cycles at 94°C for 45 s, 61–52°C for 30 s, 72°C for 30 s, with a 1°C decreasing annealing temperature per cycle. Nineteen or 27 cycles at 94°C for 45 s, 51°C for 30 s, 72°C for 30 s, and a final extension at 72°C for 1.5 min followed. The products were analyzed by agarose gel electrophoresis.
A CL-43-derived probe, spanning nucleotide 633-1026 of the CL-43 cDNA sequence (GenBank accession number X75912), was obtained by PCR amplification with the primers 5′-GGCCTCCCCACGCTCTTCA-3′ and 5′-CCTTCTGGCCTCATCCTGTGG-3′ and a previously isolated CL-43 cDNA clone as template (19). The PCR consisted of 4 min of initial denaturation at 94°C followed by 30 cycles of 94°C for 45 s, 56°C for 30 s, 72°C for 30 s, and a final extension step at 72°C for 1 min. The product was purified using the Qiaex II gel extraction kit and labeled with [32P]dCTP using an oligolabeling kit (Amersham Pharmacia Biotech, Piscataway, NJ) and random primers according to the recommended procedure. Approximately 2 × 105 plaques of a bovine genomic EMBL3 lambda phage library (Clonetech, Palo Alto, CA) were screened with the [32P]-labeled probe. The final high-stringency wash was done in 0.3× SSC at 55°C for 20 min. Positive clones were replated and rescreened twice to uniformity. Phage DNA was isolated, digested with SacI, and analyzed by agarose gel electrophoresis. Fragments not observed in a similar digestion of empty EMBL3 vector were purified using the QIAX II gel extraction kit. Purified fragments (20–100 ng) were ligated into pBluescript SK+ vector (25 ng), previously digested with SacI and treated with calf intestinal phosphatase, and heat shock transformed into chemically competent XL-10 E. coli. Plasmids were purified and sequenced as above. To assemble the nonoverlapping sequences of the subclones, three PCRs overlapping the junctions were performed with the following primers: 5′-CGGACCAAAGGGAGACACTG-3′(P1), 5′-GTCCCTGCTGCTCACACCAC-3′(P1), 5′-AGAGGAAAAGGGCAGTGGGTCAG-3′(P2), 5′-GGGTGAGGAGAAAGGGGAGGAC-3′(P2), 5′-GAGCATGAATGACATCTCCAC-3′(P3), and 5′-AGAAGAGAAAAGGGAAGGATGT-3′(P3). Fifty nanograms of purified phage DNA was used as template and denatured at 94°C for 4 min. The amplification comprised 30 cycles of 94°C for 45 s, 54°C(P1)/59°C(P2)/49°C(P3) for 30 s, 72°C for 1 min (P1) or 30 s (P2, P3) and was followed by a final extension step at 72°C for 2 min. Products were analyzed by agarose gel electrophoresis, purified, and sequenced. The 5′-upstream sequence was analyzed for potential cis-regulatory elements according to Quandt et al. (20) and using matrices from the Transfac 4.0 database (Biological Databases, Braunschweig, Germany). The analysis was high stringent requiring a complete core-match and a relative fit of >0.9.
Genomic DNA (20 ng) of 90 hamster/cow somatic cell hybrids (21) was used as template in PCRs with the primers 5′-GCAGGTGCTGTAAAATCATATTC-3′ and 5′-GGGATGTTGGCAGCTCA-3′. Amplification comprised a 4-min initial denaturation, 10 cycles at 94°C for 45 s, 61–52°C for 30 s, 72°C for 25 s, with a decreasing annealing temperature of 1°C per cycle, and followed by another amplification of 25 cycles at 94°C for 45 s, 51°C for 30 s, 72°C for 25 s, and a final extension at 72°C for 2 min. Products were analyzed using agarose gel electrophoresis.
Expression of rCL-46 in Pichia pastoris
cDNA corresponding to the α-helical coiled-coil neck region and the CRD of CL-46 was amplified by a PCR using the primers 5′-TACGTAGTCAATGCTCTCAAGCAGCGG-3′ and 5′-CCTAGGTCAAAACTCGCAGATCACAAGGAG-3′ that insert SnaBI and AvrII restriction sites in the 5′- and 3′-end, respectively. Ten nanograms of pCRII plasmid encoding full-length CL-46 cDNA was used as template in a 60-μl PCR with 0.5 U Pwo polymerase. Amplification comprised 2 min of initial denaturation at 94°C, 30 cycles consisting of 94°C for 45 s, 57°C for 30, 72°C for 1 min, and a final extension at 72°C for 2 min. The product was isolated using the Qiaex II gel extraction kit as before and ∼300 ng of DNA was digested with SnaBI and AvrII. After agarose gel electrophoresis, the product was purified using the Qiaex II gel extraction kit, and 25 ng was ligated into 100 ng of pPIC9K vector (Invitrogen), previously digested with SnaBI and AvrII and treated with calf intestinal phosphatase. Ligated plasmids were heat shock transformed into XL-10 E. coli as described above and purified from a sequence-verified clone. Eight micrograms of SacI-linearized plasmid was transformed into P. pastoris (GS115) by electroporation (22). Clones were double selected by growth on histidine-deficient plates and plates with increasing concentrations of geneticin according to the manufacturer’s protocol. Expression was done in medium made of 200 mM phosphate buffer, 1.34% yeast nitrogen base (QBIOgene, Middlesex, U.K.), 0.5% methanol, and 4 × 10−5% d-biotin (pH 6) and spanned 4 days.
Purification of rCL-46
EDTA was added to 1 liter of P. pastoris culture to a final concentration of 3 mM, and the pH was adjusted to 5. The culture was centrifuged at 8000 × g for 10 min at 4°C and the supernatant was filtered through 0.45-μM filters. It was loaded onto an SP-Sepharose column (20 ml; Amersham Pharmacia Biotech). The column was washed with 200 ml of buffer made of 30 mM MES, 0.5 mM EDTA, 0.01% Tween 20 (Tw), pH 5 and eluted with a similar buffer with 800 mM NaCl. Tris and CaCl2 were added to final concentrations of 20 and 4 mM, respectively, and the pH was adjusted to 7.4. The eluate was loaded onto a 20-ml column made of TSK HW/75 (F) (Tosoh, Tokyo, Japan) derivatized with N-acetyl-d-glucose-amine (GlcNAc), using divinyl-sulfone as previously described (23). After washing with 200 ml of TBS/Tw/Ca2+ (10 mM Tris, 150 mM NaCl, 1 mM CaCl2, 0.01% Tw, pH 7.4) the column was eluted with TBS/Tw/EDTA (TBS/Tw/Ca2+ with 2 mM EDTA instead of CaCl2).
Preparation of Abs
Rabbits were immunized s.c. with 25 μg of purified rCL-46 in Freund’s complete adjuvant. The following monthly boosts were done with the same Ag amount in Freund’s incomplete adjuvant and antisera were collected 2 wk after a boost. Abs were purified using a 1-ml Protein G fast flow column (Amersham Pharmacia Biotech).
Microtiter wells (Maxisorb; Nalge-Nunc, Roskilde, Denmark) were coated overnight at room temperature with 200 ng of GlcNAc-BSA (Sigma-Aldrich) in 100 μl of buffer made of 15 mM Na2CO3 and 35 mM NaHCO3. Plates were emptied and incubated for 2 h at room temperature in TBS/Tw/Ca2+ with 0.1% (w/v) BSA and washed in TBS/Tw/Ca2+. Dilutions of monosaccharides in 50 μl of TBS/Tw/Ca2+ were added in duplicate to the wells. Negative and positive controls consisting of TBS/Tw/EDTA or TBS/Tw/Ca2+ without monosaccharides were included. Purified rCL-46 at 4 μg/ml TBS/Tw/Ca2+ was then added in duplicate 50-μl volumes. The solutions were mixed on a shaking platform and incubated overnight at 4°C. The saccharides tested comprised mannose (Man), α-methyl-d-Man (αMeMan), d-mannosamine (ManN), N-acetyl-d-mannosamine, (ManNAc) d-glucose, (Glc), α-methyl-d-glucose (αMeGlc), d-glucosamine (GlcN), GlcNAc, d-galactose (Gal), d-galactosamine (GalN), N-acetyl-d-galactosamine (GalNAc), d-fucose (Fuc), l-fucose (l-Fuc) and maltose (Mal). All were tested at concentrations ranging from 0.1 mM to 100 mM. The wells were washed in TBS/Tw/Ca2+ and incubated at room temperature for 3 h with rabbit-anti-rCL-46 IgG diluted to 2 μg/ml TBS/Tw/Ca2+. Wells were washed in TBS/Tw/Ca2+ and incubated with alkaline phosphatase-conjugated goat anti-rabbit Ig (DAKO, Glostrup, Denmark) diluted to 0.25 μg/ml TBS/Tw/Ca2+. After incubation at room temperature for 3 h, wells were washed with TBS/Tw/Ca2+ and developed with para-nitro-phenyl phosphate. The background was defined as the average signal of three wells incubated with rCL-46 in the presence of EDTA; the maximum signal was that obtained in buffer without monosaccharide. Each saccharide was tested in three different experiments, and various combinations of five different monosaccharides were tested on the same microtiter plate to determine their individual ranking.
SDS-PAGE was done in a discontinuous buffer system on 4–20% gradient gels (24). Proteins were stained with Coomasie brilliant blue R250 and molecular mass was estimated by comparison with prestained marker proteins (Mark 12 from Invitrogen).
PCR products or plasmid preparations were sequenced with the Prism Ready Reaction BigDyeDeoxy Terminator sequencing kit (PE Applied Biosystems, Alleroed, Denmark) using the recommended conditions. Samples were subjected to electrophoresis on an ABI prism 310 Genetic Analyzer, and data was analyzed with the ABI Prism Software version 2.1.1 (PE Applied Biosystems).
Recombinant CL-46 was subjected to SDS-PAGE and electrophoretic transfer at 7.5 V/cm for 10 h to Problot membranes (PE Applied Biosystems) in buffer made of 10 mM 3-cyclohexylamino-1-propanesulfonic acid, 10% (v/v) methanol at pH 11. The protein band was visualized with Ponceau S, excised, and sequenced on a PE Applied Biosystems 470/120A sequencer.
Identification and cloning of a novel bovine collectin (CL-46)
During the screening of a genomic phage library with a probe encoding the CRD of CL-43, we isolated three clones that encoded a putative novel collectin gene with high sequential similarity to bovine SP-D, conglutinin, and CL-43. Sequences obtained from the clones made it possible to establish a specific PCR, which we used in RT-PCR. We found that the new gene was transcribed in the thymus, and using a new pair of primers we amplified the entire open reading frame from thymus RNA. This open reading frame codes for a polypeptide of 371 aa residues. The total cDNA comprised 65 nucleotides of 5′-untranslated sequence, 1116 nucleotides encoding the protein with intact stop and start codons, followed by a 75-nucleotide 3′-untranslated sequence with a complete polyadenylation site (Fig. 1⇓). The deduced amino acid sequence of the cDNA (Fig. 1⇓) revealed a collectin structure composed of a signal peptide of 20 aa residues, an N-terminal segment of 25 aa residues, 57 Gly-Xaa-Yaa repeats, a neck region of 28 aa residues, and a CRD of 127 aa residues. Assignment of the cleavage site of the signal peptide was done by sequence comparison to bovine SP-D and conglutinin. Two cysteine residues were found in the N-terminal segment and a potential N-glycosylation site was found at aspargine 90 in the collagen-like region. Further alignment of the amino acid sequence with the sequence of other collectins showed that the CRD of CL-46 posses all necessary and conserved residues involved in calcium and carbohydrate binding (Fig. 2⇓). The cDNA sequence of CL-46 has been submitted to GenBank (accession number AF509590).
RT-PCR analysis of CL-46 expression in various tissues
To examine the expression of CL-46 we used RT-PCRs on RNA isolated from 20 different tissues obtained from a 2.5-year-old cow immediately after slaughtering (Fig. 3⇓). We normalized the amount of RNA with the expression of β-actin. Amplified products of β-actin were sequenced and identical with the partial sequences of bovine β-actin (GenBank accession numbers K00622 and K00623). To exclude contamination with genomic DNA, the CL-46-specific RT-PCR amplified a product spanning exons 7 and 8 (see Genomic characterization). After a touchdown amplification and 19 cycles of PCR, we observed expression of CL-46 mRNA in the thymus and the liver. Another eight rounds of amplification showed additional expression in the mammary gland, the rumen, the reticulum, the omasum, the abomasum, the small intestine, and the large intestine. Sequencing the amplified products verified that they derived from CL-46. Unlike what we observed with many other primers, the chosen primer pair amplified only CL-46 and not transcripts of SP-D, conglutinin, and CL-43.
Using a probe corresponding to the neck and CRD of CL-43 (see Materials and Methods), we isolated five EMBL3 lambda phage clones. Sequencing a product amplified by PCR showed that one clone contained the CL-43 gene. Restriction digestion and subcloning of the others showed that three contained the CL-46 gene and one contained a collectin-like pseudo-gene with a disrupted reading frame in the exon that otherwise encodes the CRD (data not shown). After restriction analysis, we selected one clone for further analysis. Assembling the sequences of the derived SacI subclones and the overlapping PCR products (Fig. 4⇓B) showed that the clone contained the complete CL-46 gene. We characterized ∼14 kb of the insert in the phage clone and observed that the exons encoding CL-46 mRNA are contained within a genomic sequence comprising 9799 nucleotides. The first translated exon, exon 2, encodes the signal peptide, N-terminal segment, and the first seven Gly-Xaa-Yaa repeats of the collagen-like region. The remaining collagen repeats are encoded by exons 3–6. Exons 7 and 8 encode the α-helical neck region and the CRD, respectively (Fig. 4⇓C). There was 100% identity between the sequence of our cDNA clone and the exonic sequences contained in the genomic clone. All intron boundaries obliged the GT-AG rule defining donor and acceptor sites for splicing out introns (Fig. 4⇓D). In addition to the transcribed sequence, we characterized 966 nucleotides of the potential promoter region located 5′-upstream of the first exon and 3252 nucleotides located downstream of the 3′-untranslated sequence. Three potential TATA boxes and two corresponding CAAT boxes were found in the beginning of the promoter region. The presence of multiple TATA boxes indicates that transcription may start at different locations. However, we were not able to amplify any products by RT-PCR using RNA isolated from the thymus and different forward primers located between the first TATA box (−4) and second or third (−859 or −870). Several cis-regulatory elements could be assigned to the sequence using homology search to known matrices (Table I⇓) (20). Potential AP-1 and NF-1 sequences were found in proximity of the TATA boxes and CAAT boxes (25). Many potential binding sites for the transcription factors Ikaros2 and the nuclear factors in activated T cells (NF-ATs) were found (26, 27). We also identified potential sites for NF-κB, c-Rel, and the gut-enriched Krueppel-like factor (28, 29). Besides the sites listed in Table I⇓, several potential sites for the factors, Gata1-3, Lmo2, MZF1, and S8, were also identified (30, 31, 32, 33). In the characterized 3′-downstream region we found no open reading frames of known origin. At ∼1.6 kb 3′-downstream, we found a transposon sequence with high homology to the Tn10 IS10 left transpotase of Shigella flexneri (GenBank accession number AF162223). Although it also showed homology with transposons found in other species, we encountered no complete match to any bovine transposons. Sequences of the whole gene including introns have been submitted to GenBank (accession number AF509589).
Chromosomal localization of CL-46
Screening DNA isolated from hamster/cow somatic hybrids using PCR revealed that 22 hybrids of 90 contained the gene. No amplification of products was seen in DNA isolated from Chinese hamster ovary cells. We considered the applied PCR to be specific for CL-46 because the sequence of products corresponded to CL-46 and showed no contamination of other products. Based on previous characterizations of the hybrid panel, the CL-46 gene could be linked with a log10 of odds score of 7 to the microsatellite marker ILST099, which is located on chromosome 28 at the distal position q1.8 (Fig. 4⇑A).
Recombinant expression of rCL-46
The α-helical coil-coiled neck region, starting from valine 218 (Fig. 1⇑), and CRD of CL-46 were expressed in P. pastoris (GS115). A clone, capable of growth on histidine-deficient plates and on plates with 1.5 mg of geneticin per ml of agar, was chosen for expression. N-terminal analysis showed cleavage of the α-secretion signal at the expected site, leaving the P. pastoris-derived sequence Glu-Asp-Glu-Asp-Tyr at the N terminus of the rCL-46. We purified 1–2 μg of rCL-46 per ml of culture, but the concentration in the medium was at least 10-fold higher, probably due to the presence of incorrectly folded rCL-46 that did not bind to the GlcNAc-Sepharose column. Purified rCL-46 showed a significant increased mobility in SDS-PAGE, corresponding to an increase in apparent molecular mass from 17 to 20 kDa when it was reduced (Fig. 5⇓A). The shift and the uniformity of it indicate that purified rCL-46 contains the intradisulfide bridges necessary for correct folding.
The potencies of various saccharides to inhibit the binding of purified rCL-46 to coated GlcNAc-BSA are given in Fig. 5⇑B. Concentrations of saccharide resulting in 50% inhibition are the average from three experiments and had variations of <6%. Binding to GlcNAc-BSA was stable in the presence of 1 M NaCl and at pH 6.0–9.0. Calcium could not be substituted with Cu2+, Mg2+, Mn2+, or Zn2+ (data not shown). The selectivity of CL-46 is compared with the binding selectivity of other collectins in Table II⇓.
We identified a novel bovine collectin gene while characterizing the gene encoding CL-43. It turned out that the gene encodes a protein with an N-terminal and collagen-like structure very much like that of SP-D and a CRD structure and carbohydrate binding profile resembling that of conglutinin. The new collectin was provisionally named CL-46.
The predicted amino acid structure contains a hydrophobic signal peptide of 20 aa residues, followed by a short N-terminal segment including two cysteine residues involved in higher order oligomerization (Fig. 1⇑). Comparison with other collectins shows highest resemblance with the N-termini of SP-D and conglutinin, and we suspect that CL-46 forms tetramers similar to these molecules. The 171-aa-long collagen-like region is of the same length as that of bovine SP-D and conglutinin, which lacks two Gly-Xaa-Yaa repeats compared with SP-D found in other species. The Gly-Xaa-Yaa repeats are uninterrupted, unlike the structures of SP-A and MBL, where a disruption twists and spreads out the collagenous stalks. Conglutinin has a cysteine residue after the third Gly-Xaa-Yaa repeat, but this is not preserved in CL-46, which in return possesses the otherwise SP-D-characteristic N-glycosylation motif at aspargine 90.
Four hepta-repeats with hydrophobic residues at the first and fourth positions organize the α-helical coiled-coil neck region, which also shows strong identity with the similar region of bovine SP-D.The CRD contains the four cysteine residues and the 14 conserved residues found in all collectins (Fig. 2⇑).
Five residues are known to be responsible for complexing the calcium ion involved in carbohydrate binding. Sequence analysis of C-type CRDs in comparison with monosaccharide specificity has previously revealed that residues corresponding to Glu185 and Asn187 in rat MBL-A are highly conserved in CRDs that bind Man/Glc, unlike Gal-binding C-type CRDs that have Gln and Asp at these critical positions (34). CL-46 possesses Glu335 and Asn337 at the equivalent positions, indicating that CL-46, like MBL, SP-D, conglutinin, and CL-43, has a Man-type monosaccharide specificity (34). We analyzed the specificity of CL-46 in inhibition assays and found that GlcNAc is the best inhibitor, followed by ManNac, d-mannosamine, and MeMan. Galactose showed only a weak potential to inhibit the binding, whereas Man, Glc, and l-Fuc had moderate potential as inhibitors. The specificity of CL-46 and conglutinin is strikingly similar. For both molecules, GlcNAc is the best inhibiter of the lectin-carbohydrate interaction. However, there are minor differences, best illustrated by the relative ranking of Mal, Man, and Gln. These relatively small differences might be important and might determine the specificity of in vivo ligands, because the binding relies on multivalent interaction of clustered CRDs with repeating carbohydrates on microbial surfaces. With respect to the binding of conglutinin to iC3b, we do not yet know whether CL-46 mimics the binding or perhaps binds to other deposition products of C3. An extended hydrophilic loop made of three aspargine residues (338–340), corresponding to loop 4 in the MBL and SP-D crystal, is found just next to Asp337 involved in calcium and carbohydrate binding (34, 35). The extension could play a central role in binding more complex carbohydrates, because it extends from the binding core and introduces potential positively loaded residues that might interact with anionic ligands or might even participate in coordination bonds with additional complexed calcium ions and carbohydrates. Similar extended loops can be found in conglutinin, CL-43, and porcine SP-D, although the nature of the neighboring residues is less striking (36).
CL-46 shows 83% identity with conglutinin, 77% with CL-43, 61% with bovine SP-D, and 67% with human SP-D when the CRD sequences were compared at the protein level (Fig. 6⇓A). Phylogenetic analysis of the same domain shows that CL-46 likely evolved from a common conglutinin/CL-46 ancestral gene, which again evolved from a common SP-D ancestral gene (Fig. 6⇓B). The same analysis shows that CL-43 evolved from the conglutinin branch after CL-46 and conglutinin diverged. The structure of the four collagen-coding exons encoded by 117 bp, 108 bp, 108 bp, and 117 bp, respectively, is unique for bovine SP-D, conglutinin, and CL-46. The corresponding exons in human, rat, and porcine SP-D have a uniform size of 117 bp (Fig. 6⇓C). Thus, it seems that conglutinin, CL-46, and CL-43 evolved by complete or partial gene duplications of the ancestral bovine SP-D gene—and of each other—after the Bovidae separated from other mammals. It is interesting to note that the extended hydrophilic loop of porcine SP-D, conglutinin, CL-43, and CL-46 appears to have evolved in parallel and not from a common ancestral gene, because bovine SP-D lacks the extended loop. The parallel development and preservation in evolution indicates that the extension of the hydrophilic loop is of functional importance.
The overall structure of the CL-46 gene resembles that of the human SP-D gene, with eight exons encoding the mRNA transcript. The 5′-untranslated exon of CL-46 resembles the second 5′-untranslated exon of the conglutinin gene, which is not found in human SP-D. We were not able to amplify CL-46 transcripts, with the additional 5′-untranslated exon corresponding to the first untranslated conglutinin exon from RNA isolated from thymus. It seems likely that transcription of CL-46 in the thymus initiates from the TATA box (−4) adjacent to the assigned start site; although it shows only relatively low identity with the consensus binding motif (TATA A/T A/T) of the transcription factor complex TFIID. A potential CAAT box (−71) and several other auxiliary cis-elements, including sites for AP-1 (−315) and NF-1 (−445), precede the TATA box (−4). Using a high-stringency matrix search for cis-responsive elements in the 5′-upstream region showed a large number of potential sites for the transcription factors: Ikaros2, NF-AT, NF-κB, c-Rel, and gut-enriched Krueppel-like factor. Strikingly, all of these factors are expressed in thymic cells or endothelial cell origins, and Ikaros2, NF-AT, NF-κB, and c-Rel regulate development and activation of T cells (37, 38, 39). Our RT-PCR analysis of CL-46 showed a dominant expression in the thymus, which falls in line with multiple thymus-related cis-regulatory elements in the proposed promoter region. We also observed expression in the liver and moderate expression in the mammary gland and in tissues throughout the digestive system. At present we do not know which type of cells expresses CL-46, but we judge from the pattern of expression and the recently reported mucosa-associated expression of SP-D that epithelial cells are good candidates (4). It is likely that expression of CL-46 in tissues other than the thymus initiates transcription from the TATA boxes (−859 to −870) located further 5′-upstream, resulting in transcripts with an additional 5′-untranslated exon, like the first exon found in the conglutinin gene. The sequences surrounding the TATA box (−859) are very similar to the transcription start sites of conglutinin and human SP-D. Several preceding cis-elements, including the AP-1 site (−942), with a very high match to the consensus motif are also conserved in the conglutinin gene and the SP-D genes found in human and rodent (40). Expression of CL-46 in tissues other than the thymus may be regulated by transcription factors with cis-elements located further 5′-upstream, corresponding to the conglutinin and SP-D promoters.
We localized the CL-46 gene to the bovine chromosome 28 in proximity of a microsatellite at the position q1.8. This is same position as the conglutinin gene, and we also found that the gene encoding CL-43 localized to this position (41, 42). The human collectin gene cluster is located at chromosome 10q21.1-21.4; it seems likely that the bovine collectin genes cluster at BTA 28 q1.8, which is also supported by chromosomal comparative mapping. However, because it is a quite distal position and because the murine MBL-C gene escapes the murine collectin cluster at chromosome 14, it is possible that BTA 28 does not contain all the collectin genes (43).
We find it puzzling that the collectin genes conglutinin, CL-43, and CL-46 evolved in cattle after the divergence from other mammals. With the well-established anti-microbial role of other collectins in mind, it is obvious to speculate that it relates to the rumination, which involves a critical symbiosis with many types of microorganisms. CL-46 expression in the digestive organs and the overlapping localization of human SP-D and SP-A supports this notion. A recent evolution of additional collectins, due to rumination, draws new attention to the role of SP-D and SP-A in the immune defense of the gut. Because ruminants rely heavily on microbial symbiosis in the gut, the general role of the collectins might not be exclusively antimicrobial, but more of an anti-inflammatory character to sustain the symbiosis and avoid inflammation of the gut. Such a role is in line with the established anti-inflammatory effect of SP-A and SP-D in the lung (10, 11). Although MBL is not expressed in the thymus, it binds to apoptotic thymocytes in vitro, indicating that these cells express potential collectin ligands (44). Whether CL-46 is involved in the clearance of apoptotic thymocytes is yet not know. Judging by the thymus-related cis-elements present in the promoter region, it is even possible that T cell development modulates CL-46 expression. Because the carbohydrate specificity of CL-46 resembles the carbohydrate specificity of conglutinin, potential opsonization of apoptotic cells by CL-46 might take advantage of deposition of iC3b. Deposition of iC3b has been associated with clearance of apoptotic B cells (45).
We conclude from the above results that we have identified a novel member of the collectin family. The N-terminal part of the molecule and the collagen region, which includes the potential N-glycosylation site, show highest homology to SP-D. The CRD region shows highest homology to conglutinin, and so does the carbohydrate binding specificity. Compared with the rest of the known collectins, CL-46 has a unique expression pattern being expressed mainly in the thymus and the liver. This may reflect a yet uncharacterized linkage between innate immunity and the development of the adaptive immune system.
↵1 This work was supported by the Alfred Benzon Foundation, the Frode and Norma Jacobsen’s Foundation, and the Novo Nordisk Foundation.
↵2 Current address: Department of Cell Biology, Duke University Medical Center, Durham, NC 27710.
↵3 Address correspondence and reprint requests to Dr. Uffe Holmskov, Department of Immunology and Microbiology, University of Southern Denmark, Odense, DK-5000 Odense C, Denmark. E-mail address:
↵4 Abbreviations used in this paper: CRD, carbohydrate recognition domain; MBL, mannan-binding lectin; SP-A, surfactant protein A; SP-D, surfactant protein D; CL-43, 43-kDa collectin; CL-46, 46-kDa collectin; GlcNAc, N-acetyl-d-glucoseamine; Man, mannose; Tw, Tween 20; αMeMan, α-methyl-d-man; ManN, d-mannosamine; ManNAc, N-acetyl-d-mannosamine; Glc, d-glucose; αMeGlc, α-methyl-d-glucose; GlcN, d-glucosamine; GlcNAc, N-acetyl-d-glucoseamine; Gal, d-galactose; GalN, d-galactosamine; GalNAc, N-acetyl-d-galactosamine; Fuc, d-fucose; l-Fuc, l-fucose; Mal, maltose.
- Received May 28, 2002.
- Accepted September 17, 2002.
- Copyright © 2002 by The American Association of Immunologists