Members of the Ikaros family of transcription factors are important for immune system development. Analysis of Ikaros-related genes from a range of species suggests the Ikaros family derived from a primordial gene, possibly related to the present-day protostome Hunchback genes. This duplicated before the divergence of urochordates to produce two distinct lineages: one that generated the Ikaros factor-like (IFL) 2 genes of urochordates/lower vertebrates and the Pegasus genes of higher vertebrates, and one that generated the IFL1 genes of urochordates/lower vertebrates, the IKFL1 and IKFL2 genes of agnathans and the remaining four Ikaros members of higher vertebrates. Expansion of the IFL1 lineage most likely occurred via the two intervening rounds of whole genome duplication. A proposed third whole genome duplication in teleost fish produced a further increase in complexity of the gene family with additional Pegasus and Eos members. These findings question the use of IFL sequences as evidence for the existence of adaptive immunity in early chordates and vertebrates. Instead, this study is consistent with a later emergence of adaptive immunity coincident with the appearance of the definitive lymphoid markers Ikaros, Aiolos, and Helios.
Transcription factors are often called “master regulators” of development as they function in mediating the complex changes in gene expression required to produce the diverse array of cells needed in a multicellular organism (1). A common motif found in transcription factors is the zinc finger motif, consisting of a zinc atom in specific coordination with four cysteine and/or histidine residues. Such motifs can be involved in either protein-nucleic acid or protein-protein interactions, with more than one zinc finger commonly found in the same protein, often in the form of tandem arrays (2, 3). Zinc finger transcription factors exhibit broad evolutionary conservation, including hematopoiesis. For example, GATA-related proteins regulate blood cell development from Drosophila to man (4, 5).
The Ikaros family of zinc finger transcription factors are important regulators of immune system development (6). This family is characterized by two sets of highly conserved zinc finger motifs: a set of usually four zinc fingers located at the N terminus involved in DNA binding, and a set of two zinc fingers at the C terminus, which enable the Ikaros proteins to dimerize with themselves or other family members (7). The mammalian Ikaros family consists of Ikaros, Helios, Aiolos, Eos, and Pegasus. Ikaros and Aiolos are highly conserved and expressed in lymphoid tissues, such as the spleen and thymus, as well as PBL (6, 8, 9). Loss or mutation of Ikaros results in dramatic decreases in T cells, B cells, NK, and lymphoid-derived dendritic cells, as well as effects on myeloid cell lineages (9, 10, 11). Similarly, loss of Aiolos results in severe depletion of peritoneal and recirculating B cells and, in contrast, the spontaneous production of autoantibodies (12). Deregulation of Ikaros and Aiolos, particularly their splice variants, can result in cellular hyperproliferation and the development of lymphomas and leukemias (7, 9, 10, 13). Helios is also involved in T cell differentiation, at least partly via heterodimerization with Ikaros (7). Constitutive expression of mutant forms of Helios leads to aggressive T cell lymphoma (14, 15). Eos is more broadly expressed, being found in skeletal muscle, heart, and liver in addition to hematopoietic tissues such as thymus and spleen (16, 17). However, its function remains relatively poorly understood. Pegasus is distinct from the other Ikaros members, as it contains only three N-terminal zinc fingers, with divergent sequence and DNA binding specificity. It is also widely expressed but its function remains to be determined (17).
Ikaros family homologues and related genes have been identified in a diverse range of species (17, 18, 19, 20, 21, 22, 23, 24, 25, 26), the presence of which has been used as evidence of lymphocyte-like gene regulatory programs in these organisms (19, 20, 22, 23, 27). However, despite this, a comprehensive perspective on the origins of this important family has remained elusive. To further our evolutionary understanding, we have explored the repertoire of Ikaros family of transcription factors in amphioxus, sea squirt, lamprey, teleost fish, and other species. Detailed bioinformatics analysis has allowed the formulation of an evolutionary model for the genesis of the Ikaros gene family. This suggests that Ikaros-related genes were derived from a primordial gene, possibly related to the present-day Hunchback genes of protostomes, which underwent duplication before the divergence of urochordates/lower vertebrates from higher vertebrates. This yielded a distinct Pegasus-related lineage that has remained largely intact since that time, and another lineage that ultimately produced Ikaros, Aiolos, Helios, and Eos genes by the time of divergence of the cartilaginous fish, likely via the two intervening rounds of whole genome duplication that occurred during this time. A further increase in complexity of the family has occurred in teleost fish following a proposed third whole genome duplication in this lineage. These findings have prompted a reassessment of the origins of adaptive immunity.
Materials and Methods
Identification of new Ikaros-related sequences
Whole genomic sequence, high throughput genomic, and expressed sequence tag (EST)3 databases of amphioxus, sea squirt, lamprey, and teleost fish were searched using known human orthologues of the Ikaros family to identify Ikaros-related sequences. In addition, cDNA clones for ESTs representing zebrafish genes encoding Pegasus (AY394931-full-length), Eos (CK397774 and AL927666-partial), and Helios (DN896784-partial) were obtained and sequenced in full. Raw sequence analysis, contig assembly, and basic sequence manipulation and were performed using Sequencher 4.1.4 (GeneCodes) or the web-based basic local alignment search tool (BLAST) 2Sequences (http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi). GenomeScan (http://genes.mit.edu/genomescan.html) was used to predict coding exons from sequences derived solely from whole genomic sequence or genomic scaffolds. The following new sequences were identified; Xenopus tropicalis (western clawed frog) Aiolos (ENSXETP00000033186), Danio rerio (zebrafish) Pegasus (BN001214), Eos (BN001212) and Helios (BN001213); Tetraodon nigroviridis (spotted green pufferfish) Ikaros (BN001216), Aiolos (BN001289), Pegasus1 (BN001218), Pegasus2 (BN001290), Eos1 (BN001215), Eos2 (BN001291), and Helios (BN001217); Takifugu rubripes (fugu pufferfish) Eos1 (BN001285), Eos2 (BN001286), Helios (BN001287), and Aiolos (BN001288); Oryzias latipes (medaka) Eos1 (BN001294), Eos2 (BN001282), Helios (BN001283), and Aiolos (BN001284) and Petromyzon marinus (sea lamprey) Pegasus (Ikaros factor-like (IFL) 2) (GENSCAN00000129115 and GENSCAN00000133626). The latter was identified using pre-ensembl contigs (http://pre.ensembl.org/Petromyzon_marinus/index.html) and the UCSC lamprey genome browser database (http://genome.ucsc.edu/cgi-bin/hgGateway?clade = other&org = Lamprey&db =). Also retrieved were Ciona intestinalis (sea squirt) IFL1 (BN001219) and IFL2 (BN001220), which have been assigned model identity numbers (144428 and 131901, respectively) in the Ciona intestinalis database version 1.0 (20). In addition, Branchiostoma floridae (amphioxus) IFL1 N terminus (BN001292) and C terminus (BN001293), and IFL2 C terminus (BN001281) were identified from GenBank and JGI Branchiostoma floridae v1.0 genome browser database (http://genome.jgi-psf.org/Brafl1/Brafl1.home.html) searches. Predicted sequences of zinc finger domains were also retrieved for Ikaros members from Callorhinchus milii (elephant shark).
Database interrogation and assembly of known sequences
BLASTN (nucleotide query vs nucleotide database), BLASTX (translated query-protein), and TBLASTN (nucleotide-protein) search algorithms were used on the National Center for Biotechnology Information (NCBI) (www.ncbi.nlm.nih.gov/BLAST/) and Ensembl (www.ensembl.org) websites to identify Ikaros-related sequences from representative organisms. Sequences retrieved from GenBank (http://www.ncbi.nlm.nih.gov) or Ensembl were: Homo sapiens (human) Ikaros AAP88838, Aiolos AAF13493, Helios AAF09441, Eos AAG39221, and Pegasus AAG39220, Mus musculus (mouse) Ikaros NP_033604, Aiolos XP_283022, Helios NP_035900, Eos NP_035902 and Pegasus NP_780324, Gallus gallus (chicken) Ikaros NP_990419, Aiolos NP_990151, Helios NP_989938 and Pegasus NP_001026766, Xenopus tropicalis (western clawed frog) Ikaros NP_001015698, Helios NP_001116930, Eos ENSXETP00000020847 and Pegasus NP_001017355, Danio rerio (zebrafish) Ikaros AAC61763, Takifugu rubripes (fugu pufferfish) Ikaros ENSTRUG00000017779, Pegasus1 ENSTRUG00000001038 and Pegasus2 ENSTRUG00000010762, Oryzias latipes (medaka) Ikaros ENSORLT00000025472, Pegasus1 ENSORLT00000007007 and Pegasus2 ENSORLT00000011938, Raja eglanteria (skate) Ikaros AF163848, Aiolos AF163850, Helios AF163847, Eos AF163849, Petromyzon marinus (sea lamprey) IKFL1 AAF82350, Lampetra fluviatilis (river lamprey) IKFL2 AAL67304, Myxine glutinosa (hagfish) composite IFL (renamed IKFL1) AAP84653, Oikopleura dioica (urochodate) IFL AAP84655, Strongylocentrotus purpuratus (purple sea urchin) IFL1 CX559654 and Branchiostoma belcheri tsingtaunese (amphioxus) IFL1 ABH05923, together with the Hunchback genes Caenorhabditis elegans (nematode) Hunchback-like AAD16170, Drosophila melanogaster (fruit fly) Hunchback NP_731267, Helobdella robusta (leech) Hunchback-like AAY43810, Helobdella triserialis (leech) hbl1 CAA62741 and hbl2 X91395, and Platynereis dumerilii (Dumeril’s clam worm) Hunchback-like CAJ78430.
Sequence analysis and nomenclature
For phylogenetic analysis, all known and novel protein sequences were aligned using the ClustalX version 1.83 algorithm and GONNET matrix model (Freeware) (28). This matrix was used to calculate the alignment with penalties applied for opening and extending gaps in the alignment. A phylogenetic tree was created using the bootstrapped Neighbor-Joining algorithm of 100 replicates in PAUP (Freeware). Trees were formatted using njplot (http://pbil.univ-lyon1.fr/software/njplot.html) and viewed in TreeView (Freeware: http://taxonomy.zoology.gla.ac.uk/rod/treeview.html) (29). Additional analysis using maximum parsimony (30) and maximum likelihood (31) algorithms was performed using the PHYLIP version 3.68 (http://evolution.genetics.washington.edu/phylip.html) software package to confirm phylogenetic topologies. Synteny analysis was performed using Ensembl (http://www.ensembl.org) and NCBI Map Viewer (http://www.ncbi.nlm.nih.gov/mapview) using the Ciona intestinalis (JGI v2), Petromyzon marinus (PMAR3), Homo sapiens (NCBI 36/Build 36.2), Xenopus tropicalis (JGI v4.1) and Oryzias latipes (NIG MEDAKA1) databases. Analysis of splicing was performed by comparing the genomic scaffold to the relevant cDNA sequences, applying the GT-AG rule (32).
Identification of Ikaros-related sequences in teleost fish, lamprey, sea squirt, and amphioxus
Exhaustive bioinformatic analysis of zebrafish genomic databases revealed the presence of single homologues for Helios, Eos, and Pegasus, as well as the previously described Ikaros (33), but not the related Aiolos (Fig. 2 and data not shown). The presence of corresponding ESTs verified that all were expressed. A similar analysis of the medaka, fugu, and green spotted pufferfish genomes revealed an identical complement of Ikaros and Helios genes (Figs. 1⇓ and 2⇓), but identified duplications of Pegasus and Eos genes (Figs. 1⇓ and 2⇓ and supplemental Fig. 2).4 Divergent Aiolos genes were also identified in these species (Fig. 1⇓ and 2⇓). Predicted genomic sequences retrieved from elephant shark also showed high level of identity to zinc finger domains of Ikaros members (supplementary Fig. 1). Bioinformatic analysis of the lamprey genome identified, in addition to the previously discovered IKFL1 and IKFL2 genes, a Pegasus-like gene, which grouped with the IFL2 subset (Fig. 1⇓ and 2⇓). In sea squirt, two Ikaros-related (IFL) proteins, designated IFL1 and IFL2, with homology with IFL sequences identified in other urochordates and other early vertebrates were discovered (Fig. 1⇓ and data not shown). In addition, partial sequences for the N terminus and C terminus zinc finger motifs of IFL1 and IFL2 genes in two amphioxus species were also identified (Figs. 1⇓ and 2⇓).
Zinc fingers of Ikaros-related sequences show high conservation and an alteration to the common C2H2 motif
Alignment of the N-terminal and C-terminal arrays of zinc finger (ZF) motifs revealed high level of identity and conservation of each motif between species and especially among the human and medaka orthologues (Fig. 1⇑). In general Ikaros, Aiolos, Helios, and Eos proteins aligned more closely with amphioxus and Ciona IFL1, whereas Pegasus proteins aligned more closely with IFL2 (Fig. 1⇑). Pegasus and IFL2 both lack one of the N-terminal zinc fingers, which the alignment suggests is ZF4, rather than ZF1 as suggested by others (17), an assertion supported by phylogenetic analysis of individual ZF3 and ZF4 sequences (data not shown). Medaka Ikaros and Helios, in contrast, were missing ZF1. Hunchback showed consistent homology to the Ikaros-related proteins, particularly in ZF2, ZF3, ZF5, and ZF6 (Fig. 1⇑). Finally, the C2H2 motif was observed in all zinc fingers, except for ZF4 of IFL1 and Aiolos, as well as Eos1 (medaka), which all possessed a variant C2HC motif. However, many of the Ikaros sequences have a cysteine residue immediately before the last histidine in the ZF4 and ZF5 motif, suggesting the IFL1 and Aiolos variants do not necessarily change the motif dramatically.
Phylogenic analysis of Ikaros-related sequences
To investigate potential evolutionary relationships, Ikaros-related sequences were assembled from a range of different species, along with the most closely related zinc finger protein from invertebrate species, Hunchback. Phylogenetic trees were constructed using the Neighborhood-Joining algorithm from ClustalX alignments of these sequences (Fig. 2⇑ and supplementary Fig. 2). The robustness of these trees was confirmed by the relatively high bootstrapping values for each branch, and by using the parsimony maximum likelihood PHYLIP 3.68 algorithm, which produced a similar branching structure (data not shown).
The trees indicated two distinct clades within the Ikaros family (red boxes). The first clade consisted of IFL2 sequences from amphioxus and Ciona along with Pegasus sequences from various higher vertebrate species, with IFL2 being most divergent. The second clade consisted of IFL/IFL1 sequences from sea urchin, amphioxus, tunicates, and Ciona, IKFL sequences from agnathans, as well as Helios, Eos, Ikaros, and Aiolos sequences from a variety of higher vertebrates. The IFL/IFL1 sequences were clearly divergent from the other sequences, which subsequently split into two branches, one consisting of Helios and Eos along with IKLF2 sequences and the other of all Ikaros and Aiolos along with IKLF1 sequences (blue boxes). Each of these branches split into individual Ikaros, Aiolos, Helios, and Eos clusters (yellow boxes). Finally, the Hunchback proteins grouped closest to the Ikaros-related sequences, a consistent result regardless of the outgroup or algorithm used (supplementary Fig. 2 and data not shown).
Synteny analysis of Ikaros family genes in higher vertebrates
Synteny analysis was performed on Ikaros family members from sea squirt, lamprey, and three representative higher vertebrates: human, frog, and medaka. Genes with conserved synteny were observed for each Ikaros member: HMX3, ACAD-SB and CUZD1-related genes for Pegasus; ZPBP, FIGNL1-related, GRB10-related and ERBB1-related genes for Ikaros; IGFBP4, PSMD3-related, ZPBP2, GRB7-related, ERBB2-related, and MLLT6-related genes for Aiolos; STAT2-related, SUOX, RSP26-related, ERBB3-related, and PA2G4-related for Eos, as well as ABCA12-related, BARD1-related, ERBB4-related, ACAD-L, and SPAG16-related for Helios (Fig. 3⇓), confirming the orthologous assignment of frog Aiolos and Eos, and medaka Ikaros, Aiolos, Helios, and the duplicated Eos member. However, conserved syntenic relationships were also observed between different Ikaros genes: Ikaros and Aiolos shared synteny with GRB and ZPBP homologues (Ikaros: GRB10 and ZPBP; Aiolos: GRB7 and ZPBP2), while Helios and Eos share synteny with ERBB homologues (Eos: ERBB3; Helios: ERBB4), confirming close evolutionary derivation of each gene pair. Ikaros and Aiolos also shares synteny with an ERBB homologue (Ikaros: ERBB1; Aiolos: ERBB2) highlighting a likely common evolutionary ancestry between the gene pairs and also to Ciona IFL1 which contains an ERBB gene on its loci. Similarly, Pegasus and Helios also shared conserved synteny with ACAD homologues, again suggesting a common evolutionary derivation. Synteny analysis of the duplicated teleost eos and pegasus genes confirmed differential gene loss between species (supplementary Fig. 3).
Preservation of splice sites within Ikaros-related genes
Comparison of splicing patterns reinforced the presumed evolutionary relationships (Fig. 4⇓). Ikaros, Aiolos, Helios, and Eos have an identical splicing pattern except for an extra exon in zebrafish Eos and medaka Aiolos, together with the absence of zinc finger sequences in teleost Helios. IFL1 and lamprey IKFL genes showed a similar pattern, except for three additional exons in Ciona. In contrast, while Pegasus genes in lamprey and higher vertebrates had their first ZF on a separate exon like the other Ikaros genes, the remainder lie on a single exon like Hunchback. IFL2 showed similarities to both Pegasus and other Ikaros genes.
The Ikaros family of zinc finger proteins represent an important class of transcriptional regulators, with key roles identified for some members in adaptive immunity (7). To further our understanding of the evolution of this family, we identified and characterized related genes in teleost fish, agnathans, cephalochordates, and other species. Detailed bioinformatics analysis of these and other Ikaros-related sequences has shed new light on the evolutionary origins of this family with implications for understanding the emergence of adaptive immunity.
Phylogenetic analysis of full-length protein sequences (Fig. 2⇑ and supplemental Fig. 2) highlighted the likely evolutionary relationships between specific IFL genes of urochordates and lower vertebrates and Ikaros family members of higher verterbrates: IFL1 genes clustering with Ikaros, Aiolos, Helios, and Eos, and IFL2 with Pegasus. Within the former cluster, distinct Ikaros/Aiolos and Helios/Eos subclusters were present, typical of a gene family expanded through successive rounds of duplication, with the agnathan IKFL1 and IKFL2 likely representing derivatives of the respective intermediates. These relationships were confirmed and further emphasized by similarities in the sequences of individual zinc fingers (Fig. 1⇑ and supplementary Fig. 1), arrangements of respective gene loci (Fig. 3⇑ and supplementary Fig. 3), and splicing patterns (Fig. 4⇑).
On the basis of these data, we propose a model describing the evolution of Ikaros-related genes (Fig. 5⇓). This suggests that Ikaros members evolved from a primordial gene (“X”). This duplicated before the divergence of cephalochordates and urochordates (34), producing a lineage represented by present-day IFL2 in urochordates and Pegasus in higher vertebrates, and another lineage represented by present-day IFL/IFL1 in urochordates and lower vertebrates. The latter lineage underwent successive duplications to produce intermediate precursors, related to present day agnathan IKFL1 and IKFL2, and from these the Ikaros/Aiolos and Helios/Eos gene pairs, respectively, by the time of the divergence of cartilaginous fish. Two rounds of whole genome duplications occurred during this time (35, 36), providing a convenient mechanism for how this occurred. A third round of whole genome duplications resulted in the further expansion of the Ikaros gene family by duplications of Pegasus and Eos members in the teleost lineage, although these appear to have been lost in zebrafish where gene loss appears to have occurred. Collectively, these data suggest that Pegasus likely retains a more ancient function compared with other members of the higher vertebrate Ikaros family member.
The proposed model of Ikaros gene family evolution has implications for understanding the evolution of the immune system. For example, the presence of IFL/IKFL sequences has been used as evidence for the existence of a lymphocyte-like gene regulatory apparatus in urochordates and early vertebrates (19, 20, 22), by analogy with Ikaros (as well as Aiolos and Helios), which are largely expressed in a lymphoid-specific manner with lymphoid-specific gene regulatory functions (8, 9, 11, 37). However, the evolutionary relationships delineated in this study would suggest that IFL/IKFL genes may not necessarily have immune functions. Thus, IFL2 is likely to reflect a functional homologue of the more divergent and broadly expressed Pegasus, for which expression has been observed in the adult mammalian brain, heart, skeletal muscle, kidney, and liver by Northern blot analysis (17), as well as specifically in the forebrain, hindbrain, eye, and gut during zebrafish embryogenesis as judged by in situ hybridization (ISH) (L. B. John and A. C. Ward, unpublished data). As for IFL1, the amphioxus equivalent was found to be expressed in ovary and gills by ISH (22), while urochordate IFL1 was strongly expressed in the oocyte and filter house as judged by ISH (20) and, significantly, failed to bind the consensus Ikaros site (20). Similarly, lamprey IKFL genes were found to be expressed in intestinal epithelium and strongly in the ovary using ISH, with further expression in liver and gills detected by RT-PCR (19). This collectively suggests that ascribing an immune-related function to the IFL genes may be a dangerous assumption, although general expression patterns do not themselves rule out such a role. With this important caveat, our model places the appearance of the definitively lymphoid Ikaros, Aiolos, and Helios proteins coincidentally with the timing of the “emergence” of adaptive immunity at around 450 million years ago, as argued by others (35), although this remains controversial.
The protein Hunchback, found in various invertebrate species, has been previously described as having sequence similarity with Ikaros, particularly at the C terminus (38, 39). Despite this, a direct evolutionary relationship has not been considered, in part due to the divergent function of Hunchback, a “gap” gene playing a crucial role in controlling pattern formation and segmentation during development (40, 41, 42, 43, 44), compared with the lymphopoietic function of Ikaros, Helios, and Aiolos (7). However, because Hunchback showed the highest homology of any protostome sequence to Ikaros family members (data not shown), consistently grouping with them in phylogenetic analysis (supplementary Fig. 2 and data not shown) with the same topology of C2H2 type zinc finger motifs (45, 46) that showed reasonable alignment with those of Ikaros family members (Fig. 1⇑), the possibility of shared evolutionary origins from primordial gene X should be entertained.
We thank Clifford Liongue for very helpful discussions regarding phylogenetic analysis and sequence alignments.
The authors have no financial conflict of interest.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
↵1 This project was supported by a Deakin Central Research Grant. L.B.J. received an Australian Postgraduate Award.
↵2 Address correspondence and reprint requests to Prof. Alister C. Ward, School of Medicine, Deakin University, Waurn Ponds, Victoria, Australia. E-mail address:
↵3 Abbreviations used in this paper: EST, expressed sequence tag; IFL/IKFL, Ikaros factor-like; ZF, zinc finger; ISH, in situ hybridization; BLAST, basic local alignment search tool.
↵4 The online version of this article contains supplemental material.
- Received July 21, 2008.
- Accepted February 4, 2009.
- Copyright © 2009 by The American Association of Immunologists, Inc.