The study of genomic organization and regulatory elements of rRNA genes in metazoan paradigmatic organisms has led to the most accepted model of rRNA gene organization in eukaryotes. Nevertheless, the rRNA genes of microbial eukaryotes have also been studied in considerable detail and their atypical structures have been considered as exceptions. However, it is likely that these organisms have preserved variations in the organization of a versatile gene that may be seen as living records of evolution. Here, we review the organization of the main rRNA transcription unit (rDNA) and the 5S rRNA genes (5S rDNA). These genes are reiterated in the genome of microbial eukaryotes and may be coded alone, in tandem repeats, linked to each other or linked to other genes. They may be found in the chromosome or extrachromosomally in linear or circular units. rDNA coding regions may contain introns, sequence insertions, protein-coding genes or additional spacers. The 5S rDNA can be found in tandem repeats or genetically linked to genes transcribed by RNA polymerases I, II or III. Available information from about a hundred microbial eukaryotes was used to review the unexpected diversity in the genomic organization of rRNA genes.
ribosomal cistron organization
The most recent phylogenetic model for relationships among eukaryotes clusters them into six supergroups, probably monophyletic (Simpson & Roger, 2004; Adl et al., 2005). Microbial eukaryotes are found in all six groups and have considerable morphological, ultrastructural and genetic diversity. Several unique features have been described in these organisms, such as trans-splicing and RNA editing in trypanosomatids (Madison-Antenucci et al., 2002; Haile & Papadopoulou, 2007) as well as DNA splicing and rearrangements in the ciliate Tetrahymena (Prescott, 2000). Microsporidia (Encephalitozoon cuniculi) possess genomes in the size range of bacteria (Keeling & Slamovits, 2004), while the genomes of dinoflagellates lack histones and nucleosomes (Moreno Díaz de la Espina et al., 2005). Cryptomonad and chlorarachniophyte unicellular algae conserve a relict miniaturized nucleus of a formerly independent alga (nucleomorph) (Cavalier-Smith et al., 2002) and specialized infection organelles (rhoptries and micronemes) are present in apicomplexans such as Plasmodium (Kats et al., 2006) and Toxoplasma parasites (Boothroyd & Dubremetz, 2008). Unusual characteristics extend to the organization of rRNA genes, which evidence the peculiarities, diversity and divergence of the genome structure in microbial eukaryotes. An overview of the biology of key microbial eukaryotes is given in Box 1.
We used the most recent and accepted classification of eukaryotes, based on multiple gene molecular phylogenies and structural analyses. This system divides eukaryotes into six supergroups: Amoebozoa, Opisthokonta, Rhizaria, Plantae, Chromalveolata and Excavata (Simpson & Roger, 2004; Adl et al., 2005; Dacks et al., 2008). Here, we describe key microorganisms of each supergroup (Margulis & Schwartz, 2000).
1. AMOEBOZOA: Organisms that show amoeboid locomotion with pseudopodia.
Pelomyxa palustris: Giant anaerobic amoeba that contains three types of bacterial endosymbionts that replace the functions of some lacking organelles such as the mitochondria.
Entamoeba histolytica (Entamoebida): Uninucleate amitochondriate amoeba that infects the intestine of animals, causing amoebiasis.
Physarum polycephalum (Eumycetozoa, Myxogastria): Amoeboid cells that can differentiate into fungus-like reproductive structures. During its life cycle, a diploid zygote divides repeatedly to form a multinucleated cytoplasmic mass called the plasmodium. Under dry conditions, the plasmodium may mature into spore-producing organs.
Dictyostelium discoideum (Eumycetozoa, Dictyostelia): Land-dwelling cellular slime mold. Independent amoebas may aggregate into a slimy mass (slug) that eventually transforms into a reproductive body that produces spores.
2. OPISTHOKONTA: Organisms with a single posterior flagellum in at least one stage of the life cycle.
Fungi: Dominant osmotrophs that play crucial roles as decomposers and as symbionts or parasites.
Ascomycota: Hold a microscopic reproductive structure called ascus.
Pneumocystis carinii (Ascomycota, Taphrinomycotina): Causes fatal pneumonia in immunocompromised humans.
Ciliophora: Unicellular organisms covered with cilia (short undulipodia). They have two types of nuclei: small genetic micronuclei (MIC, containing standard chromosomes) and large transcriptionally active macronuclei (MAC, it develops from the micronuclei).
Apicomplexa: Specialized obligate intracellular parasites named for the ‘apical complex’ that hold structures such as rhoptries and micronemes, the specialized machinery used for invasion (Kats et al., 2006; Boothroyd & Dubremetz, 2008).
Plasmodium (Aconoidasida, Haemosporida): The causative agent of malaria exists in association with an invertebrate host (sexual stage in the mosquito) and a vertebrate host (asexual stage). Plasmodium falciparum and Plasmodium vivax infect human red blood cells, while Plasmodium berghei infects rodents.
Babesia bovis (Aconoidasida, Piroplasmorida).
Theileria parva (Aconoidasida, Piroplasmorida).
Cryptosporidium (Conoidasida, Coccidiasina).
Eimeria (Conoidasida, Coccidiasina).
6. EXCAVATA: Organisms that typically have a suspension-feeding groove and flagella.
Giardia intestinalis (Fornicata, Eopharyngia, Diplomonadida, Giardiinae): A parasite of the small intestine of vertebrates through infective cysts. It has two transcriptionally active karyomastigonts (nuclei attached to undulipodia by thin fibers), and lacks mitochondria and the Golgi apparatus.
Trichomonas vaginalis (Parabasalia, Trichomonadida): Amitochondriate parasite causative of trichomoniasis in humans. The organelles known as parabasal bodies are involved in the synthesis, storage and transport of proteins.
Trichomonas tenax (Parabasalia, Trichomonadida): Infects the human mouth.
Tritrichomonas foetus (Parabasalia, Trichomonadida): Infects the urogenital tract of cattle.
Naegleria gruberi: (Heterolobosea, Vahlkampfiidae) Soil and freshwater freeliving amoeba that transforms into unduliopodiated cells.
Euglena gracilis (Euglenozoa, Euglenida, Euglenea): Unicellular organism living in stagnant water. It can be found with or without chloroplasts.
Kinetoplastea (Euglenozoa): Contain a large mitochondrion called a kinetoplast.
Trypanosoma (Metakinetoplastina, Trypanosomatida): The change of host and some differentiation steps are associated with characteristic movements of the kinetoplast along the cell. Trypanosoma brucei infection (transmitted to humans through the bite of infected tsetse flies) causes the sleeping sickness, while Trypanosoma cruzi infection (transmitted through the bite of infected reduviid bugs) leads to Chagas disease in humans.
Leishmania (Metakinetoplastina, Trypanosomatida): Parasite responsible for the leishmaniasis disease. It multiplies within the lysosomes of vertebrate macrophages and within the digestive system of sand-flies.
Bodo saltans (Metakinetoplastina, Eubodonida): Freeliving bi-undulipodiated cell.
Crithidia (Metakinetoplastina, Trypanosomatida).
Trypanoplasma (Metakinetoplastina, Parabodonia).
The typical eukaryotic translation machinery, the ribosome, is composed of two subunits with four rRNA species and >70 proteins. The large subunit (LSU) contains the 28S, the 5.8S and the 5S rRNAs. The small subunit (SSU) contains the 18S rRNA (SSU rRNA). The four rRNA mature molecules are coded in two rRNA genes transcribed by two different RNA polymerases. The 18S, 5.8S and 28S rRNAs are coded in a single transcription unit called a ribosomal cistron or the main transcription unit, transcribed by RNA polymerase I (pol I). The 5S rRNA gene is not usually linked to the ribosomal cistron and is transcribed by pol III (Mandal et al., 1984; Paule & White, 2000).
rRNA genes were among the first genes to be studied in detail due to their highly repetitive nature, ease of manipulation and biological importance (Miller & Beatty, 1969; Long & Dawid, 1980; Sollner-Webb & Mougey, 1991). The thorough study and description of genomic organization and regulatory elements in the rRNA genes of Xenopus, Drosophila and mouse led to the most accepted model of rRNA gene organization in eukaryotes (Long & Dawid, 1980; Mandal, 1984; Sollner-Webb & Mougey, 1991) (Fig. 1). The rRNA genes of microbial eukaryotes have also been intensively studied, although they were considered to be the exception to the rule, as their organization differs from the general models (Long & Dawid, 1980; Mandal et al., 1984). Here, we focus on the rRNA gene organization of microbial eukaryotes where many examples of gene diversity can be found. This work also summarizes the variability of motifs present in the rDNA intergenic region (IGR), which may include general and species-specific elements. For simplicity, in this review, the ribosomal cistron is referred to as rDNA and the term 5S rDNA is used for the 5S rRNA gene.
General organization of the ribosomal main transcription unit (rDNA) and 5S rDNA. (a) Schematic representation of Xenopus laevis rDNA. About 600 U of the ribosomal cistron are encoded in the chromosome in head-to-tail tandem repeats. Each unit contains a coding region (red) and an IGR. (b) A single unit of the X. laevis rDNA. The 18S, 5.8S and 28S rRNA molecules are transcribed as a single RNA precursor that is post-transcriptionally processed to produce the mature rRNA molecules. Transcription regulatory elements for RNA polymerase I are found in the NTS: tandem-repeated sequences (R), spacer promoters (SP), transcription terminators (T) and the promoter (P). The IGR comprises both the NTS and the ETS. (c) Organization of somatic 5S rDNA in X. laevis. The 5S rDNA is organized in tandem head-to-tail repeats that include a coding region (green box) and an intergenic sequence (black line). The 5S rDNA promoter is internal to the coding region (light green box). Arrows represent the transcription start point. ETS, external transcribed spacer.
Overview of the eukaryotic rRNA genes
The ribosomal cistron (rDNA)
In most species, the rDNA is present in multiple copies organized as tandem head-to-tail repeats. The rDNA unit is composed of a transcribed region and an IGR (also called the intergenic spacer) consisting of a nontranscribed spacer (NTS) 2–30 kbp long and an external transcribed spacer. The NTS contains most of the regulatory elements for transcription, while the external transcribed spacer is part of the primary transcript (pre-rRNA, 7–14 kb long) (Sollner-Webb & Mougey, 1991) (Fig. 1).
Several regulatory elements may be found in the IGR such as enhancers, spacer promoters, a proximal terminator and the gene promoter. This region may also contain several repetitive sequences that may improve the transcription efficiency, with additive effects (Paule & White, 2000). A schematic representation of the Xenopus laevis rDNA is shown in Fig. 1a and b as an example of the ‘typical’ eukaryotic rDNA organization (Sollner-Webb & Mougey, 1991). The rDNA pol I core promoter and other nonrepeated rDNA regulatory elements have been described and studied in detail in some unicellular eukaryotes such as Trypanosoma cruzi, Acanthamoeba castellanii and yeast (Kownin et al., 1985; Neigeborn & Warner, 1990; Wai et al., 2000; Figueroa-Angulo et al., 2006).
Transcription of the rDNA proceeds from the promoter through the 5′ external transcribed spacer – 18S rRNA – internal transcribed spacer-1 (ITS-1) – 5.8S rRNA – ITS-2 and 28S rRNA, until pol I comes across a transcription termination signal (Long & Dawid, 1980). In most cases, the rDNA primary transcript is post-transcriptionally processed in three rRNA mature molecules: 18S, 5.8S and 28S, resulting from the elimination of the external transcribed spacers ITS-1 and ITS-2 (Fig. 1) (Long & Dawid, 1980). Additional processing of the rRNAs into several smaller molecules has also been described. As most eukaryotic LSU rRNAs (eLSU rRNAs) are fragmented by removal of ITS-2, the eLSU rRNA should be defined as 5.8S+28S rRNA. Because the term LSU rRNA has been used as equivalent to bacterial 23S rRNA, here we refer to the 5.8S+28S rRNA as eLSU rRNA.
The 5S rRNA gene (5S rDNA)
The 5S rDNA is reiterated in the eukaryotic genome in tandem head-to-tail arrays (Paule & White, 2000) (Fig. 1c). The 5S rDNA promoter (Internal Control Region) is found downstream of the transcription start point and within the transcribed region. Upstream regulatory elements can also be found in some 5S rDNAs that may be necessary for transcription. The Internal Control Region is sufficient for transcription of 5S rRNA in Xenopus (Bogenhagen et al., 1980; Sakonju et al., 1980) whereas in Saccharomyces cerevisiae two upstream regulatory elements (start site element and upstream promoter element) are necessary for its efficient transcription in vivo (Lee et al., 1997).
Redundancy and relative ratio of the rRNA genes
rRNA genes are reiterated in almost every eukaryotic genome studied, and the gene copy number is maintained at a characteristic constant level for each organism. In most organisms, not all of the rDNA copies are transcribed (Conconi et al., 1989), suggesting that the total rDNA copy number is not directly related to the synthesis of rRNA (Kobayashi et al., 1998; Grummt, 2003; Raska et al., 2004). It has been proposed that the rDNA may participate in roles other than transcription, such as maintenance of the nucleolar structure and rDNA stability (Nogi et al., 1991; Oakes et al., 1993).
A considerable variation in the rDNA and 5S rDNA gene copy number exists among eukaryotes (Table 1): the rDNA copy number can range from one and two copies in the Ascomycota Pneumocystis carinii and the Apicomplexa Theileria parva to 4800 copies in the green alga Acetabularia mediterranea and 9000 copies in the ciliate Tetrahymena thermophila. The 5S rDNA copies can range from three in the red alga Cyanidioschyzon merolae to about one million in the ciliate Euplotes eurystomus. In the slime mold Dictyostelium discoideum and in the yeast S. cerevisiae, the rDNA and 5S rDNA genes are present in equal numbers, although a strict relationship between both types of genes is not always observed. For example, Euglena gracilis has 800–4000 copies of rDNA and only 300 copies of the 5S rDNA; in contrast, 110 copies of the rDNA are present in the genome of the kinetoplastid T. cruzi, while the 5S rDNA is repeated 1600 times. The copy number of rRNA genes in some microbial eukaryotes and the rDNA/5S rDNA ratio are given in Table 1. The considerable variability in this ratio suggests that different species may have particular regulatory mechanisms to maintain the 18S, 5.8S, 28S and 5S rRNA homeostasis for the efficient synthesis of ribosomes.
van Heerikhuizen et al. (1985), Acker et al. (2008)
Peyretaillade et al. (1998), Katinka et al. (2001)
Gatehouse & Malone (1998)
Huang et al. (2004)
All copy numbers are approximate, mainly based on quantitative hybridization analyses. If the rDNA is chromosomal, the unit size corresponds to the complete unit containing the rDNA coding region and the intergenic spacer.
If the rDNA unit is extrachromosomal, the unit size corresponds to the whole molecule size.
In trichomonads the percentage of the genome sequence that corresponds to the 5S rRNA coding region is indicated.
C, extrachromosomal circle; chr, chromosomal; H, haploid; L, extrachromosomal linear molecule; MAC, macronucleus; MIC, micronucleus; P, palindrome; T, tandem; tel, telomeric; U, unlinked and nontandem.
The ribosomal cistron: localization, gene linkage and IGR
The typical rDNA organization
Chromosomally localized tandem head-to-tail repeats of rDNA units containing a coding region and an IGR represent the typical rDNA organization in eukaryotes. Some microbial organisms of various phylogenetic branches share this general organization, as shown in Fig. 2 and Table 2. Tandem rDNA units may be located in a single chromosome and locus (e.g. Kluyveromyces lactis and Trichomonas vaginalis), or in various chromosomes and loci (e.g. T. cruzi). Atypical tail-to-tail and head-to-head rDNA repeats (interspersed with typical tandem head-to-tail repeats) are observed in Acetabularia exigua (Berger et al., 1978; Spring et al., 1978). In various yeast species such as K. lactis and S. cerevisiae, tandem rDNA units are genetically linked to the 5S rDNA. In these cases, the 5S rDNA can be coded either in a sense or an antisense orientation relative to the rDNA coding strand (Table 2, Fig. 2b and c).
Different organization of tandem head-to-tail rDNA repeats in microbial eukaryotes. (a) Eimeria tenella (Apicomplexa) exemplifies microbial eukaryotes with the typical rDNA organization. (b, c) rDNA copies in Saccharomyces cerevisiae (Ascomycota) and Toxoplasma gondii (Apicomplexa) are linked to the 5S rDNA (green), but in opposite polarities. Intergenic short direct repeats present in S. cerevisiae are shown as colored bars (see also Table 3). (d) In Giardia intestinalis (Diplomonadida), a 32-kDa antigenic protein (dark blue arrow) is coded in the complementary rDNA strand. (e) In Acanthamoeba castellanii (Acanthamoebidae), the mature eLSU rRNA is fragmented into three molecules: 5.8S, 26Sa (2.4 kb) and 26Sb (2 kb); the IGR contains six repeats of a 140-bp element (R, aqua boxes). (f, g) In Trypanosoma cruzi and Leishmania major (kinetoplastids), the eLSU rRNA is fragmented into seven molecules. The T. cruzi IGR contains a 172-bp repeated sequence (orange boxes). In Leishmania spp., the IGR is characterized by the presence of multiple repeated units (yellow). Leishmania major Friedlins ɛ region is duplicated once. Drawings are not to scale. The size of rDNA units is shown in Table 1. Arrows show the polarity of transcription.
Two blocks of highly repetitive DNA bracket the transcribed region.
Polymorphic locus with a 65-bp internal inverted repeated sequence.
ND, not determined; UBF, upstream binding factor; IGS, intergenic spacer.
Unlinked and heterogeneous rDNA
rDNAs with heterogeneous coding and intergenic sequences are characteristic of the Apicomplexa group and may be found unlinked (located in nonadjacent loci) and in low copy number (Table 4). For example, Plasmodium spp. may have four to eight rDNA copies per haploid genome. Plasmodium falciparum and Plasmodium berghei have two types of rDNA units (Waters et al., 1989; Waters et al., 1994): A-type and S-type (also known as C-type in P. berghei) code for different SSU and eLSU rRNAs that correlate with the production of ribosomes with different GTPase activity (Rogers et al., 1996; Velichutina et al., 1998). The expression of rDNA genes is tightly linked to the progression of Plasmodium life cycle: the A-type rRNA is expressed predominantly in the vertebrate host (asexual development), whereas the S-type rRNA is expressed in the mosquito stage (sexual development) (Mercereau-Puijalon et al., 2002) (Fig. 3). During the transfer of Plasmodium from the vertebrate host to the mosquito, drastic changes in glucose concentration and temperature are involved in regulating the expression of A- and S-type rDNA genes from different promoter elements (Mack et al., 1979; Fang & McCutchan, 2002; Fang et al., 2004). A third type, the O-type rDNA (oocyst), has been described in the human malaria parasite Plasmodium vivax, whose synthesis takes place in ookinetes inside the mosquito's gut (Li et al., 1997). Comprehensive reviews of rRNA genes from Plasmodium describe in detail the characteristic organization and function of these genes (Waters, 1994; McCutchan et al., 1995). Other Apicomplexa species with a similar rDNA organization are described in Table 4, as well as the non-Apicomplexa red alga C. merolae. This organism has only three unlinked rDNA units in two different chromosomes, with similar rRNA coding sequences (Matsuzaki et al., 2004).
rDNA organization in Plasmodium berghei. Four unlinked rDNA copies are coded in the telomeres of P. berghei. Type-A rDNAs contain a mature eLSU rRNA fragmented into three molecules (5.8S, 28Sa and 28Sb) and are expressed during the asexual stage in vertebrate hosts. Type-C rDNAs (with a nonfragmented 28S rRNA) are expressed during the sexual development in mosquitoes. The differential expression of rDNAs is regulated by specific promoter sequences (purple and pink boxes). Drawings are not to scale. The size of rDNA units is shown in Table 1.
The microsporidian obligate intracellular parasite E. cuniculi has 22 rDNA units located as single copies in all telomeres of its 11 chromosomes (Brugère et al., 2000). Candida albicans rDNA is found in two subtelomeric loci (Dujon et al., 2004), while Yarrowia lipolytica and Giardia intestinalis rDNA units are positioned in seven and six subtelomeric loci, respectively (Le Blancq et al., 1991; Dujon et al., 2004). The G. intestinalis chromosome I varies 5–20% in size due to subtelomeric rearrangements including variations in rDNA copy number and size (Hou et al., 1995), while some subtelomeric rDNA copies are linked to transcriptional gene units, including protein-coding genes such as ankyrin. Some of these regions may also hold incomplete rDNA sequences (Upcroft et al., 2005). Interestingly, fragments of the rDNA unit are found in all chromosomal ends in D. discoideum. These regions encode complex repeated sequences (transposable elements–rDNA junctions) that generate novel telomeric structures (Eichinger et al., 2005).
The subtelomeric localization of rDNA sequences as those found in E. cuniculi, D. discoideum and G. intestinalis suggests a physiological role for these elements. Telomeres have an ordered structure in the nucleus and can be clustered or associated with the nuclear matrix, at least in some stage during the life cycle (Pryde et al., 1997). Telomeres are regions of great plasticity within a heterochromatic context, with dynamics that allow for the amplification and/or variation in the number of telomeric genes and repeated sequences. It is not known whether subtelomeric rDNA is involved in the maintenance of the characteristic telomeric structure or whether the rDNA exploits this particular structure to regulate its expression and to maintain the sequence and copy number (Pryde et al., 1997).
rDNA may be located extrachromosomally
Extrachromosomal rDNA has been found in ciliates, cellular and plasmodial slime molds and in yeasts (Table 5). The polyploid somatic macronucleus of the ciliate T. thermophila contains about 9 000 copies of a palindromic self-replicating linear minichromosome, which codes for two rDNA units (Fig. 4a and Table 5). The IGR of this palindrome contains six types of repeated sequences (Fig. 4a and Table 3). The rDNA organization in Tetrahymena pyriformis is similar to the T. thermophila rDNA palindrome, with variations in the intergenic repeated motifs (Tables 3 and 5).
Linear extrachromosomal rDNA units in microbial eukaryotes. (a) rDNA is coded in extrachromosomal linear palindromes in Tetrahymena thermophila. Two head-to-head rDNA units are coded in a macronuclear minichromosome that contains typical telomeric sequences (black dots). Upstream and downstream IGRs contain various types of repeated sequences (colored bars; see also Table 3). A group I intron (blue square) is found within the eLSU rRNA. (b) The rDNA and 5S rDNA in Dictyostelium discoideum are linked and coded in extrachromosomal linear palindromes containing telomeres (black dots). (c) rDNA in Physarum polycephalum is coded in palindromic head-to-head units in a minichromosome with various repeated sequences both in the 5′ and 3′ IGRs (see also Table 3). The eLSU rRNA is interrupted by two group I introns (blue squares), and a third group I intron that includes an HE gene (purple square). (d) In Didymium iridis the rDNA is coded in linear minichromosomes containing one rDNA unit. The SSU rRNA contains a twintron (purple box) and two group I introns are found within the 28S rRNA (blue boxes). Drawings are not to scale. The size of rDNA units is shown in Table 1. Arrows show the polarity of transcription.
Dictyostelium discoideum and Physarum polycephalum rDNA is also encoded in palindromic extrachromosomal molecules (Fig. 4 and Table 5). Both rDNA minichromosomes contain several repeated sequence elements (Table 3). In the D. discoideum rDNA palindrome, two 5S rDNA copies are present near the telomeric ends, in the same polarity as the rDNA unit (Fig. 4b) (Cockburn et al., 1978). Additionally, a single rDNA palindrome is located in chromosome IV (Sucgang et al., 2003). Even though D. discoideum has six chromosomes, a seventh ‘chromosome’ can be observed in some chromosomal spreads. This additional ‘chromosome’ corresponds to a chromosome-sized cluster of palindromic rDNA minichromosomes (Sucgang et al., 2003), which suggests a physical interaction of the extrachromosomal rDNA. This particular organization may play a role in the expression and segregation of mitotic rDNA.
Extrachromosomal linear molecules containing one rDNA unit are found in Didymium iridis (Fig. 4 and Table 5). Ciliates such as Euplotes crassus and Glaucoma chattoni have extrachromosomal rDNA copies in single gene-sized linear molecules within the macronucleus with characteristic intergenic repeated elements (Tables 3 and 5). Finally, tandem rDNA genes in Paramecium tetraurelia can be found both in circular and in linear extrachromosomal molecules (Table 5), which can contain >13 rDNA copies.
rDNA units coded in extrachromosomal circular plasmids may be found in Amoebozoa (Entamoeba histolytica) and Excavata (E. gracilis and Naegleria gruberi) (Fig. 5 and Table 5). The most-studied E. histolytica isolate HM-1:IMSS lacks an rDNA chromosomal copy (Bagchi et al., 1999) but possesses about 200 copies of an extrachromosomal circular molecule, with two inverted rDNA units and repeated sequences in the IGR (Table 3 and Fig. 5a). This molecule starts replication at multiple sites; the primary replication origins are located near the pol I promoters, but other replication origins found all the way through the circle are activated under stress conditions (Ghosh et al., 2003). Interestingly, a 0.7-kb RNA of unknown function is encoded in the upstream region of one rDNA unit (Bhattacharya et al., 1998) (Fig. 5a). Depending on the isolate, variations are found in the size and organization of the rDNA circular molecules: the 200:NIH E. histolytica isolate has a palindromic circular organization (25.9 kb), while the HK-9 (15.3 kb) and the Rahman (18.3 kb) isolates possess single rDNA units in their circular extrachromosomal molecules (Sehgal et al., 1994; Bhattacharya et al., 1998).
Circular extrachromosomal rDNA units in microbial eukaryotes. (a) In Entamoeba histolytica extrachromosomal self-replicating molecules encode two palindromic rDNA units (red). The upstream and downstream IGRs contain several repeated sequences (coloured boxes, detailed in Table 3). The 5′ IGR also encodes a 0.7-kb mRNA (grey arrow). In the HM1 strain, four hemolysine virulence proteins (HLYs) are coded within the rRNA coding region, in antisense orientation relative to the rRNA coding strand (navy blue arrows). (b) In Euglena gracilis the rDNA is coded in circular plasmids and the eLSU is fragmented in 14 segments. (c) The Naegleria gruberi rDNA plasmid encodes one rDNA unit containing one twintron in 18S rRNA (purple box) and three type-I introns in the eLSU rRNA (blue boxes). The IGR contains two ORFs (dark blue arrows). Black arrows show the polarity of transcription.
Most E. gracilis rDNA is found in extrachromosomal circular molecules that code for a single rDNA unit (Fig. 5b and Table 5). The whole E. gracilis rDNA circle is transcribed, suggesting a read-around transcription without the need for transcriptional terminators (Greenwood et al., 2001). Naegleria gruberi rDNA plasmid contains two ORFs: a large one downstream of the 28S rRNA (similar to a homing endonuclease gene, HE gene) and a short one that codes for a hypothetical protein (Maruyama & Nozaki, 2007) (Fig. 5c and Table 5).
The yeast C. albicans possesses both chromosomal and extrachromosomal rDNA. About 200 copies in tandem, varying in size, are present in chromosome R while roughly 100 copies are found in an ∼1.2 Mbp autonomously replicating circle. Some rDNA sequences are also found in 50–150-kbp linear molecules (Huber & Rustchenko, 2001).
rDNA plasmids have only been observed in old S. cerevisiae cell cultures. During the aging process, the tandem rDNA copies are excised from the chromosome and replicate autonomously. The accumulation of rDNA circles leads to yeast sterility and shortening of the life span. An association between rDNA locus instability and loss of epigenetic silencing has also been observed (Sinclair & Guarente, 1997).
The ribosomal cistron: the coding region
The typical eukaryotic rDNA coding region is composed of the 18S, 5.8S and 28S rRNA coding sequences separated by ITS-1 and ITS-2. The rDNA coding sequence consists of a common core of domains that may be interspersed with a distinct set of variable regions (also called expansion segments; Dover et al., 1988). Ten and 18 variable regions have been identified in the SSU and LSU rRNAs of all organisms (Raué et al., 1988). Three types of sequence insertions have been found within these variable regions: (1) expansion segments, encoding RNA sequences conserved in the mature molecule; (2) group I introns, located within highly conserved regions and removed after transcription; and (3) transcribed spacers, sequences removed from the mature rRNA, thus producing fragmented eLSU rRNA molecules (Clark et al., 1984). Babesia bovis (Dalrymple et al., 1992), Cryptosporidium parvum (Le Blancq et al., 1997), D. discoideum (Frankel et al., 1977), E. histolytica (Huber et al., 1989), G. intestinalis (Healey et al., 1990), K. lactis (Verbeet et al., 1984), S. cerevisiae (Bell et al., 1977), Toxoplasma gondii (Gagnon et al., 1996) and T. vaginalis (López-Villaseñor et al., 2004) are microbial eukaryotes with the typical rDNA organization of the coding region (Fig. 2a–e).
Insertions of expansion segments in the SSU
The average length of eukaryotic SSU rRNA is 2 kb. Unusually long SSU rRNAs have been found in Pelobionta (Pelomyxa palustris), Foraminifera (Hemisphaerammina bradyi) and Euglenozoa (Distigma sennii) (Table 6). The longest SSU rRNA known is found in the Euglenid D. sennii, comprising >4.5 kb. In most cases, the insertions are found in the SSU rRNA variable regions V2, V4 and V7. The only exceptions are an extended V5 region in A. castellanii and an expansion in a nonvariable region in P. palustris (Gunderson & Sogin, 1986; Milyutina et al., 2001). The rRNA variable regions are located in the mature ribosome surface and their evolutionary implications are unknown (Katz & Bhattacharya, 2006).
Variations in the size of the SSU rRNA due to insertions
SSU size (kb)
Busse & Preisfeld (2002)
Busse & Preisfeld (2002)
Busse & Preisfeld (2002)
Busse & Preisfeld (2002)
Busse & Preisfeld (2002)
Busse & Preisfeld (2003)
Gunderson & Sogin (1986)
Gunderson & Sogin (1986)
Gast et al. (1994)
Schroeder-Diedrich et al. (1998)
Milyutina et al. (2001)
Hinkle et al. (1994)
Loftus et al. (2005)
Dávila-Aponte et al. (1991)
Katz & Bhattacharya (2006)
Group I introns and twintrons
Group I introns can be found as insertions in the SSU and eLSU rRNA coding regions that are removed from the mature molecule by means of a self-splicing reaction (Einvik et al., 1998), generating a completely functional rRNA molecule (Mandal, 1984). The ribozymes encoded in group I introns have conserved secondary structures of 10 base-paired segments, as well as some additional paired segments depending on the intron subclass (Michel & Westhof, 1990). The splicing reaction initiates with a nucleophilic attack of a guanosine cofactor at the 5′ splice site and, after two sequential transesterification reactions, the exons are ligated and the RNA intron is removed (Einvik et al., 1998).
Group I introns are widely distributed in nature and can be found in bacteria, mitochondrial and chloroplast genomes, and in the eukaryotic nucleus (Johansen et al., 2007). Group I introns may interrupt the SSU rRNA coding sequence in 40 distinct conserved sites of several microbial eukaryotes (Jackson et al., 2002) such as Acanthamoeba griffini and the green alga Characium saccatum. These introns may also be present in the eLSU rRNA, as is the case for P. falciparum A-type eLSU rRNA and the 26S rRNA of some Tetrahymena isolates (Fig. 4a, Table 7). It is interesting that some organisms may have both the SSU and the eLSU rRNAs interrupted by group I introns (e.g. P. carinii, Chlorella ellipsoidea and D. iridis, Fig. 4d). Table 7 describes some of the introns found in the rDNA of several microbial eukaryotes.
Can have two types of intron differing in sequence.
Twintrons are more complex insertions in the rDNA that consist of two group I introns (ribozymes) and an ORF encoding an HE (Einvik et al., 1998; Johansen et al., 2007). The D. iridis and N. gruberi SSU twintrons contain a small ribozyme (GIR1), followed by the HE ORF inserted into a second ribozyme (GIR2). Two different isolates of D. iridis have two types of introns, containing an HE gene in both polarities relative to the SSU rRNA gene (Fig. 6) (Johansen et al., 2007). The twintron contains the HE ORF (I-DirI) in the same polarity as the 18S rRNA coding region. GIR2 is a self-splicing ribozyme that releases the HE transcript. A second intron encoding a ribozyme (GIR1) is also found within the twintron. GIR1 modifies the 5′ end of the HE transcript to form a 2′5′cap that increases its translational efficiency (Fig. 6a) (Einvik et al., 1998; Johansen et al., 2007). In contrast, the intron II contains an HE gene (I-DirII) in opposite polarity relative to the SSU rRNA and ribozyme-coding sequences (Johansen et al., 2006). Transcription of I-DirII is established from a pol II-like promoter located immediately upstream of the HE gene (Fig. 6b) (Johansen et al., 2006). Both D. iridis HE transcripts are processed through the nuclear spliceosomal complex to remove a 50-nt noncoding spliceosomal intron, found within the HE coding sequences, and are polyadenylated (Vader et al., 1999; Johansen et al., 2007).
Group I introns that contain an HE gene. (a) Twintron present in Didymium iridis: the DiGIR2 intron (purple) is encoded in the SSU rRNA and transcribed by pol I as part of the pre-rRNA; it self-splices to generate the HE pre-mRNA (splicing sites are represented as black bars). Subsequently, DiGIR1 intron (blue) self-splices and processes the I-Dir I HE pre-mRNA in the 5′ side, producing a 2′5′-cap. The I-DirI HE pre-mRNA is additionally processed by the removal of spliceosomal intron SI (white box) and polyadenylation of the 3′ side to generate a functional I-Dir I HE mRNA (yellow region). (b) Intron II present in D. iridis: the I-Dir II HE RNA found within the DiGIR2 intron is coded in antisense orientation and is transcribed from a pol II promoter. The HE pre-mRNA is processed by pol II-associated factors to generate a typical 5′-cap and a 3′ polyadenylated tail. The spliceosomal intron SI is removed by the spliceosome machinery.
Physarum polycephalum eLSU rDNA contains an optional group I intron holding an HE gene (Ruoff et al., 1992). The full-length RNA intron can be excised or alternatively processed (immediately downstream of the HE gene) to produce a smaller transcript. Only the full-length RNA intron (lacking a 5′cap and a poly-A tail) is translated into the HE I-PpoI protein (Ruoff et al., 1992). The cleavage of this transcript in the internal processing site seems to downregulate HE I-PpoI expression by decreasing the stability of the transcript in yeast transintegrated introns (Johansen et al., 2007). Table 7 summarizes rDNA group I introns and HE gene insertions.
The ITS-1 and -2
The rDNA transcript is generally post-transcriptionally processed in three rRNA mature molecules: 18S, 5.8S and 28S rRNAs that result from elimination of ETS, ITS-1 and ITS-2 from the precursor transcript (Fig. 1). In microbial eukaryotes, ITS-1 ranges from 100 to 400 bp, while ITS-2 is 200–500 bp. Unusually long ITSs are found in the red alga C. merolae (Maruyama et al., 2004), where ITS-1 and ITS-2 average sizes are 862 and 1738 bp, respectively. Euglena gracilis has the largest known ITS-1, 1188 bp in length (Schnare et al., 1990). The dinoflagellate Cochlodinium polykrikoides ITS-1 contains a 101-bp sequence in six tandem repeats, resulting in an ITS-1 length of 813 bp (Ki & Han, 2007). Yarrowia and Giardia have the shortest known ITSs in microbial eukaryotes: the sum of ITS-1 and ITS-2 lengths in Y. lipolytica is only 150 bp (van Heerikhuizen et al., 1985), while the G. intestinalis ITS-1 and ITS-2 are 37 and 52 bp in length, respectively (Boothroyd et al., 1987). Some Microsporidia species completely lack the ITS-2 (Fig. 7) (Vossbrinck & Woese, 1986), as discussed below. The biological relevance of the ITSs' length and the presence of internal repeats are currently unknown, although their sequence has been useful in molecular phylogenetic studies of closely related species.
Unusual rDNA organization in Microsporidia. Microsporidia lack a 5.8S rRNA mature molecule and the typical 5.8S rRNA sequence is fused to the 23S rRNA. (a) Single telomeric rDNA units are surrounded by different repeated sequences in Encephalitozoon cuniculi (see also Table 3) and the rDNA lacks ITS-2. (b) Some Nosema species have an atypical rDNA coding organization, with the LSU rRNA coded upstream of the 16S rRNA. The typical 5.8S rRNA sequence is fused to the 23S rRNA and the 5S rDNA is linked to the rDNA unit.
Additional ITSs generate fragmented eLSU rRNA
Some microbial eukaryotes process the pre-rRNA into more than three mature molecules due to the presence of additional ITSs. Well-known examples of fragmented rRNA are found among kinetoplastids, with the eLSU rRNA fragmented in seven molecules. The nomenclature of these rRNAs varies according to the organism, the size of the rRNA molecule and the position in the coding region. The eLSU rRNA of Leishmania spp. is fragmented into seven elements, which are cotranscribed in the pre-rRNA and processed by exo- and endonucleolytic activities to produce the functional eLSU fragments: 5.8S, LSUα, γ, LSUβ, δ, ζ and ɛ (Martínez-Calvillo et al., 2001) (Fig. 2g). Trypanosoma cruzi and Crithidia fasciculata also code for an eLSU rRNA fragmented into seven elements: 5.8S, 24Sα, S1, 24Sβ, S2, S6 and S4 in T. cruzi (Fig. 2f) (Hernández et al., 1988), and rRNAs 5.8S, c, d, e, f, g and j in C. fasciculata (Spencer et al., 1987).
Protein-coding regions within the rDNA coding region
A correlation has been observed between the virulence of E. histolytica isolates and the sequence composition of the rDNA circular molecule described above (Clark & Diamond, 1991; Zindrou et al., 2001). Virulence associates with the striking presence of genes encoding hemolysins (proposed as virulence factors) within and overlapping the rRNA coding sequence, but in opposite polarity. Three hemolysins overlap with the eLSU coding region, while the fourth (HLY4) is coded in the ITS-1 between the SSU and 5.8S rRNAs (Jansson et al., 1994) (Fig. 5a). In G. intestinalis, a gene coding for a 32-kDa flagellum antigen has been identified in the rDNA IGR that overlaps the 3′ region of the 28S rRNA (Fig. 2d). The motif that directs transcription of this gene seems to be a hybrid pol II/pol III promoter (Upcroft et al., 1990).
Unusual rDNA coding regions
Microsporidia are obligate intracellular eukaryotes that possess many prokaryotic characteristics in their rRNA genes (Weiss et al., 2001). The rDNA units are smaller than the standard eukaryotic size and lack the ITS-2; consequently, the 5.8S rRNA is fused to the 5′ region of the 28S rRNA, as is found in bacteria (Vossbrinck & Woese, 1986) (Fig. 7). Microsporidia are the only eukaryotes known to lack an individual 5.8S rRNA molecule (e.g. E. cuniculi and Vairimorpha necatrix (Vossbrinck & Woese, 1986; Peyretaillade et al., 1998). The relevance of this eukaryotic 5.8S–28S rRNA fusion is unknown. In addition to these characteristics, Nosema bombycis and Nosema spodopterae have an unusual rDNA gene organization (Huang et al., 2004; Iiyama et al., 2004; Tsai et al., 2005) because the LSU rRNA is coded and transcribed upstream to the SSU rRNA (Fig. 7b) in contrast to the almost universal order of the rRNA coding regions (Fig. 1).
Different rDNA genes may be found within an organism
As has been mentioned, the rRNA genes within one organism are generally conserved in the coding region with an occasional sequence variation in the IGRs and with little variation in the coding sequences. Sequence variability in the IGRs may result from sequence divergence or disparity in the number of repeated sequences, involved in both up- and downregulation of rDNA transcription. Therefore, the heterogeneous composition of rDNA units may influence rDNA expression. Sequence divergence in the coding region and/or IGR within the same organism has led to a classification of rDNA units. For example, different types of rDNA may be found in Paramecium, Y. lipolytica and the Apicomplexa group. A detailed description of this variability is included in Table 8.
The organization of the 5S rDNA is simpler than that of the rDNA. Most 5S rDNAs are found in tandem head-to-tail repeats consisting of a conserved ∼120-bp coding region and an IGR of variable size and sequence. An internal pol III promoter is present in all 5S rDNA studied to date (Schramm & Hernandez, 2002) (Fig. 1c).
The 5S rDNAs are found as tandem head-to-tail repeats
The 5S rDNA in T. cruzi, Trypanosoma brucei, T. vaginalis, Trichomonas tenax, C. fasciculata, Eimeria tenella and C. parvum is typically organized in tandem head-to-tail repeats. Tritrichomonas foetus has two types of 5S rDNAs, while P. falciparum has only three 5S rDNA copies in tandem, differing in the length of the IGRs. The main characteristics of the 5S rDNA tandem head-to-tail repeats of several organisms are described in Table 9.
Some Trypanosoma species such as Trypanosoma vivax and Typanosoma rangeli have the 5S rDNA copies linked to the spliced-leader (SL) tandem repeated genes, transcribed by pol II. SL transcripts are necessary to process the mRNAs in kinetoplastids by a trans-splicing reaction (Simpson et al., 2006). A similar linkage has been found in other Euglenozoa such as Diplonema papillatum and Bodo caudatus. In Trypanoplasma borreli and Trypanosoma avium, the 5S rDNA is coded in opposite polarity relative to the SL gene (Table 10). Interestingly, the T. borreli SL can also be linked to 5S rRNA pseudogenes (with a truncated 5′ end) (Maslov et al., 1993). Some 5S rDNA units in T. pyriformis and T. foetus are associated with ubiquitin genes transcribed by pol II (Fig. 8b). Table 10 describes the relative polarity of the 5S rDNA linked to the genes transcribed by pol II.
The 5S rDNA may be linked to pol II or pol III transcribed genes. (a) In Tritrichomonas foetus the 5S rDNA is linked to a multigenic ubiquitin family. (b) In Euglena gracilis the 5S rDNA is linked to the SL gene. (c) In Yarrowia lipolytica dicistronic genes consisting of a tRNA gene (pink) and a 5S rDNA (green) are dispersed in the genome. One tricistronic gene: Lys(CTT) tRNA–Glu(CTC) tRNA–5S rDNA is also found. These genes are transcribed from the pol III promoter of the tRNA gene. Dispersed, single 5S rDNAs are also found (green).
Some 5S rDNA copies in E. histolytica and one copy in Leishmania tarentolae are linked to tRNA genes (Shi et al., 1994; Clark et al., 2006), also transcribed by pol III. Interestingly, 48 of the 108 5S rDNA copies of Y. lipolytica produce pol III dicistronic transcripts: tRNA–5S rRNA hybrid molecules. The synthesis of an ∼200-nt transcript is driven by the tRNA pol III promoter, resulting in a transcription independent of the 5S rDNA-specific transcription factor, TFIIIA. The dicistronic transcripts, as well as a unique tricistronic transcript [Lys(CTT) tRNA–Glu(CTC) tRNA–5S rDNA] are post-transcriptionally processed to generate the typical mature RNA molecules: tRNAs and 5S rRNA (Acker et al., 2008) (Fig. 8c).
Nontandem 5S rDNA copies are found dispersed throughout the genome of some microbial eukaryotes. Some examples are A. castellanii (Zwick et al., 1991), Y. lipolytica and Schizosaccharomyces pombe (Tabata, 1981; Dujon et al., 2004). The 5S rDNA may also be found in extrachromosomal DNA. Noteworthy, ciliate organisms such as Oxytricha fallax have single 5S rDNA copies coded in macronucleus extrachromosomal molecules (Rae & Spear, 1978; Roberson et al., 1989). Moreover, about one million copies of the 5S rDNA are coded in linear minichromosomes flanked by telomeres in E. eurystomus (Roberson et al., 1989).
Ribosomes are complex organelles that require the intricate collaboration of three types of RNAs (rRNA, mRNA and tRNA) and >70 proteins for the synthesis of proteins. rRNAs must maintain their convoluted structural motifs in order to be functional. It is therefore not surprising that their sequence is highly conserved among related organisms and this similarity is gradually lost as organisms diverge. For this reason, sequence comparison of the SSU rRNA has been widely used in the field of molecular phylogeny (Van de Peer et al., 2000).
The ‘typical’ eukaryotic rDNA genomic organization was proposed >30 years ago, based on the analysis of the rDNA in higher eukaryotes (Long & Dawid, 1980). The tandemly repeated head-to-tail organization has been considered the standard for eukaryotic rDNA. Surprisingly, analyses of the genomic organization of ribosomal genes in microbial eukaryotes demonstrate that although some organisms do hold the typical rDNA configuration, the majority reveal unusual characteristics. As shown in this review, the eukaryotic rDNA may be arranged in a wide variety of genomic configurations, suggesting the existence of several regulatory mechanisms (probably species-specific) within a conserved rDNA regulatory context.
Reiteration is one of the most conserved rDNA characteristics. The rDNA copy number is extremely variable and appears to be highly regulated within species. Nevertheless, the total number of rDNA repeats does not always correlate with the rate of rRNA synthesis (French et al., 2003), implying that individual rDNA units may hold different epigenetic marks that result in variable transcriptional rates (Grummt et al., 2007). rDNA structure and transcription are also important in the establishment of the nucleolar structure, which also plays regulatory roles at the cellular level (Carmo-Fonseca et al., 2000). Dictyostelium discoideum, T. gondii and some yeast species have equal numbers of rDNA and 5S rDNA copies. This organization was considered coherent, as the pool for rRNA molecules was supposed to hold equimolar amounts of 18S, 5.8S, 28S and 5S rRNA mature molecules for the efficient synthesis of ribosomes (Prokopowich et al., 2003). However, rDNA and 5S rDNA are transcribed by different RNA polymerases with dissimilar transcription rates and number of transcriptionally open rDNA units. Therefore, the rDNA/5S rDNA dosage is not directly related to the stoichiometry of the total rRNA pool and expression process. The rDNA/5S rDNA dosage varies widely throughout evolution. The processes that allow the maintenance of the pool of rRNA mature molecules in appropriate stoichiometry must be a complex network including epigenetic, transcriptional, post-transcriptional and structural mechanisms that may vary according to the rDNA/5S rDNA dosage. Additionally, the location of the rDNA and 5S rDNA in the genome may be related to its expression and physiology. The chromosomal context, together with different chromatin environments, may be involved in the maintenance of gene copy number, recombination frequency, sequence conservation and transcription regulation of rDNA.
The organization of rDNA in extrachromosomal molecules may be associated with the cellular need for quick changes in the rDNA copy number under stress conditions. Some organisms have most of their rDNA in self-replicating extrachromosomal molecules, but retain additional copies in the chromosome probably as a backup. Interestingly, some organisms hold the totality of their rDNA in extrachromosomal molecules. It has been shown that the accumulation of extrachromosomal rDNA copies in old S. cerevisiae cultures affects cells' health. Therefore, cells that hold most or all rDNA copies extrachromosomally may have special mechanisms to allow for the accumulation of large rDNA minichromosomes without affecting the cell fitness. However, it is possible that yeasts lack this mechanism, resulting in cell damage when episomes accumulate.
Post-transcriptional processing of ITS-1 and ITS-2 is observed in almost all eukaryotes. However, some organisms possess additional ITS sequences in the 28S rRNA that generate fragmented rRNA molecules that maintain the core rRNA active elements in the mature ribosome. It has been found that some organisms possess additional sequences in variable regions internal to the SSU rRNA coding sequence that remain in the mature molecule, thus generating unusually large SSU rRNAs. It is interesting to note that the SSU rRNA has not been found as a fragmented molecule in nuclear genomes; in contrast, the fragmented eLSU rRNA could be regarded as 28S molecules that have processed their variable regions. The structure and functionality of these rRNAs in the ribosome may help to understand the importance of variable regions and the differences/restrictions between subunits.
The rDNA coding region of several microbial eukaryotes is interrupted by group I introns. Different transcriptional and post-transcriptional mechanisms are involved in the processing of introns and HE transcripts. Some C. albicans strains have heterogeneous rDNA populations with both intron-containing and intron-less rDNA units. Because no function related to rDNA expression has been proposed for group I introns, C. albicans may provide a good model to study the role (if any) of these introns in rRNA expression, processing and stability.
The linkage of 5S rDNA to a variety of tandem repeated families may be the result of homogenizing mechanisms responsible for concerted evolution (Drouin & de Sá, 1995). The finding that the 5S rDNA can be linked to all polymerase transcribed genes, coded alone in different chromosomal loci or coded in extrachromosomal molecules underscores the possibility of various mechanisms acting to regulate the expression of different types of 5S rDNAs. The simultaneous expression of pol I, pol II and pol III transcribed genes in a particular locus may alter the chromatin context as well as the availability of transcription factors in the proximity. Nevertheless, the significance of the linkage between multigenic families and 5S rDNA has not been studied. The presence of unlinked 5S rDNA copies is also interesting because the chromosomal context for each gene may affect its regulation.
Widespread rDNA characteristics present among eukaryotic supergroups as well as particular features predominating in some eukaryotic subgroups reflect the complexity of evolution. The typical tandem head-to-tail organization of the rDNA and 5S rDNA is found in all eukaryotic supergroups (Fig. 9), suggesting that the eukaryotic common ancestor held this organization. Later on, the evolutionary process probably led to a specialization and divergence of the rDNA structure, resulting in the different variants described here. Other features, such as the group I introns, were acquired by horizontal transfer and are therefore widespread among microbial eukaryotes (Fig. 9) (Sogin et al., 1986; Van Oppen et al., 1993).
Schematic phylogenetic tree describing the prevailing rDNA structures in subgroups of microbial eukaryotes. Phylogeny of eukaryotes is based on the six supergroup classification (Box 1). Conserved rDNA structures that predominate in eukaryotic subgroups are boxed in color. Some of the described characteristics may be shared by more than one subgroup.
Some particular rDNA characteristics are conserved among related species, suggesting that the common ancestor for each group held these traits before current speciation. Examples of this can be found in the unlinked differentially expressed rDNA units in Apicomplexa, the extralong SSU rRNAs in Amoebozoa and Foraminifera, the extrachromosomal rDNA and 5S rDNA in ciliates and Amoebozoa, the 5.8S–23S rRNA fusion in Microsporidia and both the SL–5S rDNA linkage and the eLSU fragmentation in Euglenozoa. The 5S rDNA nontandem organization as well as the linkage between 5S rDNA with the ribosomal cistron can also be seen as predominant in the Fungi group (Fig. 9). Particular traits such as the 28S rRNA fragmentation could have appeared more than once and independently, leading to non-Euglenozoa organisms containing fragmented rRNAs such as A. castellanii and Plasmodium. Distinctive characteristics shared among non-closely related species may represent phylogenetic evidence of yet unknown linkages among eukaryotic subgroups and species. Nevertheless, a thorough and integrated comparative characterization of rRNA genes in poorly or nonstudied eukaryotes may help to understand the diversity and relationship among different forms of life.
Many unanswered questions regarding the regulation of rRNA gene expression still remain, for example, the mechanism(s) that determine which rRNA gene copies will be transcriptionally and/or epigenetically active (Lawrence et al., 2004; Grummt, 2007) or the relevance of the genomic context that surrounds the rRNA genes. Finally, it should be pointed out that the rDNA organization is only one fundamental step in its regulation, because its expression is interrelated with most, if not all, of the cell's regulation levels (Paule & White, 2000; Schramm & Hernandez, 2002; Grummt et al., 2003).
This work was supported by grants IN214006 from Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica (PAPIIT), Universidad Nacional Autónoma de México (UNAM) and P45037-Q from Consejo Nacional de Ciencia y Tecnología (CONACYT), Mexico. A.L.T.-M. was supported by a scholarship from CONACYT Mexico.
(1977) Ribosomal RNA genes of Saccharomyces cerevisiae. I. Physical map of the repeating unit and location of the regions coding for 5 S, 5.8 S, 18 S, and 25 S ribosomal RNAs. J Biol Chem 252: 8118–8125.
(2001) Description of Perkinsus andrewsi n. sp. isolated from the Baltic clam (Macoma balthica) by characterization of the ribosomal RNA locus, and development of a species-specific PCR-based diagnostic assay. J Eukaryot Microbiol 48: 52–61.
(2003) In exponentially growing Saccharomyces cerevisiae cells, rRNA synthesis is determined by the summed RNA polymerase I loading rate rather than by the number of active genes. Mol Cell Biol 23: 1558–1568.
(2007) Cryptic long internal repeat sequences in the ribosomal DNA ITS1 gene of the dinoflagellate Cochlodinium polykrikoides (dinophyceae): a 101 nucleotide six-repeat track with a palindrome-like structure. Genes Genet Syst 82: 161–166.
(1998) Expansion and contraction of ribosomal DNA repeats in Saccharomyces cerevisiae: requirement of replication fork blocking (Fob1) protein and the role of RNA polymerase I. Genes Dev 12: 3821–3830.
(2001) The unusually long small subunit ribosomal RNA gene found in amitochondriate amoeboflagellate Pelomyxa palustris: its rRNA predicted secondary structure and phylogenetic implication. Gene 272: 131–139.
(2002) Intergenic and external transcribed spacers of ribosomal RNA genes in lizard-infecting Leishmania: molecular structure and phylogenetic relationship to mammal-infecting Leishmania in the subgenus Leishmania (Leishmania). Mem I Oswaldo Cruz 97: 695–701.
(1998) Microsporidian Encephalitozoon cuniculi, a unicellular eukaryote with an unusual chromosomal dispersion of ribosomal genes and a LSU rRNA reduced to the universal core. Nucleic Acids Res 26: 3513–3520.
(2002) Characterization of the rRNA locus of Pfiesteria piscicida and development of standard and quantitative PCR-based detection assays targeted to the nontranscribed spacer. Appl Environ Microb 68: 5394–5407.
(2000) The 28S–18S rDNA intergenic spacer from Crithidia fasciculata: repeated sequences, length heterogeneity, putative processing sites and potential interactions between U3 small nucleolar RNA and the ribosomal RNA precursor. Nucleic Acids Res 28: 3452–3461.
(1984) Evolution of yeast ribosomal DNA: molecular cloning of the rDNA units of Kluyveromyces lactis and Hansenula wingei and their comparison with the rDNA units of other Saccharomycetoideae. Mol Gen Genet 195: 116–125.
(2000) Complete deletion of yeast chromosomal rDNA repeats and integration of a new rDNA repeat: use of rDNA deletion strains for functional analysis of rDNA promoter elements in vivo. Nucleic Acids Res 28: 3524–3534.