OUP user menu

Ribosomal RNA genes in eukaryotic microorganisms: witnesses of phylogeny?

Ana Lilia Torres-Machorro, Roberto Hernández, Ana María Cevallos, Imelda López-Villaseñor
DOI: http://dx.doi.org/10.1111/j.1574-6976.2009.00196.x 59-86 First published online: 1 January 2010


The study of genomic organization and regulatory elements of rRNA genes in metazoan paradigmatic organisms has led to the most accepted model of rRNA gene organization in eukaryotes. Nevertheless, the rRNA genes of microbial eukaryotes have also been studied in considerable detail and their atypical structures have been considered as exceptions. However, it is likely that these organisms have preserved variations in the organization of a versatile gene that may be seen as living records of evolution. Here, we review the organization of the main rRNA transcription unit (rDNA) and the 5S rRNA genes (5S rDNA). These genes are reiterated in the genome of microbial eukaryotes and may be coded alone, in tandem repeats, linked to each other or linked to other genes. They may be found in the chromosome or extrachromosomally in linear or circular units. rDNA coding regions may contain introns, sequence insertions, protein-coding genes or additional spacers. The 5S rDNA can be found in tandem repeats or genetically linked to genes transcribed by RNA polymerases I, II or III. Available information from about a hundred microbial eukaryotes was used to review the unexpected diversity in the genomic organization of rRNA genes.

  • rRNA
  • 5S rRNA
  • unicellular eukaryote
  • ribosomal cistron organization
  • extrachromosomal gene
  • repeated sequence


The most recent phylogenetic model for relationships among eukaryotes clusters them into six supergroups, probably monophyletic (Simpson & Roger, 2004; Adl et al., 2005). Microbial eukaryotes are found in all six groups and have considerable morphological, ultrastructural and genetic diversity. Several unique features have been described in these organisms, such as trans-splicing and RNA editing in trypanosomatids (Madison-Antenucci et al., 2002; Haile & Papadopoulou, 2007) as well as DNA splicing and rearrangements in the ciliate Tetrahymena (Prescott, 2000). Microsporidia (Encephalitozoon cuniculi) possess genomes in the size range of bacteria (Keeling & Slamovits, 2004), while the genomes of dinoflagellates lack histones and nucleosomes (Moreno Díaz de la Espina et al., 2005). Cryptomonad and chlorarachniophyte unicellular algae conserve a relict miniaturized nucleus of a formerly independent alga (nucleomorph) (Cavalier-Smith et al., 2002) and specialized infection organelles (rhoptries and micronemes) are present in apicomplexans such as Plasmodium (Kats et al., 2006) and Toxoplasma parasites (Boothroyd & Dubremetz, 2008). Unusual characteristics extend to the organization of rRNA genes, which evidence the peculiarities, diversity and divergence of the genome structure in microbial eukaryotes. An overview of the biology of key microbial eukaryotes is given in Box 1.

View this table:
Box 1.

Characteristics of some microbial eukaryotes

We used the most recent and accepted classification of eukaryotes, based on multiple gene molecular phylogenies and structural analyses. This system divides eukaryotes into six supergroups: Amoebozoa, Opisthokonta, Rhizaria, Plantae, Chromalveolata and Excavata (Simpson & Roger, 2004; Adl et al., 2005; Dacks et al., 2008). Here, we describe key microorganisms of each supergroup (Margulis & Schwartz, 2000).
1. AMOEBOZOA: Organisms that show amoeboid locomotion with pseudopodia.
Pelomyxa palustris: Giant anaerobic amoeba that contains three types of bacterial endosymbionts that replace the functions of some lacking organelles such as the mitochondria.
Acanthamoeba castellanii (Acanthamoebidae): Freeliving soil amoeba.
Entamoeba histolytica (Entamoebida): Uninucleate amitochondriate amoeba that infects the intestine of animals, causing amoebiasis.
Physarum polycephalum (Eumycetozoa, Myxogastria): Amoeboid cells that can differentiate into fungus-like reproductive structures. During its life cycle, a diploid zygote divides repeatedly to form a multinucleated cytoplasmic mass called the plasmodium. Under dry conditions, the plasmodium may mature into spore-producing organs.
Didymium iridis (Eumycetozoa, Myxogastria): Plasmodial slime mold.
Dictyostelium discoideum (Eumycetozoa, Dictyostelia): Land-dwelling cellular slime mold. Independent amoebas may aggregate into a slimy mass (slug) that eventually transforms into a reproductive body that produces spores.
2. OPISTHOKONTA: Organisms with a single posterior flagellum in at least one stage of the life cycle.
Fungi: Dominant osmotrophs that play crucial roles as decomposers and as symbionts or parasites.
Ascomycota: Hold a microscopic reproductive structure called ascus.
Pneumocystis carinii (Ascomycota, Taphrinomycotina): Causes fatal pneumonia in immunocompromised humans.
Schizosaccharomyces pombe (Ascomycota, Taphrinomycotina, Schizosaccharomycetes): Fission yeast.
Saccharomyces cerevisiae (Ascomycota, Saccharomycetes): Budding yeast that ferments sugars to ethyl alcohol.
Candida albicans (Ascomycota, Saccharomycetes): Causes infections in humans.
Microsporidia: Intracellular asexual parasites that lack mitochondria. The microsporan resting stages are the chitinous spores, which contain a polar filament and an infective body.
Encephalitozoon cuniculi: Parasites of warm-blooded vertebrates, holds one of the smallest known eukaryotic genomes (2.9 Mbp) (Biderre et al., 1997).
Nosema bombycis: causes disease in insects.
3. RHIZARIA: Organisms with pseudopodia of various types.
Foraminifera: Planktonic or benthic free-swimming organisms that have pore-studded shells. They show nuclear dimorphism and complex life cycles.
4. PLANTAE (ARCHAEPLASTIDA): Organisms that hold a photosynthetic plastid derived from a primary endosymbiosis with a cyanobacterium.
Rhodophyceae (Red algae): Mostly marine organisms that hold rhodoplasts (red plastids).
Cyanidoschyzon merolae: Unicellular organisms that inhabit sulfate-rich hot springs.
Chloroplastida: Organisms that hold green chloroplasts.
Acetabularia mediterranea (Chlorophyta, Ulvophyceae): Syncytial green algae.
Chlorella (Chlorophyta, Trebouxiphyceae).
Chlamydomonas (Chlorophyta, Chlorophyceae).
5. CHROMALVEOLATA: Organisms that contain a plastid that comes from a secondary endosymbiosis with an ancestral archaeplastid.
Bacillariophyta (Stramenopiles): Single cells or colonies covered by an elaborate, symmetrical two-part shell.
Dinoflagellata (Dinozoa): Mostly unicellular marine plankton, holding two undulipodia and complex rigid walls (tests). Some species produce toxins.
Pfiesteria piscicida (Dinophyceae, Peridinophyceae).
Ciliophora: Unicellular organisms covered with cilia (short undulipodia). They have two types of nuclei: small genetic micronuclei (MIC, containing standard chromosomes) and large transcriptionally active macronuclei (MAC, it develops from the micronuclei).
Euplotes (Intramacronucleata, Spirotrichea, Hypotrichia).
Paramecium (Intramacronucleata, Oligohymenophorea, Peniculia).
Tetrahymena (Intramacronucleata, Oligohymnophorea, Hymenostomatia).
Apicomplexa: Specialized obligate intracellular parasites named for the ‘apical complex’ that hold structures such as rhoptries and micronemes, the specialized machinery used for invasion (Kats et al., 2006; Boothroyd & Dubremetz, 2008).
Plasmodium (Aconoidasida, Haemosporida): The causative agent of malaria exists in association with an invertebrate host (sexual stage in the mosquito) and a vertebrate host (asexual stage). Plasmodium falciparum and Plasmodium vivax infect human red blood cells, while Plasmodium berghei infects rodents.
Babesia bovis (Aconoidasida, Piroplasmorida).
Theileria parva (Aconoidasida, Piroplasmorida).
Cryptosporidium (Conoidasida, Coccidiasina).
Eimeria (Conoidasida, Coccidiasina).
6. EXCAVATA: Organisms that typically have a suspension-feeding groove and flagella.
Giardia intestinalis (Fornicata, Eopharyngia, Diplomonadida, Giardiinae): A parasite of the small intestine of vertebrates through infective cysts. It has two transcriptionally active karyomastigonts (nuclei attached to undulipodia by thin fibers), and lacks mitochondria and the Golgi apparatus.
Trichomonas vaginalis (Parabasalia, Trichomonadida): Amitochondriate parasite causative of trichomoniasis in humans. The organelles known as parabasal bodies are involved in the synthesis, storage and transport of proteins.
Trichomonas tenax (Parabasalia, Trichomonadida): Infects the human mouth.
Tritrichomonas foetus (Parabasalia, Trichomonadida): Infects the urogenital tract of cattle.
Naegleria gruberi: (Heterolobosea, Vahlkampfiidae) Soil and freshwater freeliving amoeba that transforms into unduliopodiated cells.
Euglena gracilis (Euglenozoa, Euglenida, Euglenea): Unicellular organism living in stagnant water. It can be found with or without chloroplasts.
Kinetoplastea (Euglenozoa): Contain a large mitochondrion called a kinetoplast.
Trypanosoma (Metakinetoplastina, Trypanosomatida): The change of host and some differentiation steps are associated with characteristic movements of the kinetoplast along the cell. Trypanosoma brucei infection (transmitted to humans through the bite of infected tsetse flies) causes the sleeping sickness, while Trypanosoma cruzi infection (transmitted through the bite of infected reduviid bugs) leads to Chagas disease in humans.
Leishmania (Metakinetoplastina, Trypanosomatida): Parasite responsible for the leishmaniasis disease. It multiplies within the lysosomes of vertebrate macrophages and within the digestive system of sand-flies.
Bodo saltans (Metakinetoplastina, Eubodonida): Freeliving bi-undulipodiated cell.
Crithidia (Metakinetoplastina, Trypanosomatida).
Trypanoplasma (Metakinetoplastina, Parabodonia).

The typical eukaryotic translation machinery, the ribosome, is composed of two subunits with four rRNA species and >70 proteins. The large subunit (LSU) contains the 28S, the 5.8S and the 5S rRNAs. The small subunit (SSU) contains the 18S rRNA (SSU rRNA). The four rRNA mature molecules are coded in two rRNA genes transcribed by two different RNA polymerases. The 18S, 5.8S and 28S rRNAs are coded in a single transcription unit called a ribosomal cistron or the main transcription unit, transcribed by RNA polymerase I (pol I). The 5S rRNA gene is not usually linked to the ribosomal cistron and is transcribed by pol III (Mandal et al., 1984; Paule & White, 2000).

rRNA genes were among the first genes to be studied in detail due to their highly repetitive nature, ease of manipulation and biological importance (Miller & Beatty, 1969; Long & Dawid, 1980; Sollner-Webb & Mougey, 1991). The thorough study and description of genomic organization and regulatory elements in the rRNA genes of Xenopus, Drosophila and mouse led to the most accepted model of rRNA gene organization in eukaryotes (Long & Dawid, 1980; Mandal, 1984; Sollner-Webb & Mougey, 1991) (Fig. 1). The rRNA genes of microbial eukaryotes have also been intensively studied, although they were considered to be the exception to the rule, as their organization differs from the general models (Long & Dawid, 1980; Mandal et al., 1984). Here, we focus on the rRNA gene organization of microbial eukaryotes where many examples of gene diversity can be found. This work also summarizes the variability of motifs present in the rDNA intergenic region (IGR), which may include general and species-specific elements. For simplicity, in this review, the ribosomal cistron is referred to as rDNA and the term 5S rDNA is used for the 5S rRNA gene.

Figure 1

General organization of the ribosomal main transcription unit (rDNA) and 5S rDNA. (a) Schematic representation of Xenopus laevis rDNA. About 600 U of the ribosomal cistron are encoded in the chromosome in head-to-tail tandem repeats. Each unit contains a coding region (red) and an IGR. (b) A single unit of the X. laevis rDNA. The 18S, 5.8S and 28S rRNA molecules are transcribed as a single RNA precursor that is post-transcriptionally processed to produce the mature rRNA molecules. Transcription regulatory elements for RNA polymerase I are found in the NTS: tandem-repeated sequences (R), spacer promoters (SP), transcription terminators (T) and the promoter (P). The IGR comprises both the NTS and the ETS. (c) Organization of somatic 5S rDNA in X. laevis. The 5S rDNA is organized in tandem head-to-tail repeats that include a coding region (green box) and an intergenic sequence (black line). The 5S rDNA promoter is internal to the coding region (light green box). Arrows represent the transcription start point. ETS, external transcribed spacer.

Overview of the eukaryotic rRNA genes

The ribosomal cistron (rDNA)

In most species, the rDNA is present in multiple copies organized as tandem head-to-tail repeats. The rDNA unit is composed of a transcribed region and an IGR (also called the intergenic spacer) consisting of a nontranscribed spacer (NTS) 2–30 kbp long and an external transcribed spacer. The NTS contains most of the regulatory elements for transcription, while the external transcribed spacer is part of the primary transcript (pre-rRNA, 7–14 kb long) (Sollner-Webb & Mougey, 1991) (Fig. 1).

Several regulatory elements may be found in the IGR such as enhancers, spacer promoters, a proximal terminator and the gene promoter. This region may also contain several repetitive sequences that may improve the transcription efficiency, with additive effects (Paule & White, 2000). A schematic representation of the Xenopus laevis rDNA is shown in Fig. 1a and b as an example of the ‘typical’ eukaryotic rDNA organization (Sollner-Webb & Mougey, 1991). The rDNA pol I core promoter and other nonrepeated rDNA regulatory elements have been described and studied in detail in some unicellular eukaryotes such as Trypanosoma cruzi, Acanthamoeba castellanii and yeast (Kownin et al., 1985; Neigeborn & Warner, 1990; Wai et al., 2000; Figueroa-Angulo et al., 2006).

Transcription of the rDNA proceeds from the promoter through the 5′ external transcribed spacer – 18S rRNA – internal transcribed spacer-1 (ITS-1) – 5.8S rRNA – ITS-2 and 28S rRNA, until pol I comes across a transcription termination signal (Long & Dawid, 1980). In most cases, the rDNA primary transcript is post-transcriptionally processed in three rRNA mature molecules: 18S, 5.8S and 28S, resulting from the elimination of the external transcribed spacers ITS-1 and ITS-2 (Fig. 1) (Long & Dawid, 1980). Additional processing of the rRNAs into several smaller molecules has also been described. As most eukaryotic LSU rRNAs (eLSU rRNAs) are fragmented by removal of ITS-2, the eLSU rRNA should be defined as 5.8S+28S rRNA. Because the term LSU rRNA has been used as equivalent to bacterial 23S rRNA, here we refer to the 5.8S+28S rRNA as eLSU rRNA.

The 5S rRNA gene (5S rDNA)

The 5S rDNA is reiterated in the eukaryotic genome in tandem head-to-tail arrays (Paule & White, 2000) (Fig. 1c). The 5S rDNA promoter (Internal Control Region) is found downstream of the transcription start point and within the transcribed region. Upstream regulatory elements can also be found in some 5S rDNAs that may be necessary for transcription. The Internal Control Region is sufficient for transcription of 5S rRNA in Xenopus (Bogenhagen et al., 1980; Sakonju et al., 1980) whereas in Saccharomyces cerevisiae two upstream regulatory elements (start site element and upstream promoter element) are necessary for its efficient transcription in vivo (Lee et al., 1997).

Redundancy and relative ratio of the rRNA genes

rRNA genes are reiterated in almost every eukaryotic genome studied, and the gene copy number is maintained at a characteristic constant level for each organism. In most organisms, not all of the rDNA copies are transcribed (Conconi et al., 1989), suggesting that the total rDNA copy number is not directly related to the synthesis of rRNA (Kobayashi et al., 1998; Grummt, 2003; Raska et al., 2004). It has been proposed that the rDNA may participate in roles other than transcription, such as maintenance of the nucleolar structure and rDNA stability (Nogi et al., 1991; Oakes et al., 1993).

A considerable variation in the rDNA and 5S rDNA gene copy number exists among eukaryotes (Table 1): the rDNA copy number can range from one and two copies in the Ascomycota Pneumocystis carinii and the Apicomplexa Theileria parva to 4800 copies in the green alga Acetabularia mediterranea and 9000 copies in the ciliate Tetrahymena thermophila. The 5S rDNA copies can range from three in the red alga Cyanidioschyzon merolae to about one million in the ciliate Euplotes eurystomus. In the slime mold Dictyostelium discoideum and in the yeast S. cerevisiae, the rDNA and 5S rDNA genes are present in equal numbers, although a strict relationship between both types of genes is not always observed. For example, Euglena gracilis has 800–4000 copies of rDNA and only 300 copies of the 5S rDNA; in contrast, 110 copies of the rDNA are present in the genome of the kinetoplastid T. cruzi, while the 5S rDNA is repeated 1600 times. The copy number of rRNA genes in some microbial eukaryotes and the rDNA/5S rDNA ratio are given in Table 1. The considerable variability in this ratio suggests that different species may have particular regulatory mechanisms to maintain the 18S, 5.8S, 28S and 5S rRNA homeostasis for the efficient synthesis of ribosomes.

View this table:
Table 1

rRNA genes: copy number, unit size and organization

OrganismrDNA copiesrDNA unit size (kbp) and organization5S rDNA copies5S rDNA unit size (kbp) and organizationrDNA/5S rDNA copy ratioReferences
Crithidia fasciculata11–12 (T)250–3000.23 (T)Köck & Cornelissen (1990), Schnare et al. (2000)
Diplonema papillatum0.68 and 1.7 (T)Sturm et al. (2001)
Euglena gracilis800–4000 (C), 4(chr)11.5 (C)3300.6 (T)2.42–12.12Ravel-Chapuis (1988), Keller et al. (1992)
Herpetomonas0.6 (T)Aksoy et al. (1992)
Leishmania donovani16612.5 (T)Yan et al. (1999), León et al. (1978)
Leishmania major6314 (T)Martínez-Calvillo et al. (2001), Ivens et al. (2005)
Trypanosoma brucei56(T)15000.750.04Hasan et al. (1984), Berriman et al. (2005)
Trypanosoma cruzi11030 (T)16000.48 (T)0.07Castro et al. (1981), Hernández-Rivas et al. (1992), Hernández et al. (1993)
Trypanosoma rangeli3300.9Aksoy et al. (1992)
Trypanosoma vivax0.72Roditi et al. (1992)
Naegleria gruberi3000–5000 (C)14 (C)Clark & Cross (1988)
Giardia intestinalis60 (H), 3005.6 (T)Le Blancq et al. (1991)
Trichomonas tenax(T)0.01%0.307 and 0.316 (T)Torres-Machorro et al. (2009)
Trichomonas vaginalis6 (T)0.1%0.334 and 0.335 (T)López-Villaseñor et al. (2004), Torres-Machorro et al. (2006)
Tritrichomonas foetus126 (T)0.04%0.86 and ∼1.3 (T)Chakrabarti et al. (1992), Torres-Machorro et al. (2009)
Trypanoplasma borreli0.59Maslov et al. (1993)
Babesia bigemina310.65, 10.8 and 13.35 (U)Reddy et al. (1991)
Babesia bovis37Dalrymple et al. (1990)
Babesia canis4Dalrymple et al. (1992)
Cryptosporidium parvum5 (H)6.5 (T)60.55 and 0.79, 3 (T), 3(U)0.83Taghi-Kilani et al. (1994), Le Blancq et al. (1997)
Eimeria tenella140(T)5000.73 (T)0.28Stucki et al. (1993), Shirley et al. (2000)
Plasmodium berghei4(U)3(T)1.33Dame & McCutchan (1983), Waters et al. (1994)
Plasmodium falciparum5–8(U)31.67–2.67Shippen-Lentz & Vezza (1988), Gardner et al. (2002)
Plasmodium lophurae6(U)Unnasch & Wirth (1983)
Plasmodium vivax7(U)Li et al. (1997)
Theileria parva2(U)30.67Kibe et al. (1994), Gardner et al. (2005)
Toxoplasma gondii1107.5 (T)1101Guay et al. (1992)
Perkinsus andrewsi7.7–7.8Pecher et al. (2004)
Euplotes crassus7 (L)Erbeznik et al. (1999)
Euplotes eurystomus1 000 0000.93 (L)Roberson et al. (1989)
Glaucoma chattoni9.3 (L)Challoner et al. (1985)
Oxytricha fallax7.49 (L)0.69 (L)Rae & Spear (1978), Swanton et al. (1982)
Paramecium tetraurelia9 (T,L,C)Preer et al. (1999)
Tetrahymena pyriformis200 MAC (H), 1 MIC350 MAC, 350 MIC (H)0.280.29Kimmel & Gorovsky (1976), Kimmel & Gorovsky (1978)
Tetrahymena thermophila9000 MAC, 1MIC21 (L,P)150 MAC, 150 MIC (H)0.25–0.2930Yao & Gall (977), Allen et al. (1984), Eisen et al. (2006)
Acanthamoeba castellanii24(H), 60012 (T)480(U)1.25Zwick et al. (1991), Yang et al. (1994)
Dictyostelium discoideum180 (L), 1 (chr)88 (L,P)18088 (L,P)1Cockburn et al. (1978), Hofmann et al. (1993)
Didymium iridis20 (L)Johansen et al. (1992)
Entamoeba histolytica (HM-1:IMSS)200 (C), 0 (chr)24.5 (C)Huber et al. (1989), Bagchi et al. (1999)
Physarum polycephalum1 × 1011>60 (L,P)0.68Campbell et al. (1979)
Acetabularia mediterranea3500–4800(T)Spring et al. (1978)
Cyanidioschyzon merolae3(U)3(U)1Maruyama et al. (2004), Matsuzaki et al. (2004)
Candida albicans100 (C), 200 (chr)11.6–12.5 (T)Huber & Rustchenko (2001)
Candida glabrata>115(T)>2300.5Maleszka & Clark-Walker (1993), Bergeron & Drouin (2008)
Hansenula polymorpha50–608 (T)50–601Ramezani-Rad et al. (2003)
Kluyveromyces lactis608.6 (T)601Verbeet et al. (1984)
Pneumocystis carinii1Tang et al. (1998), Fischer et al. (2006)
Saccharomyces cerevisiae100–2009.1 (T)100–2001Rubin & Sulston (1973), Rustchenko & Sherman (1994)
Schizosaccharomyces pombe100–12010.4 (T)30(U)3.33–4Wood et al. (2002)
Yarrowia lipolytica1007.7 and 8.7 (T)108(U)0.93van Heerikhuizen et al. (1985), Acker et al. (2008)
Encephalitozoon cuniculi228.9 (tel)3(U)7.33Peyretaillade et al. (1998), Katinka et al. (2001)
Nosema apis18 (T)Gatehouse & Malone (1998)
Nosema bombycis4.3 (T)Huang et al. (2004)
  • All copy numbers are approximate, mainly based on quantitative hybridization analyses. If the rDNA is chromosomal, the unit size corresponds to the complete unit containing the rDNA coding region and the intergenic spacer.

  • If the rDNA unit is extrachromosomal, the unit size corresponds to the whole molecule size.

  • * In trichomonads the percentage of the genome sequence that corresponds to the 5S rRNA coding region is indicated.

  • C, extrachromosomal circle; chr, chromosomal; H, haploid; L, extrachromosomal linear molecule; MAC, macronucleus; MIC, micronucleus; P, palindrome; T, tandem; tel, telomeric; U, unlinked and nontandem.

The ribosomal cistron: localization, gene linkage and IGR

The typical rDNA organization

Chromosomally localized tandem head-to-tail repeats of rDNA units containing a coding region and an IGR represent the typical rDNA organization in eukaryotes. Some microbial organisms of various phylogenetic branches share this general organization, as shown in Fig. 2 and Table 2. Tandem rDNA units may be located in a single chromosome and locus (e.g. Kluyveromyces lactis and Trichomonas vaginalis), or in various chromosomes and loci (e.g. T. cruzi). Atypical tail-to-tail and head-to-head rDNA repeats (interspersed with typical tandem head-to-tail repeats) are observed in Acetabularia exigua (Berger et al., 1978; Spring et al., 1978). In various yeast species such as K. lactis and S. cerevisiae, tandem rDNA units are genetically linked to the 5S rDNA. In these cases, the 5S rDNA can be coded either in a sense or an antisense orientation relative to the rDNA coding strand (Table 2, Fig. 2b and c).

Figure 2

Different organization of tandem head-to-tail rDNA repeats in microbial eukaryotes. (a) Eimeria tenella (Apicomplexa) exemplifies microbial eukaryotes with the typical rDNA organization. (b, c) rDNA copies in Saccharomyces cerevisiae (Ascomycota) and Toxoplasma gondii (Apicomplexa) are linked to the 5S rDNA (green), but in opposite polarities. Intergenic short direct repeats present in S. cerevisiae are shown as colored bars (see also Table 3). (d) In Giardia intestinalis (Diplomonadida), a 32-kDa antigenic protein (dark blue arrow) is coded in the complementary rDNA strand. (e) In Acanthamoeba castellanii (Acanthamoebidae), the mature eLSU rRNA is fragmented into three molecules: 5.8S, 26Sa (2.4 kb) and 26Sb (2 kb); the IGR contains six repeats of a 140-bp element (R, aqua boxes). (f, g) In Trypanosoma cruzi and Leishmania major (kinetoplastids), the eLSU rRNA is fragmented into seven molecules. The T. cruzi IGR contains a 172-bp repeated sequence (orange boxes). In Leishmania spp., the IGR is characterized by the presence of multiple repeated units (yellow). Leishmania major Friedlins ɛ region is duplicated once. Drawings are not to scale. The size of rDNA units is shown in Table 1. Arrows show the polarity of transcription.

View this table:
Table 2

Organisms with typical rDNA organization

OrganismLocalizationIGS repeated elements5S linkageReferences
Crithidia fasciculataYesXSchnare et al. (2000)
Leishmania spp.1Ch, 1LYesXUliana et al. (1996), Yan et al. (1999), Martínez-Calvillo et al. (2001), Orlando et al. (2002), de Andrade Stempliuk & Floeter-Winter (2002)
Trypanosoma brucei4ChXXHasan et al. (1984), Melville et al. (1998)
Trypanosoma cruzi≥2ChYesXHernández et al. (1993)
Giardia spp.6L (intestinalis)intestinalis and murisXEdlind & Chakraborty (1987), Boothroyd et al. (1987), van Keulen et al. (1992), Upcroft et al. (1994)
Trichomonas tenaxXXTorres-Machorro et al. (2009)
Trichomonas vaginalis1Ch, 1LXXLópez-Villaseñor et al. (2004), Torres-Machorro et al. (2009)
Tritrichomonas foetus1Ch, 1LXXTorres-Machorro et al. (2009)
Eimeria tenella1Ch, 1LXXShirley et al. (2000)
Toxoplasma gondiiXSenseGuay et al. (1992)
Acanthamoeba castellaniiYesXD'Alessio et al. (1981), Yang et al. (1994)
Hansenula polymorphaXLinkedKlabunde et al. (2002)
Kluyveromyces lactis1Ch, 1LXAntisenseVerbeet et al. (1984)
Saccharomyces cerevisiae1Ch, 1LYesAntisenseRubin & Sulston (1973), Skryabin et al. (1984), Srivastava & Schlessinger (1991), Dujon et al. (2004), Kim et al. (2006)
Schizosaccharomyces pombe1Ch, 2LXXSchaak et al. (1982), Wood et al. (2002)
Torulopsis utilisXAntisenseTabata et al. (1980)
Yarrowia lipolytica7LYesXvan Heerikhuizen et al. (1985)
Nosema apisXSenseGatehouse & Malone (1998), Iiyama et al. (2004)
  • IGS, intergenic spacer; X, not identified or not present; Ch, chromosome; L, locus or loci.

Depending on the species (and isolate) studied, several tandemly repeated sequences may be found in the IGR with variations in size, sequence and number (Table 3). For example, the IGR of Leishmania species contain a 60–64 bp repeated element reiterated between 16 and 275 times, causing length variations in the IGR that range from 4 to 12 kbp. The size, number and sequence of these motifs are species- and isolate-specific elements in Leishmania spp. (Gay et al., 1996; Uliana et al., 1996; Yan et al., 1999; Martínez-Calvillo et al., 2001; Orlando et al., 2002) (Table 3 and Fig. 2g).

View this table:
Table 3

Repeated sequences in the ribosomal cistron intergenic spacer

OrganismSize of repeated sequencesNumber of repeated sequencesFunctionIntergenic region size (kb)References
Crithidia fasciculata19 bp28Schnare et al. (2000)
55 bp43.5
Euglena gracilis14 bp, imperfect repeat6Greenwood et al. (2001)
30 bp imperfect palindromes2
Leishmania major63 bp16–2754–12Martínez-Calvillo et al. (2001)
Leishmania amazoniensis60 bp35–70Uliana et al. (1996)
Leishmania chagasi64 bp9Enhancer-likeGay et al. (1996)
Leishmania infantum63 bp404–12Requena et al. (1997)
LiR3 348 bpTranscription termination?
Leishmania donovani64 bp395.8Yan et al. (1999)
63 bp124
Leishmania hoogstraali63 bp405.5Orlando et al. (2002)
Trypanosoma cruzi172 bp2–8Pulido et al. (1996)
Giardia intestinalis78 bp2 in GS strain, 3 in GK strainRecombination?Upcroft et al. (1994)
Giardia muris73 bpVaries, 6 minvan Keulen et al. (1992)
Euplotes crassusI. ∼30 bpErbeznik et al. (1999)
IV. ∼17 bp
Glaucoma chattoniI. 32 bp (5′)3Spacer promoter?Challoner et al. (1985)
II. 14–24 bp (5′)7Enhancer?
III. 18 bp (5′)1
IV. 17 bp (3′)9Termination?
V. 130 bp (3′)3Gene packing?
Tetrahymena thermophilaI. 32 bp (5′)4Spacer promoter?1.9Challoner et al. (1985)
II.20–21 bp (5′)13Enhancer?
III. 20 bp (5′)7
430 bp (5′)4 (2 tandem in each chromosome half)Replication, include type I and III repeats
IV. 17 bp (3′)5Termination?
V. 130 bp (3′)4Gene packing?
Tetrahymena pyriformisI. 33 bp (5′)3Spacer promoter?Challoner et al. (1985)
II. 10–24 bp (5′)11Enhancer?
III. 14–21 bp (5′)5
IV. 17 bp (3′)ND
V. 130 bp (3′)ND
Perkinsus andrewsi15 bp approximately8Coss et al. (2001)
Pfiesteria piscicida9 bpSaito et al. (2002)
Acanthamoeba castellanii106–174 bp6UBF binding2.33Yang et al. (1994)
Dictyostelium discoideum29 bp (3′)4Sucgang et al. (2003)
Entamoeba histolyticaDraI 170 bp (3′)10ARS-like sequences9.2 (5′)Huber et al. (1989), Mittal et al. (1992), Bhattacharya et al. (1998)
ScaI 144 bp (3′)73.5 (3′)
ScaI 144 bp (5′)6
PvuI 145 bp (5′)11Pathogenic-specific sequence
HinfI 653 bp (5′)2 and 4Recombination
74 bp (5′)2
AvaII 153 bp (5′)5
1408 bp (5′)2
Physarum polycephalumI. 130 bp (5′)≥22Hattori et al. (1984)
I′. 130 bp (3′)
II. 50 bp (5′)16
II′. 52 bp (3′)6
Saccharomyces cerevisiaeI. 6 bp (5′)31.1 (5′)Skryabin et al. (1984)
II. 16 bp (5′)21.25 (3′)
III. 8 bp (5′)3
IV. 11 bp (3′)2
V. 9 bp (3′)2
VI. 9 bp (3′)2
Yarrowia lipolytica140–150 bpVariesvan Heerikhuizen et al. (1985), Fournier et al. (1986)
11 bp14
Encephalitozoon cuniculi29 bp (5′)2Peyretaillade et al. (1998)
19 bp (5′)8
51 bp (3′)
43 bp (3′)
  • Repeated sequences may not be identical.

  • * The 55-bp repeat has an internal repeated inverted sequence.

  • Type I and III repetitions are within a 430-bp repeated segment involved in the replication of the minichromosome.

  • Inverted repeat.

  • § § Two blocks of highly repetitive DNA bracket the transcribed region.

  • Polymorphic locus with a 65-bp internal inverted repeated sequence.

  • ND, not determined; UBF, upstream binding factor; IGS, intergenic spacer.

Unlinked and heterogeneous rDNA

rDNAs with heterogeneous coding and intergenic sequences are characteristic of the Apicomplexa group and may be found unlinked (located in nonadjacent loci) and in low copy number (Table 4). For example, Plasmodium spp. may have four to eight rDNA copies per haploid genome. Plasmodium falciparum and Plasmodium berghei have two types of rDNA units (Waters et al., 1989; Waters et al., 1994): A-type and S-type (also known as C-type in P. berghei) code for different SSU and eLSU rRNAs that correlate with the production of ribosomes with different GTPase activity (Rogers et al., 1996; Velichutina et al., 1998). The expression of rDNA genes is tightly linked to the progression of Plasmodium life cycle: the A-type rRNA is expressed predominantly in the vertebrate host (asexual development), whereas the S-type rRNA is expressed in the mosquito stage (sexual development) (Mercereau-Puijalon et al., 2002) (Fig. 3). During the transfer of Plasmodium from the vertebrate host to the mosquito, drastic changes in glucose concentration and temperature are involved in regulating the expression of A- and S-type rDNA genes from different promoter elements (Mack et al., 1979; Fang & McCutchan, 2002; Fang et al., 2004). A third type, the O-type rDNA (oocyst), has been described in the human malaria parasite Plasmodium vivax, whose synthesis takes place in ookinetes inside the mosquito's gut (Li et al., 1997). Comprehensive reviews of rRNA genes from Plasmodium describe in detail the characteristic organization and function of these genes (Waters, 1994; McCutchan et al., 1995). Other Apicomplexa species with a similar rDNA organization are described in Table 4, as well as the non-Apicomplexa red alga C. merolae. This organism has only three unlinked rDNA units in two different chromosomes, with similar rRNA coding sequences (Matsuzaki et al., 2004).

View this table:
Table 4

Organisms with unlinked rDNA units

OrganismCopy numberChromosomal localization and rDNA typesReferences
Babesia bigemina3Reddy et al. (1991)
Babesia bovis3Two in chr III and one in chr IVDalrymple et al. (1990), Brayton et al. (2007)
Babesia canis4Two in one chr and two in the other chrDalrymple et al. (1992)
Cryptosporidum parvum53 tandem, 2 alone in different chrLe Blancq et al. (1997)
Theileria parva2A and B units in different chrKibe et al. (1994), Bishop et al. (2000)
Plasmodium berghei4Type A in chr XII and VII; Type C in chr VI and VWaters et al. (1994)
Plasmodium falciparum5–8Subtelomeric: Type A in chr V and VII; Type S in chr XI and XIIILangsley et al. (1983), Mercereau-Puijalon et al. (2002), Gardner et al. (2002)
Plasmodium lophurae6 (7–9)Unnasch & Wirth (1983)
Plasmodium vivax7A, S and O rDNA typesvan Spaendonk et al. (2000)
Cyanidioschyzon merolae33 different loci, two in chr XVII and one in chr XVIIIMaruyama et al. (2004)
  • Chr, chromosome.

Figure 3

rDNA organization in Plasmodium berghei. Four unlinked rDNA copies are coded in the telomeres of P. berghei. Type-A rDNAs contain a mature eLSU rRNA fragmented into three molecules (5.8S, 28Sa and 28Sb) and are expressed during the asexual stage in vertebrate hosts. Type-C rDNAs (with a nonfragmented 28S rRNA) are expressed during the sexual development in mosquitoes. The differential expression of rDNAs is regulated by specific promoter sequences (purple and pink boxes). Drawings are not to scale. The size of rDNA units is shown in Table 1.

Telomeric rDNA

The microsporidian obligate intracellular parasite E. cuniculi has 22 rDNA units located as single copies in all telomeres of its 11 chromosomes (Brugère et al., 2000). Candida albicans rDNA is found in two subtelomeric loci (Dujon et al., 2004), while Yarrowia lipolytica and Giardia intestinalis rDNA units are positioned in seven and six subtelomeric loci, respectively (Le Blancq et al., 1991; Dujon et al., 2004). The G. intestinalis chromosome I varies 5–20% in size due to subtelomeric rearrangements including variations in rDNA copy number and size (Hou et al., 1995), while some subtelomeric rDNA copies are linked to transcriptional gene units, including protein-coding genes such as ankyrin. Some of these regions may also hold incomplete rDNA sequences (Upcroft et al., 2005). Interestingly, fragments of the rDNA unit are found in all chromosomal ends in D. discoideum. These regions encode complex repeated sequences (transposable elements–rDNA junctions) that generate novel telomeric structures (Eichinger et al., 2005).

The subtelomeric localization of rDNA sequences as those found in E. cuniculi, D. discoideum and G. intestinalis suggests a physiological role for these elements. Telomeres have an ordered structure in the nucleus and can be clustered or associated with the nuclear matrix, at least in some stage during the life cycle (Pryde et al., 1997). Telomeres are regions of great plasticity within a heterochromatic context, with dynamics that allow for the amplification and/or variation in the number of telomeric genes and repeated sequences. It is not known whether subtelomeric rDNA is involved in the maintenance of the characteristic telomeric structure or whether the rDNA exploits this particular structure to regulate its expression and to maintain the sequence and copy number (Pryde et al., 1997).

rDNA may be located extrachromosomally

Extrachromosomal rDNA has been found in ciliates, cellular and plasmodial slime molds and in yeasts (Table 5). The polyploid somatic macronucleus of the ciliate T. thermophila contains about 9 000 copies of a palindromic self-replicating linear minichromosome, which codes for two rDNA units (Fig. 4a and Table 5). The IGR of this palindrome contains six types of repeated sequences (Fig. 4a and Table 3). The rDNA organization in Tetrahymena pyriformis is similar to the T. thermophila rDNA palindrome, with variations in the intergenic repeated motifs (Tables 3 and 5).

View this table:
Table 5

Extrachromosomal rDNA units

OrganismLineal/circularSizeOrganizationCopy numberAdditional copies in chromosomeReferences
Euglena gracilisC11.5 kbpSingle800–40004Ravel-Chapuis et al. (1988), Schnare et al. (1990)
Naegleria gruberiC14 kbpSingle300–5000NoneClark & Cross (1987)
Euplotes crassusL7 kbpSingle geneMICErbeznik et al. (1999)
Glaucoma chattoniL9.3 kbpSingle geneMICKatzen et al. (1981), Challoner et al. (1985)
Nyctotherus ovalisLSingle geneMICRicard et al. (2008)
Oxytricha fallaxL7.49 kbpSingle geneMICRae & Spear (1978), Swanton et al. (1982)
Oxytricha novaL7.49 kbpSingle geneMICSwanton et al. (1982)
Paramecium tetraureliaC and LTandemMICFindly & Gall (1978)
Stylonychia mytilusLSingle geneMICLipps & Steinbrück (1978)
Tetrahymena pyriformisLPalindrome200/haploid MAC1Engberg et al. (1976), Yao & Gall (1977), Niles et al. (1981)
Tetrahymena thermophilaL21 kbpPalindrome9000 MACEngberg et al. (1985), Eisen et al. (2006)
Dictyostelium discoideumL88 kbpPalindrome901 PalindromeCockburn et al. (1978), Hofmann et al. (1993), Eichinger et al. (2005)
Didymium iridisL20 kbpSingleJohansen et al. (1992)
Entamoeba histolytica HM-1:IMSSC24.5 kbpPalindrome200/haploidNoneHuber et al. (1989), Bhattacharya et al. (1998), Bagchi et al. (1999)
Physarum polycephalumL>60 kbpPalindrome1 × 1011Vogt & Braun (1976), Campbell et al. (1979)
Candida albicansC and L1.2 MbpTandem100200Huber & Rustchenko (2001)
  • * The size depends on the number of rDNA repeats.

  • MAC, macronucleus; MIC micronucleus.

Figure 4

Linear extrachromosomal rDNA units in microbial eukaryotes. (a) rDNA is coded in extrachromosomal linear palindromes in Tetrahymena thermophila. Two head-to-head rDNA units are coded in a macronuclear minichromosome that contains typical telomeric sequences (black dots). Upstream and downstream IGRs contain various types of repeated sequences (colored bars; see also Table 3). A group I intron (blue square) is found within the eLSU rRNA. (b) The rDNA and 5S rDNA in Dictyostelium discoideum are linked and coded in extrachromosomal linear palindromes containing telomeres (black dots). (c) rDNA in Physarum polycephalum is coded in palindromic head-to-head units in a minichromosome with various repeated sequences both in the 5′ and 3′ IGRs (see also Table 3). The eLSU rRNA is interrupted by two group I introns (blue squares), and a third group I intron that includes an HE gene (purple square). (d) In Didymium iridis the rDNA is coded in linear minichromosomes containing one rDNA unit. The SSU rRNA contains a twintron (purple box) and two group I introns are found within the 28S rRNA (blue boxes). Drawings are not to scale. The size of rDNA units is shown in Table 1. Arrows show the polarity of transcription.

Dictyostelium discoideum and Physarum polycephalum rDNA is also encoded in palindromic extrachromosomal molecules (Fig. 4 and Table 5). Both rDNA minichromosomes contain several repeated sequence elements (Table 3). In the D. discoideum rDNA palindrome, two 5S rDNA copies are present near the telomeric ends, in the same polarity as the rDNA unit (Fig. 4b) (Cockburn et al., 1978). Additionally, a single rDNA palindrome is located in chromosome IV (Sucgang et al., 2003). Even though D. discoideum has six chromosomes, a seventh ‘chromosome’ can be observed in some chromosomal spreads. This additional ‘chromosome’ corresponds to a chromosome-sized cluster of palindromic rDNA minichromosomes (Sucgang et al., 2003), which suggests a physical interaction of the extrachromosomal rDNA. This particular organization may play a role in the expression and segregation of mitotic rDNA.

Extrachromosomal linear molecules containing one rDNA unit are found in Didymium iridis (Fig. 4 and Table 5). Ciliates such as Euplotes crassus and Glaucoma chattoni have extrachromosomal rDNA copies in single gene-sized linear molecules within the macronucleus with characteristic intergenic repeated elements (Tables 3 and 5). Finally, tandem rDNA genes in Paramecium tetraurelia can be found both in circular and in linear extrachromosomal molecules (Table 5), which can contain >13 rDNA copies.

rDNA units coded in extrachromosomal circular plasmids may be found in Amoebozoa (Entamoeba histolytica) and Excavata (E. gracilis and Naegleria gruberi) (Fig. 5 and Table 5). The most-studied E. histolytica isolate HM-1:IMSS lacks an rDNA chromosomal copy (Bagchi et al., 1999) but possesses about 200 copies of an extrachromosomal circular molecule, with two inverted rDNA units and repeated sequences in the IGR (Table 3 and Fig. 5a). This molecule starts replication at multiple sites; the primary replication origins are located near the pol I promoters, but other replication origins found all the way through the circle are activated under stress conditions (Ghosh et al., 2003). Interestingly, a 0.7-kb RNA of unknown function is encoded in the upstream region of one rDNA unit (Bhattacharya et al., 1998) (Fig. 5a). Depending on the isolate, variations are found in the size and organization of the rDNA circular molecules: the 200:NIH E. histolytica isolate has a palindromic circular organization (25.9 kb), while the HK-9 (15.3 kb) and the Rahman (18.3 kb) isolates possess single rDNA units in their circular extrachromosomal molecules (Sehgal et al., 1994; Bhattacharya et al., 1998).

Figure 5

Circular extrachromosomal rDNA units in microbial eukaryotes. (a) In Entamoeba histolytica extrachromosomal self-replicating molecules encode two palindromic rDNA units (red). The upstream and downstream IGRs contain several repeated sequences (coloured boxes, detailed in Table 3). The 5′ IGR also encodes a 0.7-kb mRNA (grey arrow). In the HM1 strain, four hemolysine virulence proteins (HLYs) are coded within the rRNA coding region, in antisense orientation relative to the rRNA coding strand (navy blue arrows). (b) In Euglena gracilis the rDNA is coded in circular plasmids and the eLSU is fragmented in 14 segments. (c) The Naegleria gruberi rDNA plasmid encodes one rDNA unit containing one twintron in 18S rRNA (purple box) and three type-I introns in the eLSU rRNA (blue boxes). The IGR contains two ORFs (dark blue arrows). Black arrows show the polarity of transcription.

Most E. gracilis rDNA is found in extrachromosomal circular molecules that code for a single rDNA unit (Fig. 5b and Table 5). The whole E. gracilis rDNA circle is transcribed, suggesting a read-around transcription without the need for transcriptional terminators (Greenwood et al., 2001). Naegleria gruberi rDNA plasmid contains two ORFs: a large one downstream of the 28S rRNA (similar to a homing endonuclease gene, HE gene) and a short one that codes for a hypothetical protein (Maruyama & Nozaki, 2007) (Fig. 5c and Table 5).

The yeast C. albicans possesses both chromosomal and extrachromosomal rDNA. About 200 copies in tandem, varying in size, are present in chromosome R while roughly 100 copies are found in an ∼1.2 Mbp autonomously replicating circle. Some rDNA sequences are also found in 50–150-kbp linear molecules (Huber & Rustchenko, 2001).

rDNA plasmids have only been observed in old S. cerevisiae cell cultures. During the aging process, the tandem rDNA copies are excised from the chromosome and replicate autonomously. The accumulation of rDNA circles leads to yeast sterility and shortening of the life span. An association between rDNA locus instability and loss of epigenetic silencing has also been observed (Sinclair & Guarente, 1997).

The ribosomal cistron: the coding region

The typical eukaryotic rDNA coding region is composed of the 18S, 5.8S and 28S rRNA coding sequences separated by ITS-1 and ITS-2. The rDNA coding sequence consists of a common core of domains that may be interspersed with a distinct set of variable regions (also called expansion segments; Dover et al., 1988). Ten and 18 variable regions have been identified in the SSU and LSU rRNAs of all organisms (Raué et al., 1988). Three types of sequence insertions have been found within these variable regions: (1) expansion segments, encoding RNA sequences conserved in the mature molecule; (2) group I introns, located within highly conserved regions and removed after transcription; and (3) transcribed spacers, sequences removed from the mature rRNA, thus producing fragmented eLSU rRNA molecules (Clark et al., 1984). Babesia bovis (Dalrymple et al., 1992), Cryptosporidium parvum (Le Blancq et al., 1997), D. discoideum (Frankel et al., 1977), E. histolytica (Huber et al., 1989), G. intestinalis (Healey et al., 1990), K. lactis (Verbeet et al., 1984), S. cerevisiae (Bell et al., 1977), Toxoplasma gondii (Gagnon et al., 1996) and T. vaginalis (López-Villaseñor et al., 2004) are microbial eukaryotes with the typical rDNA organization of the coding region (Fig. 2a–e).

Insertions of expansion segments in the SSU

The average length of eukaryotic SSU rRNA is 2 kb. Unusually long SSU rRNAs have been found in Pelobionta (Pelomyxa palustris), Foraminifera (Hemisphaerammina bradyi) and Euglenozoa (Distigma sennii) (Table 6). The longest SSU rRNA known is found in the Euglenid D. sennii, comprising >4.5 kb. In most cases, the insertions are found in the SSU rRNA variable regions V2, V4 and V7. The only exceptions are an extended V5 region in A. castellanii and an expansion in a nonvariable region in P. palustris (Gunderson & Sogin, 1986; Milyutina et al., 2001). The rRNA variable regions are located in the mature ribosome surface and their evolutionary implications are unknown (Katz & Bhattacharya, 2006).

View this table:
Table 6

Variations in the size of the SSU rRNA due to insertions

OrganismSSU size (kb)References
Astasia curvata2.56Busse & Preisfeld (2002)
Astasia torta2.9Busse & Preisfeld (2002)
Distigma curvatum3.4–3.7Busse & Preisfeld (2002)
Distigma elegans3.9Busse & Preisfeld (2002)
Distigma sennii4.5Busse & Preisfeld (2002)
Ploeotia costata2.4Busse & Preisfeld (2003)
Euglena gracilis2.3Gunderson & Sogin (1986)
Acanthamoeba castellanii2.3Gunderson & Sogin (1986)
Acanthamoeba griffini2.9Gast et al. (1994)
Acanthamoeba lenticulata3Schroeder-Diedrich et al. (1998)
Pelomyxa palustris3.5Milyutina et al. (2001)
Phreatamoeba balamuthi2.74Hinkle et al. (1994)
Entamoeba histolytica2.3Loftus et al. (2005)
Ankistrodesmus stipitatus2.2Dávila-Aponte et al. (1991)
Foraminifera2.3–4Katz & Bhattacharya (2006)

Group I introns and twintrons

Group I introns can be found as insertions in the SSU and eLSU rRNA coding regions that are removed from the mature molecule by means of a self-splicing reaction (Einvik et al., 1998), generating a completely functional rRNA molecule (Mandal, 1984). The ribozymes encoded in group I introns have conserved secondary structures of 10 base-paired segments, as well as some additional paired segments depending on the intron subclass (Michel & Westhof, 1990). The splicing reaction initiates with a nucleophilic attack of a guanosine cofactor at the 5′ splice site and, after two sequential transesterification reactions, the exons are ligated and the RNA intron is removed (Einvik et al., 1998).

Group I introns are widely distributed in nature and can be found in bacteria, mitochondrial and chloroplast genomes, and in the eukaryotic nucleus (Johansen et al., 2007). Group I introns may interrupt the SSU rRNA coding sequence in 40 distinct conserved sites of several microbial eukaryotes (Jackson et al., 2002) such as Acanthamoeba griffini and the green alga Characium saccatum. These introns may also be present in the eLSU rRNA, as is the case for P. falciparum A-type eLSU rRNA and the 26S rRNA of some Tetrahymena isolates (Fig. 4a, Table 7). It is interesting that some organisms may have both the SSU and the eLSU rRNAs interrupted by group I introns (e.g. P. carinii, Chlorella ellipsoidea and D. iridis, Fig. 4d). Table 7 describes some of the introns found in the rDNA of several microbial eukaryotes.

View this table:
Table 7

Group I rDNA introns

OrganismLocationNumber of intronsSizeHE geneReferences
Ploeotia costataSSU rRNA1494 bpBusse & Preisfeld (2003)
Naegleria gruberiSSU rRNA1I-NgrIWikmark et al. (2006)
LSU rRNA3Einvik et al. (1998)
Plasmodium falciparumLSU rRNA1Langsley et al. (1983)
Tetrahymena pigmentosaLSU rRNA1400 bpWild & Gall (1979)
Tetrahymena thermophilaLSU rRNA1370–410 bpSogin et al. (1986)
Acanthamoeba griffiniSSU rRNA1519 bpGast et al. (1994)
Acanthamoeba lenticulataSSU rRNA1656 bpGast et al. (1994)
Didymium iridisSSU rRNA11.43 kbpI-DirI and I-DirIIJohansen & Vogt (1994), Johansen et al. (2006), Johansen et al. (2007)
LSU rRNA2688 and 573 bpJohansen et al. (1992)
Physarum polycephalumLSU rRNA30.7, 0.6 and 0.94 kbpI-PpoIRuoff et al. (1992), Johansen et al. (2007)
Ankistrodesmus stipitatusSSU rRNA1334 bpDávila-Aponte et al. (1991)
Characium saccatumSSU rRNA1477 bpWilcox et al. (1992)
Chlorella ellipsoideaSSU rRNA1442 bpAimi et al. (1994)
LSU rRNA1445 bpAimi et al. (1993)
Dunaliella parvaSSU rRNA2381 and 419 bpVan Oppen et al. (1993), Wilcox et al. (1992)
Dunaliella salinaSSU rRNA1397/8 bpVan Oppen et al. (1993), Wilcox et al. (1992)
Candida albicansLSU rRNA1379 bpMiletti-González & Leibowitz (2008)
Pnemocystis cariniiSSU rRNA1390 bpSogin & Edman (1989)
LSU rRNA1Lin et al. (1992), Liu et al. (1992)
Histoplasma capsulatumSSU rRNA1403–425 bpOkeke et al. (1998), Lasker et al. (1998)
Nosema bombycisSSU rRNA1Iiyama et al. (2004)
  • * Only in one of the eight LSU rRNA copies.

  • Depends on the isolate.

  • Coded in the 0.94-kbp intron.

  • § § Can have two types of intron differing in sequence.

Twintrons are more complex insertions in the rDNA that consist of two group I introns (ribozymes) and an ORF encoding an HE (Einvik et al., 1998; Johansen et al., 2007). The D. iridis and N. gruberi SSU twintrons contain a small ribozyme (GIR1), followed by the HE ORF inserted into a second ribozyme (GIR2). Two different isolates of D. iridis have two types of introns, containing an HE gene in both polarities relative to the SSU rRNA gene (Fig. 6) (Johansen et al., 2007). The twintron contains the HE ORF (I-DirI) in the same polarity as the 18S rRNA coding region. GIR2 is a self-splicing ribozyme that releases the HE transcript. A second intron encoding a ribozyme (GIR1) is also found within the twintron. GIR1 modifies the 5′ end of the HE transcript to form a 2′5′cap that increases its translational efficiency (Fig. 6a) (Einvik et al., 1998; Johansen et al., 2007). In contrast, the intron II contains an HE gene (I-DirII) in opposite polarity relative to the SSU rRNA and ribozyme-coding sequences (Johansen et al., 2006). Transcription of I-DirII is established from a pol II-like promoter located immediately upstream of the HE gene (Fig. 6b) (Johansen et al., 2006). Both D. iridis HE transcripts are processed through the nuclear spliceosomal complex to remove a 50-nt noncoding spliceosomal intron, found within the HE coding sequences, and are polyadenylated (Vader et al., 1999; Johansen et al., 2007).

Figure 6

Group I introns that contain an HE gene. (a) Twintron present in Didymium iridis: the DiGIR2 intron (purple) is encoded in the SSU rRNA and transcribed by pol I as part of the pre-rRNA; it self-splices to generate the HE pre-mRNA (splicing sites are represented as black bars). Subsequently, DiGIR1 intron (blue) self-splices and processes the I-Dir I HE pre-mRNA in the 5′ side, producing a 2′5′-cap. The I-DirI HE pre-mRNA is additionally processed by the removal of spliceosomal intron SI (white box) and polyadenylation of the 3′ side to generate a functional I-Dir I HE mRNA (yellow region). (b) Intron II present in D. iridis: the I-Dir II HE RNA found within the DiGIR2 intron is coded in antisense orientation and is transcribed from a pol II promoter. The HE pre-mRNA is processed by pol II-associated factors to generate a typical 5′-cap and a 3′ polyadenylated tail. The spliceosomal intron SI is removed by the spliceosome machinery.

Physarum polycephalum eLSU rDNA contains an optional group I intron holding an HE gene (Ruoff et al., 1992). The full-length RNA intron can be excised or alternatively processed (immediately downstream of the HE gene) to produce a smaller transcript. Only the full-length RNA intron (lacking a 5′cap and a poly-A tail) is translated into the HE I-PpoI protein (Ruoff et al., 1992). The cleavage of this transcript in the internal processing site seems to downregulate HE I-PpoI expression by decreasing the stability of the transcript in yeast transintegrated introns (Johansen et al., 2007). Table 7 summarizes rDNA group I introns and HE gene insertions.

The ITS-1 and -2

The rDNA transcript is generally post-transcriptionally processed in three rRNA mature molecules: 18S, 5.8S and 28S rRNAs that result from elimination of ETS, ITS-1 and ITS-2 from the precursor transcript (Fig. 1). In microbial eukaryotes, ITS-1 ranges from 100 to 400 bp, while ITS-2 is 200–500 bp. Unusually long ITSs are found in the red alga C. merolae (Maruyama et al., 2004), where ITS-1 and ITS-2 average sizes are 862 and 1738 bp, respectively. Euglena gracilis has the largest known ITS-1, 1188 bp in length (Schnare et al., 1990). The dinoflagellate Cochlodinium polykrikoides ITS-1 contains a 101-bp sequence in six tandem repeats, resulting in an ITS-1 length of 813 bp (Ki & Han, 2007). Yarrowia and Giardia have the shortest known ITSs in microbial eukaryotes: the sum of ITS-1 and ITS-2 lengths in Y. lipolytica is only 150 bp (van Heerikhuizen et al., 1985), while the G. intestinalis ITS-1 and ITS-2 are 37 and 52 bp in length, respectively (Boothroyd et al., 1987). Some Microsporidia species completely lack the ITS-2 (Fig. 7) (Vossbrinck & Woese, 1986), as discussed below. The biological relevance of the ITSs' length and the presence of internal repeats are currently unknown, although their sequence has been useful in molecular phylogenetic studies of closely related species.

Figure 7

Unusual rDNA organization in Microsporidia. Microsporidia lack a 5.8S rRNA mature molecule and the typical 5.8S rRNA sequence is fused to the 23S rRNA. (a) Single telomeric rDNA units are surrounded by different repeated sequences in Encephalitozoon cuniculi (see also Table 3) and the rDNA lacks ITS-2. (b) Some Nosema species have an atypical rDNA coding organization, with the LSU rRNA coded upstream of the 16S rRNA. The typical 5.8S rRNA sequence is fused to the 23S rRNA and the 5S rDNA is linked to the rDNA unit.

Additional ITSs generate fragmented eLSU rRNA

Some microbial eukaryotes process the pre-rRNA into more than three mature molecules due to the presence of additional ITSs. Well-known examples of fragmented rRNA are found among kinetoplastids, with the eLSU rRNA fragmented in seven molecules. The nomenclature of these rRNAs varies according to the organism, the size of the rRNA molecule and the position in the coding region. The eLSU rRNA of Leishmania spp. is fragmented into seven elements, which are cotranscribed in the pre-rRNA and processed by exo- and endonucleolytic activities to produce the functional eLSU fragments: 5.8S, LSUα, γ, LSUβ, δ, ζ and ɛ (Martínez-Calvillo et al., 2001) (Fig. 2g). Trypanosoma cruzi and Crithidia fasciculata also code for an eLSU rRNA fragmented into seven elements: 5.8S, 24Sα, S1, 24Sβ, S2, S6 and S4 in T. cruzi (Fig. 2f) (Hernández et al., 1988), and rRNAs 5.8S, c, d, e, f, g and j in C. fasciculata (Spencer et al., 1987).

Processing of the eLSU rRNA into several fragments has also been found in nonkinetoplastid eukaryotes (e.g. P. berghei and Plasmodium chabaudi blood stages and A. castellanii) (D'Alessio et al., 1981; da Silveira & Mercereau-Puijalon, 1983; Johansen et al., 1992) (Figs 2e and 3). Euglena gracilis has the most fragmented eLSU rRNA currently known with 14 mature molecules that result from the processing of 14 ITSs (ITS-1 to ITS-14, Fig. 5b) (Schnare et al., 1990).

Protein-coding regions within the rDNA coding region

A correlation has been observed between the virulence of E. histolytica isolates and the sequence composition of the rDNA circular molecule described above (Clark & Diamond, 1991; Zindrou et al., 2001). Virulence associates with the striking presence of genes encoding hemolysins (proposed as virulence factors) within and overlapping the rRNA coding sequence, but in opposite polarity. Three hemolysins overlap with the eLSU coding region, while the fourth (HLY4) is coded in the ITS-1 between the SSU and 5.8S rRNAs (Jansson et al., 1994) (Fig. 5a). In G. intestinalis, a gene coding for a 32-kDa flagellum antigen has been identified in the rDNA IGR that overlaps the 3′ region of the 28S rRNA (Fig. 2d). The motif that directs transcription of this gene seems to be a hybrid pol II/pol III promoter (Upcroft et al., 1990).

Unusual rDNA coding regions

Microsporidia are obligate intracellular eukaryotes that possess many prokaryotic characteristics in their rRNA genes (Weiss et al., 2001). The rDNA units are smaller than the standard eukaryotic size and lack the ITS-2; consequently, the 5.8S rRNA is fused to the 5′ region of the 28S rRNA, as is found in bacteria (Vossbrinck & Woese, 1986) (Fig. 7). Microsporidia are the only eukaryotes known to lack an individual 5.8S rRNA molecule (e.g. E. cuniculi and Vairimorpha necatrix (Vossbrinck & Woese, 1986; Peyretaillade et al., 1998). The relevance of this eukaryotic 5.8S–28S rRNA fusion is unknown. In addition to these characteristics, Nosema bombycis and Nosema spodopterae have an unusual rDNA gene organization (Huang et al., 2004; Iiyama et al., 2004; Tsai et al., 2005) because the LSU rRNA is coded and transcribed upstream to the SSU rRNA (Fig. 7b) in contrast to the almost universal order of the rRNA coding regions (Fig. 1).

Different rDNA genes may be found within an organism

As has been mentioned, the rRNA genes within one organism are generally conserved in the coding region with an occasional sequence variation in the IGRs and with little variation in the coding sequences. Sequence variability in the IGRs may result from sequence divergence or disparity in the number of repeated sequences, involved in both up- and downregulation of rDNA transcription. Therefore, the heterogeneous composition of rDNA units may influence rDNA expression. Sequence divergence in the coding region and/or IGR within the same organism has led to a classification of rDNA units. For example, different types of rDNA may be found in Paramecium, Y. lipolytica and the Apicomplexa group. A detailed description of this variability is included in Table 8.

View this table:
Table 8

Variability found in the rDNA

OrganismrDNA typesLocalization of variabilityReferences
Babesia bigemina2IGR and SSU coding regionReddy et al. (1991), Dalrymple et al. (1992)
Babesia bovisA, B and CITS and SSU coding regionLaughery et al. (2009)
Cryptosporidium parvumAt least 2ITS and coding regionLe Blancq et al. (1997), Spano & Crisanti (2000)
Plasmodium bergheiA and CIGR, promoter and coding regionsWaters et al. (1994)
Plasmodium falciparumA and SIGR, promoter and coding regionsWaters et al. (1994)
Plasmodium vivaxA, S and OIGR, promoter and coding regionsLi et al. (1997)
Theileria parva2ITS and LSU coding regionBishop et al. (2000)
Toxoplasma gondiiIGR and SSU coding regionFazaeli et al. (2000)
Oxytricha fallax2LSU coding regionDoak et al. (2003)
Paramecium tatraurelia6 MAC/4 MICIGRPreer et al. (1999)
Perkinsus andrewsiA and BIGR and coding regionsPecher et al. (2004)
Dunaliella salina2Group I intronsWilcox et al. (1992)
Saccharomyces cerevisiaeIGRSkryabin et al. (1984), Jemtland et al. (1986)
Candida albicans2Intron-containing and intronless LSUMiletti-González & Leibowitz (2008)
Yarrowia lipolytica2–5IGRvan Heerikhuizen et al. (1985), Fournier et al. (1986), Clare et al. (1986)
Nosema apisAt least 3IGRGatehouse & Malone (1998)
Nosema bombiAt least 2ITSO'Mahony et al. (2007)
  • * The number or names of the rDNA types for each species are shown.

  • IGR, intergenic region; MAC, macronucleus; MIC, micronucleus.

The 5S rDNA

The organization of the 5S rDNA is simpler than that of the rDNA. Most 5S rDNAs are found in tandem head-to-tail repeats consisting of a conserved ∼120-bp coding region and an IGR of variable size and sequence. An internal pol III promoter is present in all 5S rDNA studied to date (Schramm & Hernandez, 2002) (Fig. 1c).

The 5S rDNAs are found as tandem head-to-tail repeats

The 5S rDNA in T. cruzi, Trypanosoma brucei, T. vaginalis, Trichomonas tenax, C. fasciculata, Eimeria tenella and C. parvum is typically organized in tandem head-to-tail repeats. Tritrichomonas foetus has two types of 5S rDNAs, while P. falciparum has only three 5S rDNA copies in tandem, differing in the length of the IGRs. The main characteristics of the 5S rDNA tandem head-to-tail repeats of several organisms are described in Table 9.

View this table:
Table 9

Typical 5S rDNA organization

Organism5S rDNA typesLocalization of variabilityOtherReferences
Trichomonas tenaxA and BIGR and coding regionType B IGR palindromeTorres-Machorro et al. (2009)
Trichomonas vaginalisA and BIGRIGS 10-bp palindromeTorres-Machorro et al. (2006)
Tritrichomonas foetusA and BIGR, repeated sequences vs. ubiquitin geneTorres-Machorro et al. (2009)
Trypanosoma bruceiLenardo et al. (1985)
Trypanosoma cruzi6 Sp1 binding sites in IGRHernández-Rivas et al. (1992)
Cryptosporidium parvumIGRTaghi-Kilani et al. (1994)
Eimeria tenellaStucki et al. (1993)
Plasmodium falciparum3IGR lengthsShippen-Lentz & Vezza (1988)
Tetrahymena pyriformisIGRSome linked to ubiquitin genesGuerreiro et al. (1993)
Tetrahymena thermophilaIGR lengthsIGR 12 and 16-bp palindromesAllen et al. (1984)
  • * The number or names of the 5S rDNA types for each species are shown.

  • IGR, intergenic region.

The 5S rDNA may be interspersed with genes transcribed by any of the three RNA polymerases

The 5S rDNA has been found linked to the rDNA (transcribed by pol I) in an alternate distribution in T. gondii, Hansenula polymorpha, Perkinsus andrewsi and various Nosema species (Guay et al., 1992; Coss et al., 2001; Klabunde et al., 2002; Huang et al., 2004; Iiyama et al., 2004; Tsai et al., 2005; Liu et al., 2008) (Figs 2c and 7b). Two tandem 5S rDNA copies are linked to each repeated rDNA unit in Candida glabrata (Dujon et al., 2004). In contrast, the 5S rDNA is linked to the rDNA in opposite polarity in various yeast species, such as Torulopsis utilis, K. lactis and S. cerevisiae (Fig. 2b, Table 2). Two copies of the 5S rDNA are coded in the extrachromosomal DNA molecule in D. discoideum, in the same polarity as the two rDNA copies (Hofmann et al., 1993) (Fig. 4b). Finally, a S. cerevisiae 5S rDNA variant is found in five repeats of 3.6 kbp, located next to the rDNA tandem cluster locus, in the centromere-distal side (McMahon et al., 1984).

Some Trypanosoma species such as Trypanosoma vivax and Typanosoma rangeli have the 5S rDNA copies linked to the spliced-leader (SL) tandem repeated genes, transcribed by pol II. SL transcripts are necessary to process the mRNAs in kinetoplastids by a trans-splicing reaction (Simpson et al., 2006). A similar linkage has been found in other Euglenozoa such as Diplonema papillatum and Bodo caudatus. In Trypanoplasma borreli and Trypanosoma avium, the 5S rDNA is coded in opposite polarity relative to the SL gene (Table 10). Interestingly, the T. borreli SL can also be linked to 5S rRNA pseudogenes (with a truncated 5′ end) (Maslov et al., 1993). Some 5S rDNA units in T. pyriformis and T. foetus are associated with ubiquitin genes transcribed by pol II (Fig. 8b). Table 10 describes the relative polarity of the 5S rDNA linked to the genes transcribed by pol II.

View this table:
Table 10

5S rRNA gene linkage to pol II transcribed genes

OrganismPol II geneOrientationReferences
Trypanosoma vivaxSpliced leaderSenseRoditi et al. (1992)
Trypanosoma rangeliSpliced leaderSenseAksoy et al. (1992)
Bodo saltansSpliced leaderSenseSantana et al. (2001)
Bodo caudatusSpliced leaderSenseCampbell et al. (1992)
Diplonema papillatumSpliced leaderSenseSturm et al. (2001)
Herpetomonas spp.Spliced leaderSenseAksoy et al. (1992)
Trypanoplasma borreliSpliced leaderAntisenseMaslov et al. (1993)
Trypanosoma aviumSpliced leaderAntisenseSantana et al. (2001)
Euglena gracilisSpliced leaderSenseKeller et al. (1992)
Tritrichomonas foetusUbiquitinSenseTorres-Machorro et al. (2009)
Tetrahymena pyriformisUbiquitinSenseGuerreiro et al. (1993)
Figure 8

The 5S rDNA may be linked to pol II or pol III transcribed genes. (a) In Tritrichomonas foetus the 5S rDNA is linked to a multigenic ubiquitin family. (b) In Euglena gracilis the 5S rDNA is linked to the SL gene. (c) In Yarrowia lipolytica dicistronic genes consisting of a tRNA gene (pink) and a 5S rDNA (green) are dispersed in the genome. One tricistronic gene: Lys(CTT) tRNA–Glu(CTC) tRNA–5S rDNA is also found. These genes are transcribed from the pol III promoter of the tRNA gene. Dispersed, single 5S rDNAs are also found (green).

Some 5S rDNA copies in E. histolytica and one copy in Leishmania tarentolae are linked to tRNA genes (Shi et al., 1994; Clark et al., 2006), also transcribed by pol III. Interestingly, 48 of the 108 5S rDNA copies of Y. lipolytica produce pol III dicistronic transcripts: tRNA–5S rRNA hybrid molecules. The synthesis of an ∼200-nt transcript is driven by the tRNA pol III promoter, resulting in a transcription independent of the 5S rDNA-specific transcription factor, TFIIIA. The dicistronic transcripts, as well as a unique tricistronic transcript [Lys(CTT) tRNA–Glu(CTC) tRNA–5S rDNA] are post-transcriptionally processed to generate the typical mature RNA molecules: tRNAs and 5S rRNA (Acker et al., 2008) (Fig. 8c).

Nontandem 5S rDNA copies are found dispersed throughout the genome of some microbial eukaryotes. Some examples are A. castellanii (Zwick et al., 1991), Y. lipolytica and Schizosaccharomyces pombe (Tabata, 1981; Dujon et al., 2004). The 5S rDNA may also be found in extrachromosomal DNA. Noteworthy, ciliate organisms such as Oxytricha fallax have single 5S rDNA copies coded in macronucleus extrachromosomal molecules (Rae & Spear, 1978; Roberson et al., 1989). Moreover, about one million copies of the 5S rDNA are coded in linear minichromosomes flanked by telomeres in E. eurystomus (Roberson et al., 1989).

Concluding remarks

Ribosomes are complex organelles that require the intricate collaboration of three types of RNAs (rRNA, mRNA and tRNA) and >70 proteins for the synthesis of proteins. rRNAs must maintain their convoluted structural motifs in order to be functional. It is therefore not surprising that their sequence is highly conserved among related organisms and this similarity is gradually lost as organisms diverge. For this reason, sequence comparison of the SSU rRNA has been widely used in the field of molecular phylogeny (Van de Peer et al., 2000).

The ‘typical’ eukaryotic rDNA genomic organization was proposed >30 years ago, based on the analysis of the rDNA in higher eukaryotes (Long & Dawid, 1980). The tandemly repeated head-to-tail organization has been considered the standard for eukaryotic rDNA. Surprisingly, analyses of the genomic organization of ribosomal genes in microbial eukaryotes demonstrate that although some organisms do hold the typical rDNA configuration, the majority reveal unusual characteristics. As shown in this review, the eukaryotic rDNA may be arranged in a wide variety of genomic configurations, suggesting the existence of several regulatory mechanisms (probably species-specific) within a conserved rDNA regulatory context.

Reiteration is one of the most conserved rDNA characteristics. The rDNA copy number is extremely variable and appears to be highly regulated within species. Nevertheless, the total number of rDNA repeats does not always correlate with the rate of rRNA synthesis (French et al., 2003), implying that individual rDNA units may hold different epigenetic marks that result in variable transcriptional rates (Grummt et al., 2007). rDNA structure and transcription are also important in the establishment of the nucleolar structure, which also plays regulatory roles at the cellular level (Carmo-Fonseca et al., 2000). Dictyostelium discoideum, T. gondii and some yeast species have equal numbers of rDNA and 5S rDNA copies. This organization was considered coherent, as the pool for rRNA molecules was supposed to hold equimolar amounts of 18S, 5.8S, 28S and 5S rRNA mature molecules for the efficient synthesis of ribosomes (Prokopowich et al., 2003). However, rDNA and 5S rDNA are transcribed by different RNA polymerases with dissimilar transcription rates and number of transcriptionally open rDNA units. Therefore, the rDNA/5S rDNA dosage is not directly related to the stoichiometry of the total rRNA pool and expression process. The rDNA/5S rDNA dosage varies widely throughout evolution. The processes that allow the maintenance of the pool of rRNA mature molecules in appropriate stoichiometry must be a complex network including epigenetic, transcriptional, post-transcriptional and structural mechanisms that may vary according to the rDNA/5S rDNA dosage. Additionally, the location of the rDNA and 5S rDNA in the genome may be related to its expression and physiology. The chromosomal context, together with different chromatin environments, may be involved in the maintenance of gene copy number, recombination frequency, sequence conservation and transcription regulation of rDNA.

The organization of rDNA in extrachromosomal molecules may be associated with the cellular need for quick changes in the rDNA copy number under stress conditions. Some organisms have most of their rDNA in self-replicating extrachromosomal molecules, but retain additional copies in the chromosome probably as a backup. Interestingly, some organisms hold the totality of their rDNA in extrachromosomal molecules. It has been shown that the accumulation of extrachromosomal rDNA copies in old S. cerevisiae cultures affects cells' health. Therefore, cells that hold most or all rDNA copies extrachromosomally may have special mechanisms to allow for the accumulation of large rDNA minichromosomes without affecting the cell fitness. However, it is possible that yeasts lack this mechanism, resulting in cell damage when episomes accumulate.

Post-transcriptional processing of ITS-1 and ITS-2 is observed in almost all eukaryotes. However, some organisms possess additional ITS sequences in the 28S rRNA that generate fragmented rRNA molecules that maintain the core rRNA active elements in the mature ribosome. It has been found that some organisms possess additional sequences in variable regions internal to the SSU rRNA coding sequence that remain in the mature molecule, thus generating unusually large SSU rRNAs. It is interesting to note that the SSU rRNA has not been found as a fragmented molecule in nuclear genomes; in contrast, the fragmented eLSU rRNA could be regarded as 28S molecules that have processed their variable regions. The structure and functionality of these rRNAs in the ribosome may help to understand the importance of variable regions and the differences/restrictions between subunits.

The rDNA coding region of several microbial eukaryotes is interrupted by group I introns. Different transcriptional and post-transcriptional mechanisms are involved in the processing of introns and HE transcripts. Some C. albicans strains have heterogeneous rDNA populations with both intron-containing and intron-less rDNA units. Because no function related to rDNA expression has been proposed for group I introns, C. albicans may provide a good model to study the role (if any) of these introns in rRNA expression, processing and stability.

The linkage of 5S rDNA to a variety of tandem repeated families may be the result of homogenizing mechanisms responsible for concerted evolution (Drouin & de Sá, 1995). The finding that the 5S rDNA can be linked to all polymerase transcribed genes, coded alone in different chromosomal loci or coded in extrachromosomal molecules underscores the possibility of various mechanisms acting to regulate the expression of different types of 5S rDNAs. The simultaneous expression of pol I, pol II and pol III transcribed genes in a particular locus may alter the chromatin context as well as the availability of transcription factors in the proximity. Nevertheless, the significance of the linkage between multigenic families and 5S rDNA has not been studied. The presence of unlinked 5S rDNA copies is also interesting because the chromosomal context for each gene may affect its regulation.

Widespread rDNA characteristics present among eukaryotic supergroups as well as particular features predominating in some eukaryotic subgroups reflect the complexity of evolution. The typical tandem head-to-tail organization of the rDNA and 5S rDNA is found in all eukaryotic supergroups (Fig. 9), suggesting that the eukaryotic common ancestor held this organization. Later on, the evolutionary process probably led to a specialization and divergence of the rDNA structure, resulting in the different variants described here. Other features, such as the group I introns, were acquired by horizontal transfer and are therefore widespread among microbial eukaryotes (Fig. 9) (Sogin et al., 1986; Van Oppen et al., 1993).

Figure 9

Schematic phylogenetic tree describing the prevailing rDNA structures in subgroups of microbial eukaryotes. Phylogeny of eukaryotes is based on the six supergroup classification (Box 1). Conserved rDNA structures that predominate in eukaryotic subgroups are boxed in color. Some of the described characteristics may be shared by more than one subgroup.

Some particular rDNA characteristics are conserved among related species, suggesting that the common ancestor for each group held these traits before current speciation. Examples of this can be found in the unlinked differentially expressed rDNA units in Apicomplexa, the extralong SSU rRNAs in Amoebozoa and Foraminifera, the extrachromosomal rDNA and 5S rDNA in ciliates and Amoebozoa, the 5.8S–23S rRNA fusion in Microsporidia and both the SL–5S rDNA linkage and the eLSU fragmentation in Euglenozoa. The 5S rDNA nontandem organization as well as the linkage between 5S rDNA with the ribosomal cistron can also be seen as predominant in the Fungi group (Fig. 9). Particular traits such as the 28S rRNA fragmentation could have appeared more than once and independently, leading to non-Euglenozoa organisms containing fragmented rRNAs such as A. castellanii and Plasmodium. Distinctive characteristics shared among non-closely related species may represent phylogenetic evidence of yet unknown linkages among eukaryotic subgroups and species. Nevertheless, a thorough and integrated comparative characterization of rRNA genes in poorly or nonstudied eukaryotes may help to understand the diversity and relationship among different forms of life.

Many unanswered questions regarding the regulation of rRNA gene expression still remain, for example, the mechanism(s) that determine which rRNA gene copies will be transcriptionally and/or epigenetically active (Lawrence et al., 2004; Grummt, 2007) or the relevance of the genomic context that surrounds the rRNA genes. Finally, it should be pointed out that the rDNA organization is only one fundamental step in its regulation, because its expression is interrelated with most, if not all, of the cell's regulation levels (Paule & White, 2000; Schramm & Hernandez, 2002; Grummt et al., 2003).


This work was supported by grants IN214006 from Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica (PAPIIT), Universidad Nacional Autónoma de México (UNAM) and P45037-Q from Consejo Nacional de Ciencia y Tecnología (CONACYT), Mexico. A.L.T.-M. was supported by a scholarship from CONACYT Mexico.


  • Editor: Colin Berry


View Abstract