Cyanobacterial secondary metabolites have attracted increasing scientific interest due to bioactivity of many compounds in various test systems. Among the known structures, oligopeptides are often found with many congeners sharing conserved substructures, while being highly variable in others. A major part of known oligopeptides are of non-ribosomal origin and can be grouped into classes with conserved structural properties. Thus, the overall structural diversity of cyanobacterial oligopeptides only seemingly suggests an equally high diversity of biosynthetic pathways and respective genes. For each class of peptides, some of which have been found in all major branches of the cyanobacterial evolutionary tree, homologous synthetases and genes can be inferred. This implies that non-ribosomal peptide synthetase genes are a very ancient part of the cyanobacterial genome and presumably have evolved by recombination and duplication events to reach the present structural diversity of cyanobacterial oligopeptides. In addition, peptide synthetases would appear to be an essential part of the cyanobacterial evolution and physiology. The present review presents an overview of the biosynthesis of cyanobacterial peptides and corresponding gene clusters, the structural diversity of structural types and structural variations within peptide classes, and implications for the evolution and plasticity of biosynthetic genes and the potential function of cyanobacterial peptides.
non-ribosomal peptide synthetase
In the last two decades, a high number of cyanobacterial metabolites has been isolated and characterized from cultured strains and field samples. So far, more than 600 peptides or peptidic metabolites have been described from various taxa. The continuous and rising interest stems both from the surveillance of aquatic systems, especially where toxic compounds in mass developments — so-called blooms — are rising concerns of public health, and from various and diverse bioactivities of unique structures with potential pharmacological implications.
Cyanobacterial secondary metabolites represent a vast diversity of structures (Moore, 1996; Burja, 2001; Staunton & Weissman, 2001; Harrigan & Goetz, 2002) isolated from a variety of taxa and geographic origins. The occurrence and structures of secondary metabolites among the subsections have been evaluated recently by applying multivariate statistical analyses (Guyot, 2004). More than 80 structural archetypes of compounds have been defined, occurring in more than 30 genera of all five subsections (Boone & Castenholz, 2001) (corresponding to orders in other taxonomic schemes). To date, most compounds have been isolated from Oscillatoriales and Nostocales, followed by Chroococcales and Stigonematales, whereas very few metabolites are yet known from Pleurocapsales. However, this distribution reflects the availability of strains and exploitable biomass from natural habitats rather than the actual potential of genera in the respective subsections to synthesize secondary metabolites. For example, Lyngbya sp. (Oscillatoriales) and Microcystis sp. (Chroococcales) are easily collected or cultured in amounts that allow the isolation of compounds in the ppm range, whereas for Pleurocapsa this would be much more laborious and time consuming.
A major part of cyanobacterial secondary metabolites are peptides or possess peptidic substructures. The majority of these oligopeptides are assumed to be synthesized by NRPS (non-ribosomal peptide synthetase) or NRPS/PKS (polyketide synthase) hybrid pathways on the basis of particular structures that are not achievable by ribosomal peptide synthesis. Recently, however, a biosynthetic pathway for a cyclic peptide with thiazole moieties, patellamides A—C, has been shown to start with a ribosomally synthesized peptide that is modified post-translationally (Schmidt, 2005). A similar biosynthetic pathway might produce other types of cyanobacterial peptides, as will be discussed below.
The present review presents a short introduction to NRPS principles, followed by an overview of currently known genes and gene clusters for peptide biosynthesis in cyanobacteria. In the second part, an overview is given of the structural diversity of cyanobacterial peptides that will be grouped in biologically meaningful classes. Thirdly, we review the data on peptide distribution in cyanobacterial taxa and in diverse habitats and discuss the hypotheses on the function of these metabolites.
Non-ribosomal peptide synthetases — a short introduction
The non-ribosomal peptide biosynthetic system operates nucleic acid-free at the protein level (Finking & Marahiel, 2004; Sieber & Marahiel, 2005). The specific condensation of amino acids and related carboxyl containing compounds is directed by protein templates, and each biosynthetic step generally requires a protein module (Weber & Marahiel, 2001; Schwarzer, 2003; Finking & Marahiel, 2004). A module is composed of catalytic domains with a minimal module consisting of an adenylation domain for amino acid activation, a thiolation domain for transfer of activated intermediates, and a condensation domain (von Döhren, 1997; Marahiel, 1997; Stachelhaus, 1998). Domains and modules are assembled at the gene level into complex structures, gene clusters, which are translated into multifunctional proteins or multi-enzyme complexes, and post-translationally modified by addition of 4′-phosphopantetheine to the thiolation domain.
Non-ribosomal peptide synthetase genes generally encode multi-module proteins, but genes encoding single modules or domains can be found as well. For a single minimal module, some 3–3.5 kbp (kilobase pairs) of genetic sequence is required, thus making some multi-module NRPS genes the largest known genes (Finking & Marahiel, 2004).
The recognition of domains in silico is generally easily achieved by a blast search and a further characterization by assigning conserved, domain-specific core motifs in the protein sequences (Konz & Marahiel, 1999).
Main functional domains
Adenylation domains catalyze the specific activation of carboxyl groups of amino acids, imino acids or hydroxy acids, as well as various carboxylic acids. Adenylation domains are the primary specification step for the amino acid sequence of the completed peptide. This is achieved by the geometry of a binding pocket in the enzyme that only allows a specific amino acid to enter into the catalytic site. An analysis of the phenylalanine binding pocket of the activating domain of the first module of gramicidin S synthetase has led to an amino acid contact residue code permitting the prediction of substrates in NRPS adenylation domains (Stachelhaus, 1999; Challis, 2000; Lautru & Challis, 2004). This specificity conferring code has been confirmed in a variety of correlations of NRPS genes with known peptide product structures, and may make it possible to predict unknown products (Challis & Ravel, 2000). Although this non-ribosomal code is a good predictor of substrate selection, it is not the only mechanism of control. Thiolation, as well as condensation domains, are also involved in the specific formation of a particular amino acid sequence (von Döhren, 1999; Lautru & Challis, 2004). Further support for this has been provided by studies on heterologously expressed adenylation domains on which amino acid specific adenylation can be tested by an ATP-PPi-exchange assay (Dieckmann, 1995). Examples are BarD, which incorporates l-leucine but activates 3-chloro-leucine and valine as well (Chang, 2002). The leucine specific adenylation domain of McyB of Microcystis aeruginosa activates isoleucine and valine as well, but these have never been observed in microcystins (Sielaff, 2004). Likewise, the first adenylation domain of NosA activates Val, Ile and Leu when it is expressed in Escherichia coli, but Leu is not found in nostopeptolide (Hoffmann, 2003).
In cyanobacteria, about 200 adenylation domains have been identified in nucleotide sequences so far. They are generally integrated in NRPS systems or represent acyl-CoA synthetases. Upon alignment, 10 core motifs (A1–A10) can be easily identified in most cyanobacterial adenylation domains representing consensus sequences that can also be found in fungal systems (Konz & Marahiel, 1999).
The key role of these domains is in the transport of intermediates, which requires specific interaction with the activating adenylation domain and the corresponding condensation domains for aminoacyl and peptidyl elongation cycles. In cases of intermediate modifications, the transport also requires interactions with epimerization domains, methyltransferase domains, oxidation domains, reduction domains, or thioesterase domains in terminating cyclization reactions (Weber, 2000).
Aminoacylation or acylation of the ‘swinging arm’ cofactor 4′-phosphopantetheine is considered the covalent transport principle in NRPSs and PKSs.
These domains are generally identified by the conserved 4′-phosphopantetheine attachment site as signature sequence, which is post-translationally modified by protein- phosphopantetheinyl transferases (see below).
The condensation domain of about 450 amino acids has been functionally characterized in the gramicidin S/tyrocidine synthetase systems (Stachelhaus, 1998). The current functional interpretation proposes, by analogy to the ribosomal system, that an aminoacyl and a peptidyl site (A-site and P-site) receive the activated intermediates (von Döhren, 1999). The aminoacylated carrier proteins (thiolation domains) resemble charged tRNAs, and the condensing site, the peptidyl transferase region.
As a prototype of a C-domain, the crystal structure of an isolated C-domain of the vibriobactin biosynthetic system, VibH, has been determined (Keating, 2002). The VibH-structure revealed a novel topology, and is a monomer consisting of two subdomains. Alignments confirm the structure to be representative of the NRPS condensation domains, the related epimerization domains, and cyclocondensation domains. The downstream carrier, which transports the initiating acyl residue or the peptidyl intermediate, will bind to the C-terminal face of this domain with the pantetheinyl arm extending into the solvent channel. The upstream carrier with the acceptor compound, usually an aminoacyl residue generally binding in trans to the condensation domain, would approach from the opposing open end of the domain, and both pantetheinyl arms would extend into the solvent channel to facilitate peptide bond formation (Keating, 2002).
A survey of about 160 cyanobacterial condensation domains reveals that their core sequences are very similar to those derived from Bacillus domains. Upon alignment by clustal, domains group into functionally related types, and not into subsections or genera. This has been observed before and correlates with similar analysis of adenylation and thiolation domains (von Döhren, 1999). Obvious clusters are the related heterocyclization and epimerization domains and functionally related domains of systems producing homologous peptides.
These domains catalyze the peptide bond formation and cyclization of cysteine, serine or threonine side chains to respective heterocycles. This cyclodehydration reaction requires either the N-acyl-aminoacyl or the respective peptidyl intermediate. This domain type was first identified in the bacitracin system (Konz, 1997) and the reactions have been studied in detail in the vibriobactin, pyochelin and epothilone systems (Patel, 2003). Peptides containing heterocycles are fairly common among cyanobacteria, e.g. in various Cys containing cyclopeptides, barbamide, curacins or cyclamides. The respective domains from the barbamide and curacin systems are known, as well as similar domains in several orphan biosynthetic clusters of Anabaena PCC 7120, Nostoc punctiforme ATCC 73102 and Crocosphaera watsonii. Thiazole formation, however, is not restricted to NRPS pathways as has been shown for patellamides (Schmidt, 2005).
Bacterial thioesterase domains with a size of about 280 amino acids catalyze termination steps of the biosynthetic process, cyclizations involving peptide or ester bonds, or release by hydrolysis (Shaw-Reid, 1999; Trauger, 2000; Kohli, 2002; Sieber & Marahiel, 2005). The termination reaction is catalyzed upon binding of a carrier protein with the final peptide intermediate, attached as thioester, by a conserved active site Ser residue (Bruner, 2002; Tseng, 2002).
Almost all cyanobacterial systems characterized so far, including PKS systems, contain this terminating domain. A comparative multiple sequence alignment analysis (with clustal) of the currently available domains reveals that microcystin (McyC) and nodularin (NdaB)-linked domains are a special group, presumably due to the unusual cyclization reaction catalyzed between Adda and the last amino acid in the linear peptide sequence (data not shown).
Integrated modifying domains
Epimerization domains and amino acid racemases
Epimerization domains largely resemble condensation domains and can be identified by the slightly different signature sequences (Konz & Marahiel, 1999). Their functions are to epimerize aminoacyl and peptidyl intermediates at the thioester stage. This reaction is reversible, and these intermediates are thus in an equilibrium state between both isomers. The following reaction, usually a condensation reaction, is involved in the control of stereospecificity to select the d-isomer (Stachelhaus & Walsh, 2000; Luo, 2001).
Not all d-configured amino acids, however, are transformed by this reaction. Some adenylation domains specifically accept d-residues, which have to be supplied by corresponding amino acid racemases. This is illustrated in microcystin biosynthesis, where d-Glu is supplied as a direct precursor, whereas Ala is epimerized by the respective module (Tillett, 2000; Sielaff, 2003).
A formylation domain was first identified according to sequence comparison in the anabaenopeptilide biosynthetic cluster (Rouhiainen, 2000). The N-terminal region of ApdA shows similarities to co-substrate formyl tetrahydrofolate-dependent methionyl-tRNA formyltransferases. The protein region following the circa 400 amino acids shows similarities to condensation domains and is linked to the first adenylation domain. Other formylated non-ribosomal peptides include linear gramicidin (Kessler, 2004).
N-methylated peptide bonds in non-ribosomally formed peptides originate from N-methyl transfer to thiol-attached amino acids by N-methyl transferase domains. This was first demonstrated in fungal systems by sequencing the enniatin synthetase gene. N-methyl-transferase domains are integrated in the adenylation domain between the core motifs A8 and A9 (Haese, 1993; Patel & Walsh, 2001). This domain with a size of about 450 amino acid residues (55 kDa) shares some sequence similarities with a heterologous family of S-adenosyl-l-methionine (SAM)-dependent methyltransferases, including DNA methyltransferases (Velkov & Lawen, 2003). N-methylation does not seem to be obligatory for the following condensation reaction (Glinski, 2001).
A comparative analysis of adenylation domains containing N-methyl-transferase inserts with homologous domains without inserts reveals a high sequence identity, also in the regions adjacent to the insert between core sequences A8 and A9. This implies that N-methylation can be regarded as a function to be gained by domain insertion, or lost as well (Schauwecker, 2000).
These domains of about 200 amino acids with homology to NAD-binding proteins are inserted in adenylation domains between the core motifs A8 and A9. Examples include myxobacterial systems forming epothilone (Polyangium cellulosum, EpoB; Julien, 2000), myxothiazol (Stigmatella aurantiaca, MtaC and MtaD; Silakowski, 1999) or tubulysin (Angiococcus disciformis, TubB; Sandmann, 2004). Respective homologous sequences can be found in two orphan NRPS genes in Anabaena PCC 7120 and Crocosphaera watsonii.
In epothilone biosynthesis, this domain catalyzes the oxidation of the methylthiazolinyl-intermediate to the methylthiazolylcarboxy-intermediate (Chen, 2001). In the barbamide biosynthetic system, no oxidation domain is present in the respective adenylation domain of BarG, although the peptide contains a terminal thiazole. It has been suspected that BarI and BarJ are involved in oxidative decarboxylation and conversion of thiazoline (Chang, 2002).
Various non-ribosomal peptides have been known to contain a reduced C-terminal carboxyl group, and respective terminal alcohol functions are also found in polyketide structures. These originate by a two-step reduction via the aldehyde catalyzed by an NADPH/NADH dependent catalytic domain, thus releasing the final carrier-bound thioester intermediate (Gaitatzis, 2001; Schracke, 2005).
Reductase domains of about 400 amino acids in size show significant similarity to several related proteins, such as nucleoside-diphosphate-sugar epimerases, flavonol reductase/cinnamoyl-CoA reductase and other NADPH dependent enzymes. In nostocyclopeptides, the final peptidyl intermediate is reduced to a linear aldehyde cyclizing with the N-terminal tyrosine to form a stable imine bond (Becker, 2004).
Phosphopantetheine-protein transferases (PPTs) are well known in bacterial systems, and their genes are often contained within biosynthetic clusters (Lambalot, 1996; Walsh, 1997). Sfp, a PPT located in the surfactin cluster of Bacillus subtilis (Quadri, 1998), is well characterized. The 26-kDa protein modifies a variety of carrier proteins and domains, including acyl carrier domains and aryl carrier domains (Reuter, 1999). It is thus possible directly to charge CoA-thioesters directly onto apo-enzymes to investigate, for example, the specificity of modification and condensation reaction or to generate new products in vitro (Weinreb, 1998; Belshaw, 1999; Sieber, 2003).
A blast survey of cyanobacteria based on the Sfp-structure shows PPTs in NRPS containing strains (Anabaena PCC7120, Anabaena variabilis ATCC 29413, C. watsonii WH 8501, N. punctiforme PCC 73102 and Trichodesmium erythraeum IMS101), but also in NRPS-free strains (Gloeobacter violaceus PCC 7421, Prochlorococcus marinus SS120, Synechococcus elongatus PCC 6301 and Synechocystis sp. PCC 6803).
Methylation of hydroxyl groups (O-methyltransferases) or methylene groups is catalyzed by autonomous methyltransferases, which can be readily identified by a set of sequence motifs involved in S-adenosyl-methionine (SAM) binding. The role of McyJ in O-methylation of Adda in microcystin biosynthesis has been confirmed by gene disruption, which led to the production of des-methyl-Adda-microcystin (Christiansen, 2003). O-methyl transferases involved in microcystin formation share more than 80% identity, whereas other methyl transferases like ApdE or an unidentified Prochlorococcus enzyme, both of unknown function, share only about 40% of the amino acid residues around the SAM-binding region, as can be inferred from a blast search with McyJ. This group of modifying enzymes is thus fairly diverse with respect to the substrates encountered.
These types of enzymes resemble Zn-dependent dehydrogenases and have been found in cyanobacteria in the nostopeptolide cluster and the nostocyclopeptide cluster, where they are involved in methyl-proline formation from leucine together with delta1-pyrroline-5-carboxylic acid reductase (P5C reductases; Luesch, 2003). Similar enzyme pairs are found in as yet unidentified clusters of N. punctiforme and A. variabilis.
Chlorine is found in 22% of cyanobacterial metabolites (Guyot, 2004), but little information is available about respective halogenases. So far, vanadium haloperoxidases known from bromination of metabolites from marine algae have not been found in cyanobacteria. Enzymes involved in the halogenation of aromatic side chains, especially of Tyr and Trp, contain an NAD binding motif and are known from Pseudomonas, Xanthomonas, Myxococcus and Streptomycetes.
The putative halogenase, ApdC, within the anabaenopeptilide synthetase cluster of Anabaena 90, is presumably involved in the chlorination of a tyrosine residue, but has so far no cyanobacterial homologs (Rouhiainen, 2000).
A set of enzymes of the barbamide biosynthetic cluster is involved in leucine chlorination, BarB1, BarB2 and BarC (Chang, 2002). They show similarities to Phytanoyl-CoA dioxygenase, belong to a group of putative 2-oxoglutarate iron-dependent halogenases and have also been identified in various strains of Oscillatoria spongeliae from the marine sponge Dysidea (Lamellodysidea) herbacea, known as a source of halogenated peptides (Faulkner, 1994).
Non-integrated thioesterases are involved in deacylation of pantetheine thiols, activating NRPS systems primed with acetyl CoA, or reactivating mischarged and thus stalled carrier domains (Yeh, 2004; Yeh, 2004; Sieber & Marahiel, 2005). Only McyT in the microcystin cluster of Planktothrix is directly linked to NRPS genes, though it is absent in other microcystin clusters. Six other similar thioesterases found so far in cyanobacterial genomes are not parts of the detected orphan NRPS biosynthetic clusters.
Non-ribosomal peptide synthetase and polyketide synthase genes in cyanobacteria
The high diversity of cyanobacterial secondary metabolites and their chemical structures indicates the presence of diverse NRPS and PKS gene clusters in cyanobacterial genomes, though only a minor part has been sequenced so far. A peculiarity of cyanobacterial secondary metabolite biosynthesis is the frequently observed mixing of NRPS and PKS genes, often within a single open reading frame (see below).
One approach to estimate the potential of secondary metabolite biosynthesis of cyanobacterial taxa is the search for NRPS and PKS genes by degenerate primer PCR. Conducting such a study, Christiansen (2001) confirmed the presence of NRPS genes in 75% of 146 axenic cyanobacterial strains of all subsections. Nonetheless, no homologous genes were detected in a number of genera, mainly in the Chroococcales (Cyanothece, Gloeobacter and Gloeothece and the genetically diverse Synechococcus and Synechocystis strains). A similar analysis has been carried out for stromatolite communities (Burns, 2005) where diverse PKS gene fragments were identified. NRPS genes have been identified in a symbiotic Prochloron strain that could not be cultivated. This indicates that some metabolites that have been attributed to the ascidian host may in fact be produced by the symbiont (Schmidt, 2004).
Hence, it is reasonable to assume a wide distribution and a high diversity of NRPS and PKS clusters in cyanobacterial genomes. The increasing number of completed genomic sequences is also valuable for the study of natural product biosynthesis.
Genomic sequence data
A wealth of sequence data on cyanobacterial NRPS/PKS genes is available today, although the published sequences may represent only a small part of cyanobacterial genes for secondary metabolite synthesis. Fourteen complete or nearly complete genomes are available to date (Table 1).
NRPS genes in cyanobacterial genomes. The numbers of genes containing at least one condensation domain (COG1020 or pfam00668) and the genome size in Mbp are given. In unfinished genomes, numbers are estimates
It is also remarkable that especially species with small genomes like Synechocystis do not contain NRPS/PKS genes, whereas large genomes like Nostoc or Crocosphaera contain numerous clusters. In such genomes, NRPS/PKS genes may constitute more than 5% of the genomic sequence, comparable to prominent actinobacteria (Streptomyces clavuligerus and Streptomyces avermitilis), firmicutes (Bacillus subtilis) or myxobacteria (Myxococcus xanthus, Sorangium cellulosum).
For most NRPS clusters in genomic sequences, however, a peptide product has not been identified. Only for N. punctiforme ATCC 29133 (syn. PCC73102) could a gene cluster be identified as the nostopeptolide cluster (nosA-D) previously described from another Nostoc strain (Hoffmann, 2003) by sequence homology and the chemical detection of the peptide (Hunsucker, 2004). Both clusters are nearly identical except for an epimerization domain in NosC of Nostoc ATCC 29133 that is lacking in Nostoc GSV224. For all other NRPS clusters, no product has been identified so far and the corresponding gene clusters can be considered orphan clusters.
Considering the high number of individual NRPS metabolite pathways, it is likely that the number of clusters inherited in a single cyanobacterial genome has a limit. On the basis of genome sequences, a number of three to five NRPS or NRPS/PKS clusters seems to be exceeded only rarely. This corresponds well with to up to four peptide classes detected in single colonies of Microcystis (Welker, 2004a) or strains of Planktothrix (Welker, 2004b), whereas at least a twice the number of peptide classes are produced by both genera as a whole.
Known cyanobacterial non-ribosomal peptide synthetases
Cyanobacterial NRPS and NRPS/PKS gene clusters. Gene and domain size are given on a consistent scale. For respective references see text.
Microcystin and nodularin synthetases
The first NRPS gene characterized from M. aeruginosa PCC7806 was a gene from the microcystin biosynthetic cluster (Dittmann, 1997). This cluster has since been characterized in detail in Microcystis (Nishizawa, 1999, 2000; Tillett, 2000), Planktothrix (Christiansen, 2003) and Anabaena (Rouhiainen, 2004). The 10 genes of the mcy-cluster code for a mixed polyketide/peptide synthetase system accounting for 44 of the 48 expected biosynthetic reactions involved.
The supply of two direct precursor amino acids of the product, d-Glu and N-Me-d-Asp, and the origin of phenylacetate has not been determined. The amino acid racemase McyF included in the cluster is apparently not involved in the process (Sielaff, 2003). The function(s) of McyI, a phosphoglycerate dehydrogenase homologue, are also unclear. Further, the mechanism for the formation of the important dehydro-Ala residue by dehydration of a seryl-intermediate is not known. By analogy with heterocyclization domains, a special condensation–dehydration has been suspected, but evidence is still missing. An ABC transporter, McyH, proved to be essential for microcystin production, linking export and synthesis (Pearson, 2004). Similar genes for export proteins that are usually required for the biosynthetic processes have been found in other NRPS systems (von Döhren, 2004; Finking & Marahiel, 2004; Sieber & Marahiel, 2005). Though essential for NRPS, the 4'phosphopantetheine protein transferase gene (PPT) was not part of the cluster.
The absence of a PPT gene or genes involved in precursor supply has parallels in NRPS clusters described in various prokaryotes and eukaryotes. Direct precursors utilized by NRPS systems are often primary metabolites and do not need to be provided by the respective cluster. On the other hand, it has frequently been observed that genes for the biosynthesis of precursors are cluster constituents, as the PKS system providing Adda for microcystin.
Cloning and sequencing of the microcystin biosynthetic clusters from Microcystis (Chroococcales), Planktothrix (Oscillatoriales) and Anabaena (Nostocales) revealed a highly conserved set of multidomain proteins accounting for the same basic reaction steps (Tillett, 2000; Christiansen, 2003; Rouhiainen, 2004). Differences in the clusters have been found with respect to the arrangement of genes, the localization and orientation of promoter regions and the content of genes not directly involved in the peptide assembly. However, the structural organization of the biosynthetic NRPS/PKS genes, including their modular arrangement, has been conserved (Fig. 1). A sequence analysis comparing key regions of the three microcystin synthetases from the different genera with the respective 16S rRNA gene sequences and a fragment of the DNA-dependent RNA polymerase (rpoC1) indicates the co-evolution of the complete gene set of these synthetases in different subsections (Rantala, 2004). This hints at an ancient existence of complete sets of biosynthetic genes, predating the eukaryote lineage. A comparison with nodularin biosynthetic genes supports the close relation of these systems and suggests that the nodularin biosynthetic cluster evolved from the microcystin cluster by domain deletion (Moffitt & Neilan, 2004; Rantala, 2004). Indeed, the two amino acids following the dehydro-residue in position 3 in microcystins are missing in nodularin. The respective modules corresponding to parts of McyA and McyB are lacking, and the remains are fused into the two-module synthetase NdaA. All other genes have orthologues in the microcystin cluster.
Non-producing strains of the various subsections or genera generally lack the complete mcy cluster, although rare exceptions are known (Kaebernick, 2001; Kurmayer, 2004).
To explain the patchy occurrence of microcystins and other peptides, horizontal gene transfer has been discussed as a possible mechanism for the distribution of biosynthetic clusters. Horizontal gene transfer is well known within the frame of pathogenicity islands that often contain biosynthetic clusters (Dobrindt, 2004; Hochhut, 2005). The uptake of such DNA fragments dramatically increases the extent of pathogenicities and thus the range of host interactions. Another mechanism of diversity increase is the horizontal gene transfer of fragments of biosynthetic clusters like domains or sets of domains, which may be acquired by DNA uptake followed by recombination. Such modifications have been documented in the microcystin clusters from M. aeruginosa (Tanabe, 2004) as supposed genetic exchange within a species, but have also been conducted between diverse prokaryotes (Lopez, 2003) based on a careful analysis of the epothilone cluster.
The presence of transposase genes close to all three mcy-clusters is intriguing in this respect. Several insertion sequence (IS) elements have been described in cyanobacteria, including M. aeruginosa (Mlouka, 2004).
Anabaenopeptilide (cyanopeptolin) synthetase
Anabaenopeptilides (Fujii, 1996) are members of the cyanopeptolin class of cyanobacterial peptides (see below). The anabaenopeptilide synthetase gene cluster from Anabaena 90 contains three NRPS genes (apdA, B and D) and a putative halogenase (apdC) thought to be involved in chlorination of a Tyr residue (Rouhiainen, 2000).
Two remarkable features of the anabaenopeptilides are N-formylation and the unusual amino acid 3-amino-6-hydroxy-2-piperidone (Ahp). The initiating reaction, formylation of a Gln residue, is carried out by a formyl-transferase domain, first described in this system.
Gln is also the predicted substrate of the adenylation domain in position 2 (see section below on Cyanopeptolins), which is proposed to be linked to the nitrogen of the adjacent Thr by a new type of domain inserted into the Thr activating A-domain between the motifs A8 and A9, then generating Ahp. This insert has about 30% amino acid identity with protein arginine N-methyltransferases, but so far it has not been found in any other NRPS systems.
In addition, there are two genes of yet unclear functions, a methyltransferase (apdE) and a putative acyl carrier protein reductase gene (apdF). The methyltransferase gene shows 40–43% identity to genes of unknown insect and sponge symbionts and a similar SAM-binding protein in Prochlorococcus marinus. The microcystin biosynthesis-associated O-methyl transferase, McyJ, of Anabaena 90, M. aeruginosa and Planktothrix agardhii all have all only 27% identity when compared to ApdF.
Nostopeptolides are produced by the terrestrial cyanobacterium Nostoc sp. GSV224 and are branched acylated octapeptides with a heptapeptide lactone structure (Golakoti, 2000). An unusual component is leucyl acetate, which is derived from a leucyl intermediate by acetate addition, thus linking NRPS and PKS systems directly. The ring structure consists of 7 amino (imino) acids and one acetate unit. The nos gene cluster contains three NRPS genes (nosA, c and D), one PKS gene (nosB) for the acetate insertion and two genes, zinc-dependent long-chain dehydrogenase (nosE) and a delta(1)-pyrroline-5-carboxylic acid reductase (nosF), involved in the formation of 4-methyl-proline (Luesch, 2003), as well as an ABC transporter (nosG; Hoffmann, 2003). The unassigned orf5 encoding a 265-amino acid protein has been found in other NRPS clusters as well (nostocyclopeptide cluster and an orphan cluster in A. variabilis ATCC 29413). A detailed analysis of the amino acid binding sites of the peptide synthetases showed the co-linearity of the protein template with the peptide sequence. The first adenylation domain expressed as fragment in E. coli showed a relaxed specificity, accounting for the presence of Val or Ile in nostopeptolide A and B. Activation of additional analogues like Leu, which are not incorporated in the peptide, indicated additional control mechanisms.
Nostocyclopeptides are cycloheptapeptides produced by Nostoc ATCC 53789, isolated from a lichen (Golakoti, 2001). Interestingly, this Nostoc strain does not contain nostopeptolides, but it does produce a set of more than 25 cryptophycins, also produced by Nostoc GSV224. However, the respective 16S rRNA genes differ by 2.8%, reflecting a significant genetic distance. The 33-kb cluster contains two NRPS genes (ncpA and B) in a first operon, which assemble the peptide (Tyr-Gly-DGln-Ile-Ser-mPro-Leu/Phe)-S-NcpB. In a unique termination reaction, this peptide is reduced by the terminal reductase domain to a linear aldehyde that is subsequently captured intramolecularly by the amino group of the N-terminal amino acid residue tyrosine to form a stable imine bond (Becker, 2004). The second operon contains five additional genes, with ncpC, ncpD and ncpE being orthologues to the respective nos-genes involved in methyl-Pro supply, NcpF resembling an ABC-transporter (77 kDa) and the peptidase NcpG with homologies to D-amino acid specific hydrolases. A putative transposase is located downstream of ncpA.
Barbamide (Orjala & Gerwick, 1996) is one of about 200 bioactive cyanobacterial metabolites of marine origin (Burja, 2001; Gerwick, 2001), many of which have been isolated from Lyngbya majuscula samples. In the barbamide biosynthetic cluster (bar), 12 genes were identified and are transcribed in two coinciding polycistronic mRNAs (Chang, 2002). The biosynthesis of barbamide has several unique features, including a trichloro-Leu as starter unit, which is deaminated, extended by a diketide with E-double bond formation, and heterocyclization and decarboxylation of the terminal Cys.
The chlorination of Leu has been proposed to occur at the thioester intermediate level (attached to the carrier protein BarA) in a complex involving BarB1, BarB2 and BarC (Sitachitta, 2000a). The reaction is likely to be similar to the chlorination of Thr in syringomycin biosynthesis, thought to be catalyzed by the related proteins SyrB2 and SyrC (Guenzi, 1998). The substrate binding pocket of the respective stand-alone adenylation domain BarD has been shown to accept Leu, 3-chloro-Leu and Val. The BarE adenylation domain specifically accepts 3-chloro-Leu, but it is not clear whether 3-chloro-Leu is a free intermediate, or is channelled somehow between BarA and BarE (Chang, 2002).
Curacins (Gerwick, 1994; Yoo & Gerwick, 1995) are polyketides with a single cysteine converted to thiazolidine and are produced by strains of Lyngbya majuscula. Curacin A is a potent cancer cell toxin interacting with the colchicine drug binding site on microtubules (Wipf, 2004). The 64-kb cluster contains 14 genes, including eight monomodular PKS and one PKS-NRPS hybrid (CurF) with a heterocyclization domain, a Cys activating domain and a thiolation domain (Chang, 2004). Preceding the NRPS module is a unique gene cassette that contains an HMG-CoA synthase likely responsible for formation of the cyclopropyl ring. A highly unusual feature of CurA is three tandem acyl carrier proteins, followed by an adjacent module of autonomous domains (CurB-E). This particular region is similar to another Lyngbya polyketide, jamaicamide (Edwards, 2004).
The lyngbyatoxins are potent skin irritants with a prenylated indolactam structure derived from Val and Trp (Edwards & Gerwick, 2004). The biosynthetic gene cluster cloned from a field sample of Lyngbya majuscula spans 11.3 kbp and encodes for a two-module NRPS (LtxA), a P450 mono-oxygenase (LtxB), an aromatic prenyltransferase (LtxC) and an oxidase/reductase protein (LtxD). LtxC has been expressed in E. coli and shown to catalyze the transfer of a geranyl group to (–)-indolactam V as the final step in the biosynthesis of lyngbyatoxin A.
Ribosomal peptide synthesis of complex peptides
Though the majority of cyanobacterial peptides have been shown to be synthesized non-ribosomally, the recent characterization of the patellamide biosynthesis cluster and its heterologous expression (Long, 2005; Schmidt, 2005) indicates that complex and modified peptides may nonetheless be synthesized independently of NRPS enzymes.
Patellamides, pseudosymmetrical cyclo-octapeptides with each substructure having the sequence thiazole-nonpolar amino acid-oxazoline-nonpolar amino acid, are moderately cytotoxic and reverse multidrug resistant. In patellamides, the primary sequence is encoded in a rather small gene, patE, which has been identified by a tblastn search of the draft genome sequence, querying for all eight possible peptides that could lead to the formation of the cyclic structure. This gene contains both octapeptide sequences of patellamides A and C. The translated peptide of 71 amino acids is processed by proteolytic cleavage, cyclization and heterocyclization. Respective genes, a protease (patA), a possible adenylating enzyme-hydrolase hybrid (patD) and an oxidoreductase-protease hybrid (patG), immediately surround patE. These genes and the organization of the cluster are reminiscent of the lantibiotic and microcin biosynthetic machinery, which has been characterized in other bacteria (Garneau, 2002; Rebuffat, 2004; Chatterjee, 2005). A similar cluster has been found in the genome of Trichodesmium erythraeum IMS101 (Schmidt, 2005).
Structural diversity of cyanobacterial peptides
To date, some 600 cyanobacterial peptides have been described. New peptide structures have been given names that are not included in a naming system and thus many structurally similar peptides have names that do not reflect these similarities. Peptide names eventually chosen by the authors often refer to the taxon from which the new compound has been isolated or to the geographic locality where the sample was taken from (e.g. micro-, anabaeno-, kasumig-, banyas-) combined with suffixes referring to structural properties (e.g. -peptin, -peptilide, -cyclin, cyclamide). Thus, for clearly confined groups of similar peptides, a multitude of names can exist. For example, the peptides aeruginopeptin 917S-C (Harada, 2001), anabaenopeptilide 90-A (Fujii, 1996), cyanopeptolin S (Jakobi, 1995), symplostatin 2 (Harrigan, 1999), hofmannolin (Matern, 2003a), microcystilide A (Tsukamoto, 1993), micropeptin 88-A (Ishida, 1998a), nostocyclin (Kaya, 1996), oscillapeptilide 97-B (Fujii, 2000), oscillapeptin F (Itou, 1999a), scyptolin A (Matern, 2001), somamide A (Nogle, 2001) and tasipeptin A (Williams, 2003) are all cyclic depsipeptides of one peptide class – called cyanopeptolins in this review. The suffixes to the peptide names refer to the strain number (e.g. anabaenopeptilide 90-A from Anabaena 90; Fujii, 1996), the origin of a bloom sample (e.g. micropeptin T from lake Teganuma; Kodani, 1999), or the mass (e.g. micropeptin SF995; Banker & Carmeli, 1999), where the letters SF refer to the central pond of Tel Aviv Safari Park and 995 to the mass in Da), or are given in alphabetical order (e.g. anabaenopeptins A through K).
Part of the diversity of names for similar peptides is attributable to the nearly simultaneous publication of peptide structures. This is the case for anabaenopeptin A (Harada, 1995), ferintoic acid A (Williams, 1996) and oscillamide Y (Sano & Kaya, 1995), for example. In one case, the same name was given to two different structures that had been isolated from P. agardhii and published in the same year: anabaenopeptins G with one molecule having a mass of 908.5 Da and an amino acid sequence of [Tyr-MIle-Hty-Ile-Lys]-CO-Arg (Erhard, 1999) and the other having a mass of 929.5 Da and an amino acid sequence of [Ile-MHty-Hty-Ile-Lys]-CO-Tyr (Itou, 1999b). The opposite has occurred, too: aeruginopeptin 917S-A (Harada, 2001) has exactly the same structure as the previously published microcystilide A ([Tyr-Ahp-Leu-MTyr-Ile-O-Thr]-Gln-Hpla; Tsukamoto, 1993). Other misleading overlaps can arise with different types of peptides produced by other organisms like microcin SF608 (Banker & Carmeli, 1999), an aeruginosin from a Microcystis bloom, and microcin J25, a 21-residue, ribosomal peptide antibiotic from E. coli (Blond, 1999) or aeruginosin A, a pigment from Pseudomonas aeruginosa (Holliman, 1969) and cyanobacterial aeruginosins, linear tetrapeptides (Murakami, 1995).
All naming efforts are driven by the need to have a unique name for a unique structure and any system has to guarantee this. Unfortunately, there is currently no naming system for cyanobacterial peptides, except for microcystins, where an effort to standardize the naming of structural variants was made at a stage when the number of described variants was much lower than it is today (Carmichael, 1988). Nonetheless, this naming system proved to be very valuable and, most importantly, also applicable to new variants without the addition of too many prefixes and suffixes–at most like [Asp3,ADMAdda5,Dhb7]Mcyst-RR for a variant of Mcyst-RR isolated from a Nostoc strain (Beattie, 1998).
For other classes of peptides, it appears to be much harder to design a scheme corresponding to that used in other peptide classes as the number of main variable positions in the molecules is higher than the two in microcystins (see below) and amino acid modifications are much more variable. In anabaenopeptins (Harada, 1995), for example, a class of cyclic hexapeptides (see below), all positions are variable except for a conserved lysine. As will be discussed later, we have to be aware of the possibility that the number of known structural variants in any peptide class is only a minor proportion of the total number of naturally produced congeners. Thus, any naming system has to assure that further variants to be described will fit in the system.
In addition to the variability of amino acids at particular positions in the peptide molecules, various modifications of amino and other organic acids can occur, again complicating the introduction of a practical naming scheme. In microcystins in general, only N- or O-methylation is common as an amino acid modification, whereas in aeruginosins, for example, additional chlorination, sulphation and hydroxylation have been described. Further, the number of variable, non-proteinogenic organic acids is high in cyanopeptolins, whereas in microcystins the non-proteinogenic amino acids are in conserved positions. Since a one-letter code is restricted to 26 amino acids, it is not applicable when residues like formic, acetic, glyceric, butanoic, hexanoic, or octanoic acids and others have to fit into the same scheme.
In summary, we think that a simple but efficient naming system such as that used for microcystins is not applicable to most other peptide classes and that other systems have to be developed. This is, however, not the aim of the present paper, which is focused on the genetics and biochemistry of cyanobacterial oligopeptide synthesis. The basic classification of known peptide structures in biologically meaningful groups, as presented here, may be a good starting point for a comprehensive nomenclature.
Building blocks of cyanobacterial peptides
One distinct characteristic of NRPS biosynthetic pathways is the possibility to combine proteinogenic amino acids with non-proteinogenic amino acids, fatty acids, carbohydrates and other building blocks into complex molecules. It further allows modifications of proteinogenic amino acids from epimerization to the formation of heterocycles or dehydration.
As a basis of cyanobacterial peptide synthesis, all proteinogenic amino acids in l-configuration can be found. However, certain amino acids are rarely incorporated (like methionine) or have been reported only once (like histidine; Ishida, 2000b). All l-amino acids can, in principle, be epimerized and incorporated in d-configuration. Further, homo-variants of many proteinogenic amino acids are common, for example homotyrosine (Hty) or homoserine (Hse). Catalyzed by N-methyl transferase domains, N-methylation at the α-amino nitrogen is a common modification. N-methylation has been reported for all amino acids in cyanobacterial peptides (except proline), whereas O-methylation is less common and restricted to amino acids with free hydroxy groups.
A number of modifications of proteinogenic amino acids have been reported: hydroxylated derivatives like hydroxy-Leu or hydroxy-Pro; dehydrated and reduced amino acids like dehydrated Ser, then named dehydro-alanine (Dha) or Cys and Thr reduced to thiazoles and oxazoles by heterocyclization and reduction, respectively.
Halogenation has been reported for aromatic amino acids as well as for aliphatic structures. Aromatic chlorination is most common for Tyr and Trp and has not been reported for Phe. Bromination predominantly occurs – with one exception (Ishida, 1999) – in peptides from marine environments.
Glycosylation is possible in principle in amino acids that possess a free hydroxy-group – hydroxy-leucine, for example. Some common sugars (glucose, mannose, xylose) have been reported for cyanobacterial peptides (Fujii, 1997a; Shin, 1997; Neuhof, 2005).
Amino acid derivatives that also occur in the primary metabolism have been reported for multiple types of peptides: hydroxy-phenyl lactic acid (Hpla, a tyrosine derivative), 2-hydroxy-4-methylvaleric acid (leucic acid, a leucine derivative), 2-hydroxy-4-methylpentanoic acid (2-hydroxy-3-methylvaleric acid or isoleucic acid), or agmatine (an arginine derivative). Unfortunately, identical building blocks are sometimes not abbreviated identically. For example, the acronyms Hmpa as well as Hmva are used for isoleucic acid. Similarly, a threonine derivative (see section on Microcystins and nodularins below) is designated as 2-amino-2-butenoic acid (Aba), dehydro-homoalanine (Dhha), or dehydrobutyrine (Dhb).
A large variety of fatty acids are building blocks of cyanobacterial peptides. The most simple cases are unbranched aliphatic fatty acids like hexanoic or octanoic acid (HA and OA, respectively). Modifications of these simple fatty acids include methylation (branching), hydroxylation, or amination. By amination, β-amino acids can be formed like the complex Adda moiety in microcystins.
Last but not least, the high diversity of modified fatty acids makes cyanobacterial peptides and non-ribosomal peptides in general a structurally extremely diverse type of metabolite. Fatty acids often have a pronounced influence on physico-chemical properties of peptides, e.g. by influencing the hydrophobicity.
The following classification is based on the molecular structures, irrespective of the original source of individual congeners (Table 2). It includes a major part of the known cyanobacterial peptides but by far not all. Many individual structures are, at present, the only representatives of other peptide classes, with more members potentially to be discovered.
Main classes of cyanobacterial peptides as described in the text. Synonyms refer to names in original publications. As producing organisms, the taxa from which the respective peptides have been originally isolated are listed; homologue peptides can be found in other taxa. The number of variants reflects the structural variability of known congeners in early 2005
The peptides grouped in an individual peptide class are thought to be synthesized by homologous NRPS/PKS systems or ribosomal operons encoded in gene clusters with high sequence similarity. As shown for microcystins, the overall organization of a gene cluster coding for structural congeners in different taxa can be different even though the individual genes are clearly homologous (Rantala, 2004).
For each peptide class, the structure of a representative peptide is shown (generally the first one that has been described). In the flat formula, stereochemistry is not considered but it is mentioned in the text. For more detailed information, the reader is referred to the original publications.
Further, a schematic structure is given that lists all amino acid and other moieties that have been found at particular positions in the peptides of that class. When further modifications of amino acids have been observed, this is indicated in italics preceding the corresponding positions. All amino acids are abbreviated by standard three-letter codes and other abbreviations will be explained in the text. It has to be emphasized that not all possible combinations have been found in cyanobacterial samples. Representatives of described aeruginosins are, for example, aeruginosin 98-A (Murakami, 1995): ClHpla-Ile-SuChoi-Agmatine or aeruginosin 89-A (Ishida, 1999) Su,ClHpla-Leu-Choi-Argininal. New aeruginosins could be predicted to have, for example, the amino acid sequences ClHpla-Tyr-Choi-Argininol or ClHpla-Leu-Choi-Agmatine, but they have not yet been described.
Because of the high number of individual peptides in some peptide classes, we could not cite all original publications, only those articles that refer to particular features of individual peptides. This does not, of course, imply that the publications not cited individually are in any way less important; the selection was made solely for practical reasons.
The linear peptides of this class are characterized by a derivative of hydroxy-phenyl lactic acid (Hpla) at the N-terminus, the amino acid 2-carboxy-6-hydroxyoctahydroindole (Choi) and an arginine derivative at the C-terminus (Fig. 2) (Murakami, 1995). The biosynthesis is achieved putatively by an NRPS (K. Ishida & E. Dittmann, personal communication; N. Tandeau de Marsac & M. Welker, unpublished).
Structure of aeruginosin 98-A (Murakami et al., 1995) and schematic general structure of aeruginosin type peptides. Aeap, 3-aminoethyl-1-N-amidino-d-3-pyrroline. Bold lines in the flat structure represent the conserved part of the molecule that can be found in all or most congeners; thin lines refer to variable parts. For stereochemical information see text. In the schematic structure all amino acid and other residues that have been found in the respective positions of the molecule are listed together with possible respective modifications (Cl, chlorination; Br, bromination; Su, sulphation; pent, glycosylation; M, α-amino-methylation; m, O-methylation). The first line represents the structure shown above. The numbering of amino acid and other residues is generally according to the (presumed) biosynthetic steps if not stated otherwise. Amino acids are given in three-letter codes including homo-variants of proteinogenic amino acids; for all other abbreviations explanations are given in the text.
The C-terminal arginine derivatives are agmatine, derived from Arg by decarboxylation (e.g. in microcin SF608; Banker & Carmeli, 1999), argininol, derived from Arg by reduction of the carboxy group to an alcohol (e.g. in aeruginosin 298A; Ishida, 1999) or argininal, derived from argininol by cyclization (e.g. in aeruginosin 102A; Matsuda, 1996a). At position 2, variable amino acids such as Tyr (Hty), Phe, Leu, or Ile can be incorporated that are predominantly in d-configuration. In an individual strain, however, aeruginosins with a configuration of the amino acid in position 2 as L or as d can be found (Ishida, 1999).
Hpla is a compound that is readily available for NRPS from the tyrosine metabolism and has been found to be in d-configuration in most congeners. Choi has been synthesized in vitro from tyrosine but it is not yet clear whether tyrosine is the precursor during peptide biosynthesis (Valls, 2001). Recently, the chemical synthesis of aeruginosins has been accomplished (Valls, 2002). Chlorination (indicated by Cl) and sulphation (Su) can occur at the Choi (e.g. aeruginosin 205-A; Shin, 1997) or Hpla (e.g. aeruginosin 101; Ishida, 1999) residues, but have never been observed simultaneously at both positions in an individual peptide. In one aeruginosin, Hpla was found to be brominated (aeruginosin 98-C; Ishida, 1999), which is remarkable as bromine was not detectable in the natural environment or in the culture medium. Glycosylation with a pentose sugar (xylose) has been described for a variant isolated from Planktothrix (aeruginosin 205-A; Shin, 1997). In Planktothrix such a glycosylation seems to be common (Welker, 2004b), whereas it has not been observed in Microcystis so far. At present, 27 variants have been published (Table 2). However, mass spectral analyses of strains and bloom samples indicate a high number of further structural variants which differ only in chlorination and sulphation and less in amino acid sequences, as is the case in other peptides. Aeruginosins have been isolated from Microcystis and Planktothrix and variants with mPro instead of Choi (spumigin Fujii, 1997b) from Nodularia. Aeruginosins also have similarities to dysinosins, linear tetrapeptides from a dysideid sponge (Carroll, 2002) and to suomilide (Fujii, 1997a) and banyasides (Ploutno & Carmeli, 2005), peptides from Nodularia and Nostoc, respectively.
This class of linear peptides is characterized by a decanoic acid derivative, 3-amino-2-hydroxy-decanoic acid (Ahda) and a predominance of two tyrosine units at the C-terminus (Fig. 3) (Okino, 1993a). Microginins vary in length from four (e.g. microginin 91A; Ishida, 2000a) to six (e.g. microginin 299C; Ishida, 1998b) amino acids with the variability occurring at the C-terminal end. Position 2 is most variable with seven different amino acids reported, while in the two following positions, three to four different amino acids have been reported. N-methylation can occur at positions 1, 3 and 4 (Ishida, 1998b). Aliphatic chlorination has been reported for Ahda (Kodani, 1999) and in some cases also as dichlorination at the terminal carbon atom (Ishida, 1998b). Aromatic chlorination has not been observed at any of the (homo)tyrosine units. The putative gene cluster coding for microginin synthetase has been sequenced in strain Microcystis HUB 5.3 (Kramer, 2000) but a corresponding knock-out mutant could not be produced until now. Ahda formation is achieved by a PKS enzyme complex and presumably is the starting unit of microginins.
Structure of microginin (Okino et al., 1993) and schematic general structure of microginin type peptides. For further explanations see Fig. 2.
Microginins sensu stricto have been found in blooms and strains of Microcystis and Planktothrix so far. Two peptides, carmabins A and B, isolated from Lyngbya are similar but have different decanoic acid derivatives (dimethyl decanoic acid and oxo-dimethyl decanoic acid; Hooper, 1998). Nostoginins have been isolated from Nostoc with an N-terminal 3-amino-2-hydroxy-octanoic acid, Ahoa (Ploutno & Carmeli, 2002).
These cyclic peptides are characterized by a lysine in position 5 and the formation of the ring by an N-6-peptide bond between Lys and the carboxy group of the amino acid in position 6 (Fig. 4) (Harada, 1995). A side chain of one amino acid unit is attached to the ring by an ureido bond formed between the α-N of Lys and the α-N of the side chain amino acid. All other positions in the ring and side chain are variable, with three to five amino acids reported for the respective positions. The amino acid in position 5 is N-methylated. Methionine in position 3 has been reported as an S-oxygenated variant (nodulapeptin B) (Fujii, 1997b). Homo-variants of tyrosine and phenylalanine can be found in positions 4 and 5 (Murakami, 1997; Reshef & Carmeli, 2002). A putative respective NRPS gene cluster has been found in the genome of Anabaena strain 90 (K. Sivonen & L. Rouhiainen, personal communication).
Structure of anabaenopeptin A (Harada et al., 1995) and schematic general structure of anabaenopeptin type peptides. For further explanations see Fig. 2.
Biosynthesis presumably starts with the side chain amino acid that forms a pseudo-C-terminus when the ureido bond is formed. Ring closure is then accomplished by the peptide bond formation between the C-terminal carboxy-group in position 6 and the 6-amino group of lysine.
The mass range of anabaenopeptins spans from 759 Da for anabaenopeptin I ([Leu-MAla-Hty-Val-Lys]-CO-Ile; Murakami, 2000) to 956 Da for Oscillamide C ([Phe-MHty-Hty-Ile-Lys]-CO-Arg; Sano, 2001).
All amino acids, except the Lys in position 2, are in l-configuration.
Anabaenopeptins have been reported from cyanobacteria isolated from a variety of habitats: freshwater (Harada, 1995), terrestrial (Reshef & Carmeli, 2002) and brackish water (Fujii, 1997b) and also from marine sponges (konbamide and keramide A from Theonella sp.; Kobayashi, 1991a, b). As sponges host a broad variety of prokaryotic symbionts, these peptides may well be produced by cyanobacteria rather than by the sponge itself (Harrigan & Goetz, 2002; Piel, 2004). Indeed, two congeners of anabaenopeptins differ only in hydroxylation of tryptophan: mozamide A ([Phe-MhoTrp-Leu-Val-Lys]-CO-Ile; Schmidt, 1997) was isolated from a theonellid sponge, whereas plectonemid A ([Phe-MTrp-Leu-Val-Lys]-CO-Ile; Müller, 2005) originated from a culture of Plectonema sp.
This class of cyclic peptides is characterized by the amino acid 3-amino-6-hydroxy-2-piperidone (Ahp) and the cyclization of the peptide ring by an ester bond of the β-hydroxy group of threonine with the carboxy group of the terminal amino acid (Fig. 5) (Martin, 1993). In two cases, the threonine unit is substituted by a hydroxy methyl proline unit and the ring-closing ester bond is formed with this hydroxy group (Nostopeptins; Okino, 1997). The general type of this peptide class is thus a branched peptidolactone.
Structure of cyanopeptolin A (Martin et al., 1993) and schematic general structure of cyanopeptolin type peptides. GA, glyceric acid; FA, formic acid; AA, acetic acid; BA, butanoic acid; HA, hexanoic acid; OA, octanoic acid. The numbering of the side chain is in reverse order and S1 is the moiety next to Thr (1). Biosynthesis starts from the side chain and ends with 6. For further explanations see Fig. 2.
A side chain of variable length is attached via the amino group of the threonine unit. Two major types of side chains are common: one consisting of one or two amino acids and an aliphatic fatty acid from formic (e.g. in anabaenopeptilide 202-A; Fujii, 1996) to octanoic acid (e.g. in micropeptin A; Okino, 1993b) and one with a glyceric acid unit at the N-terminus (e.g. cyanopeptolin S; Jakobi, 1995). The glyceric acid can be attached directly to the threonine in position 1 or to an amino acid side chain (e.g. in A90720A; Bonjouklian, 1996). Sulphation and O-methylation of the glyceric acid have been observed (e.g. in Oscillapeptins A–C; Itou, 1999a). A branched side chain has been reported for scyptolins B where two alanine-butanoic acids are joined to a threonine in position S1 (Matern, 2001). In several variants, hydroxyphenyl lactic acid (Hpla) in the side chain has been reported (microcystilide A; Tsukamoto, 1993), aeruginopeptin 95A/B; Harada, 1993). Other non-proteinogenic amino acids or hydroxy acids in cyanopeptolins are: tetrahydro-tyrosine (H4Tyr, also called hydroxy-cyclohexenyl alanine, HcAla) in position 2 (e.g. aeruginopeptin-95B, Harada, 1993; micropeptin 88-D, Ishida, 1998a); kynurenine in position 5 (micropeptin SD999; Reshef & Carmeli, 2001); hydroxy-methyl valeric acid (Hmv) in the sidechain (hofmannolin; Matern, 2003a); amino-butenoic acid (Aba or Dhb) in position 2 (somamide A; Nogle, 2001).
The biosynthetic gene cluster has been sequenced in Anabaena 90 (Rouhiainen, 2000) and another one is present in the Microcystis PCC7806 genome (Martin, 1993; Bister, 2004) (N. Tandeau de Marsac, personal communication). The arrangement of the genes and the structure of the peptides suggest that initiation of biosynthesis starts with the side chain and that the final step is the ring closure between the amino acid in position 6 and threonine in position 1. All amino acids are in l-configuration in all cyanopeptolins described so far. With regard to the highly variable side chain, the gene clusters expectedly would show much variation in the corresponding genes.
All positions in the ring, except threonine and Ahp, can be occupied by variable amino acids. However, the number of amino acids that have been reported for individual positions varies from two in position 6 to 9 in position 2. In position 5, an aromatic amino acid is found in all variants; position 6 is occupied by neutral amino acids. Position 2 of the ring can be occupied by a broad variety of amino acids like aromatic, basic, aliphatic and hydroxy amino acids. In this position, Dhb has also been reported that is common to the microcystins of Planktothrix (Harrigan, 1999).
In all variants, the amino acid in position 5 is N-methylated. Derivatization by O-methylation can occur when a free hydroxy group is available, as in tyrosine (e.g. anabaenopeptilide 90-A; Fujii, 1996). Chlorination has been reported for position 5 when it is occupied by tyrosine but no dichlorination has been found so far (micropeptin 478-A, Ishida, 1997a; scyptolin A/B, Matern, 2001).
The high structural variability of cyanopeptolins is also reflected by the wide range of molecular masses spanning from 770 Da for tasipeptin A (Williams, 2003) to 1181 Da for oscillapeptin B (Itou, 1999a).
Cyanopeptolin type peptides have been isolated from Chroococcales, Oscillatoriales and Nostocales. Further, one congener was reported from Dollabella, a marine herbivorous gastropod, suggesting a cyanobacterial origin (Harrigan, 1999).
The naming of this class of peptides is very incoherent (Table 2), which is partly explained by the nearly synchronous publication of several names within the same year: cyanopeptolin (Martin, 1993), micropeptin (Okino, 1993b), microcystilide (Tsukamoto, 1993) and aeruginopeptin (Harada, 1993).
The total synthesis of one congener, micropeptin T-20, has been recently achieved (Yokokawa, 2005).
Microcystins and nodularins
Microcystins (originally described as cyanoginosins; Botes, 1984, 1985) and nodularins are characterized by the amino acid (2S,3S,8S,9S)-3-amino-9-methoxy-2,6,8-trimethyl-10-phenyldeca-4,6-dienoic acid (Adda, position 5), glutamate and an aspartate derivative at positions 5, 6 and 3, respectively, of the ring (Fig. 6). The aspartate derivative is referred to as d-erythro-2-methyl-iso-aspartate (DmiA). Other d-amino acids in most structural variants are d-Ala in position 1 and d-Glu in position 6. The numbering of the particular positions was assigned before the biosynthetic pathway had been discovered (Carmichael, 1988) and thus does not reflect the sequence of chain elongation during biosynthesis (Tillett, 2000). Two positions show high variability, namely positions 2 and 4, whereas all other positions are more conserved. For this reason, the nomenclature of microcystins has been revised in an early stage and it was proposed to name variants according to the two most variable positions by applying the one-letter code for amino acids, e.g. microcystin-LR for the variant with leucine in position 2 and arginine in position 4 (Carmichael, 1988). Position 7 is occupied in most variants by dehydro alanine (Dha) or, when methylated, methyl-dehydro alanine (Mdha), which originates from dehydration of a seryl-intermediate. Several variants still have a native Ser in this position (Namikoshi, 1992b). In Planktothrix and Nostoc, an analogous threonine derivative, 2-amino-2-butenoic acid (Aba or Dhb), can frequently be found in this position. While N-methylation is observed in many Dha7-variants (then Mdha), it has never been found for Dhb-variants in microcystins, indicating a lack or deletion of the N-methyl-transferase domain in the respective module of McyA. In nodularins, the Dhb moiety is methylated and an N-methyl-transferase domain has been reported for NdaA (Moffitt & Neilan, 2004). Dhb can have E- or Z-configurations, by which toxicity is influenced (Blom, 2001). Similar isomers have been reported for the conjugated double bond in the Adda side chain, though only in photochemical experiments and not as compounds in vivo, so far (Harada, 1996).
Structure of microcystin-LA (Botes et al., 1984) and schematic general structure of microcystin type peptides. Note that the numbering does not correspond to the suite of biosynthetical steps that starts with 5 and ends with 4. For further explanations see Fig. 2.
At the Adda side chain, the O-methyl group can be lacking (Namikoshi, 1992a) or be substituted by an acetyl group (Namikoshi, 1990; Sivonen, 1992). Considering all possible variability at the individual moieties, it is not surprising that new variants can still be found in various strains (Grach-Pogrebinsky, 2004; Oksanen, 2004; Welker, 2004b), although nearly 90 structural variants have already been described. The methylation at two positions alone allows four possible variants of any microcystin-XZ: [Asp3]microcystin-XZ, [Asp3,Dha7]microcystin-XZ, [Dha7]microcystin-XZ, together with the ‘native’ compound. With the possible variations in other positions, like O-methylation of the Glu6-moiety, the potentially high number of structural variants is evident. Nonetheless, despite the structural variability in field samples as well as in isolated strains, a few variants are dominant and most structural variants occur only in low concentrations (Fastner, 1999; Welker, 2004b). Chlorination or sulphation has never been observed in microcystins.
Biosynthesis gene clusters have been sequenced for the genera Microcystis (Tillett, 2000), Planktothrix (Christiansen, 2003) and Anabaena (Rouhiainen, 2004). The biosynthesis gene cluster of nodularin (Moffitt & Neilan, 2004) is homologous to the mcy-cluster but lacks two modules (Rantala, 2004). Consequently, nodularins are cyclic pentapeptides with an Adda moiety and show much similarity to microcystins. A nodularin-like, cyclic pentapeptide with Adda and Mdhb, motuporin, has been isolated from the sponge Theonella (de Silva, 1992), where it might be produced by a symbiotic (cyano)bacterium (Bewley & Faulkner, 1998).
Biosynthesis in microcystins and nodularins starts with the formation of the Adda moiety by a NRPS/PKS hybrid enzyme, presumably with phenylacetate as starter unit (Moore, 1991) (and thus with position 5). The d-configuration of alanine in position 1 is achieved by an epimerization domain in McyA while the d-configuration of glutamate (position 6) and DmiA is achieved by a separate racemase (Sielaff, 2003).
Various structures have been described with characteristic thiazole and oxazole moieties thought to be cysteine and threonine derivatives, respectively (Fig. 7), as shown by nuclear magnetic resonance techniques for another type of peptide, barbamide from Lyngbya (Williamson, 1999). Corresponding moieties are most likely formed from native amino acids by dehydration and reduction to form the heterocycle. In typical peptides of this class, e.g. nostocyclamide (Todorova, 1995), thiazole/oxazole units occur in alternation with unmodified amino acids to form a cyclic hexapeptide. In one case, westiellamide (Prinsep, 1992), the molecule is built exclusively of alternating oxazole and valine residues, whereas in other congeners, all six moieties are different from each other (as in raocyclamides) (Admi, 1996). In the hexapeptides with thiazole moieties, not all cysteine/threonine units are dehydrated as in banyascyclamides A and B (Ploutno & Carmeli, 2002), where the threonine moiety in position 1 is unaltered. In other peptides, only one thiazole moiety is found, as in ulongamides A–F (Luesch, 2002), and instead of further thiazole/oxazole units, proteinogenic amino acids or lactic acid and amino methyl-hexanoic acid are incorporated. The naming of this peptide class is also very incoherent and the names of the peptides are linked either to the producing organism (e.g. nostocyclamide) or to the origin of the sample or strain (e.g. banyascyclamide). The biosynthesis and respective genes are not known and therefore the numbering of amino acid residues is arbitrary.
Structure of nostocyclamide (Todoroda et al., 1995) and schematic general structure of cyclamide type peptides. dh, dehydrated. The numbering is arbitrary due to unknown biosynthesis. For further explanations see Fig. 2.
The first peptides of this class, however, have been described from the marine ascidian Lissoclium bistratum. Bistratamides (Degnan, 1989a; Foster, 1992) were thought to be produced by the symbiotic Prochloron sp. rather than by the ascidian itself. Recently, Schmidt (2004) found gene sequences in Prochloron indicating the presence of NRPS clusters that might be involved in the biosynthesis of cyclic peptides.
Peptides recently isolated from the ascidian Didemnum molle, didmolamides A and B (Rudi, 2003), have exactly the same (flat) structure as banyascyclamides A and C (Ploutno & Carmeli, 2002), respectively.
Further peptides with similar alternating thiazole/oxazole units, patellamides (Ireland, 1982) and lissoclinamides (Degnan, 1989b), have been isolated from Lissoclinium. In patellamides A–C, four thiazole/oxazole units and four alternating amino acids form a cyclic octapeptide, whereas the heptapeptides lissoclinamides 1–4 are built of three thiazole/oxazole and four amino acid units. Recently, it has been shown that patellamides are synthesized ribosomally from a linear octapeptide by post-translational modification (Schmidt, 2005) and thus a similar biosynthetic pathway may be responsible for cyclamide formation (see section on Ribosomal peptide synthesis of complex peptides, above).
The largest known cyanobacterial oligopeptides are the microviridins (Fig. 8) (Ishitsuka, 1990). This group is characterized by the multicyclic structure established by secondary peptide and ester bonds and a side chain of variable length. The main peptide ring consists of seven amino acids with an ester bond between the 4-carboxy group of aspartate (position 10) and the hydroxy group of threonine (position 4) and a peptide bond between the 6-amino group of lysine (position 6) and the 4-carboxy group of glutamate (position 7).
Structure of microviridin A (Ishitsuka et al., 1990) and schematic general structure of microviridin type peptides. Ester bonds and a secondary amino bond are indicated. For further explanations see Fig. 2.
The members of this class of peptides all share these features and variations are primarily due to substitutions in the side chain and at position 5 in the ring. However, in many natural samples and isolated strains, peptides with fragment mass spectra similar to those of microviridins have been detected. This indicates a much greater structural variability than suggested by the number of isolated congeners so far (Fastner, 2001; Welker, 2004a, b). A recently isolated variant, microviridin J, proved to be toxic to the planktonic crustacea Daphnia, while other structural variants were inactive (Rohrlack, 2004).
All amino acids in microviridins are in l-configuration and the only non-proteinogenic unit is the N-terminal acetic acid. Therefore, it could be that microviridins are synthesized ribosomally and that the tri-cyclic structure is completed by post-translational modifications similar to those that have been described for other prokaryotic peptides, e.g. microcin J25 (Blond, 1999).
This class of cyclic depsipeptides is special in that most of the reported congeners have been isolated exclusively from Nostoc and the majority from a single strain (Fig. 9a) (Schwartz, 1990; Golakoti, 1995, 1994) and not, as for other peptide classes, from a wide taxonomic range of cyanobacteria. Cryptophycins are composed of two hydroxy-acid units – a valeric acid derivative and a derivative of phenyl-octenoic acid – and two amino acid units – a tyrosine derivative and a β-amino acid. Structural variability arises mainly from the chlorination of the tyrosine unit and an optional epoxy group at the phenyl-octanoic acid.
Structures of cryptophycin C (Golakoti et al., 1994) (a), microcolin A (Koehn et al., 1992) (b) and tantazole A (Carmeli et al., 1990) (c).
Cryptophycins show cytotoxicity toward various tumor cell lines and are potential candidates for anticancer drugs (Edelman, 2003).
Microcolins and mirabimids
These linear peptides have been isolated from Lyngbya and Scytonema, respectively, and are characterized by a C-terminal pyrrolin-2-one moiety (Fig. 9b) (Carmeli, 1991a; Koehn, 1992). Mirabimids have an N-methylated or acetylated N-terminal amino acid. Mirocolins and majusculamid D (Moore & Entzeroth, 1988) possess a modified octanoic acid (dimethyl-OA).
Tantazoles and mirabazoles
Tantazoles A, B, F and I and mirabazoles A–C (Fig. 9c) (Carmeli, 1990, 1991b) have been isolated from Scytonema mirabile. These peptides are composed nearly exclusively of (methylated) thiazole and oxazole units forming linear tetra- and pentapeptides, respectively. A similar compound, thiangazole, has been isolated from the myxobacterium Polyangium (Kunze, 1993).
More than half of the known peptides can be assigned to the major peptide classes described above, whereas the remaining peptides cannot be grouped in larger classes with many structural variants. For most of these peptide types, only a few congeners are known and these often have been isolated as minor compounds from the same strain or sample. It is beyond the scope of this review to present all known peptides in detail and several peptide types will be mentioned as examples with a focus on structural peculiarities.
Thiazole and oxazole moieties are reported for several types of cyanobacterial peptides with structural properties too specific to assume a homology to the cyclic thiazole/oxazole peptides mentioned above. Nonetheless, the thiazole formation may well be homologous in the corresponding NRPS enzymes. Lyngbyabellin B (Luesch, 2000a) is a cyclic hexapeptide containing two thiazoles and a modified octanoic acid (2-dimethyl,3-hydroxy,7-dichloro-OA). Aeruginosinamide (Fig. 10a) (Lawton, 1999b), a linear tetrapeptide, contains a C-terminal thiazole and an N-terminal leucine with a di-isoprenylated amino group. These features are similar to those of virenamide A, a peptide isolated from the ascidian Diplosoma (Carroll, 1996). The linear tetrapeptide barbamide (Fig. 10b) contains a C-terminal thiazole (Orjala & Gerwick, 1996) and a triply chlorinated fatty acid moiety probably derived from a tri-chloro leucine (Sitachitta, 2000a). Apramides A–G (Luesch, 2000b) are linear nonapeptides characterized by a C-terminal thiazole and a modified, N-terminal 7-octenoic or 7-octynoic acid unit. Wewekazole from Lyngbya is a cyclic undecapeptide with three (methyl-)oxazole moieties (Nogle, 2003).
Structures of aeruginosinamide (Lawton et al., 1999) (a) and barbamide (Orjala & Gerwick, 1996) (b).
A number of cyclic deca- and undecapeptides have been reported to possess a Dhb moiety and hydroxy amino acids. Examples are puwainaphycins A–E from Anabaena (Gregson, 1992), lipopeptides with a modified stearic or palmitic acid; laxaphycins A–E from Anabaena laxa (Frankmölle, 1992) (Fig. 11a) with hydroxy-amino acids; hormothamnin A from Hormothamnion (Gerwick, 1992), which has the same structure as laxaphycin A except for the configuration of Dhb; and lobocyclamides A–B (MacMillan, 2002), which differ from similar laxaphycins in two amino acids. Calophycin from Calothrix sp. (Moon, 1992) possesses a 2-hydroxy-3-amino-4-methylpalmitic acid (Hamp) similar to that in puwainaphycin E.
Structures of laxaphycin A (Frankmölle et al., 1992) (a), kawaguchipeptin A (Ishida et al., 1996) (b) and oscillatorin (Sano & Kaya, 1996) (c).
Further cyclic deca- and undecapeptides are, for example, kawaguchipeptins A (Fig. 11b) and B, undecapeptides from Microcystis that differ in two Trp moieties modified by prenyl-groups (Ishida, 1996, 1997b). Oscillatorin (Sano & Kaya, 1996) (Fig. 11c) is a cyclic decapeptide with an unusual amino acid, oscillatoric acid, which is a prenylated tryptophan derivative.
Some 50 peptides have been isolated that do not fit in any of the peptide classes or types described above. The smallest known cyanobacterial peptide is radiosumin, built of two acetylated amino acids derived from p-aminophenylalanine (Fig. 12a) (Matsuda, 1996b). A particular type of peptides is the aeruginoguanidines, tripeptides built of two arginine moieties and a tyrosine-like amine. The tyrosine-like amine is triply sulphated whereas the arginine moieties are modified by prenyl or geranyl groups (Ishida, 2002). Kasumigamide (Fig. 12b), a linear pentapeptide, is built entirely from non-proteinogenic amino acids like β-alanine or the tryptophan derivative Ahipa (4-amino-3-hydroxy-5-indolylpentanoic acid; Ishida & Murakami, 2000).
Structures of radiosumin (Matsuda et al., 1996) (a), kasumigamide (Ishida & Murakami, 2000) (b), antanapeptin A (Nogle & Gerwick, 2002) (c) and malevamide C (Horgen et al., 2000) (d).
In a number of cyclic peptides, it is not possible to find any particularly characteristic feature. When several congeners are described, they have often been isolated from a single sample. The smallest peptides of this incoherent group are antanapeptins (A–D, Fig. 12c) isolated from Lyngbya with a characteristic 3-hydroxy-2-methyl-octynoic acid and a 2-hydroxyisovaleric acid unit (Nogle & Gerwick, 2002). The largest mono-cyclic peptide is malevamide C (Fig. 12d) (Horgen, 2000) with 14 (amino) acid units. In malevamide C, a modified octanoic acid is also present, 3-amino-2-methyl-7-octynoic acid, which was also found in pitipeptolides A and B, cyclic heptapeptides with dihydroxy octynoic and octenoic acid, respectively (Luesch, 2001). A hydroxy-dimethyl octynoic acid is also found in yanucamides A and B together with a hydroxy-isovaleric acid (Sitachitta, 2000b), peptides isolated from a Lyngbya/Schizothrix assemblage that resemble kulolides, peptides isolated from a nudibranch gastropod (Reese, 1996).
Distribution and function of peptides in cyanobacteria
Regarding the structural diversity of cyanobacterial peptides the question arises as to how common particular peptides or peptide types are with respect to their taxonomic and geographic distributions. When this question is asked, it always has to be kept in mind that the biosynthesis of non-ribosomal peptides requires a significant part of the cell's energy and nutrient resources. As a rule of thumb, any single amino acid incorporated in a non-ribosomal peptide requires genetic information of about 4–5 kbp. For highly modified amino acids or building blocks that are synthesized by PKS-systems, this number can be substantially higher. The share of peptide synthetase enzymes in the cellular protein pool is unknown, but it can be assumed that the translation of the enzymes has significant costs for the cell.
Further, only little data are available on the actual taxonomic and geographic distribution of individual peptides or peptide classes. Therefore, the data reviewed below should be considered as a first insight.
Taxonomic distribution of peptides
For most individual peptides, the distribution among cyanobacterial taxa is basically unknown and the only existing reference is the taxon from which the respective peptide has been isolated for the first time. However, summarizing the data available from the original publications already indicates that oligopeptides that are synthesized by NRPS can be found in genera from all sections of cyanobacteria (Christiansen, 2001). As can be expected for true secondary metabolites, these biosynthetic activities have a patchy distribution.
The only peptide class with a good database in this respect are microcystins, which have been studied intensively in many countries and cyanobacterial strains because of their potentially hazardous role in drinking water reservoirs (Carmichael, 1992; Codd, 1995; Sivonen & Jones, 1999; Svrcek & Smith, 2004). The respective biosynthetic gene cluster is found in genera of three subsections (Microcystis, Planktothrix and Anabaena). Strikingly, in the potentially producing species and genera, some strains contain the gene cluster whereas other strains do not produce microcystins and are lacking the complete gene cluster (Kurmayer, 2002; Hisbergues, 2003; Via-Ordorika, 2004).
Soon after the structure elucidation of microcystins that made detection and quantification possible in many laboratories, it became evident that microcystins can be found worldwide and are produced by a broad variety of cyanobacterial genera. Today, it is of no surprise when microcystins are detected in field samples containing Microcystis, Planktothrix, Anabaena, or Nostoc, independent of the geographical origin of the samples (Chorus & Bartram, 1999). Higher microcystin concentrations most often are associated with higher biomass of toxigenic taxa and thus are more likely found in eutrophic than in clear lakes (Svrcek & Smith, 2004). A number of studies have dealt with the cellular microcystin content in cyanobacterial strains under various growth conditions (Sivonen, 1990; Utkilen & Gjolme, 1992; Rapala, 1997; Orr & Jones, 1998; Oh, 2000; Wiedner, 2003), with the presence of mcy-genes (Tillett, 2001; Bittencourt-Oliveira, 2003; Hisbergues, 2003; Via-Ordorika, 2004; Mbedi, 2005), or with both factors (Mikalsen, 2003). An important conclusion of these studies is that mcy-genes are present nearly exclusively in those strains in which microcystins can actually be detected. Exceptions to this general rule are rarely found in Microcystis (Kaebernick, 2001) but seem to be more common in Planktothrix rubescens, where natural mutants can make up to 10% of a population (Kurmayer, 2004). Globally, microcystins are constitutively present in mcy+-strains and not only when the synthesis is triggered by distinct environmental signals, such as can be observed for most other microbial NRPS systems (Du & Shen, 2001). As the structural variants actually synthesized as well as the cellular concentrations are more or less constant, independent of growth conditions, it is evident that there is a genetic rather than a physiological control of peptide production (Mikalsen, 2003). Moreover, the presence or lack of mcy-genes does not correspond to any phylogeny based on housekeeping genes (Neilan, 1995, 1997). Discussing the organization and sequences of the mcy-gene cluster in different taxa and the distribution within potentially toxigenic genera, Rantala (2004) concluded that the mcy-gene cluster is a very ancient unit dating back to the common ancestor of modern Anabaena, Microcystis, Nostoc and Planktothrix. Within these genera, the distribution of mcy-genes was interpreted as the result of repeated and independent losses. Remarkably, in genera closely related to the ones mentioned above, no microcystins have been found yet, e.g. in Aphanizomenon, Synechocystis, or Limnothrix.
A similar distribution of biosynthesis gene clusters and respective peptides among cyanobacterial taxa and strains can reasonably be assumed for at least the main classes and types of cyanobacterial peptides. The data available are only fragmentary at present but they fit well into this picture. Anabaenopeptins, for example, are produced by strains of the genera Microcystis, Planktothrix and Aphanizomenon belonging to sections I, III and IV, respectively, but not in all strains of these genera (Fastner, 2001; Welker, 2003, 2004a, b). Interestingly, in distantly related taxa exactly the same structural variants can be found (for example anabaenopeptin B). The cellular concentration in a producing strain, Anabaena 90, showed only moderate response to varying growth factors, in a range comparable to that found for microcystins (Repka, 2004).
Another peptide class with a very wide distribution are the cyanopeptolins (Table 2), which have been isolated from various environments and from diverse taxa (Anabaena, Microcystis, Planktothrix, Scytonema, Symploca, Nostoc, Lyngbya, Oscillatoria). Most of the structural variants originate from Microcystis and Planktothrix but this most likely does not reflect the global distribution of cyanopeptolins among cyanobacteria. However, considering the data available, the highest diversity of cyanopeptolins occur in planktonic freshwater taxa. As with microcystins, strains without and with cyanopeptolin synthetases exist that are very closely related, e.g. in Microcystis (N. Tandeau de Marsac, personal communication; own unpublished data).
Other peptides such as microviridins, aeruginosins and microginins have been reported from a similar discontinuous array of cyanobacterial genera.
Figure 13 summarizes data on the distribution of major classes of cyanopeptides in sections, genera of section I, and strains of Microcystis. In sections I, III and IV, all classes of peptides can be found while for sections II (Pleurocapsales) and V (Stigonematales) little or no information is available, mainly due to the low number of available strains. Within section I, only the genus Microcystis has been found to produce oligopeptides so far. This is in accordance with Christiansen (2001) and the data available from cyanobacterial genomes (see above). In individual strains of Microcystis, oligopeptides can be found in various combinations, already giving an impression of the chemotype diversity within a single genus. When individual peptides are considered rather than peptide classes, the number of possible chemotypes seems endless and indeed, when clones were analyzed as single colonies or filaments, or as cultured isolates, the number of peptide chemotypes by far exceeds the number of morphotypes (Fastner, 2001; Welker, 2004a, b).
Distribution of classes of peptides among taxa of cyanobacteria. From left to right: sections, genera of section I, strains of the genus Microcystis. For the sections II and V only scarce data are available.
Also in Microcystis, some strains do not produce any peptides [by high-performance liquid chromatography analysis supported by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MS) or liquid chromatography-MS/MS], whereas in others, peptides of up to four classes can be found (Czarnecki, 2006).
Among marine cyanobacteria in the genera Lyngbya and Symploca, a similar chemotype diversity is to be expected. Though respective studies on chemotype diversity are only in a beginning phase (Guyot, 2004; Thacker & Paul, 2004), the sheer number of peptides (and other metabolites) isolated from Lyngbya underlines the potential metabolic diversity in this genus (Shimizu, 2003).
The taxonomic distribution of oligopeptides in cyanobacteria, as assessed by chemical analyses, might not give a complete picture, partly because the number of known and characterized peptides is still low compared to the expected total structural diversity. Nonetheless, the data indicate that the production of oligopeptides is concentrated in certain genera among taxonomic sections. Within these genera, a multitude of peptide chemotypes exist where we assume that the chemotype directly reflects the genotype with respect to NRPS/PKS gene clusters.
Geographic distribution of peptides and peptide classes
As with the taxonomic distribution, only few data are available on the geographic distribution of cyanobacterial peptides, with the exception of microcystins. In microcystins the available data indicate that the biosynthesis of these peptides is not restricted to any climatic zone or other geographic range. Microcystin producing cyanobacteria can be found in tropical waters (Cuvin-Aralar, 2002) as well as in Antarctic samples (Hitzfeld, 2000), large lakes (Brittain, 2000) as well as in shallow ponds (Welker, 2005), high altitude (Mez, 1997) as well as coastal waters (Henriksen, 2001).
For other classes and types of peptides, the few available data indicate a similar global distribution. Certain peptides that have been isolated from a particular cyanobacterial taxon can be potentially found in samples and strains of this taxon, independently of the geographical origin. For example, peptides isolated from Japanese samples or strains, like kasumigamide or anabaenopeptins, can be found in Microcystis samples from several locations in Europe and Canada (Williams, 1996; Fastner, 2001; Barco, 2004; Welker, 2004a; own unpublished data). This does not exclude the possibility that other peptides might be restricted within certain geographic limits.
Co-production of peptides in individual strains
A major aspect of cyanobacterial peptides is the production of multiple peptide classes and congeners by individual strains. Combinations of individual peptides can be considered peptide fingerprints typical of individual clonal strains, allowing the distinction of morphologically undistinguishable strains as chemotypes (Fastner, 2001; Welker, 2004a, 2005). Congeners of a peptide class produced by one strain vary in a few amino acid positions, whereas other positions are conserved (Martin, 1993; Czarnecki, 2006). This indicates that certain positions apparently need to be preserved to retain bioactivities, whereas natural selection likely prevents the persistence of non-active peptide variants. In addition to variable amino acids (and other organic acids) in peptides produced by an individual strain, further structural diversity can arise from modifications, like methylation (Harada, 1991) or halogenation (Murakami, 1995; Ishida, 1998b; Rouhiainen, 2000).
The positional control of amino acid specificity is well known from various examples of peptides found in other bacterial phyla or the superkingdom fungi (Kleinkauf & von Döhren, 1997). A positional control apparently acts on different levels of fidelity for a specific position. Thus, for example, an l-Leu residue in a specific position of a peptide might be exchanged with similar residues (such as Ile and Val), or with unrelated amino acids (such as Tyr and/or Arg) as in the case of microcystins. This is a remarkable property of cyanobacterial peptide synthetases because, in various heterotrophic bacteria and fungi, only exchanges by similar amino acids have been documented (e.g. aromatic amino acid Tyr, Phe and Trp, or aliphatic branched amino acids Leu and Ile; Kleinkauf & von Döhren, 1997). It has to be underlined that available data clearly indicate that there is only a single enzyme system in each strain producing a set of congeners. Besides the recognition and activation of specific amino acids by adenylation domains, the processing of the aminoacyl intermediates by condensation domains is also important for the selective incorporation of specific amino acids during chain elongation. The corresponding mechanisms, however, have not been intensively studied. A study by Belshaw (1999) showed that the incorporation of the ‘wrong’ amino acid slowed down the speed of the further peptide synthesis to a point where the production of corresponding variants became extremely unlikely. The production of congeners is thus related to the level of control in activation and processing reactions of each step in a biosynthetic pathway.
Amino acid modifications catalyzed by N-methyl transferases or halogenases (see above) are reactions that are not absolutely required for the biosynthetic process. Therefore, non-methylated or non-halogenated analogues are frequently observed. However, as has been shown in fungal systems, the rate of processing of non-methylated intermediates can be substantially reduced, so levels of respective analogues are quite variable (Billich & Zocher, 1990).
The number of congeners of a single peptide class that can be found in an individual strain can reach more than 10, although often a few congeners are dominant. This is well known for microcystins. The most toxigenic strains of Microcystis, for example, produce the variants Mcyst-LR, -RR and -YR, whereas other strains either have single variants or combinations of variants (Lawton, 1999a; Rohrlack, 2001). In addition, the respective unmethylated variants (e.g. [Dha7]Mcyst-RR) may be present. Multiple microcystin variants have also been detected in Nostoc sp. (six variants in strain IO-102-l; Oksanen, 2004) or in P. agardhii (12 variants in strain Max 06; Welker, 2004b). Similar structural diversity is also observed for other peptide classes. At least 13 cyanopeptolins are produced by Microcystis HUB08B03 (Czarnecki, 2006), varying in three positions with a maximum of three different building blocks each. In many strains of Microcystis and Planktothrix, the same structural variants of anabaenopeptins are produced, namely anabaenopeptins A, B, F and oscillamide Y (own unpublished data).
In most cyanobacterial strains that produce oligopeptides, there is more than one class of peptides. Peptides of two or three classes can frequently be found in Microcystis, Planktothrix, Anabaena or Nostoc strains (Harada, 1993, 1995; Fujii, 1997b; Kodani, 1998). The number of peptide classes produced by an individual strain multiplied by the number of congeners actually produced results in a number of individual peptide structures of several dozens. In natural populations, the co-existence of distinct chemotypes can dramatically increase the number of peptides that can be found in a bloom sample.
Functions of cyanobacterial peptides
From the distribution of particular peptides among clones of Microcystis, for example, it is evident that none of the peptides (peptide classes) is required by non-producing clones either for growth in the laboratory or to be competitive in natural environments. On the other hand, strains that grow under laboratory conditions for decades do not lose the ability to produce the peptides typical for that strain. Microcystis PCC7806, for example, was isolated in 1973 from a Dutch lake, deposited in the Pasteur Culture Collection as axenic strain in 1978 and since then been the object of many studies on peptide production (Martin, 1993; Dittmann, 1997; Rohrlack, 1999a; Kaebernick, 2000; Tillett, 2000; Wiedner, 2003; Pearson, 2004). Under all culture conditions and in all laboratories, the strain produced the same peptides (microcystins and cyanopeptolins) except, of course, mcy-knockout mutant. This contrasts with the production of gas vesicles, for which spontaneous mutants that lack the ability to produce them frequently occur (Mlouka, 2004). Paradoxically, the production of peptides in an individual strain seems to be selectively stabilized, even in axenic cultures, whereas other strains that completely lack the corresponding biosynthetic genes or the respective mutants do not seem to exhibit any severe disadvantage (Hesse, 2001; Kaebernick, 2001). Natural populations and communities are mixtures of producers and non-producers with respect to particular peptides or peptide classes (Fastner, 2001; Rohrlack, 2001; Welker, 2004a; b).
Several hypotheses on the function of peptides in the physiology and ecology of cyanobacteria have been discussed, mostly related either to grazing protection or to allelopathy.
The bioactivities exhibited by many cyanobacterial peptides towards mammalian (or vertebrate) test systems are often similar to effects observed in invertebrate animals that might be potential consumers of cyanobacteria. The toxicity of microcystins to mammals is caused in principle by the inhibition of protein phosphatases 1 and 2a, which are important enzymes in the intracellular regulatory mechanisms (Honkanen, 1994; Dawson, 1998). A similar inhibition has been demonstrated for protein phosphatases of Daphnia, the most important grazer in pelagic freshwater systems (DeMott & Dhawale, 1995). In due course, it has been shown that an intoxication of Daphnia upon ingestion of cyanobacterial cells is largely dependent on the microcystin content of the cells (Rohrlack, 1999a, b). On the other hand, no clear evidence has been produced that Daphnia intoxication plays a major role in plankton dynamics, whereas grazing resistance due to colony or filament formation has been recognized as important grazing protection (Hansson, 1998; DeMott, 2001; Kagami, 2002). Other cyanobacterial peptides have been reported to inhibit protein phosphatases (Sano, 2001), too, but respective IC50-values were at least an order of magnitude higher compared to that of microcystins (Honkanen, 1994).
Many cyanobacterial peptides have been studied for potential pharmaceutical applications and, in many cases, protease inhibitory activity has been found. Protease inhibition is known as a (inducible) grazing protection in terrestrial plants (Pena-Cortés, 1995; Bowles, 1998). For a number of cyanopeptolin-type peptides, inhibitory activity against serine/threonine proteases has been reported. In one case, hofmannolin (a cyanopeptolin), the interaction with elastase has been demonstrated by co-crystallization and X-ray spectroscopy, which revealed the importance of the Ahp-moiety for the inhibitory activity (Matern, 2003b). However, cyanopeptolins are not the only protease inhibitors among cyanobacterial peptides, and inhibitory activity has been reported for aeruginosins and microviridins (Shin, 1997, 1995; Ishida, 1999). For microviridin J, it has been shown that this peptide inhibits the molting of Daphnia and thus might reduce the grazing pressure of a population efficiently without directly killing the grazers (Rohrlack, 2004). Several studies demonstrated the effective inhibition of Daphnia proteases by cyanobacterial peptides (Agrawal, 2001, 2005; Rohrlack, 2003; von Elert, 2004; Czarnecki, 2006). From marine cyanobacteria, other feeding deterrents have been isolated that are highly modified peptides, like ypaoamide (Nagle & Paul, 1998).
Allelopathy in aquatic systems is not well studied compared to terrestrial systems (Legrand, 2003), although it is considered an important regulatory factor for community composition and dynamics (Gross, 1999, 2003). Allelopathic effects of cyanobacterial metabolites mostly concern the reduction of photosynthetic activity and growth rates of other planktonic autotrophs (Smith & Doan, 1999), eventually leading to cyanobacterial dominance (von Elert & Jüttner, 1997; Schagerl, 2002; Suikkanen, 2004). Peptides may also affect the competition between individual clones, as shown in recent enclosure experiments (Schatz, 2005).
Inhibition of photosynthetic activity by microcystins has been observed (Pflugmacher, 2002), indicating an allelopathic nature of microcystins. However, effective concentrations were generally higher than those expected in natural waters (LeBlanc, 2005). Indeed, microcystins are released to the surrounding water only in small amounts by vital cells (Welker, 2001). Other peptides have been tested for their allelopathic capacity, e.g. kasumigamide (Ishida & Murakami, 2000), but effective concentrations were also rather high compared to concentrations that can be assumed under field conditions. In freshwater systems, the most important adverse effect of cyanobacterial blooms on macrophytes and eukaryotic phytoplankton might be the reduction of light by shading (Casanova, 1999).
In terrestrial or benthic cyanobacteria, allelopathic metabolites could be more important as, in their respective habitats, the diffusion driven dilution is much slower. Such compounds have been isolated but they have mostly non-peptidic structures (Klein, 1995; Hagmann & Jüttner, 1996). Antifungal and anti-algal peptides have been isolated from terrestrial strains, but it remains obscure whether these peptides act as allelochemicals in situ (Todorova, 1995; Neuhof, 2005).
A further hypothesis relates cyanobacterial peptides to bacterial quorum sensing mechanisms (Kaebernick, 2000).
At present, only a few experimental studies have been published to resolve the ecological role of cyanobacterial oligopeptides and most of these studies were related to microcystins simply because these peptides were best studied and it is easier to obtain funding due to their problematic role in drinking water hygiene. For an understanding of the ecological/evolutionary role of cyanobacterial oligopeptides, two general observations are probably essential: firstly, the biosynthetic pathway of NRPS/PKS is a very ancient part of the (cyano)bacterial metabolism, i.e. cyanobacteria have synthesized oligopeptides long before higher plants or animals existed. Secondly, natural selection did not minimize the pool of peptide structures to a few very efficient ones but, on the contrary, obviously favoured the production of a vast array of individual structures.
Although a wealth of data on cyanobacterial peptides and the respective biosynthetical pathways has become available in the last decade, it is likely that we still know very little about these metabolites. Even less is known about their ecological or physiological functions, which are obscure at present or have only been hypothesized for a few peptide types. The homology of various NRPS genes and suspected intergenic recombination events suggest a similar function of various peptides, or at least a tight co-evolution.
New congeners of known peptide classes as well as entirely new peptides remain to be discovered together with their respective biosynthesis genes. Structural and chemical data will improve our understanding of the biosynthetical potential of cyanobacteria and the distribution of peptide types at the taxonomic and geographic levels. Genetic data will provide insights into the evolution of gene clusters responsible for the production of myriads of peptides that are often variations on a single theme–and will help to unveil the mechanisms of Nature's own combinatorial biosynthesis.
The collation of peptide, sequence and biochemical data and references was supported by Marcel Erhard (AnagnosTec GmbH, Germany), Jutta Fastner (Umweltbundesamt) and Karina Hesse (TU Berlin), and we are very thankful for this support. The work on this review was largely made possible through funding by the EU-project ‘Bioactive Peptides from Cyanobacteria’ (PEPCY). Comments on earlier drafts of the manuscript by Elke Dittmann and Annick Wilmotte were highly appreciated as were the comments made by two anonymous reviewers.
& BagchiSN (2005) Cysteine and serine protease-mediated proteolysis in body homogenate of a zooplankter, Moina macrocopa, is inhibited by the toxic cyanobacterium, Microcystis aeruginosa PCC7806. Comp Biochem Physiol B 141: 33–41.
& CaixachJ (2004) Determination of microcystin variants and related peptides present in a water bloom of Planktothrix (Oscillatoria) rubescens in a Spanish drinking water reservoir by LC/ESI-MS. Toxicon 44: 881–886.
& RavelJ (2000) Coelichelin, a new peptide siderophore encoded by the Streptomyces coelicolor genome: structure prediction from the sequence of its non-ribosomal peptide synthetase.
FEMS Microbiol Lett 187: 111–114.
& JungblutP (1999) Rapid identification of the new anabaenopeptin G from Planktothrix agardhii HUB 011 using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Rapid Comm Mass Spectrom 13: 337–343.
& Von DöhrenH (2001) Determination of oligopeptide diversity within a natural population of Microcystis spp. (Cyanobacteria) by typing single colonies by matrix-assisted laser desorption ionization-time of flight mass spectrometry. Appl Environ Microbiol 67: 5069–5076.
& SivonenK (1996) Novel cyclic peptides together with microcystins produced by toxic cyanobacteria. Harmful and Toxic Algal Blooms (YatsumotoT, OshimaY & FukuyoY, eds), Intergovernmental Oceanographic Commission of UNESCO, Paris.
& MüllerR (2001) In vitro reconstitution of the myxochelin biosynthetis machinery of Stigmatella aurantiaca Sg a15: biochemical characterization of a reductive release mechanism from nonribosomal peptide synthetases.
Proc Natl Acad Sci USA 98: 11136–11141.
, et al. (1995) Structure determination, conformational analysis, chemical stability studies, and antitumor evaluation of the cryptophycins. Isolation of 18 new analogs from Nostoc sp. strain GSV 224. J Am chem Soc 117: 12030–12049.
(1999) Allelopathy in benthic and littoral areas: case studies on allelochemicals from benthic cyanobacteria and submersed macrophytes. Principles and Practices in Plant Ecology (DakshiniKMM & FoyCF, eds), pp. 179–199. CRC Press, Boca Raton.
& MarahielMA (2004) The linear pentadecapeptide gramicidin is assembled by four multimodular nonribosomal peptide synthetases that comprise 16 modules with 56 catalyic domains. J Biol Chem 279: 7413–7419.
& BörnerT (2000) Characterization of the microginin synthetase gene cluster in the cyanobacterium Microcystis aeruginosa HUB 5.3. International Symposium on Phototrophic Prokaryotes, Barcelona, Spain (abstract).
(2003) Naturally mosaic operons for secondary metabolite biosynthesis: variability and putative horizontal transfer of discrete catalytic domains of the epothilone synthase locus.
Mol Gen Genomics 270: 420–431.
& MooreRE (2003) Biosynthesis of 4-methylproline in cyanobacteria: cloning of nosE and nosF genes and biochemical characterization of the encoded dehydrogenase and reductase activity.
J Org Chem 68: 83–91.
& EvansAM (1992a) Identification of 12 hepatotoxins from Homer lake bloom of the cyanobacterium Microcystis aeruginosa, Microcystis viridis, and Microcystis wesenbergii: nine new microcystins.
J Org Chem 57: 866–872.
& WalshCT (2001) In vitro reconstitution of the Pseudomonas aeruginosa nonribosomal peptide synthesis of pyochelin: characterization of backbone tailoring thiazoline reductase and N-methyltransferase activities.
Biochemistry 40: 9023–9031.
& WalshCT (2003) Epimerization of an L-cysteinyl to a D-cysteinyl residue during thiazoline ring formation in siderophore chain elongation by pyochelin synthetase from Pseudomonas aeruginosa. Biochemistry 42: 10514–10527.
& SivonenK (2004) Effects of phosphate and light on growth of and bioactive peptide production by the cyanobacterium Anabaena strain 90 and its anabaenopeptilide mutant. Appl Environ Microbiol 70: 4551–4560.
& WalshCT (1999) Assembly line enzymology by multimodular nonribosomal peptide synthetases: the thioesterase domain of E. coli EntF catalyzes both elongation and cyclolactonization.
Chem Biol 6: 385–400.
& WillisCL (2000a) Biosynthetic pathway and origin of the chlorinated methyl group in barbamide and dechlorobarbamide, metabolites from the marine cyanobacterium Lyngbya majuscula. Tetrahedron 56: 9103–9113.
& NeilanBA (2001) Detection of toxigenicity by a probe for the microcystin synthetase A gene (mcyA) of the cyanobacterial genus Microcystis, comparison of toxicities with 16S rRNA and phycocyanin operon (phycocyanin intergenic spacer) phylogenies. Appl Environ Microbiol 67: 2810–2818.
& BonjochJ (2001) First total syntheses of aeruginosin 298-A and aeruginosin 298-B, based on a stereocontrolled route to the new amino acid 6-hydroxyoctahydroindole-2-carboxylic. Chemistry-A European Journal 7: 3446–3460.
& LawenA (2003) Mapping and molecular modelling of S-adenosyl-L-methionine binding sites in N-methyltransferase domains of the multifunctional polypeptide cyclosporin synthase. J Biol Chem 278: 1137–1148.
& ChorusI (2004) Distribution of microcystin-producing and non-microcystin-producing Microcystis sp in European freshwater bodies: detection of microcystins and microcystin genes in individual colonies.
Syst Appl Microbiol 27: 592–602.
& Von DöhrenH (2004a) Diversity and distribution of Microcystis (Cyanobacteria) oligopeptide chemotypes from natural communities studied by single colony mass spectrometry. Microbiology 150: 1785–1796.
& GerwickWH (1999) Biosynthesis of the marine cyanobacterial metabolite barbamide. 2: elucidation of the origin of the thiazole ring by application of a new GHNMBC experiment.
Tetrahedron Lett 40: 5175–7178.
& ShioiriTA (2005) Synthetic studies of the cyclic depsipeptides bearing the 3-amino-6-hydroxy-2-piperidone (Ahp) unit. Total synthesis of the proposed structure of micropeptin T-20. Tetrahedron 61: 1459–1480.