OUP user menu

The bacterial replication initiator DnaA. DnaA and oriC, the bacterial mode to initiate DNA replication

Walter Messer
DOI: http://dx.doi.org/10.1111/j.1574-6976.2002.tb00620.x 355-374 First published online: 1 November 2002

Abstract

The initiation of replication is the central event in the bacterial cell cycle. Cells control the rate of DNA synthesis by modulating the frequency with which new chains are initiated, like all macromolecular synthesis. The end of the replication cycle provides a checkpoint that must be executed for cell division to occur. This review summarizes recent insight into the biochemistry, genetics and control of the initiation of replication in bacteria, and the central role of the initiator protein DnaA.

Keywords
  • Protein domain
  • DnaB helicase
  • ATP-binding DnaA box
  • Plasmid

1 Introduction

The initiator protein DnaA is found in all eubacteria analyzed so far. The name derives from the fact that dnaA was the first mutant isolated that is affected in DNA replication. In fact, mutants from this series were the first temperature-sensitive mutants in bacteria and established the use of this type of conditional lethal mutations for cellular functions [1, 2]. The replicon hypothesis by Jacob et al. [3] postulates two basic elements for the initiation of replication: the initiator, a trans-acting substance, DnaA protein, and the cis-acting replicator, which we now call replication origin, oriC. Most bacteria possess a unique replication origin on the usually circular chromosome. Several aspects of the initiation of bacterial replication are covered in a number of recent reviews [47].

2 The different functions of DnaA protein, an overview

All functions of DnaA protein depend on its ability to bind specifically to an asymmetric 9-bp recognition sequence, the DnaA box: 5′-TTATNCACA. A replication origin, oriC, usually consists of an array of several DnaA boxes. Binding of DnaA to such an array, origin recognition, is the first step in the assembly of a specialized nucleoprotein complex [8], the initiation complex. Structural distortions of the DNA within such a complex are the prerequisite for the second function of DnaA in the initiation process. It acts as a DnaA primosome. Protein–protein interaction results in the loading of the replicative helicase, DnaB in the case of Escherichia coli.

DnaA protein is also a transcription factor (for a review see [9]). Binding to one or two DnaA boxes in a promoter region may result in repression. Most importantly, the dnaA gene itself is repressed by DnaA, it is autoregulated. Other promoters are activated by DnaA binding. If DnaA boxes are within a transcription unit, DnaA binding may result in transcription termination.

3 Steps in the initiation of E. coli DNA replication

Most of our knowledge on DnaA and oriC is based on E. coli. Especially the elegant work over many years from the Kornberg laboratory using the DnaA- and oriC-dependent in vitro replication system is the basis of our understanding [10].

oriC of E. coli contains in 260 bp five DnaA boxes, and at its left end an AT-rich region consisting of three 13-mer repeats and the so-called AT cluster (see Fig. 1). In addition, there are binding sites for accessory proteins like IHF and FIS, and control factors like IciA, Rob, H-NS whose precise function is not known (see [5] and references therein). The importance of 11 GATC sites, recognition sequences for the Dam methyltransferase, will be discussed in Section 16.

1

Schematic replication origin of E. coli.

DnaA protein binds to its five binding sites in oriC as a monomer, introducing a 40° bend at each site [11]. Only DnaA complexed with ATP is active in initiation [12]. ATP-DnaA binds to additional sites, 6-mer ATP-DnaA boxes with the sequence 5′-AGATCT or a close match [13, 14]. These sites are predominant in the AT-rich region which is unwound, and subsequently stabilized due to specific binding of ATP-DnaA to single-stranded ATP-DnaA boxes. In this sequence of interactions binding to 9-mer DnaA boxes in oriC is a high-affinity interaction (KD=1 nM), binding to 6-mer ATP-DnaA boxes in double-stranded DNA is a low-affinity interaction that requires cooperativity among the proteins, and binding to 6-mer ATP-DnaA boxes in single-stranded DNA is again a high-affinity interaction [14]. A threshold level of DnaA protein is required for the conversion of the initial to the open complex. It is presumably because of these graded affinities and the cooperativity between the proteins that a precise timing of initiation in the cell cycle is possible. The net result is unwinding of the AT-rich origin region [1517]. FIS protein has a negative effect on the reaction [18, 19], HU and IHF proteins [20, 21], a high ATP concentration (>2 mM), high temperature (38°C) [22], and transcriptional activation enhance unwinding [21, 2325]. The replicatively active complex contains 20–30 DnaA monomers, as determined by electron microscopy [26, 27]. Direct measurement by surface plasmon resonance gives a stoichiometry of 18 DnaA monomers per oriC in the initial complex [7]. The open complex with six more binding sites in the single-stranded AT-rich region thus must contain 24 DnaA monomers, corroborating the electron microscopic results. However, lower stoichiometries have also been reported, and the influence of binding conditions on DnaA stoichiometries has been emphasized [28].

The unwound region spans 28 bp without and 44–46 bp with SSB present [17]. Since single-stranded DNA covered with SSB is a poor substrate for DnaB helicase, it has to be loaded with the help of DnaA. Two double hexamers of DnaB and the helicase loader DnaC, one double hexamer for each replication direction, are positioned by DnaA into the loop [28, 29]. DnaC leaves the complex immediately after or during loading, accompanied by ATP hydrolysis. This activates the helicase activity of DnaB [30]. The two DnaB hexamers slide past each other in 5′-3′ direction, and expand the bubble to about 65 nucleotides [29]. Now primase can enter the complex and synthesize two leading strand primers. Synthesis of opposed and overlapping leading strands has been postulated before in order to ensure complete synthesis of oriC [31].

The sliding clamp, a ring-shaped dimer of the β-subunit of DNA polymerase III, is loaded onto each primed template by the clamp loader, the polymerase III γ complex [32]. This activates the intrinsic ATPase activity of DnaA [33, 34], in cooperation with a protein with sequence homology to DnaA, Hda [35]. Inactivation of DnaA by ATP hydrolysis prevents further initiations. The E. coli initiation cycle is shown schematically in Fig. 2.

2

The initiation cycle of E. coli.

4 Domains of DnaA and the homology between DnaA proteins

Sequence homology based on four species [36, 37] and later on 30 species [6] made it possible to subdivide DnaA protein into four domains. These domains were later shown to correspond to functional domains, with minor correction of their borders [38, 39]. The functional studies were inspired by earlier attempts to combine sequence comparisons with secondary structure predictions [40]. These results were recently corroborated and extended by a thorough study that compares 104 different DnaA proteins (C. Weigel and W. Messer, http://www.molgen.mpg.de/~messer/). Despite considerable effort no three-dimensional structure of a DnaA protein is available. However, the high number of proteins to compare and the possibility to relate the predicted structure to experimentally determined structures of other members of this protein family, the AAA+ family, make this prediction pretty reliable. The AAA+ class of ATPases is a large family of proteins with common sequence motifs. Many of them are involved in the initiation of replication and other functions of DNA metabolism throughout all kingdoms with a presumptive role in remodeling and loading of proteins. DnaA and also DnaC are members of this family [41].

Grouping of DnaA sequences from eubacteria on the basis of similarity reproduced closely the phylogenetic relations. DnaA domains are shown in Fig. 3. For domain 1 (86 residues) a consensus sequence is not very pronounced, except for a few patches of conserved residues. Domain 2 is highly variable in sequence and length, between 6 and 247 (Streptomyces) amino acids. Therefore we assume it to form a flexible region. The C-terminal two-thirds are quite similar and make it possible to derive a reasonable consensus sequence. Domain 3 contains an ATP-binding site, i.e. Walker A and B motifs, and other motifs found in the AAA+ protein family. Domain 4 is also well conserved and lacks homologues in the archaea as well as in the eukarya. It contains the DnaA signature, a definition used in the SWISS PROSITE online search facility, which has been updated on the basis of the now known 104 DnaA sequences: [STPQ]-x(3)-[IL]-[GA]-x(2)-[FLIM]-x(1,2)-[RK]-[DTSNER]-[HA]-[TSPA]-[TSVA]-[VILM] (C. Weigel and W. Messer, http://www.molgen.mpg.de/~messer/).

3

Secondary structure prediction for E. coli DnaA protein with sequence motifs and interaction domains.

Secondary structure prediction was done by the PHD server (http://www.embl-heidelberg.de/predictprotein/predictprotein.html) [42]. Results were virtually identical for different prediction methods. The secondary structure prediction showed a much higher conservation for the structural elements than for primary sequence. It is shown in Fig. 3. Domain 3 shows an [αβ]5 Rossman fold-type ATPase motif [43] suggesting a tight structure which aligns perfectly with the known structure of polymerase IIIδ′[44]. This illustrates the reliability of the prediction. The prediction for the DNA-binding domain 4 shows four α-helices, the C-terminal long one is divided in two in some DnaA proteins. α-Helices 12 and 13 (Fig. 3) with a basic loop in between were suggested to represent the DNA-binding motif [45].

5 Domain functions

The N-terminal domain 1 mediates protein–protein interactions, DnaA oligomerization and interaction with DnaB protein. Several techniques were used to demonstrate oligomerization. Domains 1 of E. coli [46] and of Streptomyces lividans [47] were able to functionally replace the oligomerization domain of λcI repressor. Physical interaction of isolated domain 1 was demonstrated with a solid phase binding assay in vitro, showing that dimerization via domain 1 does not require other parts of DnaA protein [46]. Cooperative binding of DnaA to two adjacent DnaA boxes depended on domain 1 [48]. Expression of isolated domain 1 was inhibitory to initiation, as measured by suppression of the overinitiation phenotype of dnaA219 (see Section 6) [46], showing that dimerization via domain 1 is an essential step in initiation in vivo.

The larger part of domain 1, amino acids 24–86, is also involved in the interaction of DnaA with DnaB, as shown in vitro with a solid phase binding assay [49]. In in vivo experiments domain 1, or amino acids 24–86, was required for the loading of DnaB [49, 50]. There is a second site for DnaA–DnaB interaction in the N-terminal region of domain 3 (see below). In summary, physical interaction between DnaA and DnaB requires amino acids 24–86 of DnaA, whereas DnaA dimerization requires the complete domain 1, amino acids 1–86; in later experiments amino acids 1–77 were found to be sufficient. Surprisingly, domains 1 and 2 were found not to be required for DNA unwinding of the AT-rich region in vitro [50]. However, it has also been reported that mutants with altered residues 26 or 40 within domain 1 were unable to unwind the AT-rich region [51]. Additional experiments are required to clarify this point.

Domain 2 is very different in size in different bacteria, as mentioned above. In E. coli DnaA all of domain 2, amino acids 87–134, can be deleted without loss of function. A spontaneous mutant, dnaA216, ‘DnaA light’ lacking residues 87–104, was functionally indistinguishable from wild-type [38]. However, long domains 2, e.g. in Streptomyces, have been conserved in evolution with respect to length not sequence. Therefore the length of domain 2 may well have an important function in certain conditions.

Domain 3 is organized as an open twisted αβ-structure [11] and contains the ATP-binding region of AAA+ ATPase-type proteins, a P-loop or Walker A motif [41, 43, 5254]. The presence of an ATPase is shown by the Walker B motif as well as by the isolation of mutants affected in ATPase activity, E204Q and R334H [55, 56]. The importance of ATP binding and hydrolysis will be discussed in Section 6.

In the N-terminal part of domain 3 is a region, amino acids 130–148, that comprises a second interaction site with DnaB [49, 50, 57, 58] DnaA thus contains two sites for interaction with DnaB, amino acids 24–86, interacting with DnaB amino acids 154–210 (C-terminal βEmbedded Image fragment [59]), and DnaA amino acids 130–148 interacting with the DnaB N-terminus (α fragment) [49].

Domain 3 also contains a second region that promotes oligomerization. This has been shown for Streptomyces DnaA by kinetic analysis [48] and by gel retardation that even showed the existence of mixed dimers containing full-length and truncated DnaA [47]. Likewise, in E. coli DnaA the presence of a region from the C-terminal part of domain 3 gave a strongly reduced dissociation rate in surface plasmon resonance analysis, suggesting cooperativity between monomers that did not contain domain 1 [7]. This same region is conserved in the NtrC family of transcription factors [45].

The DNA-binding domain 4 is represented by the 94 C-terminal amino acids [45]. The DNA-binding motif probably consists of α-helices 12 and 13 (see Fig. 3) with a basic loop between the helices. In fact, mutation of some basic amino acids in the loop abolishes DNA binding [60, 61]. A thorough mutational analysis of this domain revealed that mutations with impaired binding were found throughout domain 4 [60]. Mutations that affect sequence specificity are clustered in the beginning of helix 15 [60, 62]. Suppressors of temperature-sensitive dnaX mutants [63] also map to this specificity region. Binding of ATP to its site in domain 3 modifies the sequence specificity of DnaA protein [13] (see Section 9). In concert with this property, bound ATP is located close to the DNA-binding domain in the three-dimensional structure, as found by cross-linking the γ-phosphate of ATP to Lys-415 in domain 4 [64].

6 The role of ATP binding on DnaA structure and function

Binding of ATP is required for the unwinding of the AT-rich region by DnaA [12]. In fact, this is the only reaction that requires ATP-complexed DnaA, since helicase loading occurs in mutants that are deficient in ATP binding [49]. ATP promotes an allosteric modification and does not provide energy for the unwinding reaction since non-hydrolyzable ATP analogues are equally effective [12].

Several of the ‘classical’dnaA mutants, e.g. dnaA5 and dnaA46, carry a mutation A184V close to the ATP-binding site in the Walker A motif [65]. These mutants are temperature-sensitive and do not bind ATP or ADP at any temperature [6668], but can be activated to ATP binding by DnaK and GrpE proteins [68]. All A184V mutants carry secondary mutations in dnaA [65, 69]. However, when the A184V mutation is separated from the secondary mutations, it confers similar temperature sensitivity [68]. When A184V is present on multicopy plasmids, such plasmids confer cold sensitivity to dnaA+ or dnaA46 hosts [65, 68, 70]. Intragenic suppressors of dnaA46 have been isolated, dnaAcos [71] which carries two additional mutations (Q156L and Y271H) [69], and dnaA219 with a R342C mutation [46]. Both suppressor strains are cold-sensitive due to DNA overinitiation at 30°C [46, 71]. The dnaA219-carrying strain has been successfully used to monitor for conditions that are detrimental to initiation and thereby relieve the cold sensitivity and provide a positive selection for impaired initiation [46, 49].

The active structure of ATP-complexed DnaA is sensed by a region close to or in DNA-binding domain 4, since ATP-DnaA acquires an additional sequence specificity. Mutations that affect ATP hydrolysis were found just upstream of domain 4, R334H [56], as well as at a position in domain 3, E204Q, that is involved in hydrolysis in similar ATPases [55]. In addition, physical proximity between ATP bound in the P-loop to the DNA-binding region was demonstrated by crosslinking the γ-phosphate position of ATP with K415 in domain 4 (see Section 5) [64]. The intrinsic ATPase of DnaA is weak. As outlined in Section 3 it can be stimulated and DnaA thereby inactivated. Since DnaA protein with the A184V mutation does not bind ATP (or ADP) we must assume that the suppressing mutations confer an active conformation onto the protein without nucleotide binding. Of course, such a protein cannot be inactivated by ATP hydrolysis, which may explain the phenotype of overinitiation [72, 73].

The intrinsic ATPase activity of DnaA is stimulated at the end of the initiation cycle. This reaction is promoted by the loading of the sliding clamp to single-stranded DNA, the dimeric β-subunit of DNA polymerase III holoenzyme [33]. This reaction requires Hda as cofactor [35] (see Section 3). Therefore, hda mutants have an overinitiation phenotype similar to dnaAcos or dnaA219. The overall ratio of ADP to ATP in exponentially growing cultures was 4:1. In synchronized cultures ATP-DnaA increased prior to initiation and was hydrolyzed to ADP-DnaA during initiation [74]. This also results in a fluctuation of DnaA synthesis in the cell cycle since ATP-DnaA is the active repressor for the dnaA promoter [13]. The cycling between ATP- and ADP-bound states of initiation proteins is apparently a ubiquitous regulatory principle [75].

It has been described that DnaA at low Mg2+ or nucleotide-free DnaA protein binds DNA unspecifically [28, 76, 77]. However, this is an artifact of the Sekimizu protocol of DnaA preparation [78] which involves a denaturation step. Native DnaA protein is always complexed with ATP or ADP.

7 DnaA and membranes

A large part of the DnaA protein in E. coli cells is found in a membrane fraction [78, 79]. An E. coli mutant, pgsA, with a limited ability to synthesize acidic phospholipids leads to arrest of cell growth [80]. This can be suppressed by overproduction of DnaA proteins with mutations in the membrane interaction domain described below [81]. Also, some dnaA mutants have an altered cell membrane permeability [82]. There are thus a number of indications that DnaA–membrane interaction is physiologically relevant.

Acidic phospholipids decrease the affinity of DnaA for ATP or ADP, and therefore enhance the exchange of these nucleotides and promote a ‘rejuvenation’ of inactive ADP-DnaA [8386]. Phospholipids also decrease the affinity of DnaA for oriC sequences [87]. The region of DnaA that interacts with membrane was defined by mutations in α-helices 10 and 11 (see Fig. 3), amino acids 327–344 and 357–374 [8891]. α-helix 11 (amino acids 357–374) was found to be indispensable for membrane-mediated release of nucleotides [92], and a peptide between amino acids 373–381, adjacent to α-helix 11 on the C-terminal side, was found to bind to phospholipids [92]. So far, it is not known whether the physiological relevance for DnaA–membrane interaction lies in DnaA-mediated binding of the initiation complex to the membrane or in the ATP-ADP ‘rejuvenation’ reaction.

8 DnaA boxes

ATP-DnaA and ADP-DnaA bind with the same affinity (KD between 0.6 and 50 nM) to an asymmetric 9-mer consensus sequence, the DnaA box 5′-TTA/TTNCACA, as measured by gel retardation of oligonucleotides carrying a single DnaA box [11] and by surface plasmon resonance [60]. UDG footprinting shows that within this sequence T2, T4, T7′, and T9′ are of particular importance. In this technique specific T residues are replaced by dU, and their accessibility to uracil-DNA glycosylase is determined with and without DnaA [93]. Previously a more relaxed consensus sequence which allowed 1–2-bp deviation from this stringent consensus sequence was found using DNase I footprinting or retention on nitrocellulose filters [9497]. When the ability of DnaA-DNA box complexes to block transcribing RNA polymerase was used to define DnaA boxes an even more relaxed consensus sequence was found: 5′-(T,C)(T,C)(A,T,C)T(A,C)C(A,G)(A,C,T)(A,C) [98]. The reason for the apparent discrepancy are the rules for binding of E. coli DnaA protein described below. A monomer of E. coli DnaA binds only to the stringent DnaA box. Relaxed consensus sequences require the cooperation of two monomers, and hence two DnaA boxes [7, 13, 99].

9 DnaA oligomerization and cooperativity. Rules for DnaA binding

The dnaA promoter region was analyzed by DNase I footprinting and surface plasmon resonance for the modes of DnaA binding. The surprising observation was that ATP-complexed DnaA, but not ADP-DnaA, has a second sequence specificity for a 6-mer consensus sequence, the ATP-DnaA box [13]. This is also true for the dnaA promoter region of Vibrio harveyi [100]. DnaA binding to ‘non-canonical’ sites was observed before with λ DNA [101]. A summary of these sites and the ATP-DnaA box consensus sequence is given in Table 1.

View this table:
1

E. coli ATP-DnaA boxes

DNA backgroundATP-DnaA boxes
E. coli dnaAp [13]AGAACT
AGATCT
AGTTTA
AGATTT
E. coli oriC [14]AGATCT
AGATCT
AGATCA
AGGATC
λ[101]AGAACT
AGATCC
AGTCAT
AGTATT
V. harveyi dnaAp [100]TGATCG
AGATCG
AGACTG
AGATCT
AGATCA
ConsensusAGatct

The rules for DnaA binding as derived from the dnaA promoter region are as follows:

  1. E. coli DnaA, both ATP-DnaA and ADP-DnaA, binds to a single DnaA box as a monomer. But this box must conform to the stringent consensus sequence 5′-TTA/TTNCACA, henceforth called a ‘strong’ box.

  2. Both forms do not bind to single ‘weak’ boxes, i.e. DnaA boxes with one or two mismatches to the consensus sequence, unless cooperation occurs with a second DnaA protein bound to a DnaA box close by.

  3. ATP-DnaA recognizes in addition 6-mer ATP-DnaA boxes with a consensus sequence that corresponds to BglII recognition sites. These constitute weak boxes, and therefore require an adjacent strong box for DnaA binding.

  4. ATP-DnaA also binds to single-stranded ATP-DnaA boxes. This was observed for the unwound region in E. coli oriC [14] and for a substrate with a double-stranded strong box and adjacent single-stranded DNA with an ATP-DnaA box [102, 234]. In the latter case the strong box was required for binding to the single-stranded ATP-DnaA box, in oriC ATP-DnaA could bind to single-stranded ATP-DnaA boxes without the help of a strong box, presumably because of the high number of such boxes in oriC. Single-stranded ATP-DnaA boxes have an intermediate affinity (KD=40 nM); strong double-stranded boxes have a KD of about 1 nM, and weak double-stranded boxes, including ATP-DnaA boxes, have a KD of around 400 nM [14].

As discussed in Section 5, there are two regions responsible for oligomerization of DnaA monomers, domain 1, amino acids 1–77, and a region in domain 3. This second oligomerization region was localized in the C-terminal part of domain 3 between amino acids 334 and 373 [7]. The domain 1 oligomerization domain can operate over a distance of at least 150 bp [46, 49], and does not even require DnaA protein to be bound to DNA, the domain 3 oligomerization region requires a precise spacing of adjacent DnaA boxes [7].

Rules for binding of DnaAs from other bacteria are similar but not identical, and therefore show the variability of the binding reaction, see also Section 12. Bacillus subtilis DnaA apparently follows similar rules, since a hybrid origin with the main part of oriC from E. coli and the AT-rich region from B. subtilis can execute most initiation reactions, primarily it can be unwound [103]. DnaA from S. lividans has the same two interaction domains with similar kinetic consequences [47, 61, 104]. However, oligomerization via domain 3 does not require binding to a second DnaA box as in the case of E. coli, since mixed oligomers made out of full-length and truncated DnaA proteins, respectively, can bind to a single DnaA box [47]. Streptomyces species have a high G/C content in their DNA, and consequently the consensus sequence of the Streptomyces DnaA box has a G or C at position 3 [105, 106]. Although this is a strong box, binding affinity to the E. coli consensus sequence is still better, and although binding to a single box of this kind is possible, binding to two boxes is preferred [48]. It is not known whether the ATP-complexed form of Streptomyces DnaA recognizes a 6-mer box.

Thermus thermophilus is also a G/C-rich organism. However, contrary to Streptomyces, the DnaA boxes in the Thermus origin have the stringent E. coli consensus sequence. However, single DnaA boxes do not bind T. thermophilus DnaA (approx. KD>1500 nM), and only cooperativity allows reasonable binding with an apparent KD of 30–60 nM. This cooperativity requires ATP-DnaA, the affinity of T. thermophilus oriC to ADP-DnaA is about 10 times lower [107].

10 DnaA-mediated origin unwinding

The unwinding of the AT-rich region is the crucial step in the initiation of most origins, prokaryotic and eukaryotic ones. In the case of E. coli, the AT-rich region contains six ATP-DnaA boxes adjacent to the 9-mer DnaA box R1 (Fig. 4). The sequential binding of ATP-DnaA to these sites has been determined using DNase I footprinting and surface plasmon resonance [14]. Initial binding is to DnaA box R1. This then serves as an anchor for the cooperative binding of ATP-DnaA to the double-stranded ATP-DnaA boxes. As outlined above, these boxes promote low-affinity binding. Therefore cooperative interaction is required. DnaA bends the DNA by 40° upon binding [11]. The topological stress in this complex unwinds the DNA in the AT-rich region. ATP-DnaA then binds to the single-stranded ATP-DnaA boxes, providing an initial stabilization of the single-stranded state. This is again a high-affinity interaction. It seems to be a general feature in the assembly of protein complexes at specific sites and at defined times that a sequence of high-low-high affinity is followed. Cooperative binding to the double-stranded AT-rich region is therefore presumably the limiting step in the initiation reaction, followed by unwinding [14].

4

Mechanism of unwinding of the AT-rich region. ATP-DnaA is shown in red, ATP-DnaA boxes are shown in blue, the high-affinity box R1 in magenta.

Initially 28 bp are unwound by DnaA protein alone, and 44–52 when SSB is also present. This is true both for E. coli and for B. subtilis oriC [17]. A hybrid origin with the main part from oriC of E. coli and the AT-rich region from B. subtilis could be unwound by E. coli DnaA, and helicase was loaded into the bubble [103]. Therefore we assume that the initiation process and the action of DnaA are similar for E. coli and for B. subtilis. We also conclude that the species specificity of the DnaA–oriC interaction lies in the main part of oriC with its five 9-mer DnaA boxes.

11 DnaA-mediated helicase loading: the DnaA primosome

Two double hexamers of the replicative helicase DnaB and the helicase loader protein DnaC, one for each replication direction, must be loaded into the unwound region. Since single-stranded DNA covered with SSB is not a substrate for DnaB, helicase must be positioned by DnaA. As outlined in Section 5, DnaA and DnaB interact physically. The DnaA N-terminus (amino acids 24–86) interacts with the DnaB βγ fragment (amino acids 154–210), and the DnaB N-terminus (α fragment) with domain 3 of DnaA (amino acids 130–148) [49, 50, 57, 58]. In an elegant study, O'Donnell and co-workers could show that two DnaB6–DnaC6 complexes are actually introduced into the bubble in a reaction that does not require helicase activity [29]. This stoichiometry was subsequently confirmed [28]. The DnaB hexamer forms a ring around single-stranded DNA [108, 109]. The channel through the helicase ring corresponds to about 20 nucleotides. The bubble can therefore accommodate just two helicase rings. A binding activity to single-stranded DNA induced in DnaC upon interaction with DnaB has been described. This seems to facilitate the transfer of DnaB to the template [110]. DnaC dissociates from the complexes with concomitant ATP hydrolysis resulting in two head-to-head helicase hexamers. This activates DnaB as a helicase [111]. The two helicase complexes move past each other in 5′-3′ direction, and only after they moved >65 nucleotides can they interact with primase [29].

The strand in oriC to which the first helicase is positioned by DnaA was determined using an artificial substrate that consisted of a double-stranded DnaA box R1 of oriC and single-stranded DNA from the AT-rich region in one strand and random single-stranded DNA in the other strand. At limiting DnaA concentrations DnaB was loaded to the lower strand [102, 234]. This suggests that the first helicase hexamer is positioned onto the lower strand, as illustrated in Fig. 2. This is compatible with results obtained by footprinting [29].

There is a very striking similarity between the region of E. coli oriC where DnaB is loaded into the AT-rich bubble and structures found in other systems where DnaB helicase is loaded with the help of DnaA. Even the distance between the 3′-end of the DnaA box and the potentially single-stranded DNA is similar in most cases. Such a structure is found in several bacterial origins, e.g. B. subtilis [17, 112], Pseudomonas putida and aeruginosa [113, 114], Coxiella burnetti [115], T. thermophilus [107], and others. This also holds for some plasmid origins like P1 [116] or mini-F [117]. In an artificial system, the ABC primosome, a DnaA box is positioned in a hairpin structure within single-stranded phage DNA [118]. In all cases, a single double-stranded DnaA box is followed by a single-stranded region to its 3′-side.

12 Variations of the theme: DnaA and oriC in other bacteria

Different bacteria have replication origins with widely differing sizes, but all (except Synechocystis, see below) contain several DnaA boxes and an AT-rich region (Fig. 5). For example, E. coli oriC consists of 260 bp and five DnaA boxes, B. subtilis oriC contains three clusters of DnaA boxes which are separated by the dnaA gene [119]. T. thermophilus has symmetrically arranged 12 or 13 boxes (number 12 and 13 overlap) [107], and Streptomyces has 19 DnaA boxes in 600 bp [106, 120].

5

Replication origins of different bacteria. DnaA boxes are shown as red arrows, AT-rich regions are shown in yellow.

12.1 B. subtilis

All three clusters of DnaA boxes are required for oriC function. The intervening dnaA gene can be largely deleted if DnaA is provided in trans. There are three AT-rich 16-mers at the rpmH-proximal border of oriC, and an AT-rich cluster of 27 bp at the dnaN side [121]. The unwinding reaction occurs at this 27-bp AT cluster and surrounding sequences [122]. DnaA proteins bound to the DnaA box clusters interact and form loops that can be visualized in the electron microscope. This is also seen with E. coli oriC, but the physiological relevance is not clear [122]. The DnaA protein is biochemically very similar to that of E. coli [123]. As described above, the structure of the region between the right-most DnaA box in Fig. 5 and the AT cluster is very similar to the corresponding region in oriC of E. coli, including the positions of ATP-DnaA boxes, and the extent and positions of unwound nucleotides are virtually identical between E. coli and B. subtilis [17].

The location of oriC with DnaA box clusters upstream of dnaA or between dnaA and dnaN is conserved in many bacteria [124], e.g. Micrococcus luteus [36], Mycoplasma capricolum [125], Spiroplasma citri [126], Mycobacterium [127], Helicobacter pylori [128], and Streptomyces [106]. Likewise, many bacteria carry a ‘dnaA’ operon downstream, containing besides dnaA and dnaN, recF and gyrB. It has therefore been suggested that this genomic arrangement represents a primordial structure [104, 113, 121, 124]. However, there are also many exceptions with dnaA and oriC located in different environments on the genome [129, 130].

A minichromosome replicating from oriC of B. subtilis has been isolated [131]. However, distinct from minichromosomes of E. coli, such plasmids have a strong incompatibility to the chromosomal origin, and consequently their copy number is low. Incompatibility is also observed when isolated DnaA box clusters from oriC are cloned in high-copy-number vectors [112]. Presumably, the relaxed copy number control of E. coli minichromosomes is due to the Dam-SeqA system (see Section 16) that allows the cell to discriminate between replicated and not yet replicated origins. An organism like B. subtilis without such a methylation must control its replication initiation much more stringently. An in vitro replication system for oriC plasmids has also been established for B. subtilis [132]. Expression of the dnaA gene is autoregulated, as in the case of E. coli (see Section 13) [133].

One of the additional controls found in B. subtilis but not in E. coli are two replication checkpoints 100–200 kb downstream of oriC on either side. They are dependent on the stringent response and on a RTP protein-binding site of the type normally used for replication arrest at the chromosomal terminus [134].

12.2 Streptomyces

Streptomyces species have large linear chromosomes with a centrally located oriC. This oriC can be cloned as a circular, autonomously replicating minichromosome, which also has a low copy number [120]. Each oriC contains 19 DnaA boxes whose location and orientation are conserved [106]. Streptomyces oriC contains no extended AT-rich region, but five short AT-rich stretches. Consequently, unwound bases are interspersed with paired ones (J. Majka and J. Zakrzewska-Czerwinska, personal communication). Therefore it is unknown where and how the replicative helicase is loaded.

Streptomyces DnaA protein is similar to that from E. coli in many respects [7]. It can oligomerize via two regions, one in domain 1 and one in domain 3 [47, 48, 61, 104]. The consensus sequence of Streptomyces DnaA boxes shows a G or C at position 3 [105, 106], similar to M. luteus [36]. Both are organisms with a high G+C content. Such a strong Streptomyces DnaA box (5′-TTGTCCACA) can bind a DnaA monomer, and different from the situation in E. coli, also a dimer [47, 61]. However, binding to a DnaA box with the E. coli consensus sequence is still better (KD around 10 nM for TTGTCCACA and around 3 nM for TTATCCACA) [48]. Weak boxes with mismatches require cooperation between DnaA proteins, and therefore require two boxes [48, 135]. In an experiment where optimal binding sites were selected by evolution from random sequences, the optimal spacing was 3 bp and the boxes were facing each other [48]. At higher concentrations long-range interactions between DnaA protein bound to distant DnaA boxes in oriC promote loop formation [47, 104]. The promoter region of dnaA contains two DnaA boxes with the same spacing and arrangement as found in the selection from random sequence [136]. Cooperative binding of DnaA to these boxes results in repression, and thus in autoregulation of the dnaA gene [135].

12.3 T. thermophilus

The T. thermophilus replication origin is also located between the dnaA and dnaN genes. It has a quite unusual, nearly symmetrical structure [107]. Six DnaA boxes in the first half of oriC point toward dnaA, six boxes in the second half point toward the dnaN gene. Box number 13 overlaps with box number 12. Like Streptomyces, T. thermophilus is an organism with a high G+C content in its chromosome. Nevertheless, all DnaA boxes, except numbers 6 and 13, have the E. coli consensus sequence, 5′-TTATCCACA. At the dnaN side is a 40-bp-long AT-rich stretch.

The Thermus DnaA protein has the same overall structure as the other DnaA proteins. The binding properties, however, are different. A single DnaA box is nearly inert to DnaA. DnaA protein binding to oriC requires cooperativity between monomers, and this cooperativity is enhanced when DnaA is in the ATP-complexed form (Table 2) [107]. There are no DnaA boxes in the promoter region of the dnaA gene, and consequently, T. thermophilus dnaA is not autoregulated [137].

View this table:
2

Apparent dissociation constants of T. thermophilus DnaA protein with DnaA boxes from T. thermophilus oriC

DnaA box(es)[DnaA] at 50% of Rmax (nM)
ADP-DnaAATP-DnaA
413001500
727002200
1–373582
1–627031
1–1342060

A comparison of the two G+C-rich organisms, Streptomyces and T. thermophilus, shows that they follow different strategies to adjust the properties of their DnaA proteins to their origins which are both rich in DnaA boxes. Streptomyces with a high-affinity DnaA protein reduces the quality of its DnaA boxes. Thermus, on the other hand, compensates potentially high-affinity DnaA boxes with a DnaA protein with low binding affinity.

12.4 H. pylori

H. pylori DnaA also has similar structural properties as the other DnaA proteins [128]. The replication origin of H. pylori has been defined by its location close to DnaA, by a strand switch in the G/C composition, and by five DnaA boxes in a 180-bp stretch. These DnaA boxes, however, all have at least one mismatch to the consensus sequence. Despite this, the DNA-binding domain 4 of Helicobacter DnaA can bind to each isolated DnaA box as a monomer [128].

12.5 Synechocystis sp.

Cyanobacteria also have regular DnaA proteins [129, 138]. The Synechocystis DnaA protein binds to DnaA boxes, as does its C-terminal domain 4 with affinities comparable to E. coli DnaA. The genome of Synechocystis has been completely sequenced, and there is no stretch with several DnaA boxes that could qualify as a replication origin. In addition, the dnaA gene could be deleted without loss of viability [139]. However, DnaA has been conserved during evolution, and its transcription is not autoregulated but light-dependent instead, and follows the circadian rhythm of DNA synthesis [139]. The easiest way to interpret these seemingly incompatible observations is to assume that normal replication initiation operates via DnaA and an origin that is even more difficult to detect than H. pylori oriC. In addition, Synechocystis must have a very efficient alternative initiation pathway, possibly involving R-loops or D-loops as in the case of stable replication in E. coli [140].

12.6 Caulobacter crescentus

This is a bacterium with a developmentally differentiated cell cycle. Asymmetric cell division results in a replicating stalked cell and a non-replicating swarmer cell. Motile swarmer cells must differentiate into sessile stalked cells before they can replicate [141]. An essential control system for the cell cycle in this bacterium is a response regulator, CtrA (for cell cycle transcription regulator), which is homologous to E. coli OmpR. The Caulobacter replication origin, Cori, contains five DnaA boxes, of which none has the E. coli consensus sequence, and five binding sites for CtrA [142]. CtrA binds to these sites with a 40-fold enhanced affinity if it is phosphorylated by the histidine kinase CckA [143]. CtrA is only present in swarmer cells and represses among many other genes replication from Cori [144]. During differentiation from swarmer to stalked cells CtrA is degraded by regulated proteolysis [145]. Four of the five binding sites at which CtrA represses replication could, by AT-richness and by their location to the left-most DnaA box of Cori, represent potential sites of unwinding which have to be cleared before initiation. Whatever the precise mechanism, regulation of replication by a transcriptional regulator, that is a response regulator which can be phosphorylated, is a very unusual way to regulate replication in this unusual organism. However, initiation of chromosome replication is strictly dependent on DnaA also in Caulobacter [146]. Like E. coli, Caulobacter has a DNA methyltransferase, CcrM with the recognition sequence GANTC, that is cell cycle-regulated. However, it is not clear whether it has a similar role as Dam methyltransferase in E. coli [147].

13 DnaA as a transcription factor

DnaA boxes are found in the promoter regions of many genes where they can mediate repression, transcriptional activation, or transcription termination due to loop formation between two DnaA boxes in a transcription unit and long-range DnaA–DnaA interaction [99, 148]. The property of DnaA as a transcription factor has been covered in a review by Messer and Weigel [9]. Here I want to limit myself to some aspects of the autoregulation of the dnaA gene. E. coli dnaA is transcribed from two promoters. There is a consensus DnaA box between them. Inactivation of DnaA in temperature-sensitive dnaA mutants results in derepression of dnaA promoters, whereas overproduction of dnaA cloned in a plasmid represses dnaA transcription [149154]. However, a dnaA gene with a mutation of the DnaA box (dnaA820) was still subject to autoregulation [155]. The solution of this paradox is the presence of additional binding sites for DnaA between the promoters. There is a second DnaA box with a mismatch and four ATP-DnaA boxes. The dnaA promoter region was the model sequence for derivation of the rules for DnaA binding [13]. ADP-DnaA binds to the two 9-mer DnaA boxes, ATP-DnaA binds to all boxes as shown by DNase I footprinting and surface plasmon resonance. The result is that ATP-DnaA is a much better repressor for the dnaA promoter than ADP-DnaA [13]. The promoter region with the dnaA820 mutation is bound about 60% as efficiently by ATP-DnaA as the wild-type promoter. There are additional controls for dnaA expression, e.g. methylation of GATC sites present in the region (see Section 16), or Fis and IciA proteins. These are listed in [9].

14 Joint action of DnaA and plasmid initiators in plasmid replication

DnaA is involved in the initiation of replication of many E. coli plasmids. These plasmids contain in their replication origins one or several DnaA boxes and an AT-rich region for unwinding, and, in addition, several iterons, repeated binding sites for a plasmid-encoded initiation protein. Whereas bacterial initiation requires the single initiator DnaA, initiation of plasmids with iterons requires dual initiator proteins; DnaA works in concert with a plasmid-encoded initiator protein [156]. DnaA can assist in the unwinding reaction, in helicase loading, or it can have a primary role in the structure of the initiation complex, e.g. in pSC101. The dependence of the unwinding reaction on DnaA protein is different for different plasmids. It is absolutely required for some, e.g. P1, where DnaA can unwind the AT-rich region even alone, although quite inefficiently. DnaA is required, in concert with the plasmid initiator, for unwinding in mini-F, RK2 and R6K. At the other end of the list is plasmid R1 which requires DnaA in vitro, but not in vivo. The cooperation between DnaA and plasmid initiator proteins extends to the loading of DnaB helicase, at least in the cases that have been analyzed.

14.1 P1

The origin of plasmid P1 contains two groups of DnaA boxes at either end. The left box region is followed by an AT-rich region containing GATC sites. These sites must be methylated by Dam methyltransferase for efficient initiation [157]. Adjacent to the right box region are five 19-bp repeats, binding sites for the plasmid initiator protein RepA. Only one of the two DnaA box regions is required. Even one box is sufficient, provided it has the consensus sequence for a strong box. Optimal replication needs both DnaA box regions [158, 159]. The phasing of DnaA boxes relative to one another and the rest of the origin is important [160]. DnaA is required for the unwinding of the AT-rich region [159]. DnaA, together with HU protein, is sufficient for unwinding. However, the reaction is much more efficient in the presence of RepA, as measured by KMnO4 footprinting [116, 160].

14.2 F

Plasmid mini-F possesses all necessary functions for replication of F plasmids. The replication origin (ori2) has two DnaA boxes, followed by an AT-rich region and four 19-bp direct repeats, iterons to which the plasmid-encoded initiator protein RepE binds as monomer. Both DnaA and RepE proteins alone, but together with HU protein, are able to unwind mini-F. For efficient unwinding, as well as sequence specificity for the AT-rich region, however, the concerted action of DnaA, RepE, and HU is required [117].

14.3 RK2

The replication origin of the broad-host-range plasmid RK2/RP4 (393 bp) carries four (lousy) DnaA boxes, five 17-bp iterons for binding of the plasmid initiator TrfA in its monomeric form, and an AT-rich region consisting of four 13-mer repeats that are similar to the 13-mers in E. coli oriC. DnaA binds cooperatively to the four DnaA boxes, in agreement with the rules for DnaA binding formulated above. DnaA box number 4 seems to serve as an anchor point [161]. However, DnaA binding by itself does not result in unwinding of the AT-rich region, different from the plasmids discussed above [162]. TrfA, bound to the iterons, unwinds the AT-rich region in the presence of HU and ATP. This unwinding is enhanced by DnaA [162]. DnaA is indispensable for the following step, the delivery of the helicase [163]. However, DnaA cannot by itself activate the DnaB helicase at the RK2 origin. Both DnaA and TrfA are required for DnaB-induced template unwinding [163]. Surprisingly, it has been found that the DnaA–DnaB–DnaC complex is not assembled at the unwound AT-rich region but at the DnaA box region that is about 200 bp away [164]. We must therefore assume a physical interaction by DNA looping between the DnaA box region with the bound complex and the unwound AT-rich region, presumably with the help of TrfA.

14.4 R6K

The replication of plasmid R6K is initiated at either of three origins, α, β, and γ. α and β origins require the presence of ori γin cis for function. Initiation from ori α and ori γ requires both the plasmid-encoded initiator π and DnaA protein (reviewed in [165]. ori γ contains (in this order) a DnaA box, an AT-rich region, and seven iterons for binding of π. Between the AT-rich region and the iterons is a binding site for IHF. It has been suggested that its function is to bend the origin and thereby bring into contact DnaA and π[166]. Unwinding of the AT-rich region in ori γ depends on an interaction between DnaA and the N-terminal part (amino acids 1–116) of π[167]. The ATP-complexed form of DnaA is not required, since a mutation in the ATP-binding site (K178A) of DnaA is as effective [167].

14.5 R1

DnaA protein is required for the in vitro replication of plasmid R1, whereas it is dispensable for replication in vivo [168, 169]. Therefore, dnaA(Null) mutants can be integratively suppressed by R1, but not by F [170]. R1 replication initiation is different from the other plasmids discussed so far. Instead of iterons it contains two partially palindromic 10-bp sequences at each end of a 100-bp region. This region is completely protected in DNase I footprinting experiments. At one end is a DnaA box, and at the other end an AT-rich region [171]. A model has been proposed in which a DNA loop is formed by binding and interaction of two RepA molecules, probably dimers, to the binding sites at the ends of ori R, and then the 100-bp loop is filled with more RepA. The AT-rich region is then unwound in this higher-order complex [156, 171]. Binding of DnaA to its box is very inefficient and requires DnaA–RepA contacts [172, 173]. Although DnaA is required for initiation in vitro, the DnaA box can be mutated [172]. Presumably, similar to loading of DnaA to a DnaA box with mismatches, such a loading is also possible due to interaction with RepA. The role of DnaA in R1 replication is, however, not clear.

14.6 pSC101

The origin of pSC101 contains a consensus DnaA box followed by an AT-rich region, a binding site for IHF, and five iterons for RepA binding. Within the iterons is a weak DnaA box with two mismatches to the consensus sequence [174]. DnaA is required for unwinding of the AT-rich region and for helicase loading, in concert with RepA and IHF [174, 175]. However, since domains 2 and 3 of DnaA are dispensable for pSC101 replication [49, 176], the involvement of DnaA must be indirect. It has been suggested that DnaA fulfills a structural role in pSC101 initiation, together with IHF, by stabilizing a loop between the DnaA box at the left end of the origin and the iteron region [174]. Mutant DnaA protein consisting of domains 1 and 4 only might dimerize with domain 1, bind with domain 4 to the left DnaA box and with domain 4 of the second monomer to the iteron region. This might happen by binding to the weak DnaA box or by interaction with RepA. In fact, interaction of DnaA protein domain 1 and of domain 4 with RepA has been demonstrated [176]. An alternative anchor at the iteron side of pSC101 ori could be a second region of DnaA that can support pSC101 replication [39]. DnaA domains 1 and 4 do not need to be covalently linked. A leucine zipper added to each domain allows a non-covalent association that is sufficient to provide a bridge between the left DnaA box and the iteron region in the origin [176].

15 The special case of DnaA and λ plasmids

Replication of bacteriophage λ, and of plasmids derived from phage λ, depends on the phage encoded proteins O and P (for review see [177]). O is a functional analogue of DnaA, and P replaces DnaC during loading of DnaB helicase. The replication origin (ori λ) is located within the O gene. Transcription from the rightward λ promoter pR is required for expression of O and P, but also for transcriptional activation of the origin (for review see [177]). Transcriptional activation is required for early replication of phage λ, and for replication of λ plasmids. Replication of λ plasmids depends on DnaA protein [178]. This was surprising, since there is no DnaA box in the λ origin region. However, binding of DnaA to several 9-bp sequences that have similarity to DnaA boxes, as well as to 6-mer ATP-DnaA boxes, has been demonstrated [101]. Since all these boxes represent low-affinity binding sites, cooperative binding of DnaA is required. DnaA activates transcription, i.e. transcriptional activation, from pR in vivo and in vitro [179, 180]. A DnaA box located several base pairs downstream of pR is particularly important for this activation, since mutation of this box abolishes transcription stimulation by DnaA [180]. It has been shown that early λ replication via the bidirectional θ mode depends on activation of pR by DnaA. It is predominantly unidirectional in dnaA mutants [181]. Unidirectional θ type of replication precedes the late σ type of replication. We may therefore assume that in dnaA+ hosts a depletion of the DnaA pool due to binding to DnaA boxes is the reason for the switch from θ to σ replication [181]. The activation is delicately balanced, e.g. certain dnaA(Ts) mutants are unable to activate pR even at permissive temperature [182]. All these results show that the role of DnaA in λ plasmid replication is a role as transcription factor, and not as a replisome organizer.

16 How to limit initiation to once per generation

All organisms have developed mechanisms that ensure chromosomal replication occurs once and only once per generation [183]. In an E. coli cell there are at least three systems that prevent reinitiation of origins that have already initiated: (i) sequestration of oriC and of the dnaA promoter region after replication and hence blocking of their activities for a given time window; (ii) binding of DnaA protein to a region close to oriC that provides a sink for DnaA; (iii) regulatory inactivation of ATP-DnaA at the end of the initiation cycle. This has been discussed in Sections 3 and 6.

16.1 Sequestration

The replication origin of E. coli contains an unusually high number (11) of GATC sequences, recognition sites for the Dam methyltransferase. These sites must be methylated for efficient initiation [184, 185]. Especially hemimethylated GATC sites, as they are present shortly after replication, are detrimental to initiation [186]. Hemimethylated GATC sites in oriC, and also in the promoter region of dnaA, remain hemimethylated for about one-third of a generation, whereas elsewhere on the chromosome they become remethylated within about 1 min [187, 188]. The length of this eclipse period depends on the supply of Dam methyltransferase [184, 189, 190]. Since in vitro initiation can occur on unmethylated and hemimethylated origins [184, 191, 192], it has been concluded that within cells these regions are sequestered by binding to cellular factors. The most prominent of these factors is SeqA [193, 194]. Another member of the membrane-bound sequestration complex is SeqB [195].

SeqA has a high affinity for hemimethylated oriC, somewhat less for fully methylated oriC, and does not bind specifically to unmethylated DNA [196, 197]. It binds specifically to two sites in oriC, one on each side of DnaA box R1. From there it seems to spread by cooperative binding to adjacent regions, competing with DnaA for binding [198200]. This is likely to be the main reason for the negative effect of SeqA on initiation of replication. Also sequestration of the dnaA promoter region affects initiation negatively [193], since methylation of GATC sites in the promoter region is required for maximum promoter activity [151, 201]. Hence, shortly after initiation at the time when the dnaA promoter is sequestered, there is a transient suppression of dnaA expression [202]. Other inhibitory effects could come from changes in oriC topology made by SeqA protein [203, 204].

Sequestration by Dam methylation is restricted to Enterobacteriaceae. It gives E. coli and its relatives the opportunity to regulate initiation in a very relaxed way. Since the cells are able to tell a new origin from an old one they do not need to keep track of individual copies and need not control the copy number of oriC tightly. Consequently, the copy numbers of E. coli minichromosomes are around 10 per cell, and they show a very large variation, and therefore a high loss rate [205, 206]. Quite contrary, the copy number of minichromosomes from B. subtilis [131] or Streptomyces [120] is around 1 per cell, and there is strong incompatibility between minichromosomes and the chromosomal origin. As expected, initiation of origins that are not subject to methylation is more tightly controlled.

16.2 Titration of DnaA to the datA locus

About 300 DnaA boxes, located around the E. coli chromosome, are able to bind DnaA protein. Five chromosomal regions show an especially high affinity [207]. The locus with the highest DnaA-binding capacity is datA, located close to oriC [208]. It can bind about eight times more DnaA than the combined oriC-mioC region with a similar number of DnaA boxes. This high capacity, compared to oriC, might result from a missing competition with SeqA since there are only few GATC sites in datA. So far, it has not been analyzed whether there are ATP-DnaA boxes in datA or close to the other high-affinity sites that might be responsible for the high binding capacity.

There is evidence for a direct involvement of datA in the regulation of initiation. Deletion of datA results in overinitiation [209]. Additional copies of datA on a plasmid limit initiation or block it completely [209, 210]. We can therefore safely assume that datA is a sink for DnaA protein which regulates the availability of DnaA and hence affects initiation.

17 DnaA and oriC in the cell cycle, control of initiation

When E. coli grows in a rich medium it contains several chromosomes. These initiate synchronously within a very short time interval [183, 211, 212]. oriC present on minichromosomes initiate in the same interval [213]. A number of conditions upset this synchrony. These could either be suboptimal conditions of initiation, e.g. in some dnaA(Ts) mutants even at permissive temperature [214], or interference with the sequestration apparatus, e.g. in dam and seqA mutants [189, 212].

Initiation occurs at a defined time in the cell cycle. Donachie discovered that at the time of initiation cells have a more or less constant volume per replication origin, called the initiation volume [215]. Later it was discovered that there is some variation in this initiation volume [216]. After initiation it takes a constant time to replicate the chromosome, independent of growth rate as long as it is <60 min at 37°C, and a constant time between termination of replication and cell division [217, 218]. The frequency of initiation thus determines the rate of DNA synthesis, and indirectly the rate of cell division.

The DNA-bending proteins Fis and IHF have specific binding sites in oriC (Fig. 1) [219224]. The occupation of binding sites by these architectural proteins and by DnaA has been determined in synchronized cells by in vivo footprinting. The result was that throughout most of the cell cycle, DnaA was bound to DnaA boxes R1, R2, and R4, and Fis was bound to its site, presumably inhibiting initiation. At the time of initiation Fis was lost from oriC, DnaA bound to DnaA box R3, and IHF bound to its site [225, 226]. This results in bending of oriC and in a redistribution of DnaA [227]. So far a correlation of these in vivo results with the results obtained in vitro has not been attempted.

The molecular basis of the initiation volume discussed above apparently is the concentration of DnaA protein. It has been shown that the initiation volume can be changed by modulation of the DnaA level [228]. A hypothesis has been formulated [229, 230] that postulates that initially all newly synthesized DnaA protein is bound to DnaA boxes around the chromosome. These boxes increase in number due to replication, and DnaA continues to be synthesized until the number of DnaA molecules exceeds the number of DnaA boxes. This is the moment of initiation. A necessary condition for this hypothesis is that the binding event that triggers initiation is of lower affinity than binding to the other boxes [229]. The switch to cooperative binding at ATP-DnaA boxes (see Section 10) is ideally suited to fulfill this condition.

In addition, it has been suggested that DnaA protein molecules that become free due to initiation and replication are immediately used for initiation at adjacent origins in the form of an initiation cascade or an avalanche, thereby ensuring the highly synchronous initiation of all origins in a cell [231].

Both these hypotheses have been formulated before the discovery of inactivation of ATP-DnaA due to ATP hydrolysis [33], and before the drastic changes in the proportion of ATP-DnaA before and after initiation had been measured [74]. This does not invalidate these hypotheses, but they have to be refined in view of the newer results.

18 Perspectives

DnaA and oriC and the replicative helicase are the major elements in bacterial replication initiation. They have been the topic for extensive research for close to 40 years, but many questions are still unanswered. So far most of the research has centered on E. coli, but this one-sided view is slowly broadening. Following good tradition in molecular biology, the initiation cycle in bacteria, in E. coli, can serve as a paradigm for other more complicated systems. Eukaryotic viruses like SV40, polyoma or papilloma have origins and initiators that are basically comparable to E. coli [232]. Yeast origins have an AT-rich region and a binding site for the six-protein origin recognition complex (ORC). ORC binds like DnaA to double-stranded and to single-stranded DNA, and the change between the binding specificities is associated with an ATP/ADP switch [75, 233]. ORCs and corresponding origins have been found in all metazoa that have been looked at. The unwinding of double-stranded origin DNA and the loading of helicase into this region is one of the basic reactions in the cell cycle. The E. coli way of life gives us a good example how to think about this process and design experiments.

19 Note added in proof

A crystal structure of domains 3 and 4 of one DnaA protein has recently been published [Erzberger, J.P., Pirruccello, M.M. and Berger, J.M. (2002) The structure of bacterial DnaA: implications for general mechanisms underlying DNA replication initiation. EMBO J. 21, 4763–4773]. It extends and corroborates many of the predictions given in Section 4.

Acknowledgements

I thank Harald Seitz, Kirsten Skarstad, Grzegorz Wegrzyn, Christoph Weigel and Jolanta Zakrzewska-Czerwinska for critically reading the manuscript. I gratefully acknowledge support from the Fonds der Chemischen Industrie.

References

  1. [1].
  2. [2].
  3. [3].
  4. [4].
  5. [5].
  6. [6].
  7. [7].
  8. [8].
  9. [9].
  10. [10].
  11. [11].
  12. [12].
  13. [13].
  14. [14].
  15. [15].
  16. [16].
  17. [17].
  18. [18].
  19. [19].
  20. [20].
  21. [21].
  22. [22].
  23. [23].
  24. [24].
  25. [25].
  26. [26].
  27. [27].
  28. [28].
  29. [29].
  30. [30].
  31. [31].
  32. [32].
  33. [33].
  34. [34].
  35. [35].
  36. [36].
  37. [37].
  38. [38].
  39. [39].
  40. [40].
  41. [41].
  42. [42].
  43. [43].
  44. [44].
  45. [45].
  46. [46].
  47. [47].
  48. [48].
  49. [49].
  50. [50].
  51. [51].
  52. [52].
  53. [53].
  54. [54].
  55. [55].
  56. [56].
  57. [57].
  58. [58].
  59. [59].
  60. [60].
  61. [61].
  62. [62].
  63. [63].
  64. [64].
  65. [65].
  66. [66].
  67. [67].
  68. [68].
  69. [69].
  70. [70].
  71. [71].
  72. [72].
  73. [73].
  74. [74].
  75. [75].
  76. [76].
  77. [77].
  78. [78].
  79. [79].
  80. [80].
  81. [81].
  82. [82].
  83. [83].
  84. [84].
  85. [85].
  86. [86].
  87. [87].
  88. [88].
  89. [89].
  90. [90].
  91. [91].
  92. [92].
  93. [93].
  94. [94].
  95. [95].
  96. [96].
  97. [97].
  98. [98].
  99. [99].
  100. [100].
  101. [101].
  102. [102].
  103. [103].
  104. [104].
  105. [105].
  106. [106].
  107. [107].
  108. [108].
  109. [109].
  110. [110].
  111. [111].
  112. [112].
  113. [113].
  114. [114].
  115. [115].
  116. [116].
  117. [117].
  118. [118].
  119. [119].
  120. [120].
  121. [121].
  122. [122].
  123. [123].
  124. [124].
  125. [125].
  126. [126].
  127. [127].
  128. [128].
  129. [129].
  130. [130].
  131. [131].
  132. [132].
  133. [133].
  134. [134].
  135. [135].
  136. [136].
  137. [137].
  138. [138].
  139. [139].
  140. [140].
  141. [141].
  142. [142].
  143. [143].
  144. [144].
  145. [145].
  146. [146].
  147. [147].
  148. [148].
  149. [149].
  150. [150].
  151. [151].
  152. [152].
  153. [153].
  154. [154].
  155. [155].
  156. [156].
  157. [157].
  158. [158].
  159. [159].
  160. [160].
  161. [161].
  162. [162].
  163. [163].
  164. [164].
  165. [165].
  166. [166].
  167. [167].
  168. [168].
  169. [169].
  170. [170].
  171. [171].
  172. [172].
  173. [173].
  174. [174].
  175. [175].
  176. [176].
  177. [177].
  178. [178].
  179. [179].
  180. [180].
  181. [181].
  182. [182].
  183. [183].
  184. [184].
  185. [185].
  186. [186].
  187. [187].
  188. [188].
  189. [189].
  190. [190].
  191. [191].
  192. [192].
  193. [193].
  194. [194].
  195. [195].
  196. [196].
  197. [197].
  198. [198].
  199. [199].
  200. [200].
  201. [201].
  202. [202].
  203. [203].
  204. [204].
  205. [205].
  206. [206].
  207. [207].
  208. [208].
  209. [209].
  210. [210].
  211. [211].
  212. [212].
  213. [213].
  214. [214].
  215. [215].
  216. [216].
  217. [217].
  218. [218].
  219. [219].
  220. [220].
  221. [221].
  222. [222].
  223. [223].
  224. [224].
  225. [225].
  226. [226].
  227. [227].
  228. [228].
  229. [229].
  230. [230].
  231. [231].
  232. [232].
  233. [233].
  234. [234].
View Abstract