In the previous section we discussed the basic physics of homopolymers, or chains of segments which are all the same. As a result, their physics is the physics of rather homogenous little blobs and globules. The two cases we focused on were
Random-walk polymer:
Size R » N1/2 b
Volume V » N3/2 b3
Concentration c = N/V » 1/(N1/2
b3) Number of self-contacts »
N b3 c » N1/2
Collapsed polymer (w < -b3):
Size R » N1/3 b
Volume V » N b3
Concentration c = N/V » 1/b3
Number of self-contacts » N b3
c » N
In both these cases, the number of equivalent states is large, i.e. exponential in N. We can sort of count up the number of states if we consider polymers drawn on a cubic lattice, where the lattice step is equal to the polymer segment length b.
For the simple random walk on the lattice, we have W = 6N since at each step you have six choices for its direction.
If we disallow the case of reversal of direction which causes sharp hairpins, we have W = 5N, still large.
For the collapsed polymer, it turns out that the number of states is still exponential in N; numerical studies indicate that W » 1.85N for the simple cubic lattice (see Pande et al, J. Phys. A 27, 6231 (1994) for a recent numerical study).
One other case worth mentioning is that of the self-avoiding walk on the simple cubic lattice, for which W » 4.7N (see de Gennes' book, p.39 for more details).
In any case, we see that all states of homopolymers are characterized by exponentially many states W » zN and therefore entropy proportional to the number of segments S/kB = lnW µ N.
For these folded structures, considered at the spatial resolution of
our lattice model, W << zN .
In some cases (e.g. globular proteins with one structure) really only
one conformational state is populated, i.e. W = 1.
NA's are held together by hydrogen bonding and base stacking, usually
acting between complementary bases
(a=t and g º c) for DNA).
Base-pairing defines both the DNA double helix, and the stem regions of RNA stem-loop (or helix-loop) structures. I will tend to talk about DNA, but most of the following comments apply to RNA base-paired stem structures as well.
We can easily understand the stability of these structures by studying base-unpairing fluctuations for a few cases.
1. Unzipping from an end:
Suppose we have a DNA double helix made of complementary-sequence strands, for example:
5'-ccatgattcg-3'
3'-ggtactaagc-5'
In general, let's talk about the total length of the strands as N nucleotides, so in the specific example above we have N = 10. Now consider the unzipping of this double helix from the right end. If we unpair n bases, leaving the other N-n bases paired, we will get a `frayed' molecule.
In our example, if n = 5 we have
ttcg-3'
a
5'-ccatg
3'-ggtac
t
aagc-5'
We can imagine that for each base-pair that we break apart, there is an energy cost e. Of course, this energy must depend on the sequence, and great effort has been made to determine the `right' values of e for each base pair.
One of the most commonly used models for base-pairing free energies is due to Breslauer et al (Proc. Natl. Acad. Sci. USA 83, 3746 (1986)) which actually considers the energies of breaking adjacent bases, with the application in mind to prediction of the melting temperature of short DNAs. Table II of Breslauer et al shows how {a,t}-rich regions contribute (Gibbs) free energies of about 2 kcal/mol per base pair, while {g,c}-rich regions contribute closer to 3-4 kcal/mol per base pair. Using 1 kB T = 0.6 kcal/mol at 300 K tells us that we have something like 3 kB T of Gibbs free energy holding at pairs together, and more like 5-6 kB T holding gc pairs together.
These data are for pretty strongly bound double helicies at 25 C and 1 M NaCl; at more physiological salt concentrations of 0.1 M NaCl, the Gibbs free energies of base-pairing are closer to 2 kB T for at pairs, and 4 kB T for gc pairs. Therefore we can roughly use e = 3 kB T as the free energy difference between paired and unpaired bases in our unzipping example.
So, the work that must be done to unzip n bases will be something like
|
|
|
Note that for the case n = N where the two strands actually get pulled completely apart, there is now additional entropy associated with the relative translation of the now-separated ssDNAs. This entropy is roughly
|
|
The net result is that not-too-long DNAs (e.g. 24 bp) are extremely stable against accidental dissassembly by randomly acting thermal forces. On the other hand, two strands of DNA can be pulled apart by directed forces which progressively do » 3 kB T of work per base pair.
To do this you will need to interpret the Boltmann distribution as giving the probability for a single attempt to excite an unzipping states. This assumption is widely used as a simple kinetic model for thermally excited processes.
Plot your result for lifetime as a function of N, for N < 20. You will want to use a logarithmic scale for the lifetime. How long does a dsDNA have to be for it to be considered as `stable'?
Impressive experiments on unzipping of long double helices have been done in the last few years. See especially Essevaz-Roulet et al, Proc. Natl. Acad. Sci. USA 94, 11935 (1997).
Find the temperature dependence of DG for the 12-mer
5'-aggtcgccgccc-3'
3'-tccagcggcggg-5'
using the model of Breslauer et al (Proc. Natl. Acad. Sci. 83, 3746 (1986)).
At what temperature does this double helix `melt'?
Why is it not very important to worry about the `ideal gas' translational entropy
|
We might worry about the probability of a single-stranded bubble opening up in the middle of a double helix or RNA stem:
Suppose n base-pairs unpair, opening a bubble of 2n bases. As before, the energy associated with breaking the n base-pairs will be ne, where e » 3 kB T on average.
So far this is the same as the end-fraying discussed above. But - for a loop, there is a constraint that the loop close. We previously discussed the entropy cost associated with this, and found it to be -(3/2) kB ln(2n). Therefore, the free energy to open n bases in the interior of a dsNA region is
|
|
The one situation where end fraying and internal bubbles become important is when e® 0, which occurs when the temperature approaches the melting temperature, which is between 40 and 60 C depending on the DNA sequence.
RNAs are organized into a hierarchy of stem-loop structures, each of which can be considered to be a short base-paired `stem' region terminated by a ssRNA loop. Here is a picture of a long loop formed by unpairing n base-pairs:
Such loops are always at least 4 bases in circumference. The existence
of a finite loop size limit follows from some simple physical considerations
to deal with the case of very short loops formed by unpairing only a few
bases:
We suppose that we have a RNA sequence that could completely base-pair. However, to do this would require a sharp `hairpin' bend which would cost a lot of energy (and which would really be stereochemically impossible). This large energy cost can be thought of in terms of a bending energy contribution from the loop. If there are 2n unpaired bases in the loop (i.e. n unpaired complementary base-pairs) the net stem-loop energy has the form:
|
What should we use for the bending energy of the 2n unpaired bases? The 2n unpaired bases form a polygon with 2n+1 sides, and the bending angle at each vertex is q = 2p/(2n+1).
We can suppose that at each vertex there is a bending energy µ q2, and thus a rough model for the bending energy is
|
This gives a net loop free energy of
|
|
|
|
The model presented here is very rough but illustrates the basic physics that controls the size of single-stranded nucleic acid loops.