Proteins are also linear polymers made up of a string of subunits called
amino-acids. Each amino-acid has a central (
)
atom covalently linked to an amino (NH2)
group, a carboxyl (COOH) group, and a side-chain denoted as R in the figure
below. The different amino-acids differ in their side-chains of which there
are 20 different kinds.
Each amino-acid
is connected to the adjacent one via a covalent linkage called the peptide
bond (hence the name polypeptides). The protein chains, like the DNA and
RNA chains, have a direction, from the amino end of the chain to the carboxy
end. The first amino-acid at the NH2 end
is specified by the first codon on the mRNA, and the subsequent amino-acids
are added to the COOH end. Each peptide bond formation results in the release
of a water molecule.
The backbone of the protein chain is highly polar with the amino (-NH-)
group a strong donor of hydrogen bonds and the carbonyl (-CO-) group an
acceptor of hydrogen bonds. The size, shape, charge and polarity of the
side-chains are highly varied and are responsible for the three-dimensional
structure that the protein chain adopts and ultimately its function.
Proteins are chains of subunits (amino-acids) linked together as shown
in the figure above.
Structural
biologists prefer a slightly different representation in which the peptide
planes in between 2
are taken as a unit. The atoms of the carboxy group of subunit 1 and the
amino group of subunit 2 lie in a plane (called the peptide plane), and
the only degrees of freedom along the backbone are about the
bond (called the
angle) and the
bond
(called the
angle).
Therefore, the
and
angles
corresponding to each
atom
specifies the ‘trajectory’ of the backbone for a particular protein structure.
The side-chains can be divided into three main classes according to their properties. These are hydrophobic, charged and polar. Hydrophobic side-chains are found primarily in the interior for water-soluble proteins and charged residues are usually on the surface. Polar residues are good hydrogen bond donors or acceptors and are equally happy making hydrogen bonds with water as with other parts of the protein.
19 out of the 20 amino-acids are shown below. The missing one is Glycine (Gly, G) which is the smallest amino-acid with only H as its residue.
Amino-acids
Amino-acids continued
Secondary structures in proteins
One of the basic principals of protein folding architecture is that
hydrophobic residues prefer to be on the inside of the protein. However,
since the backbone is highly polar, and prefers contact with water, it
creates a problem because part of the backbone also has to be buried along
with the hydrophobin sied-chains. The protein solves this problem by adopting
regular 'secondary' structures that allow the -NH- and -CO- groups along
the backbone to form hydrogen bonds with each other. There are two main
classes of secondary structures,
-helices
and
-sheets.
-helices
The backbone chain in an
-helix
adopts a right-handed helical conformation such that the carbonyl group
of amino-acid
i makes a hydrogen bond with the amino group of the
amino-acid i+4. The rise/per amino-acid is ~ 1.5 A, and the helix
pitch has ~ 3.6 residues (= 5.4 A).
The above figure shows various representations of the
-helix:
(a) is a ribbon diagram showing schematically the backbone chain, (b) shows
the hydrogen bonds (shown as red springs) between the C' atom of the carbonyl
group and the -NH- group, (c) shows the real conformation of the backbone
chain, and (d) shows the residues (in pink) sticking out from the helix.
Note that unlike in a DNA helical structure where the bases are pointing
toward the helical axis, in proteins the residues point outwards.
Sometimes there are variations on this theme with a
-helix
in which amino-acid i makes a hydrogen bond with i+5 (more
loosely coiled than an
-helix)
and a 310-helix with a hydrogen bond contact between i
and i+3 (more tightly coiled).
-sheets
-sheets
are formed when the amino and carbonyl groups of one strand (part of the
polypeptide chain) make hydrogen bonds with the carbonyl and amino groups
of another strand that is in parallel with the first strand. There are
two kinds of
-sheet
formations, anti-parallel
-sheets
(in which the two strands run in opposite directions) and parallel
-sheets
(in which the two strands run in the same direction).
Anti-parallel
-sheets
Parallel
-sheets
Hairpin loops
Proteins structures have many helices and
-sheets
connected to each other by loop regions of various lengths and irregular
shapes. The loop regions are found at the surface of the folded protein
molecules.
Tertiary structures in proteins
The secondary structural elements dock together to give a close-packed three-dimensional shape to the protein molecule, also called its three-dimensional structure. The figure below shows the backbone conformation of three small water-soluble globular proteins:
Schematic drawing
of the protein thioredoxin from E. coli. The red arrows in
the middle represent a parallel
-sheet
structure with the direction of the strand (amino end to carboxy end) indicated
by arrows. Regions of the polypeptide chain that form the strands in the
-sheet
structure are interrupted by
-helical
regions shown in white. Almost all
-sheet
structures, parallel, anti-parallel, or mixed, are twisted in a right-handed
twist as shown in the figure.
The structure
of myoglobin, an oxygen strorage protein, which is essentially all
-helical.
The red group in the middle is called the heme group where the oxygen binds.
The heme group is not part of the polypeptide chain. It is covalently linked
to one of the amino-acid residues and the rest of the chain forms a globule
around the heme group.
Structure of
the enzyme triosephosphate isomerase. The central core is again a
-sheet
structure with
-helices
on the outside.
An all atom representation of proteins
The ribbon
diagrams shown above illustrate very well the conformation of the backbone
chain, but are somewhat misleading in that they don't reveal the 'real'
shape of the protein. An alternative view is to draw every atom in the
protein as a ball of size determined by its Van der Waals radius. This
view shows the close-packed nature of the protein molecule, and shows how
all the atoms fit together like a three-dimensional jigzaw puzzle.
The packing density of atoms in a protein is close to that in a crystal.
Proteins have precisely engineered moving parts that are essential for their function
The three-dimensional structures obtained from x-ray crystallography or NMR are average structures. The protein molecule, just like DNA and RNA, does not have a static structure but instead is constantly bombarded by water molecules and subjected to thermal motions. Remember that, apart from the covalent interactions that hold the chain together, all other weak interactions are in the range of a few kBT and therefore subject to thermal fluctuations.
One of the primary functions of proteins is to catalyze reactions in the cell. Catalysis is usualy accompanied by larger-scale conformational changes in the protein in which, for example, different domains of the protein can come together or move apart by several Angstroms, usually in response to binding of small molecules called ligands or substrates. The following example illustrates the conformational change in an enzyme called hexokinase whose function is to catalyze the transfer of the terminal phosphate of an ATP molecule to glucose in an early step of sugar metabolism.
The hexokinase molecule has two distinct conformations, 'open' and 'closed',
and the conformational fluctuation between the two states involves relative
motion of the two domains or two halves of the protein shown in blue and
green in the picture below. (Hexokinase is an example of a protein in which
the two domains fold up from a single polypeptide chain).
In the open
conformation, the molecule has a low affinity for both the glucose molecule
and the ATP molecule. The binding of one of the molecules, say glucose,
shifts the equilibrium to the closed conformation of the protein, which
has a higher affinity for ATP because now the ATP binding site has the
correct conformation to accomodate ATP. By the same reasoning, if
ATP were to bind first, that would also shift the equilibrium to the closed
conformation and hence increase the affinity for glucose. Therefore the
binding of glucose and ATP are coupled and this kind of conformational
coupling is known as allostery. The protein is called an allosteric
protein.
The allosteric coupling can be positive, as in the above example, or negative, as shown below, where the binding of one molecule to the protein and the corresponding conformational change inhibits the binding of another molecule.
There are several examples in biology where the binding of ATP or GTP by a protein and the subsequent hydrolysis of ATP / GTP is used to convert the energy released from hydrolysis into a large conformational change of the protein.
This figure shows
a GTP-binding protein called the Elongation Factor protein (EF-Tu) which
participates in protein synthesis. Its function is to bind to a tRNA molecule
that is already covalently linked to the appropriate amino-acid, and to
load it onto the ribosome. The protein is made up of three distinct domains
(single polypeptide chain). When GTP is bound to EF-Tu, the domains are
close together and form the binding site for tRNA. When GTP is hydrolyzed,
there is a small conformational change in at the GTP-binding site of a
few Angstroms. This small conformational change is propagated along a helix
(shown in red in the figure) which acts like a switch and causes the domains
to swing apart through a distance of about 4 nm, thus releasing the bound
tRNA onto the ribosome.
ATP hydrolysis is also used to do mechanical work in cells.
A classic example of a protein that uses energy from ATP hydrolysis to do work is myosin, a protein involved in muscular contraction. Muscle contraction takes place by the mutual sliding of two sets of filaments made up of fibrous proteins: thick filaments that contain myosin and thin filaments that contain another protein called actin. The thick and thin filaments are organized into units which are about 2-3 microns long, and which also contain other proteins. Another fibrous component of muscles is the protein titin which is the largest known polypeptide chain. Titin helps restore the stretched muscle back to the correct length.
The relative sliding
of thick and thin filaments is brought about by the myosin molecules whose
'heads' stick out from the myosin filaments and act as 'cross-bridges'
between the two sets of filaments. The thin filaments slide over the thick
filaments as a result of a 'rowing' action of myosin. Myosin has two conformation,
one in which the head is in a 90 degree coformation and another in which
it is in a 45 degree conformation. to an actin monomer (shown in red).
Membrane proteins
All cells and nuclei are bounded by membranes which are thin films ( ~ 4.5 nm) of lipids and protein molecules. The lipids are bilayered with hydrophilic groups facing the surface of the membranes and hydrophobic groups in the interior of the membranes. Membrane proteins span the two sides of the membrane. Therefore, unlike water soluble proteins where the hydrophilic amino-acids are found on the surface of the folded protein, in membrane proteins hydrophobic amino-acids that make contact with the interior of the membranes are found on the surface of the proteins.
Several examples of membrane proteins are shown below. Alpha-helices are drawn as cylinders and beta-sheets as arrows. The membrane-bound regions are shown in green.
Ion channels
Ion channels are membrane proteins that have highly selective pores that regulate transport of inorganic ions across cell membranes. Shown in the figure below is a potassium channel which is tetrameric (made up of four subunits) and has a pore in the middle. Potassium channels are selective for K+ by a factor of 10,000 over Na+
A cross-sectional view of the ion-pore. The overall length is
4.5 nm.
Fibrous proteins
Fibrous proteins are long-chain molecules that serve as structural materials
for the same reason that other polymers do. They can cross-link and inter-twine
to provide strength and flexibility. Examples are coiled-coil alpha-helices
in muscles, a triple helix in collagen and beta-sheets in silks and spiders'
webs. The figure below shows the composition of fibers found in a spiders'
web in which beta-sheets stack up to form microcrystals interspersed with
regions of less-ordered (random coil) structures.