Proteins are polypeptides

Proteins are also linear polymers made up of a string of subunits called amino-acids. Each amino-acid has a central () atom covalently linked to an amino (NH2) group, a carboxyl (COOH) group, and a side-chain denoted as R in the figure below. The different amino-acids differ in their side-chains of which there are 20 different kinds.

Each amino-acid is connected to the adjacent one via a covalent linkage called the peptide bond (hence the name polypeptides). The protein chains, like the DNA and RNA chains, have a direction, from the amino end of the chain to the carboxy end. The first amino-acid at the NH2 end is specified by the first codon on the mRNA, and the subsequent amino-acids are added to the COOH end. Each peptide bond formation results in the release of a water molecule.

The backbone of the protein chain is highly polar with the amino (-NH-) group a strong donor of hydrogen bonds and the carbonyl (-CO-) group an acceptor of hydrogen bonds. The size, shape, charge and polarity of the side-chains are highly varied and are responsible for the three-dimensional structure that the protein chain adopts and ultimately its function.
 
 
 
 

Proteins are chains of subunits (amino-acids) linked together as shown in the figure above.
 
 
 
 

Structural biologists prefer a slightly different representation in which the peptide planes in between 2  are taken as a unit. The atoms of the carboxy group of subunit 1 and the amino group of subunit 2 lie in a plane (called the peptide plane), and the only degrees of freedom along the backbone are about the  bond (called the  angle) and the bond (called the angle). Therefore, the  and angles corresponding to each atom specifies the ‘trajectory’ of the backbone for a particular protein structure.
 


 

The side-chains can be divided into three main classes according to their properties. These are hydrophobic, charged and polar. Hydrophobic side-chains are found primarily in the interior for water-soluble proteins and charged residues are usually on the surface. Polar residues are good hydrogen bond donors or acceptors and are equally happy making hydrogen bonds with water as with other parts of the protein.

19 out of the 20 amino-acids are shown below. The missing one is Glycine (Gly, G) which is the smallest amino-acid with only H as its residue.

Amino-acids


 

Amino-acids continued


 
 

Secondary structures in proteins

One of the basic principals of protein folding architecture is that hydrophobic residues prefer to be on the inside of the protein. However, since the backbone is highly polar, and prefers contact with water, it creates a problem because part of the backbone also has to be buried along with the hydrophobin sied-chains. The protein solves this problem by adopting regular 'secondary' structures that allow the -NH- and -CO- groups along the backbone to form hydrogen bonds with each other. There are two main classes of secondary structures, -helices and -sheets.

-helices

The backbone chain in an -helix adopts a right-handed helical conformation such that the carbonyl group of amino-acid i makes a hydrogen bond with the amino group of the amino-acid i+4. The rise/per amino-acid is ~ 1.5 A, and the helix pitch has ~ 3.6 residues (= 5.4 A).

The above figure shows various representations of the -helix: (a) is a ribbon diagram showing schematically the backbone chain, (b) shows the hydrogen bonds (shown as red springs) between the C' atom of the carbonyl group and the -NH- group, (c) shows the real conformation of the backbone chain, and (d) shows the residues (in pink) sticking out from the helix. Note that unlike in a DNA helical structure where the bases are pointing toward the helical axis, in proteins the residues point outwards.

Sometimes there are variations on this theme with a-helix in which amino-acid i makes a hydrogen bond with i+5 (more loosely coiled than an -helix) and a 310-helix with a hydrogen bond contact between i and i+3 (more tightly coiled).

-sheets

-sheets are formed when the amino and carbonyl groups of one strand (part of the polypeptide chain) make hydrogen bonds with the carbonyl and amino groups of another strand that is in parallel with the first strand. There are two kinds of -sheet formations, anti-parallel -sheets (in which the two strands run in opposite directions) and parallel-sheets (in which the two strands run in the same direction).

Anti-parallel-sheets


 
 

Parallel-sheets

Hairpin loops

Proteins structures have many helices and -sheets connected to each other by loop regions of various lengths and irregular shapes. The loop regions are found at the surface of the folded protein molecules.
 
 

Tertiary structures in proteins

The secondary structural elements dock together to give a close-packed three-dimensional shape to the protein molecule, also called its three-dimensional structure. The figure below shows the backbone conformation of three small water-soluble globular proteins:

Schematic drawing of the protein thioredoxin from E. coli. The red arrows in the middle represent a parallel -sheet structure with the direction of the strand (amino end to carboxy end) indicated by arrows. Regions of the polypeptide chain that form the strands in the -sheet structure are interrupted by -helical regions shown in white. Almost all -sheet structures, parallel, anti-parallel, or mixed, are twisted in a right-handed twist as shown in the figure.
 
 
 
 
 
 

The structure of myoglobin, an oxygen strorage protein, which is essentially all -helical. The red group in the middle is called the heme group where the oxygen binds. The heme group is not part of the polypeptide chain. It is covalently linked to one of the amino-acid residues and the rest of the chain forms a globule
around the heme group.
 
 
 
 
 
 
 

Structure of the enzyme triosephosphate isomerase. The central core is again a -sheet structure with -helices on the outside.
 
 
 
 

An all atom representation of proteins

The ribbon diagrams shown above illustrate very well the conformation of the backbone chain, but are somewhat misleading in that they don't reveal the 'real' shape of the protein. An alternative view is to draw every atom in the protein as a ball of size determined by its Van der Waals radius. This view shows the close-packed nature of the protein molecule, and shows how all the atoms fit together like a three-dimensional jigzaw puzzle.
 
 
 

The packing density of atoms in a protein is close to that in a crystal.

Proteins have precisely engineered moving parts that are essential for their function

The three-dimensional structures obtained from x-ray crystallography or NMR are average structures. The protein molecule, just like DNA and RNA, does not have a static structure but instead is constantly bombarded by water molecules and subjected to thermal motions. Remember that, apart from the covalent interactions that hold the chain together, all other weak interactions are in the range of a few kBT and therefore subject to thermal fluctuations.

One of the primary functions of proteins is to catalyze reactions in the cell. Catalysis is usualy accompanied by larger-scale conformational changes in the protein in which, for example, different domains of the protein can come together or move apart by several Angstroms, usually in response to binding of small molecules called ligands or substrates. The following example illustrates the conformational change in an enzyme called hexokinase whose function is to catalyze the transfer of the terminal phosphate of an ATP molecule to glucose in an early step of sugar metabolism.

The hexokinase molecule has two distinct conformations, 'open' and 'closed', and the conformational fluctuation between the two states involves relative motion of the two domains or two halves of the protein shown in blue and green in the picture below. (Hexokinase is an example of a protein in which the two domains fold up from a single polypeptide chain).
In the open conformation, the molecule has a low affinity for both the glucose molecule and the ATP molecule. The binding of one of the molecules, say glucose, shifts the equilibrium to the closed conformation of the protein, which has a higher affinity for ATP because now the ATP binding site has the correct conformation to accomodate ATP.  By the same reasoning, if ATP were to bind first, that would also shift the equilibrium to the closed conformation and hence increase the affinity for glucose. Therefore the binding of glucose and ATP are coupled and this kind of conformational coupling is known as allostery. The protein is called an allosteric protein.

The allosteric coupling can be positive, as in the above example, or negative, as shown below, where the binding of one molecule to the protein and the corresponding conformational change inhibits the binding of another molecule.

There are several examples in biology where the binding of ATP or GTP by a protein and the subsequent hydrolysis of ATP / GTP is used to convert the energy released from hydrolysis into a large conformational change of the protein.

This figure shows a GTP-binding protein called the Elongation Factor protein (EF-Tu) which participates in protein synthesis. Its function is to bind to a tRNA molecule that is already covalently linked to the appropriate amino-acid, and to load it onto the ribosome. The protein is made up of three distinct domains (single polypeptide chain). When GTP is bound to EF-Tu, the domains are close together and form the binding site for tRNA. When GTP is hydrolyzed, there is a small conformational change in at the GTP-binding site of a few Angstroms. This small conformational change is propagated along a helix (shown in red in the figure) which acts like a switch and causes the domains to swing apart through a distance of about 4 nm, thus releasing the bound tRNA onto the ribosome.
 

ATP hydrolysis is also used to do mechanical work in cells.

A classic example of a protein that uses energy from ATP hydrolysis to do work is myosin, a protein involved in muscular contraction. Muscle contraction takes place by the mutual sliding of two sets of filaments made up of fibrous proteins: thick filaments that contain myosin and thin filaments that contain another protein called actin. The thick and thin filaments are organized into units which are about 2-3 microns long, and which also contain other proteins. Another fibrous component of muscles is the protein titin which is the largest known polypeptide chain. Titin helps restore the stretched muscle back to the correct length.

The relative sliding of thick and thin filaments is brought about by the myosin molecules whose 'heads' stick out from the myosin filaments and act as 'cross-bridges' between the two sets of filaments. The thin filaments slide over the thick filaments as a result of a 'rowing' action of myosin. Myosin has two conformation, one in which the head is in a 90 degree coformation and another in which it is in a 45 degree conformation. to an actin monomer (shown in red).
 
 
 
 
 
 
 

Membrane proteins

All cells and nuclei are bounded by membranes which are thin films ( ~ 4.5 nm) of lipids and protein molecules. The lipids are bilayered with hydrophilic groups facing the surface of the membranes and hydrophobic groups in the interior of the membranes. Membrane proteins span the two sides of the membrane. Therefore, unlike water soluble proteins where the hydrophilic amino-acids are found on the surface of the folded protein, in membrane proteins hydrophobic amino-acids that make contact with the interior of the membranes are found on the surface of the proteins.

Several examples of membrane proteins are shown below. Alpha-helices are drawn as cylinders and beta-sheets as arrows.  The membrane-bound regions are shown in green.

Ion channels

Ion channels are membrane proteins that have highly selective pores that regulate transport of inorganic ions across cell membranes. Shown in the figure below is a potassium channel which is tetrameric (made up of four subunits) and has a pore in the middle. Potassium channels are selective for K+ by a factor of 10,000 over Na+


 

A cross-sectional view of the ion-pore.  The overall length is 4.5 nm.

 
 
 
 
 
 
 
 
 
 
 
 

Fibrous proteins

Fibrous proteins are long-chain molecules that serve as structural materials for the same reason that other polymers do. They can cross-link and inter-twine to provide strength and flexibility. Examples are coiled-coil alpha-helices in muscles, a triple helix in collagen and beta-sheets in silks and spiders' webs. The figure below shows the composition of fibers found in a spiders' web in which beta-sheets stack up to form microcrystals interspersed with regions of less-ordered (random coil) structures.