Proteins, from the Greek proteios, meaning first, are a class of organic compounds which are present in and vital to every living cell. In the form of skin, hair, callus, cartilage, muscles, tendons and ligaments, proteins hold together, protect, and provide structure to the body of a multi-celled organism. In the form of enzymes, hormones, antibodies, and globulins, they catalyze, regulate, and protect the body chemistry. In the form of hemoglobin, myoglobin and various lipoproteins, they effect the transport of oxygen and other substances within an organism.
Proteins are generally regarded as beneficial, and are a necessary part of the diet of all animals. Humans can become seriously ill if they do not eat enough suitable protein, the diseasekwashiorkor being an extreme form of protein deficiency. Protein based antibiotics and vaccines help to fight disease, and we warm and protect our bodies with clothing and shoes that are often protein in nature (e.g. wool, silk and leather).
The deadly properties of protein toxins and venoms is less widely appreciated. Botulinum toxin A, from Clostridium botulinum, is regarded as the most powerful poison known. Based on toxicology studies, a teaspoon of this toxin would be sufficient to kill a fifth of the world's population. The toxins produced by tetanus and diphtheria microorganisms are nearly as poisonous. A list of highly toxic proteins or peptides would also include the venoms of many snakes, and ricin, the toxic protein found in castor beans.
Despite the variety of their physiological function and differences in physical properties--silk is a flexible fiber, horn a tough rigid solid, and the enzyme pepsin water soluble crystals--proteins are sufficiently similar in molecular structure to warrant treating them as a single chemical family. When compared with carbohydrates and lipids, the proteins are obviously different in fundamental composition. The lipids are largely hydrocarbon in nature, generally being 75 to 85% carbon. Carbohydrates are roughly 50% oxygen, and like the lipids, usually have less than 5% nitrogen (often none at all). Proteins and peptides, on the other hand, are composed of 15 to 25% nitrogen and about an equal amount of oxygen. The distinction between proteins and peptides is their size. Peptides are in a sense small proteins, having molecular weights less than 10,000.
2. Natural α-Amino Acids
Hydrolysis of proteins by boiling aqueous acid or base yields an assortment of small molecules identified as α-aminocarboxylic acids. More than twenty such components have been isolated, and the most common of these are listed in the following table. Those amino acids having green colored names are essential diet components, since they are not synthesized by human metabolic processes. The best food source of these nutrients is protein, but it is important to recognize that not all proteins have equal nutritional value. For example, peanuts have a higher weight content of protein than fish or eggs, but the proportion of essential amino acids in peanut protein is only a third of that from the two other sources. For reasons that will become evident when discussing the structures of proteins and peptides, each amino acid is assigned a one or three letter abbreviation.
Some common features of these amino acids should be noted. With the exception of proline, they are all 1º-amines; and with the exception of glycine, they are all chiral. The configurations of the chiral amino acids are the same when written as a Fischer projection formula, as in the drawing on the right, and this was defined as the L-configuration byFischer. The R-substituent in this structure is the remaining structural component that varies from one amino acid to another, and in proline R is a three-carbon chain that joins the nitrogen to the alpha-carbon in a five-membered ring. Applying the Cahn-Ingold-Prelog notation, all these natural chiral amino acids, with the exception of cysteine, have an S-configuration.
For the first seven compounds in the left column the R-substituent is a hydrocarbon. The last three entries in the left column have hydroxyl functional groups, and the first two amino acids in the right column incorporate thiol and sulfide groups respectively. Lysine and arginine have basic amine functions in their side-chains; histidine and tryptophan have less basic nitrogen heterocyclic rings as substituents. Finally, carboxylic acid side-chains are substituents on aspartic and glutamic acid, and the last two compounds in the right column are their corresponding amides.
The formulas for the amino acids written above are simple covalent bond representations based upon previous understanding of mono-functional analogs. The formulas are in fact incorrect. This is evident from a comparison of the physical properties listed in the following table. All four compounds in the table are roughly the same size, and all have moderate to excellent water solubility. The first two are simple carboxylic acids, and the third is an amino alcohol. All three compounds are soluble in organic solvents (e.g. ether) and have relatively low melting points. The carboxylic acids have pKa's near 4.5, and the conjugate acid of the amine has a pKa of 10. The simple amino acid alanine is the last entry. By contrast, it is very high melting (with decomposition), insoluble in organic solvents, and a million times weaker as an acid than ordinary carboxylic acids.
These differences all point to internal salt formation by a proton transfer from the acidic carboxyl function to the basic amino group. The resulting ammonium carboxylate structure, commonly referred to as a zwitterion, is also supported by the spectroscopic characteristics of alanine.
As expected from its ionic character, the alanine zwitterion is high melting, insoluble in nonpolar solvents and has the acid strength of a 1º-ammonium ion. To the right above is a Jmol display of an L-amino acid. The model will change to its zwitterionic form by clicking the appropriate button beneath the display. Examples of a few specific amino acids may also be viewed in their favored neutral zwitterionic form. Note that in lysine the amine function farthest from the carboxyl group is more basic than the alpha-amine. Consequently, the positively charged ammonium moiety formed at the chain terminus is attracted to the negative carboxylate, resulting in a coiled conformation.
Since amino acids, as well as peptides and proteins, incorporate both acidic and basic functional groups, the predominant molecular species present in an aqueous solution will depend on the pH of the solution. In order to determine the nature of the molecular and ionic species that are present in aqueous solutions at different pH's, we make use of the Henderson-Hasselbach Equation, written below. Here, the pKa represents the acidity of a specific conjugate acid function (HA). When the pH of the solution equals pKa, the concentrations of HA and A(-) must be equal (log 1 = 0).
The titration curve for alanine, shown below, demonstrates this relationship. At a pH lower than 2, both the carboxylate and amine functions are protonated, so the alanine molecule has a net positive charge. At a pH greater than 10, the amine exists as a neutral base and the carboxyl as its conjugate base, so the alanine molecule has a net negative charge. At intermediate pH's the zwitterion concentration increases, and at a characteristic pH, called the isoelectric point (pI), the negatively and positively charged molecular species are present in equal concentration. This behavior is general for simple (difunctional) amino acids. Starting from a fully protonated state, the pKa's of the acidic functions range from 1.8 to 2.4 for -CO2H, and 8.8 to 9.7 for -NH3(+). The isoelectric points range from 5.5 to 6.2. Titration curves show the neutralization of these acids by added base, and the change in pH during the titration.
The distribution of charged species in a sample can be shown experimentally by observing the movement of solute molecules in an electric field, using the technique of electrophoresis. For such experiments an ionic buffer solution is incorporated in a solid matrix layer, composed of paper or a crosslinked gelatin-like substance. A small amount of the amino acid, peptide or protein sample is placed near the center of the matrix strip and an electric potential is applied at the ends of the strip, as shown in the following diagram. The solid structure of the matrix retards the diffusion of the solute molecules, which will remain where they are inserted, unless acted upon by the electrostatic potential. In the example shown here, four different amino acids are examined simultaneously in a pH 6.00 buffered medium. To see the result of this experiment, click on the illustration. Note that the colors in the display are only a convenient reference, since these amino acids are colorless.
At pH 6.00 alanine and isoleucine exist on average as neutral zwitterionic molecules, and are not influenced by the electric field. Arginine is a basic amino acid. Both base functions exist as "onium" conjugate acids in the pH 6.00 matrix. The solute molecules of arginine therefore carry an excess positive charge, and they move toward the cathode. The two carboxyl functions in aspartic acid are both ionized at pH 6.00, and the negatively charged solute molecules move toward the anode in the electric field. Structures for all these species are shown to the right of the display.
It should be clear that the result of this experiment is critically dependent on the pH of the matrix buffer. If we were to repeat the electrophoresis of these compounds at a pH of 3.80, the aspartic acid would remain at its point of origin, and the other amino acids would move toward the cathode. Ignoring differences in molecular size and shape, the arginine would move twice as fast as the alanine and isoleucine because its solute molecules on average would carry a double positive charge.
As noted earlier, the titration curves of simple amino acids display two inflection points, one due to the strongly acidic carboxyl group (pKa1 = 1.8 to 2.4), and the other for the less acidic ammonium function (pKa2 = 8.8 to 9.7). For the 2º-amino acid proline, pKa2 is 10.6, reflecting the greater basicity of 2º-amines.
Some amino acids have additional acidic or basic functions in their side chains. These compounds are listed in the table on the right. A third pKa, representing the acidity or basicity of the extra function, is listed in the fourth column of the table. The pI's of these amino acids (last column) are often very different from those noted above for the simpler members. As expected, such compounds display three inflection points in their titration curves, illustrated by the titrations of arginine and aspartic acid shown below. For each of these compounds four possible charged species are possible, one of which has no overall charge. Formulas for these species are written to the right of the titration curves, together with the pH at which each is expected to predominate. The very high pH required to remove the last acidic proton from arginine reflects the exceptionally high basicity of the guanidine moiety at the end of the side chain.
3. The Isoelectric Point
As defined above, the isoelectric point, pI, is the pH of an aqueous solution of an amino acid (or peptide) at which the molecules on average have no net charge. In other words, the positively charged groups are exactly balanced by the negatively charged groups. For simple amino acids such as alanine, the pI is an average of the pKa's of the carboxyl (2.34) and ammonium (9.69) groups. Thus, the pI for alanine is calculated to be: (2.34 + 9.69)/2 = 6.02, the experimentally determined value. If additional acidic or basic groups are present as side-chain functions, the pI is the average of the pKa's of the two most similar acids. To assist in determining similarity we define two classes of acids. The first consists of acids that are neutral in their protonated form (e.g. CO2H & SH). The second includes acids that are positively charged in their protonated state (e.g. -NH3+). In the case of aspartic acid, the similar acids are the alpha-carboxyl function (pKa = 2.1) and the side-chain carboxyl function (pKa = 3.9), so pI = (2.1 + 3.9)/2 = 3.0. For arginine, the similar acids are the guanidinium species on the side-chain (pKa = 12.5) and the alpha-ammonium function (pKa = 9.0), so the calculated pI = (12.5 + 9.0)/2 = 10.75.
4. Other Natural Amino Acids
The twenty alpha-amino acids listed above are the primary components of proteins, their incorporation being governed by the genetic code. Many other naturally occurring amino acids exist, and the structures of a few of these are displayed below. Some, such as hydroxylysine and hydroxyproline, are simply functionalized derivatives of a previously described compound. These two amino acids are found only in collagen, a common structural protein. Homoserine and homocysteine are higher homologs of their namesakes. The amino group in beta-alanine has moved to the end of the three-carbon chain. It is a component of pantothenic acid, HOCH2C(CH3)2CH(OH)CONHCH2CH2CO2H, a member of the vitamin B complex and an essential nutrient. Acetyl coenzyme A is a pyrophosphorylated derivative of a pantothenic acid amide. The gamma-amino homolog GABA is a neurotransmitter inhibitor and antihypertensive agent.
Many unusual amino acids, including D-enantiomers of some common acids, are produced by microorganisms. These include ornithine, which is a component of the antibiotic bacitracin A, and statin, found as part of a pentapeptide that inhibits the action of the digestive enzyme pepsin.