- Proteinogenic amino acid
-
Proteinogenic amino acids are those amino acids that can be found in proteins and require cellular machinery coded for in the genetic code [1] of any organism for their isolated production. There are 22 standard amino acids, but only 21 are found in eukaryotes. Of the 22, 20 are directly encoded by the universal genetic code. Humans can synthesize 11 of these 20 from each other or from other molecules of intermediary metabolism. The other 9 must be consumed in the diet, and so are called essential amino acids; those are histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, and valine. The remaining two, selenocysteine and pyrrolysine, are incorporated into proteins by unique synthetic mechanisms.
The word proteinogenic means "protein building". Proteinogenic amino acids can be assembled into a polypeptide (the subunit of a protein) through a process called translation (the second stage of protein biosynthesis, part of the overall process of gene expression).
In contrast, non-proteinogenic amino acids are either not found in proteins (like carnitine, GABA, or L-DOPA), or are not produced directly and in isolation by standard cellular machinery (like hydroxyproline and selenomethionine). The latter often results from posttranslational modification of proteins.
There are clear reasons why organisms have not evolved to incorporate certain non-proteinogenic amino acids into proteins: for example, ornithine and homoserine cyclize against the peptide backbone and fragment the protein with relatively short half-lives, while others are toxic because they can be mistakenly incorporated into proteins, such as the arginine analog canavanine.
Non-proteinogenic amino acids are found in nonribosomal peptides, which are not produced by the ribosome during translation.
Contents
Structures
The following illustrates the structures and abbreviations of the 21 amino acids that are directly encoded for protein synthesis by the genetic code of eukaryotes. The structures given below are standard chemical structures, not the typical zwitterion forms that exist in aqueous solutions.
-
L-Alanine
(Ala / A) -
L-Arginine
(Arg / R) -
L-Asparagine
(Asn / N) -
L-Aspartic acid
(Asp / D) -
L-Cysteine
(Cys / C) -
L-Glutamic acid
(Glu / E) -
L-Glutamine
(Gln / Q) -
Glycine
(Gly / G) -
L-Histidine
(His / H) -
L-Isoleucine
(Ile / I) -
L-Leucine
(Leu / L) -
L-Lysine
(Lys / K) -
L-Methionine
(Met / M) -
L-Phenylalanine
(Phe / F) -
L-Proline
(Pro / P) -
L-Serine
(Ser / S) -
L-Threonine
(Thr / T) -
L-Tryptophan
(Trp / W) -
L-Tyrosine
(Tyr / Y) -
L-Valine
(Val / V)
IUPAC/IUBMB now also recommends standard abbreviations for the following two amino acids:
-
L-Selenocysteine
(Sec / U) -
L-Pyrrolysine
(Pyl / O)
Non-specific abbreviations
Sometimes the specific identity of an amino acid cannot be determined unambiguously. Certain protein sequencing techniques do not distinguish among certain pairs. Thus, the following codes are used:
- Asx (B) is "asparagine or aspartic acid"
- Glx (Z) is "glutamic acid or glutamine"
- Xle (J) is "leucine or isoleucine"
In addition, the symbol X is used to indicate an amino acid that is completely unidentified.
Chemical properties
Following is a table listing the one-letter symbols, the three-letter symbols, and the chemical properties of the side-chains of the standard amino acids. The masses listed are based on weighted averages of the elemental isotopes at their natural abundances. Note that forming a peptide bond results in elimination of a molecule of water, so the mass of an amino acid unit within a protein chain is reduced by 18.01524 Da.
General chemical properties
Amino Acid Short Abbrev. Avg. Mass (Da) pI pK1
(α-COOH)pK2
(α-+NH3)Alanine A Ala 89.09404 6.01 2.35 9.87 Cysteine C Cys 121.15404 5.05 1.92 10.70 Aspartic acid D Asp 133.10384 2.85 1.99 9.90 Glutamic acid E Glu 147.13074 3.15 2.10 9.47 Phenylalanine F Phe 165.19184 5.49 2.20 9.31 Glycine G Gly 75.06714 6.06 2.35 9.78 Histidine H His 155.15634 7.60 1.80 9.33 Isoleucine I Ile 131.17464 6.05 2.32 9.76 Lysine K Lys 146.18934 9.60 2.16 9.06 Leucine L Leu 131.17464 6.01 2.33 9.74 Methionine M Met 149.20784 5.74 2.13 9.28 Asparagine N Asn 132.11904 5.41 2.14 8.72 Pyrrolysine O Pyl Proline P Pro 115.13194 6.30 1.95 10.64 Glutamine Q Gln 146.14594 5.65 2.17 9.13 Arginine R Arg 174.20274 10.76 1.82 8.99 Serine S Ser 105.09344 5.68 2.19 9.21 Threonine T Thr 119.12034 5.60 2.09 9.10 Selenocysteine U Sec 168.053 5.47 Valine V Val 117.14784 6.00 2.39 9.74 Tryptophan W Trp 204.22844 5.89 2.46 9.41 Tyrosine Y Tyr 181.19124 5.64 2.20 9.21 Side chain properties
Amino Acid Short Abbrev. Side chain Hydro-
phobicpKa Polar pH Small Tiny Aromatic
or Aliphaticvan der Waals
volumeAlanine A Ala -CH3 X - - - X X - 67 Cysteine C Cys -CH2SH X 8.18 - acidic X - - 86 Aspartic acid D Asp -CH2COOH - 3.90 X acidic X - - 91 Glutamic acid E Glu -CH2CH2COOH - 4.07 X acidic - - - 109 Phenylalanine F Phe -CH2C6H5 X - - - - - Aromatic 135 Glycine G Gly -H X - - - X X - 48 Histidine H His -CH2-C3H3N2 - 6.04 X weak basic - - Aromatic 118 Isoleucine I Ile -CH(CH3)CH2CH3 X - - - - - Aliphatic 124 Lysine K Lys -(CH2)4NH2 - 10.54 X basic - - - 135 Leucine L Leu -CH2CH(CH3)2 X - - - - - Aliphatic 124 Methionine M Met -CH2CH2SCH3 X - - - - - - 124 Asparagine N Asn -CH2CONH2 - - X - X - - 96 Pyrrolysine O Pyl Proline P Pro -CH2CH2CH2- X - - - X - - 90 Glutamine Q Gln -CH2CH2CONH2 - - X - - - - 114 Arginine R Arg -(CH2)3NH-C(NH)NH2 - 12.48 X strongly basic - - - 148 Serine S Ser -CH2OH - - X - X X - 73 Threonine T Thr -CH(OH)CH3 - - X weak acidic X - - 93 Selenocysteine U Sec -CH2SeH X 5.73 - - X - - Valine V Val -CH(CH3)2 X - - - X - Aliphatic 105 Tryptophan W Trp -CH2C8H6N X - - - - - Aromatic 163 Tyrosine Y Tyr -CH2-C6H4OH - 10.46 X - - - Aromatic 141 Note: The pKa values of amino acids are typically slightly different when the amino acid is inside a protein. Protein pKa calculations are sometimes used to calculate the change in the pKa value of an amino acid in this situation.
Gene expression and biochemistry
Amino Acid Short Abbrev. Codon(s) Occurrence
in human proteins
(%)Essential‡ in humans Alanine A Ala GCU, GCC, GCA, GCG 7.8 - Cysteine C Cys UGU, UGC 1.9 Conditionally Aspartic acid D Asp GAU, GAC 5.3 - Glutamic acid E Glu GAA, GAG 6.3 Conditionally Phenylalanine F Phe UUU, UUC 3.9 Yes Glycine G Gly GGU, GGC, GGA, GGG 7.2 Conditionally Histidine H His CAU, CAC 2.3 Yes Isoleucine I Ile AUU, AUC, AUA 5.3 Yes Lysine K Lys AAA, AAG 5.9 Yes Leucine L Leu UUA, UUG, CUU, CUC, CUA, CUG 9.1 Yes Methionine M Met AUG 2.3 Yes Asparagine N Asn AAU, AAC 4.3 - Pyrrolysine O Pyl UAG* - Proline P Pro CCU, CCC, CCA, CCG 5.2 - Glutamine Q Gln CAA, CAG 4.2 - Arginine R Arg CGU, CGC, CGA, CGG, AGA, AGG 5.1 Conditionally Serine S Ser UCU, UCC, UCA, UCG, AGU, AGC 6.8 - Threonine T Thr ACU, ACC, ACA, ACG 5.9 Yes Selenocysteine U Sec UGA** - Valine V Val GUU, GUC, GUA, GUG 6.6 Yes Tryptophan W Trp UGG 1.4 Yes Tyrosine Y Tyr UAU, UAC 3.2 Conditionally Stop codon† - Term UAA, UAG, UGA - - * UAG is normally the amber stop codon, but encodes pyrrolysine if a PYLIS element is present.
** UGA is normally the opal (or umber) stop codon, but encodes selenocysteine if a SECIS element is present.
† The stop codon is not an amino acid, but is included for completeness.
‡ An essential amino acid cannot be synthesized in humans and must, therefore, be supplied in the diet. Conditionally essential amino acids are not normally required in the diet, but must be supplied exogenously to specific populations that do not synthesize it in adequate amounts.Mass spectrometry
In mass spectrometry of peptides and proteins, it is useful to know the masses of the residues. The mass of the peptide or protein is the sum of the residue masses plus the mass of water.[2]
Amino Acid Short Abbrev. Formula Mon. Mass§ (Da) Avg. Mass (Da) Alanine A Ala C3H5NO 71.03711 71.0788 Cysteine C Cys C3H5NOS 103.00919 103.1388 Aspartic acid D Asp C4H5NO3 115.02694 115.0886 Glutamic acid E Glu C5H7NO3 129.04259 129.1155 Phenylalanine F Phe C9H9NO 147.06841 147.1766 Glycine G Gly C2H3NO 57.02146 57.0519 Histidine H His C6H7N3O 137.05891 137.1411 Isoleucine I Ile C6H11NO 113.08406 113.1594 Lysine K Lys C6H12N2O 128.09496 128.1741 Leucine L Leu C6H11NO 113.08406 113.1594 Methionine M Met C5H9NOS 131.04049 131.1986 Asparagine N Asn C4H6N2O2 114.04293 114.1039 Pyrrolysine O Pyl C12H21N3O3 255.15829 255.3172 Proline P Pro C5H7NO 97.05276 97.1167 Glutamine Q Gln C5H8N2O2 128.05858 128.1307 Arginine R Arg C6H12N4O 156.10111 156.1875 Serine S Ser C3H5NO2 87.03203 87.0782 Threonine T Thr C4H7NO2 101.04768 101.1051 Selenocysteine U Sec C3H5NOSe 150.95364 150.0388 Valine V Val C5H9NO 99.06841 99.1326 Tryptophan W Trp C11H10N2O 186.07931 186.2132 Tyrosine Y Tyr C9H9NO2 163.06333 163.1760 Stoichiometry and metabolic cost in cell
Following table lists the abundance of amino acids in E.coli cell and the metabolic cost (ATP) for synthesis the amino acids. Negative numbers indicate the metabolic processes are energy favorable and do not cost net ATP of the cell.[3] Note that the abundance of amino acids include amino acids in free-form and in polymerization form (proteins).
Amino acid Abundance
(# of molecules (×108)
per E. coli cell)ATP cost in synthesis
under aerobic
conditionATP cost in synthesis
under anaerobic
conditionAlanine 2.9 -1 1 Cysteine 0.52 11 15 Aspartic acid 1.4 0 2 Glutamic acid 1.5 -7 -1 Phenylalanine 1.1 -6 2 Glycine 3.5 -2 2 Histidine 0.54 1 7 Isoleucine 1.7 7 11 Lysine 2.0 5 9 Leucine 2.6 -9 1 Methionine 0.88 21 23 Asparagine 1.4 3 5 Proline 1.3 -2 4 Glutamine 1.5 -6 0 Arginine 1.7 5 13 Serine 1.2 -2 2 Threonine 1.5 6 8 Tryptophan 0.33 -7 7 Tyrosine 0.79 -8 2 Valine 2.4 -2 2 Remarks
Amino Acid Abbrev. Remarks Alanine A Ala Very abundant, very versatile. More stiff than glycine, but small enough to pose only small steric limits for the protein conformation. It behaves fairly neutrally, and can be located in both hydrophilic regions on the protein outside and the hydrophobic areas inside. Asparagine or aspartic acid B Asx A placeholder when either amino acid may occupy a position. Cysteine C Cys The sulfur atom bonds readily to heavy metal ions. Under oxidizing conditions, two cysteines can join together in a disulfide bond to form the amino acid cystine. When cystines are part of a protein, insulin for example, the tertiary structure is stabilized, which makes the protein more resistant to denaturation; therefore, disulfide bonds are common in proteins that have to function in harsh environments including digestive enzymes (e.g., pepsin and chymotrypsin) and structural proteins (e.g., keratin). Disulfides are also found in peptides too small to hold a stable shape on their own (eg. insulin). Aspartic acid D Asp Behaves similarly to glutamic acid. Carries a hydrophilic acidic group with strong negative charge. Usually is located on the outer surface of the protein, making it water-soluble. Binds to positively-charged molecules and ions, often used in enzymes to fix the metal ion. When located inside of the protein, aspartate and glutamate are usually paired with arginine and lysine. Glutamic acid E Glu Behaves similar to aspartic acid. Has longer, slightly more flexible side chain. Phenylalanine F Phe Essential for humans. Phenylalanine, tyrosine, and tryptophan contain large rigid aromatic group on the side-chain. These are the biggest amino acids. Like isoleucine, leucine and valine, these are hydrophobic and tend to orient towards the interior of the folded protein molecule. Phenylalanine can be converted into Tyrosine. Glycine G Gly Because of the two hydrogen atoms at the α carbon, glycine is not optically active. It is the smallest amino acid, rotates easily, adds flexibility to the protein chain. It is able to fit into the tightest spaces, e.g., the triple helix of collagen. As too much flexibility is usually not desired, as a structural component it is less common than alanine. Histidine H His In even slightly acidic conditions protonation of the nitrogen occurs, changing the properties of histidine and the polypeptide as a whole. It is used by many proteins as a regulatory mechanism, changing the conformation and behavior of the polypeptide in acidic regions such as the late endosome or lysosome, enforcing conformation change in enzymes. However only a few histidines are needed for this, so it is comparatively scarce. Isoleucine I Ile Essential for humans. Isoleucine, leucine and valine have large aliphatic hydrophobic side chains. Their molecules are rigid, and their mutual hydrophobic interactions are important for the correct folding of proteins, as these chains tend to be located inside of the protein molecule. Leucine or isoleucine J Xle A placeholder when either amino acid may occupy a position Lysine K Lys Essential for humans. Behaves similarly to arginine. Contains a long flexible side-chain with a positively-charged end. The flexibility of the chain makes lysine and arginine suitable for binding to molecules with many negative charges on their surfaces. E.g., DNA-binding proteins have their active regions rich with arginine and lysine. The strong charge makes these two amino acids prone to be located on the outer hydrophilic surfaces of the proteins; when they are found inside, they are usually paired with a corresponding negatively-charged amino acid, e.g., aspartate or glutamate. Leucine L Leu Essential for humans. Behaves similar to isoleucine and valine. See isoleucine. Methionine M Met Essential for humans. Always the first amino acid to be incorporated into a protein; sometimes removed after translation. Like cysteine, contains sulfur, but with a methyl group instead of hydrogen. This methyl group can be activated, and is used in many reactions where a new carbon atom is being added to another molecule. Asparagine N Asn Similar to aspartic acid. Asn contains an amide group where Asp has a carboxyl. Pyrrolysine O Pyl Similar to lysine, with a pyrroline ring attached. Proline P Pro Contains an unusual ring to the N-end amine group, which forces the CO-NH amide sequence into a fixed conformation. Can disrupt protein folding structures like α helix or β sheet, forcing the desired kink in the protein chain. Common in collagen, where it often undergoes a posttranslational modification to hydroxyproline. Glutamine Q Gln Similar to glutamic acid. Gln contains an amide group where Glu has a carboxyl. Used in proteins and as a storage for ammonia. The most abundant Amino Acid in the body. Arginine R Arg Functionally similar to lysine. Serine S Ser Serine and threonine have a short group ended with a hydroxyl group. Its hydrogen is easy to remove, so serine and threonine often act as hydrogen donors in enzymes. Both are very hydrophilic, therefore the outer regions of soluble proteins tend to be rich with them. Threonine T Thr Essential for humans. Behaves similarly to serine. Selenocysteine U Sec Selenated form of cysteine, which replaces sulfur. Valine V Val Essential for humans. Behaves similarly to isoleucine and leucine. See isoleucine. Tryptophan W Trp Essential for humans. Behaves similarly to phenylalanine and tyrosine (see phenylalanine). Precursor of serotonin. Naturally fluorescent. Unknown X Xaa Placeholder when the amino acid is unknown or unimportant. Tyrosine Y Tyr Behaves similarly to phenylalanine (precursor to Tyrosine) and tryptophan (see phenylalanine). Precursor of melanin, epinephrine, and thyroid hormones. Naturally fluorescent, although fluorescence is usually quenched by energy transfer to tryptophans. Glutamic acid or glutamine Z Glx A placeholder when either amino acid may occupy a position. Catabolism
References
- ^ Ambrogelly A, Palioura S, Söll D (Jan 2007). "Natural expansion of the genetic code". Nat Chem Biol 3 (1): 29–35. doi:10.1038/nchembio847. PMID 17173027. http://www.nature.com/nchembio/journal/v3/n1/abs/nchembio847.html.
- ^ "The amino acid masses". ExPASy. http://education.expasy.org/student_projects/isotopident/htdocs/aa-list.html. Retrieved 2009-01-06.
- ^ Physical Biology of the Cell (Garland Science) p. 178
- ^ Chapter 20 (Amino Acid Degradation and Synthesis) in: Denise R., PhD. Ferrier. Lippincott's Illustrated Reviews: Biochemistry (Lippincott's Illustrated Reviews). Hagerstwon, MD: Lippincott Williams & Wilkins. ISBN 0-7817-2265-9.
- Nelson, David L.; Cox, Michael M. (2000). Lehninger Principles of Biochemistry (3rd ed.). Worth Publishers. ISBN 1-57259-153-6.
- Kyte, J.; Doolittle, R. F. (1982). "A simple method for displaying the hydropathic character of a protein". J. Mol. Biol. 157 (1): 105–132. doi:10.1016/0022-2836(82)90515-0. PMID 7108955.
- Meierhenrich, Uwe J. (2008). Amino acids and the asymmetry of life (1st ed.). Springer. ISBN 978-3-540-76885-2.
See also
The 20 common amino acids By properties AliphaticBranched-chain amino acids (Valine · Isoleucine · Leucine) · Methionine · Alanine · Proline · GlycineAromaticPolar, unchargedPositive charge (pKa)Negative charge (pKa)GeneralOther classifications biochemical families: prot · nucl · carb (glpr, alco, glys) · lipd (fata/i, phld, strd, gllp, eico) · amac/i · ncbs/i · ttpy/iCategories:- Proteinogenic amino acids
- Amino acids
- Nitrogen metabolism
- Nutrition
-
Wikimedia Foundation. 2010.