Codes for amino acids

发布于:2008年7月2日 12时22分

It is known that amino acids or nucleotide acids can be represented as codes of either triple letters or single letter. When I was learning pairwise alignment algorithm a few days ago, I found that there is a letter ‘J’, a single letter for amino acid that I had never seen before, in NCBI’s scoring matrix. I posted the question in the newsgroup and got the answer in the next day. Thank Hamish McWilliam of EBI, who mailed me with a very detail explanation. Following are the codes for amino acids (from

Standard Encoded Amino Acids
A Ala alanine M Met methionine
C Cys cysteine N Asn asparagine (“asparagiNe”, “aspartic-NH2″)
D Asp aspartic acid (“asparDic acid”) P Pro proline
E Glu glutamic acid (“gluEtamic acid”) Q Gln glutamine (“Qutamine”)
F Phe phenylalanine (“Fenylalanine”) R Arg arginine (“aRginine”)
G Gly glycine S Ser serine
H His histidine T Thr threonine
I Ile isoleucine V Val valine
K Lys lysine (“K” next to “L”) W Trp tryptophan (“tWiptophan”, or double-ring)
L Leu leucine Y Tyr tyrosine (“tYrosine”)
Amino Acid Ambiguities
B Asx aspartic acid or asparagine (“B” near “D”, uncertain result of hydrolysis)
J Xle leucine or isoleucine (“J” between “I” and “L”, uncertain result of mass-spec)
X Xaa unknown or unspecified amino acid (“Unk” is sometimes used as an abbreviation)
Z Glx glutamic acid or glutamine (“Z” near “X”, uncertain result of hydrolysis)
Special Encoded Amino Acids
U Sec selenocysteine (the UniProt Knowledgebase uses “C” and a feature rather than “U”)
O Pyl pyrrolysine (“pyrrOlysine”, the UniProt Knowledgebase uses “K” and a feature rather than “O”)