This page describes the three-letter code used for residue names in files relevant to the GLYCAM force fields. For information on other nomenclatures recognized by GLYCAM-Web, please click here.
Current Carbohydrate Naming Convention in GLYCAM_04 and GLYCAM_06
In order to develop a versatile set of 3-D structural modeling tools for carbohydrates, glycoproteins and carbohydrate-protein complexes (http://glycam.ccrc.uga.edu), we have introduced the following standards for one-letter (Table 1) and three-letter (Tables 2-5) codes for monosaccharides. There are only 26 possible one-letter codes, and they have been assigned to all of the hexoses and pentoses, as well as to the common 6‑deoxy‑ and 2‑N‑acetamido derivatives, uronic acids and sialic acid. Where possible, the letter is taken from the first letter of the monosaccharide name, however, (A = Ara, F=Fuc, G=Glc, I=Ido, M=Man, P=Psi, Q=Qui, R = Rib, T=Tal, X=Xyl). L was assigned to Gal for alliterative reasons, though it could have been used for the much less common Lyx.
In order to incorporate carbohydrates in a standardized way into modeling programs, as well as to provide a standard for X-ray and NMR protein database files (pdb), we have developed the following three-letter code nomenclature. The restriction to three letters is based on standards imposed on protein database (pdb) files by the RCSB PDB Advisory Committee (http://www.rcsb.org/pdb/pdbac.html), and for the practical reason that all current modeling and experimental software has been developed to read three-letter codes, historically for amino acids.
Figure 1. Illustration of the GLYCAM naming convention applied to cellobiose.
Using three letters (Figure 1, Tables1-5), the present system encodes the following content: carbohydrate residue name (Glc, Gal, etc.), ring form (pyranosyl or furanosyl), anomeric configuration (α or β), enantiomeric form (D or L) and occupied linkage positions (2-, 2,3-, 2,4,6-, etc.). Incorporation of linkage position is a particularly useful addition, since, unlike amino acids, the linkage cannot otherwise be inferred from the monosaccharide name. Further, the three-letter codes were chosen to be orthogonal to those currently employed for amino acids. Please see Figure 1 for a graphical illustration of the naming convention.