GMML Main Classes
MolecularModeling
This contains the classes that make up the central data structure (CDS). It contains classes for atoms, residues (list of atoms), and assemblies (list of atoms). As you can see, our core unit is an atom. In OOP-speak, our atom class is a child of multiple parent classes including qmAtom, dockingatom, MD atom. This is confusing to a non-programmer, but it allows us to have an atom with a QM mass and a different MD mass, a docking point charge and a MD point charge, etc. In OPP-speak: the child class atom can have whatever attributes of the parents we need for a particular situation.
FileSet
We use a separate data structure in the FileSet folder for each type of file (e.g PDB or AMBER topology), as there are lots of unique information in each file type beyond the atomic coordinates. All of this information (e.g. LINK cards, header cards, title cards, seqres) is stored in its the data structure. Using PDB as an example, the PDBfile class is a holder for all the other PDB classes and everything links into and through this one main class. From the PDBfile class it is possible to import a subset of the information such as the coordinates into the CDS if you wish to manipulate the 3D structure. Some actions, such as the PDBpreprocessor, do not need to manipulate the 3D structure, so it works directly on the PDBfile data structure, without using the CDS. Each data structure is an aggregate of smaller, more fundamental objects, which you can see in the “attributes” section of the header file.
ParameterSet
Other file types that we need to read include AMBER and GLYCAM library, parameter and prep files. Libraryfile.hpp is the holder for each class. Fileset and parameterset contain all the files that are supported by gmml.
Geometry
Geometry holds entries such as coordinate, cell (used for addion), grid (a collection of cells), and plane. To define a plane we use two different coordinates originating from zero to make two vectors and then the cross-product gives the normal vector of the plane. There is also an internal coordinate as well as distance, angle, and dihedral
GlycamNamespace (Poorly named, better would be SugarID)
This data structure contains everything we need to deal with identifying sugars in pdb files. We have a structure to hold Lachele’s chemical code (a structure normally doesn’t have function, whereas classes do). There are no functions so there is nothing in the source; these are information holders that are assigned by the molecularmodelling code. Recognition of carbohydrate in the CDS is done via a call to extract sugars (poor name, better would be determine sugars). There is a monosacc class which stores single ring and attachment information in Lachele’s chemical code and a map of derivatives (a sideatom is the first atom off a ring atom, and derivates are anything past that. The last atom in ring has three spaces to put side atoms +1,+2,+3. The monosacc class also stores cyclic atoms and the method used to determine the anomeric carbon:
Determining the anomeric carbon
Only inspect the two carbons that are neighbours of the ring oxygen
1. If only one has an oxygen sideatom then that is the anomeric
2. If both or none have oxygen sideatoms, decide using the atom name. If that fails look for the atom that has a carbon sideatom and if that fails go random.
Gmml Header Files
Common.hpp
Used to define:
1. Constants
2. A lookup table for sugar identification.
3. Three types of paratmeter file. Main is the standard, forcmod is modified, ionicmod frcmod file that has two sections. Mass and polarizability.
The use of enumerators to find out what typeof file is being read just cleans up the code; (instead of if else statements everywhere). Most enuminators in common are useful for programmer, but not really the user.
Gmml.hpp
Adds all the header files. Other software can then just include gmml.hpp.
Utils.hpp
List of general functions that become like built in C++ functions (just like printf)