GlycoSuiteDB: A curated relational database of glycoprotein glycan structures

Hiren J. Joshi¹, Sarah Jarvis², Jonathan W Arthur, Mathew J. Harrison, Marc R. Wilkins, Nicolle H. Packer, Catherine A. Cooper
¹hirenj@proteomesystems.com, Proteome Systems; ²sjarvis@proteomesystems.com, Proteome Systems

GlycoSuiteDB is a relational database of published glycan structures. The database currently contains over 8100 entries, and is a powerful glycomic reference tool. The database can be queried based on multiple criteria including molecular mass, biological source, reference and structure epitopes. As well as providing a powerful query tool, there are a number of additional tools which are useful in the analysis of glycans.
The design of GlycoSuiteDB is based around efficiency and interchange. The efficiency of GlycoSuiteDB is achieved through the use of a relational table structure that has been optimised to allow complicated queries to run in minimal time. For example, a patented epitope query method uses a custom algorithm to allow users to search for any glycan substructure, including those that are not fully characterised, in the entire database.
Interchange is also an important factor in the design choices behind GlycoSuiteDB, and wherever possible, data stored in GlycoSuiteDB is in a standardised format. The Glycan sequences stored in the database conform to the nomenclature given by International Union of Pure and Applied Chemistry (IUPAC), with various modifications to overcome shortcomings in the standard. The database currently cross-references MEDLINE/PubMed and SWISSPROT/TrEMBL, with more links planned.
The GlycoSuite tools take advantage of the database, and provide complementary functionality to users. One such tool is GlycosidIQ, a glycan mass fingerprinting tool which allows for the assignment of glycan sequences to MS/MS spectra. Glycan mass fingerprinting is particularly difficult due to the size of the solution space, which increases exponentially as the level of fragmentation increases. GlycosidIQ uses inteilligent query optimisation and scoring to reduce the computational load of the mass fingerprinting technique and produce results in a timely manner.