A genetic polymorphism object model and XML implementation: Biological Variation Markup Language.

Greg Tyrelle1, Garry C. King2
1greg@kinglab.unsw.edu.au, UNSW; 2garry@kinglab.unsw.edu.au, UNSW

The need for a standard object model for describing genetic variation is essential for the future interoperability of genetic polymorphism databases and data produced by clinical genotyping technologies. We have developed a hierarchical object model for modelling polymorphism data and describing clinical genotypes. This object model is part of a platform specification for providing secure distributed access to clinical genotype data [1]. For electronic data exchange to be effective, information must be expressed in a standard machine-readable format. Our hierarchical model for communicating polymorphism information has been implemented in a machine-parsable format: extensible markup language (XML). We call this format Biological Variation Markup Language (BVML), its application and use are described in the BVML draft specification (http://bioinformatics.kinglab.unsw.edu.au). The BVML object model allows for accurate description of polymorphisms and metadata associated with the experimental techniques used to generate the data. We have used BVML for describing polymorphisms generated from multiple alignments and single base extension assays. We also describe a Python library for input-output processing of BVML documents. 1. Tyrelle, G. D. and King, G. C. (2003) A platform for the description, distribution and analysis of genetic polymorphism data. Proceedings of the First Asia-Pacific Bioinformatics Conference 19. Australian Computer Science Communications 25 (4) pp 173-180.