GeneView: A Dynamic Gene Annotation System and Its Application to Microarray Data Analysis

Xiang Yao1, Heng Dai2, Bin Tian, David Zhao, Albert Leung, Simon Smith, and Jackson Wan, Johnson and Johnson PRD;, Johnson and Johnson PRD

Motivation: Updated, comprehensive and structured annotations for all genes on a microarray chip are essential for the interpretation of its expression data. Currently, most chip gene annotations are one-line free text descriptions that are often partial, outdated and unsuitable for large-scale data analysis. Therefore the interpretation of microarray gene expression clusters is often limited. Although researchers can manually navigate a collection of databases for better annotations, it is only practical for limited number of genes. We have developed an automatic system to address this issue. Results: GeneView system monitors various data sources, extracts gene information from a source whenever it is updated, comprehensively matches genes, and integrates them into a central database by categories, such as pathway, genetic mapping, phenotype, expression profile, domain structure, protein interaction, disease association, and references. The system consists of four major components: (1) database that holds internal gene indices, categorized gene annotations from various data sources, and their gene-to-gene matches; (2) data processing that monitors, retrieves, and integrates gene information from these sources; (3) user curation that allows users to edit gene annotations and gene-to-gene matches; (4) data query that provides both single gene navigation and large batch retrieval. We evaluated it by analyzing genes on cDNA and Affymetrix Oligo chips. In both cases, the system provided more accurate and comprehensive information than those provided by the vendors or the chip users, and helped identify new common functions among genes in the same expression clusters. Availability: GeneView software and data from public sources are freely available for academic institutions on request from the authors. Contact: