Simon Lin 1 and Harry Zuzan 2
1Duke Bioinformatics Shared Resource, Box 3958, Duke University Medical Center, Durham, NC 27710
2Institute of Statistics and Decision Science, Box 90251, Duke University, Durham, NC 27708
Large-scale gene expression data has placed increasing demand on informatics support. In this tutorial, we will share our experience in strategic planning and practical implementation of an institute-wide bioinformatics core at Duke for Affymetrix DNA chip and spotted cDNA microarray studies. A comprehensive review of current literature for existing practices will be presented. These practices include clustering expression profiles, mapping expression data to metabolic pathways and to chromosome locations, and most recently, modeling and simulating regulatory networks. This tutorial will focus on the following three issues: 1) Storage, retrieval and dissemination of expression data, 2) Auxiliary domain expert databases for data mining and knowledge discovery, 3) Data analysis and visualization tools. Both academic solutions and commercial packages will be reviewed.
General purpose software capable of analysis of image data are usually adequate for the use of commonly employed algorithms on images of moderate size. Images obtained for the purpose of gene expression analysis are typically too large to be easily handled by commercial software. It may also be desirable to analyse a set of images of gene expression simultaneously, which is outside the scope of most software capable of analysing single images. Analysis of gene expression data is further complicated by prior knowledge of specific genes being measured at predetermined locations in the image. The benefit of this information imposes the cost of preventing the image from being viewed as a signal because some pixel values are more important than others. The relative importance is subjective and not readily quantified. For these reasons, it is advantageous to shift analyses of gene expression images from an algorithmic approach to the object oriented paradigm which is equally capable of efficiently implementing algorithms but easily focuses computational analyses on retention of data structures in place of data summaries.
Due to the rapid advance of array informatics, the contents of this tutorial will be updated regularly. To share these with the scientific community, we have developed a Microarray Informatics Portal (www.array.tech.nu). Please reference this portal for the latest information on this ISMB tutorial.
Due to the rapid advance of array informatics, the contents of this tutorial will be updated regularly. Please always reference the Microarray Informatics Portal (www.array.tech.nu) for the latest information.
Scope of the Tutorial
In today's competitive functional genomics business, understanding and applying informatics technology in microarray studies is crucial to maintaining a competitive position. This tutorial gives you an understanding of how to build microarray informatics support, from an applied perspective without undue bias toward vendor specific implementations.
Objective of the Tutorial
Upon completion of the tutorial, you will be able to:
Tutorial Outline
INTRODUCTION
Review of Related Subjects
Comparative Costs of Gene Expression Data
Image Analysis
Exploratory Data Analysis (EDA) and Data Mining (DM)
Data Visualization
Computational Approaches
Knowledge Bases for Functional Genomics
Deciphering the Genetic Network
e-Publication
Research Collaboration Environment
Appendices