GENAW: GEnetic Network Analysis Workbench for microarray raw data

Pan-Gyu Kim1, Kyung Shin Lee2, Seon- Hee Park, Mi Young Shin, Hwan-Gue Cho
1pgkim@pearl.cs.pusan.ac.kr, Department of computer science, Pusan national university; 2kslee@pearl.cs.pusan.ac.kr, Department of computer engineering, Pusan national university

Since the introduction of microarray for the analysis of large-scale expressions and the completion of the Human Genome Project (HGP), many researchers have been intensively investigating the genetic regulatory network from the expression data. Time-series microarrays are especially useful for the investigation of the regulatory relations since they can simultaneously display the expressions of thousands of genes. Although lots of researchers are recently working on genetic regulatory network with microarray experiments, there are few useful and user- friendly softwares for automatic construction of genetic networks. Here we propose GENAW (GEnetic Network Analysis Workbench for microarray raw data) system that automatically generates the network from raw expression data. One important feature of GENAW is that this system accepts various data formats of commercial tools including ImaGeneTM, GenePix proTM, etc. The input data of GENAW is the expression profile of microarrays. GENAW finally produces a genetic network with minimal user interaction to the system. Moreover, GENAW also provides various visualization tools that enable biologists to efficiently investigate the regulatory relationships among genes. Since GENAW generates genetic regulatory networks by using both of Boolean network and Bayesian network method, this system quantizes the gene expression data into binary or ternary domain. Since the generation of optimal genetic network with Boolean or Bayesian network methods is known to be an NP-hard problem, GENAW exploits a heuristic method which goes as follows: 1) For a single gene G, we try to find a set, n-core, which is a set of the most influential n genes to G. 2) Once the initial n-core is found, we try to find the next influential gene from those n-cores to construct the whole network. We experimented with Yeast cell cycle data from Stanford university. An experiment result shows that 3-core set is sufficiently reasonable in average.