| Poster Abstracts
 
 
	
	| Data Mining
 |  
| B-1  Biclustering Microarray Data by Gibbs Sampling Qizheng Sheng1, Yves Moreau2, Bart De Moor
 1qizheng.sheng@esat.kuleuven.ac.be, Katholieke Universiteit Leuven, Department of Electrical Engineering; 2yves.moreau@esat.kuleuven.ac.be, Katholieke Universiteit Leuven, Department of Electrical Engineering
 Correspondence address: qizheng.sheng@esat.kuleuven.ac.be
 
 
We have adapted Gibbs sampling strategy, which has become a method of choice for the discovery of motifs in DNA and protein sequences, to the biclustering of discretized microarray data.  In contrast with standard clustering, biclustering reveals similar expressional behavior of the genes over a subset of conditions in an microarray data set.
 Long abstract
 
 
 
 |  
| B-2  Hidden Multivariate Markov Models for Pattern Recognition in Genomic DNA Sequences Leo Wang-Kit Cheung1
 1lcheung@crch.hawaii.edu, Cancer Research Center of Hawaii, University of Hawaii
 Correspondence address: lcheung@crch.hawaii.edu
 
 
Hidden Multivariate Markov Models (HM3s) are introduced for modeling multi-dimensional genomic DNA data. A bivariate version of HM3s is developed for studying the joint behavior of the C+G richness pattern and the bendability pattern of DNA. Applications of the bivariate HM3s for recognition/prediction of eukaryotic promoter regions are illustrated.
 Long abstract
 
 
 
 |  
| B-4  Domain-Domain correlations in Yeast protein complexes Doron Betel1, Christopher W.V. Hogue2
 1doron.betel@utoronto.ca, Samuel Lunenfeld Research Institute, Mt. Sinai Hospital and Department of Biochemistry, University of Toronto; 2hogue@mshri.on.ca, Samuel Lunenfeld Research Institute, Mt. Sinai Hospital and Department of Biochemistry, University of Toronto
 Correspondence address: doron.betel@utoronto.ca
 
 
We introduce a new method for detecting statistically meaningful functional associations between domains from molecular complexes. Two random control sets were used to compute P-values for domain co-occurrences in complexes. Results from four different datasets show that many of the correlations are between domains of similar or associated function.
 Long abstract
 
 
 
 |  
| B-5  ReBIL : Relating Biological Information through Literature Francisco M. Couto1, Pedro Coutinho2, Mário J. Silva
 1fjmc@di.fc.ul.pt, Faculdade de Ciencias, Universidade de Lisboa; 2pedro@afmb.cnrs-mrs.fr, UMR 6098, Architecture et Fonction des  Macromolécules Biologiques, CNRS
 Correspondence address: fjmc@di.fc.ul.pt
 
 
ReBIL aims to improve the efficiency of information extraction systems applied to biological literature. The project is based on the correlation between structural and functional classifications of gene products. We evaluate extracted information by checking if they preserve the correlation. More information about Rebil is available at http://xldb.fc.ul.pt/rebil/.
 Long abstract
 
 
 
 |  
| B-6  POLII TRANSCRIPTION TERMINATION SIGNALS IN HUMAN Aroul Selvam1, Thomas Down2, Tim Hubbard
 1asr25@cam.ac.uk, The Wellcome Trust Sanger Institute; 2td2@sanger.ac.uk, The Wellcome Trust Sanger Institute
 Correspondence address: asr25@cam.ac.uk
 
 
RNA polymerase II - although important as it transcribes all the protein coding genes in the cell, little is known about its termination process. This study focuses on identifying motifs that link to transcription termination and polymerase relase process.
 Long abstract
 
 
 
 |  
| B-7  GOstat: Find statistically overrepresented Gene Tim Beissbarth1, Terry Speed2
 1beissbarth@wehi.edu.au, WEHI; 2terry@wehi.adu.au, WEHI
 Correspondence address: beissbarth@wehi.edu.au
 
 
GOstat provides a useful tool in order to find biological processes or
annotations characteristic for a group of genes. This is greatly helpful in
analyzing lists of genes resulting from high throughput screening experiments,
such as microarrays, for their biological meaning. GOstat is accessible via the Internet at http://gostat.wehi.edu.au.
 Long abstract
 
 
 
 |  
| B-8  Multi-Dynamic Bayesian Networks for Pattern Recognition in Genomic DNA Sequences Leo Wang-Kit Cheung1, Angel Yee-Man Cheung2
 1lcheung@crch.hawaii.edu, Cancer Research  lCenter of Hawaii, University of Hawaii ; 2angelymch@yahoo.com, Department of Computer Science, Chu Hai College
 Correspondence address: angelymch@yahoo.com
 
 
Multi-Dynamic Bayesian Networks (MDBNs) are introduced for analyzing 
multi-dimensional genomic DNA data. A two-dimensional version of MDBNs is 
developed for learning and predicting the joint behaviour of the C+G richness 
pattern and the bendability pattern of DNA. Applications of these MDBNs for 
recognition/prediction of eukaryotic promoter regions are illustrated.
 Long abstract
 
 
 
 |  
| B-9  Hight-throughput gene expression analysis with GATO David Vilanova1, James holzwarth2, Marie Camille Zwahlen, Frank Desiere,Matthew Alan Roberts
 1david.vilanova@rdls.nestle.com, Nestle research center; 2james.holzwarth@rdls.nestle.com, Nestle research center
 Correspondence address: david.vilanova@rdls.nestle.com
 
 
We present GATO (gene annotation tool) a tool to analyse gene expression data based on Ensembl database and Gene Ontology. We describe how our tool can be utilized to rapidly mine gene expression data and drive biological interpretation using Affymetrix arrays.
 Long abstract
 
 
 
 |  
| B-10  Continuous in situ Analysis of Cell Growth and Cell Viability Petra Haenel1, Franz Kummert2, Karl Friehs, Erwin Flaschel, Gerhard Sagerer
 1iphaenel@techfak.uni-bielefeld.de, Bielefeld University, Germany; 2franz@techfak.uni-bielefeld.de, Bielefeld University, Germany
 Correspondence address: iphaenel@techfak.uni-bielefeld.de
 
 
We present an image analysis system to detect and count  eukaryotic cells in darkfield
microscopy images. Analyzing undiluted yeast suspension the tool
differentiates between single, budding and cell clusters as well as
dead and vital cells. The cells within a cluster are detected by
active contours as well as the cell features which results in
precise information for fermentation.
 Long abstract
 
 
 
 |  
| B-11  Partially supervised clustering of gene expression time course data Alexander Schoenhuth1, Alexander Schliep2, Christine Steinhoff
 1aschoen@zpr.uni-koeln.de, Center for Applied Computer Science, University Cologne; 2schliep@molgen.mpg.de, Max Planck Institute for Molecular Genetics, Berlin
 Correspondence address: aschoen@zpr.uni-koeln.de
 
 
As the amount of genes with known function available is growing there is a need for 
classification methods which allow the use of prior knowledge. Partially supervised clustering of 
time courses stemming from gene expression experiments addresses this problem. Here a 
model-based clustering approach using Hidden Markov Models is proposed.
 Long abstract
 
 
 
 |  
| B-12  ExMI: Extracting Molecular Interaction from Large Biomedical Literature Yoshihiro Ohta1, Tohru Natume2, Tetsuo Nishikawa, Hiroko Ohi, Tohru Hisamitsu
 1yoh@crl.hitachi.co.jp, HITACHI Central Research Laboratory; 2natsume@jbirc.aist.go.jp, National Institute of Advanced Industrial Science and Technology
 Correspondence address: yoh@crl.hitachi.co.jp
 
 
Extracting molecular interactions from the rapidly growing biomedical literature is important to seek systematical explorations of relationships between genes and proteins. However, many of the existing computer-aided methods are not sufficiently capable of processing a huge amount of literature. Extraction techniques include molecule name detection, interaction event detection, and graphical interface construction. We explore these techniques and show system examples.
 Long abstract
 
 
 
 |  
| B-13  An algorithm to select abstracts from MEDLINE concerning UV-regulated genes Hiroko Ao1, Toshihisa Takagi
 1aohiroko@ims.u-tokyo.ac.jp, Department of Computational Biology
 Correspondence address: aohiroko@ims.u-tokyo.ac.jp
 
 
With the rapid growth of machine-readable literature, such like MEDLINE database, a search for articles is an important assignment. Therefore, we propose an efficient algorithm to select information from results of a PubMed search. When taking 487 UV-regulated genes, it extracted sentences
containing the query with 97% precision and 97% recall. 
 Long abstract
 
 
 
 |  
| B-14  TextLens: A Fast and Practical Partial Parser for Biomedical Literature Yasunori Yamamoto1, Hiroko Ao2, Toshihisa Takagi
 1yayamamo@ims.u-tokyo.ac.jp, Department of Computer Science, University of Tokyo; 2aohiroko@ims.u-tokyo.ac.jp, Department of Computational Biology
 Correspondence address: yayamamo@ims.u-tokyo.ac.jp
 
 
TextLens Partial Parser is a parser to catch a pair of main subject and predicate of a sentence. It aims at an improvement of information extraction by appropriately capturing each chunk of words and a structure of a sentence. It uses a rule-based algorithm which makes an abstract expression of a sentence.
 Long abstract
 
 
 
 |  
| B-16  MutationMiner: A Graph Theoretic Approach to Extract Point Mutation Data from Biomedical Literature Lawrence C. Lee1, Florence Horn2, Fred E. Cohen
 1lle8@itsa.ucsf.edu, University of California San Francisco; 2horn@cmpharm.ucsf.edu, University of California San Francisco
 Correspondence address: lle8@itsa.ucsf.edu
 
 
MutationMiner is a program which automates extraction of point mutation data from biomedical literature.  It uses regular expressions and a graph theoretic approach to find point mutations in the text and then confirms the mutations with SwissProt data.  MutationMiner can search over one thousand journal articles in 24 hours.
 Long abstract
 
 
 
 |  
| B-17  Use of hidden Markov models and phylogenetic algorithms to predict functionally distinct subclasses of chromodomains in different families of chromatin-modifying proteins Khairina Tajul-Arifin1, Rohan Teasdale2, John S. Mattick
 1k.tajularifin@imb.uq.edu.au, IMB, UQ; 2r.teasdale@imb.uq.edu.au, IMB, UQ
 Correspondence address: k.tajularifin@imb.uq.edu.au
 
 
We have used phylogenetic algorithms and hidden Markov models to identify functionally distinct subsets within the family of chromodomains. Our results demonstrate the validity of using bioinformatic analysis of large datasets to predict subtle but meaningful differences in protein domain function and structure-function relationships.
 Long abstract
 
 
 
 |  
| B-18  FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes Fatima Al-Shahrour1, Ramon Diaz-Uriarte2, Joaquin Dopazo
 1falshahrour@cnio.es, Bioinformatics Unit, Spanish Natonal Cancer Center, CNIO.; 2rdiaz@cnio.es, Bioinformatics Unit, Spanish Natonal Cancer Center, CNIO.
 Correspondence address: jdopazo@cnio.es
 
 
FatiGO (http://fatigo.bioinfo.cnio.es) is a simple but powerful procedure to extract Gene Ontology terms that result(upon the application of a statistical test) significantly over or under-represented in sets of genes within the context of a genome-scale experiments (DNA microarray, proteomics, etc.).
 Long abstract
 
 
 
 |  
| B-19  Current status of the GENIA Corpus: an Annotated Corpus in Molecular Biology Domain Tomoko Ohta1, Jin-Dong Kim2, Yuka Tateisi, Masayoshi  Tsuruoka, Jun'ichi Tsujii
 1okap@is.s.u-tokyo.ac.jp, CREST, JST; 2jkdim@is.s.u-tokyo.ac.jp, University of Tokyo
 Correspondence address: okap@is.s.u-tokyo.ac.jp
 
 
GENIA corpus 3.0p and 3.01, consisting of 2,000 MEDLINE abstracts, have been released with linguistically rich annotations including sentence boundaries, term boundaries, term classifications, semi-structured coordinated clauses, recovered ellipsis in terms, part-of-speech, etc. This poster is intended to provide the current status of the GENIA corpus.
 Long abstract
 
 
 
 |  
| B-20  The automatic discovery of structural principles describing protein fold space Adrian P Cootes1, Michael je Sternberg2, Stephen H Muggleton
 1a.cootes@ic.ac.uk, Imperial College; 2m.sternberg@ic.ac.uk, Imperial College
 Correspondence address: a.cootes@ic.ac.uk
 
 
The rapid increase in protein structures produced by structural-genomics projects will make it increasingly difficult to analyse and understand the distribution of proteins in fold space. We have applied a machine-learning strategy to automatically determine the structural principles describing 45 folds.
 Long abstract
 
 
 
 |  
| B-21  A Constructional Approach to Extraction Cornelia M. Verspoor1, George J. Papcun2, Kari Sentz
 1verspoor@lanl.gov, Los Alamos National Laboratory; 2gjp@lanl.gov, Los Alamos National Laboratory
 Correspondence address: verspoor@lanl.gov
 
 
We present a prototype implementation of a system for extracting protein/gene
  interactions from biological literature which is motivated by the theory of
  Construction Grammar.  CG provides a powerful framework for combining domain-specific
  terminology management with patterns incorporating generic linguistic
  structural constraints.
 Long abstract
 
 
 
 |  
| B-22  Discovery of Analog Enzymes in Thiamin Biosynthesis by Anticorrelation Enrique Morett1, J. Korbel2, K. Emmanuvel  Rajan1, G. Saab-Rincon1, L. Olvera1, M. Olvera1, B. Snel2, S. Schmidt2, and P. Bork2.
 1emorett@ibt.unam.mx, 1Instituto de of Biotecnologia, Universidad Nacional Autonoma de Mexico, AP 510-3, Cuernavaca Mor.  62250.Mexico; 2 European Molecular Biology Laboratory, Meyerhofstrasse 1. Heidelberg 69117. Germany.
 Correspondence address: emorett@ibt.unam.mx
 
 
Prediction of gene function is one of the most challenging tasks in genomic science when there is no clear sequence similarity to annotated genes. Here we present a new method denominated Anticorrelation of Gene Presence to predict gene function. Using this method we identified four new genes involved in thiamin biosynthesis
 Long abstract
 
 
 
 |  
| B-23  EST based method to identify differentially expressed gene clusters along chromosomes Karine Megy1, Stephane Audic2, Francios Enault, Jean-Michel Claverie
 1km369@cam.ac.uk, University of Cambridge; 2audic@igs.cnrs-mrs.fr, CNRS
 Correspondence address: audic@igs.cnrs-mrs.fr
 
 
We developed a method based on a statistical analysis of Expressed Sequence Tags (ESTs) to evaluate the positional clustering of differentially expressed genes. Human chromosomes 20, 21 and 22 were analysed with this method and show clusters of specifically expressed genes
 Long abstract
 
 
 
 |  
| B-24  Surfing data sources in drug discovery Dennis Madsen1
 1dnnm@novonordisk.com, Novo Nordisk
 Correspondence address: dnnm@novonordisk.com
 
 
A data integration generalist tool has been developed allowing simultaneous query in several data sources. The query is restricted to specific types such as project or metabolite name. The hits are displayed with links to the originating data source and an option to use the hit as the next query.
 Long abstract
 
 
 
 |  
| B-25  Novel members of the C12/C19 cysteine proteases identified through human genome mining efforts: primary characterization of selected genes Pierrat Benoit1, Bruengger Adrian2, Cai Richard, Gerhartz Bernd, Kossida Sophia, Nirmala Nanguniri, Worpenberg Susanne
 1benoit.pierrat@pharma.novartis.com, Novartis Institute of Biomedical Research; 2adrian.bruengger@pharma.novartis.com, Novartis Institute of Biomedical Research
 Correspondence address: benoit.pierrat@pharma.novartis.com
 
 
DUBs are cysteine proteases controlling the ubiquitination status of target proteins. Using genome mining tools, we have conducted searches for new human DUB members leading to the identification of 11 new sequences. Here we report on their in silico annotation and discuss the primary functional characterization of selected members.
  
 Long abstract
 
 
 
 |  
| B-26  Mapping and Visual Exploration of GPCR Classification Hierarchy in Interpro and GPCRDB System Yanwei Niu1, Xiangyun Wang2, Yockey, Anastasia Christianson, Guang R. Gao
 1niu@capsl.udel.edu, . Department of ECE, University of Delaware, USA; 2Xiangyun.Wang@astrazeneca.com, EST Informatics Wilmington, Astra Zeneca PLC
 Correspondence address: niu@capsl.udel.edu
 
 
Interpro and GPCRDB are two GPCR classification systems. Using data mining technique, we compared the two systems family by family at each level of the classification hierarchy and established mapping relation between them. We introduced a novel visualization tool that allows us to directly and easily compare them.
 Long abstract
 
 
 
 |  
| B-27  Discovering biological knowledge from gene expression using association rules P. Carmona-Saez1, M. Chagoyen2, A. Rodriguez, O. Trelles, J.M. Carazo and  A. Pascual-Montano
 1pcarmona@cnb.uam.es, National Center of Biotechnology. Madrid; 2monica@cnb.uam.es, National Center of Biotechnology. Madrid
 Correspondence address: pascual@cnb.uam.es
 
 
We describe the application of association rule discovery technique to find relevant relations between different genes attributes and experimental conditions in microarrays expression dataset. This method can be used to extract interesting and very diverse biological information. The method is implemented in EngeneTM software package that it is freely available upon request at http://www.engene.cnb.uam.es 
 Long abstract
 
 
 
 |  
| B-28  Human and Mouse expression maps from in silico expression profiles Alia BenKahla1, Ralf Herwig2, Hans Lehrach, Marie-Laure Yaspo
 1kahla@molgen.mpg.de, Max Planck Institute for Molecular Genetics; 2herwig@molgen.mpg.de, Max Planck Institute for Molecular Genetics
 Correspondence address: kahla@molgen.mpg.de
 
 
We present the strategy used to extract the "in silico expression profiles" of the human and mouse genes (EST mining approache) and the data describing differentially expressed genes, disease related genes, and cluster of genes potentially involved in a common cellular function. Orthology gene expression comparison will also be presented.
 Long abstract
 
 
 
 |  
| B-29  New Datasets for Structural Data Mining Studies. Carmen K. Chu1, Merridee A. Wouters2
 1cchu@cse.unsw.edu.au, Computational Biology and Bioinformatics Program, Victor Chang Cardiac Research Institute; 2m.wouters@victorchang.unsw.edu.au, Computational Biology and Bioinformatics Program, Victor Chang Cardiac Research Institute
 Correspondence address: m.wouters@victorchang.unsw.edu.au
 
 
We compared the sequence-derived representative dataset PDB_SELECT with the structural database SCOP. Some folds remain overrepresented in PDB_SELECT. After filtering, we obtain a subset of unique protein fold representatives: approximately ¼ of the original PDB_SELECT 25% list. We also discuss using unique representatives of SCOP folds as a representative dataset. 
 Long abstract
 
 
 
 |  
| B-30  An architecture for a modularized gene information retrieval and summarization tool: Bioretrieve Anton Bergheim1, Sheila Rock2
 1anton@cs.wits.ac.za, University of the Witwatersrand; 2sheila@cs.wits.ac.za, University of the Witwatersrand
 Correspondence address: anton@cs.wits.ac.za
 
 
The ability to process natural language based information
computationally is becoming a necessity for the geneticist. We present here an architecture for Bioretrieve,  a computational tool for the management of the extremely large body of knowledge that exists about genes. Designed in a modular fashion and
employing an open-source approach, it has advantages over existing monolithic systems.
 Long abstract
 
 
 
 |  
| B-31  SEMA, A semantic literature annotator Alex Garcia1, Cleary John2, Mark A. Ragan, Yi-Ping Pheobe Chen
 1a.Garcia@imb.uq.edu.au, Institute for Molecular Bioscience; 2jcleary@reeltwo.com, Reel Two
 Correspondence address: a.garcia@imb.uq.edu.au
 
 
We are using a machine-learning algorithm implemented in GO-KDS to complement SwissProt literature citation fields for each database entry. SEMA organizes this new relevant information, then builds a conceptual navigable map that is presented to the user as a flat or hyperbolic tree. This map allows redefining queries over the same database or over other information sources.
 Long abstract
 
 
 
 |  
| B-32  Non-negative matrix factorization for gene expression and scientific texts analysis A. D. Pascual-Montano1, P. Carmona-Saez2, M. Chagoyen and J.M. Carazo
 1pascual@cnb.uam.es, National Center of Biotechnology. Madrid. Spain; 2pcarmona@cnb.uam.es, National Center of Biotechnology. Madrid. Spain
 Correspondence address: pascual@cnb.uam.es
 
 
We describe the application of Non-negative Matrix Factorization (NMF) technique to reduce dimensionality and to find local patterns hidden in gene expression data sets and in the scientific literature. Results show the potential of this new machine learning technique to find relevant biological information.
 Long abstract
 
 
 
 |  
| B-33  GENAW: GEnetic Network Analysis Workbench for microarray raw data Pan-Gyu Kim1, Kyung Shin Lee2, Seon- Hee Park, Mi Young Shin, Hwan-Gue Cho
 1pgkim@pearl.cs.pusan.ac.kr, Department of computer science, Pusan national university; 2kslee@pearl.cs.pusan.ac.kr, Department of computer engineering, Pusan national university
 Correspondence address: pgkim@pearl.cs.pusan.ac.kr
 
 
We develop GENAW (GEnetic Network Analysis Workbench for microarray raw data) system that produces automatically the network from raw expression data. GENAW accepts various data formats of commercial tools and provides various visualization tools. We experimented with Yeast cell cycle data from Stanford university, our experiment was sufficiently reasonable.
 Long abstract
 
 
 
 |  
| B-34  Multi-class protein fold classification using an integrative machine learning approach Aik Choon Tan1, David Gilbert2
 1actan@brc.dcs.gla.ac.uk, Bioinformatics Research Centre, Department of Computing Science, University of Glasgow;
 2drg@brc.dcs.gla.ac.uk, Bioinformatics Research Centre, Department of Computing Science, University of Glasgow
 Correspondence address: actan@brc.dcs.gla.ac.uk
 
 
We devised a novel approach to integrate rules induced from multi-class and unbalanced data sets; and to demonstrate its usefulness to multi-class protein fold classification which contains 700 examples for 27 SCOP folds.  We showed that this approach increases the sensitivity of the classifiers and yielding more useful classifiers.  
 Long abstract
 
 
 
 |  
| B-35  C. elegans microarray data seen through a novel nonmetric  multidimensional scaling method Y-h. Taguchi1, Y. Oono2
 1tag@granular.com, Department of Physics, Chuo University; 2y-oono@uiuc.edu,  Department of Physics, UIUC
 Correspondence address: tag@granular.com
 
 
 C.elegans microarray data is analysed by a novel nonmetric
multidimensional scaling method that is maximally nonmetric.  The genes are 
embeddable in 3D.  Their annotations are consistent with their positions 
in this space.  A method to compute the 3D coordinates 
directly from the microarray data is also developed.
 Long abstract
 
 
 
 |  
| B-36   Non-metric  analysis of temporal patterns captured in microarray data Y-h. Taguchi1, Y. Oono2
 1tag@granular.com, Department of Physics, Chuo Universit; 2y-oono@uiuc.edu, Department of Physics, UIUC
 Correspondence address: tag@granular.com
 
 
A nonmetric multidimensional scaling analysis of the gene activity response 
of cell cycle-synchronized human fibroblasts to serum [Lyer et 
al.  Science  283, 83-87 (1999)] automatically gives a ring-like
gene arrangement along which the expression level peak rotates. This
unambiguously 
demonstrates the power of a nonlinear data mining method.
 Long abstract
 
 
 
 |  
| B-37  Extracting Transcription Factor Interactions from Medline Abstracts Marc Light1, Robert Arens, Vladimir Leontiev, Meredith Patterson, Xinying Qiu, Hudong Wang
 1marc-light@uiowa.edu, University of Iowa
 Correspondence address: marc-light@uiowa.edu
 
 
Staying abreast of research on transcription factors (TFs) is
currently a difficult task for biologists.  We are building a system that will extract TF interactions from Medline abstracts automatically.  To date, we have annotated a corpus for TF interactions and evaluated a number of component technologies.
 Long abstract
 
 
 
 |  
| B-38  Construction of the plant gene index system based on tissue-categorized EST sets Seung-Jae Noh1, Cheol-Goo Hur2, Sung-Ho Goh, Ho-Jin Chung, Kyoung-Oak Choi
 1sjnoh@kribb.re.kr, Korea Research Institute of Bioscience and Biotechnology; 2hurlee@kribb.re.kr, Korea Research Institute of Bioscience and Biotechnology
 Correspondence address: sjnoh@kribb.re.kr
 
 
Our plant gene index system based on stackPACK EST clustering with tissue categorization contains valuable information about 150,000 consensus sequences obtained from 9 principal plant model organisms. The information can be browsable, searchable, and downloadable with user-friendly web-interface at http://plant.pdrc.re.kr/new_korea/genepool/Plant/index.html
 Long abstract
 
 
 
 |  
| B-39  Regression analysis in optimal gene selection for DNA microarray analysis Si-Ho Yoo1, Sung-Bae Cho2
 1bonanza@candy.yonsei.ac.kr, Yonsei University; 2sbcho@cs.yonsei.ac.kr, Yonsei University
 Correspondence address: bonanza@candy.yonsei.ac.kr
 
 
We propose a new gene selection method based on forward selection method in regression analysis. This method reduces redundant information about cancer that could be in the subset of selected genes. The result shows high accuracy of 90.3% for colon cancer data.
 Long abstract
 
 
 
 |  
| B-40  BioInfoCallboratory: Towards an Agent- Assisted Web-Based Collaboration Environment for Bioinformatics Yan Chen1, Yi-Ping Phoebe Chen2
 1y52.chen@student.qut.edu.au, Queensland University of Technology; 2p.chen@qut.edu.au, Queensland University of Technology
 Correspondence address: y52.chen@student.qut.edu.au
 
 
Collaborations in web based bioinformatics environment require intelligent supports to assist human computer interaction. BioInfoCallboratory is an agent assisted web based environment for supporting bioinformatics research. It facilitates sophisticated interactions such as: matchmaking based on common interests; internet spanning data mining for bioinformatics data; and event alert for interested parties.
 Long abstract
 
 
 
 |  
| B-41  Pathogenic archaea-do they exist Neil Saunders1, Ricardo Cavicchioli2, Paul M.G. Curmi, Torsten Thomas
 1neil.saunders@unsw.edu.au, The University of New South Wales; 2r.cavicchioli@unsw.edu.au, The University of New South Wales
 Correspondence address: neil.saunders@unsw.edu.au
 
 
We have developed a rapid, automated search strategy for the detection of contaminating sequence from putative novel pathogenic archaea in human EST sequence data.  The system has general application to the detection of microbial pathogens and will be available at http://psychro.bioinformatics.unsw.edu.au.
 Long abstract
 
 
 
 |  
| B-42  Structural Classification in the Gene Ontology Cliff Joslyn1, Susan Mniszewski2, Andy Fulmer, Gary Heaton
 1joslyn@lanl.gov, Los Alamos National Laboratory; 2smm@lanl.gov, Los Alamos National Laboratory
 Correspondence address: joslyn@lanl.gov
 
 
We present the Gene Ontology Clusterer (GOC), which structurally
classifies the GO based on pseudo-distances between comparable nodes
in posets, in conjunction with scoring algorithms, to rank-order the
GO nodes with respect to a set of requested genes. We will also share
lessons we've learned about working with the GO.
 Long abstract
 
 
 
 |  
| B-43  Extracting informative genes with negative correlation for accurate cancer classification Hong-Hee Won1, Sung-Bae Cho2
 1cool@candy.yonsei.ac.kr, Yonsei University; 2sbcho@cs.yonsei.ac.kr, Yonsei University
 Correspondence address: cool@candy.yonsei.ac.kr
 
 
We define two negatively correlated ideal gene vectors which represent the patterns of classes well and extract two significant gene subsets (SGSs) based on the similarity to two ideal genes. We train the neural network classifiers with SGSs and combine them. The ensemble classifier produces the best recognition rate-97.1% in Leukemia, 87.1% in Colon, and 92.0% in Lymphoma.
 Long abstract
 
 
 
 |  
| B-44  Issues and principles in the analysis of large genomic datasets. Francis Clark1, Susan Lilley2
 1fc@maths.uq.edu.au, Advanced Computational Modelling Centre, University of Queensland, Australia.; 2s364202@student.uq.edu.au, School of Information Technology & Electrical Engineering, University of Queensland, Australia.
 Correspondence address: fc@maths.uq.edu.au
 
 
Development of research analysis pipelines often involves working
with poorly understood data to answer questions that are,
initially, simplistic. This poster overviews some strategies and
best practices that may be employed in such work, including;
handling & appraisal of the data, choice of appropriate
thresholds, extrapolation, and checking for reasonableness.
 Long abstract
 
 
 
 |  
| B-45  Hierarchical classification of cDNA libraries for gene expression analysis Bumjin Kim1, Sanghyuk Lee2, Hyunjung Lee, Young-Ah Shin, Euiju Jung, Pora Kim
 1unikbj@ewha.ac.kr, Division of Molecular Life Sciences, Ewha Womans University, Seoul 120-750, KOREA; 2sanghyuk@ewha.ac.kr, Division of Molecular Life Sciences, Ewha Womans University, Seoul 120-750, KOREA
 Correspondence address: uandikbj@hotmail.com
 
 
Approximately 8,200 human cDNA libraries in dbEST were hierarchically classified in a hierarchical fashion in four gene expression categories – tissue, pathology, developmental stage, and sex. Web-based application for profiling gene expression using the resulting database is available at http://genome.ewha.ac.kr/EODB/.
 Long abstract
 
 
 
 |  
| B-46  Detection of implicit protein-protein interactions from literature Tomonori Izumitani1, Frederic Tingaud2, Hirotoshi Taira, Eisaku Maeda
 1izumi@cslab.kecl.ntt.co.jp, NTT Communication Science Laboratories; 2tingaud@cslab.kecl.ntt.co.jp, NTT Communication Science Laboratories
 Correspondence address: izumi@cslab.kecl.ntt.co.jp
 
 
In this study, we propose a method to detect explicit or implicit
protein-protein interactions from text data. It was applied to the detection of interactions between yeast proteins. The result indicates
that the putative interactions detected by the method can contain true and experimentally unidentified interactions.
 Long abstract
 
 
 
 |  
| B-47  Comparison of intra-molecular disulphide bonding arrangements between disulphide-rich and -poor proteins in the Protein Data Bank Gerald Hartig1, Tran Trung Tran2, Mark Smythe
 1g.hartig@imb.uq.edu.au, Institute for Molecular Bioscience; 2tran@doctor.com, Protagonist Pty Ltd
 Correspondence address: g.hartig@imb.uq.edu.au
 
 
Intra-molecular disulphide bonds (IDSB) are an important determinant of a protein’s 3D conformation.  This work describes the differences in IDSB arrangements between disulphide-rich and -poor proteins.  A naturally occurring partition of 25.2 residues / IDSB was used, revealing differences in PDB headers, SCOP folds, IDSB bonding patterns and loop lengths.
 Long abstract
 
 
 
 |  
| B-48  Intimately Incorporated NLP System Adapted for Bio-Text Mining Young-Sook Hwang1, Hae-Chang Rim2, Kyoung-MePark, Ki-Joong Lee, Hong-Woo Chun
 1yshwang@nlp.korea.ac.kr, Korea Univ.; 2rim@nlp.korea.ac.kr, Korea Univ.
 Correspondence address: yshwang@nlp.korea.ac.kr
 
 
BioNLPro is a system for providing the base for a robust bio-text mining system. It is an intimately integrated NLP system consisting of the adapted core NLP modules reflecting the peculiarities of bio-text including a POS tagger, a biological term recognizer, a grammatical relation tagger based on chunking and a biological event extractor.
 Long abstract
 
 
 
 |  
| B-49  Data to Diamonds: Multivariate Datamining Leads to Concise Gene Rob Dunne1, Glenn Stone2
 1Rob.Dunne@csiro.au, CSIRO ; 2Glenn.Stone@csiro.au, CSIRO
 Correspondence address: Rob.Dunne@csiro.au
 
 
CSIRO Bioinformatics have developed an analysis methodology
based on generalized linear models coupled
with a specialized Bayesian variable selection technique.
This methodology
is capable of producing parsimonious predictors of; Classification
targets, Numeric targets
using Gaussian, Poisson or Gamma
regressions or  Survival targets using Cox's proportional hazards regression.
 Long abstract
 
 
 
 |  
| B-50  Detection of Program Source Code Plagiarism Using Genomic Sequence Alignments Methodology Eun-Mi Kang1, Hwan-Gue Cho2, Young-Min Kang
 1emkang@pearl.cs.pusan.ac.kr, Pusan National University; 2hgcho@pusan.ac.kr, Pusan National University
 Correspondence address: emkang@pearl.cs.pusan.ac.kr
 
 
We propose a new method for detecting the plagiarism by exploiting the genomic sequence alignment. The system extracts linear sequences of keywords from the source code flow, and computes the local alignments to detect local similarity of original sources. The experimental results show this approach is more powerful than fingerprinting-matching.
 Long abstract
 
 
 
 |  
| B-51  Computational comparative analysis framework at the Centre for Bioinformatics and Biological Computing. M Bellgard1, A Hunter2, D Schibeci
 1m.bellgard@murdoch.edu.au, CBBC, Murdoch University; 2a.hunter@cbbc.murdoch.edu.au, CBBC, Murdoch University
 Correspondence address: m.bellgard@murdoch.edu.au
 
 
The CBBC conducts research in computational biology ranging from comparative genomic sequence analysis, microarray and proteomic data analysis, and novel algorithms and software. The CBBC is developing a comparative analysis framework incorporating audit trailing of analysis, open source activities and distributed resource management. We present an overview of this framework.
 Long abstract
 
 
 
 |  
| B-52  Extraction of patterns in each domain of G-protein-coupled receptors Jeongho Huh1, Chungoo Park2, Dong Soo Jung, Hong Gil Nam, Jiin Choi, Young Bock Lee
 1artist3@postech.ac.kr, Division of Molecular Life Sciences, Pohang University of Science and Technology; 2madreach@bric.postech.ac.kr, Biological Research Information Center(BRIC), Pohang University of Science and Technology
 Correspondence address: artist3@postech.ac.kr
 
 
Detecting local functional sequence patterns is suitable in GPCR sequence analyses. We attempted to extract patterns that are specific in GPCR subtypes. For extracting patterns, we applied different rules to each domain of three GPCR domains. Consequently, we obtained specific patterns of GPCR clans with high frequency and high specificity.
 Long abstract
 
 
 
 |  
| B-53  Pathway data mining: tissue specificity and potential cross talks between pathways Yu-Tai Wang1, Ueng-Cheng Yang2, Yung-Wen Deng, Cheng-Min Wei, Kai-Lung Tang, Der-Ming Liou.
 1ytwang@ym.edu.tw, Institute of Biochemistry, National Yang-Ming University, Taiwan; 2yang@ym.edu.tw, Bioinformatics Research Center, National Yang-Ming University, Taiwan
 Correspondence address: ytwang@ym.edu.tw
 
 
Pathway is a way to present the mechanism behind a biological phenomenon. We have developed methods to integrate pathway-related information. By querying this integrated database, users will be able to look up tissue specific pathways and discover possible cross talk between pathways. The query results can output in graphic form.
 Long abstract
 
 
 
 |  
| B-54  GHMM and HMMEd: A toolkit for Hidden Markov Models Wasinee Rungsarityotin1, Alexander Schliep2
 1rungsari@molgen.mpg.de, Max Planck Institute for Molecular Genetics; 2schliep@molgen.mpg.de, Max Planck Institute for Molecular Genetics
 Correspondence address: schliep@molgen.mpg.de
 
 
We have developed and implemented a library for a general Hidden Markov Model (GHMM) to assist in designing a topology and visualizing parameters for a HMM. The tool has been used in solving problems such as identification of circular permutation with Profile HMMs.
GHMM and HMMEd is freely available at http://sourceforge.net/projects/ghmm/. Long abstract
 
 
 
 |  
| B-55  Efficiently finding regulatory elements using correlation with gene expression Hideo Bannai1, Shunsuke Inenaga2, Ayumi Shinohara, Masayuki Takeda, Satoru Miyano
 1bannai@ims.u-tokyo.ac.jp, Human Genome Center, Institute of Medical Science, University of Tokyo; 2s-ine@i.kyushu-u.ac.jp, Department of Informatics, Kyushu University
 Correspondence address: bannai@ims.u-tokyo.ac.jp
 
 
We present an efficient algorithm for detecting putative regulatory 
elements in the upstream sequences of genes, using expression data
obtained from microarrays. We are able to find the optimal pattern, 
most correlated with the expression levels of the genes, in time
linear in the total length of the upstream sequences.
 Long abstract
 
 
 
 |  
| B-56  Discovering useful patterns from DNA microarray experiment with large-scale multifactor design by genetic algorithm and permutation test Ju Han Kim1, Tae Su Chung2, Jihun Kim, Ji Yeon Park, Hye Won Lee, Jihoon Kim, Mingoo Kim
 1juhan@snu.ac.kr, Seoul National University Human Genome Research Institute; 2epiai@korea.com, Seoul National University Human Genome Research Institute
 Correspondence address: juhan@snu.ac.kr
 
 
We present a method of discovering useful patterns from DNA microarray experiment with large-scale multifactor design with no replication using genetic algorithm. Permutation test for the distance measures between observation and the pattern discovered statistically significant multifactor gene expression patterns with simple biological interpretations.
 Long abstract
 
 
 
 |  
| B-57  QTL analysis for outcrossing family data using genetic algorithm and simulated EM algorithm Reiichiro Nakamichi1, Satoru Miyano2
 1rei-naka@ims.u-tokyo.ac.jp, Human Genome Center, Institute of Medical Science, the University of Tokyo; 2miyano@ims.u-tokyo.ac.jp, Human Genome Center, Institute of Medical Science, the University of Tokyo
 Correspondence address: rei-naka@ims.u-tokyo.ac.jp
 
 
We propose a new method of quantitative trait loci (QTL) mapping using genetic algorithm (GA) with simulated EM algorithm. It detects QTL without highly organized experimental cross and is applicable to human genetics. Simulation studies showed high performance of our method in the cases not supported by traditional gene mappings.
 Long abstract
 
 
 
 |  
| B-58  Hierarchical-Partitioning: A New Clustering Framework for Gene Expression Data Analysis Alan Wee-Chung Liew1, Hong Yan2, Lap Keung Szeto
 1itwcliew@cityu.edu.hk, Dept of Computer Engineering and Information Technology, City University of Hong Kong; 2ityan@cityu.edu.hk, Dept of Computer Engineering and Information Technology, City University of Hong Kong
 Correspondence address: itwcliew@cityu.edu.hk
 
 
We introduce a novel hierarchical-partitioning clustering algorithm for gene expression data analysis, which combines both features
of hierarchical-based and partitioning-based clustering. Our algorithm performs a successive binary subdivision of the data into smaller and smaller partitions hierarchically, until no further splitting of a (parent) partition into two smaller (children) partitions is possible. 
 Long abstract
 
 
 
 |  
| B-59  P-quasi complete linkage clustering method for gene-expression profiles based on distribution analysis Shigeto Seno1, Reiji Teramoto2, Yoichi Takenaka, Hideo Matsuda
 1s-senoo@ist.osaka-u.ac.jp, Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University; 2teramoto@sumitomopharm.co.jp, Genomic Science Laboratories, Research Division, Sumitomo Pharmaceuticals
 Correspondence address: s-senoo@ist.osaka-u.ac.jp
 
 
We propose a new clustering method with the following two features. 
First, this method exploits a new similarity measure based on distribution 
of gene expressions. Second, this method leverages the P-quasi complete linkage 
algorithm for describing clusters. The synergy of the two features provides 
more informative clustering than traditional ones.
 Long abstract
 
 
 
 |  
| B-60  New tools for exploring noncoding RNA-mediated regulatory networks S.Stanley1
 1S.Stanley@imb.uq.edu.au, IMB
 Correspondence address: S.Stanley@imb.uq.edu.au
 
 
We present a method for the extraction of minimal complete sets of exactly repeated sequences from genomes, and then extracting subsets with the potential for producing primary sequence-dependent RNA regulatory signals and regulatory networks, initially focused on intronic sequences, wherein we can examine clustering of matched sequences by functional groups.  Results are presented for S.cerevisiae.
 Long abstract
 
 
 
 |  
| B-61  DESCRIBER: Graphical Relational Models for Collaborative Filtering in Microarray Data Mining William H. Hsu1, Roby Joehanes2, Prashanth Boddhireddy
 1bhsu@cis.ksu.edu, Kansas State University; 2robbyjo@cis.ksu.edu, Kansas State University
 Correspondence address: bhsu@cis.ksu.edu
 
 
This poster presents DESCRIBER, a system that uses graphical models to represent relational data in computational genomics portals cf. myGrid, integrating descriptive data models for microarray data mining and extending the information retrieval capabilities 
of indices such as ResearchIndex.  The objective is to provide collaborative filtering (CF) over data, metadata, source code (cf. OpenBio), and experimental documentation.
 Long abstract
 
 
 
 |  
| B-62  A Software Toolkit for Learning Dynamic Graphical Models of Gene Regulatory Structure from Microarray Data William H. Hsu1, Youping Deng2, J. Clare Nelson, Judith L. Roe
 1bhsu@cis.ksu.edu, Kansas State University; 2ypdeng@ksu.edu, Kansas State University
 Correspondence address: bhsu@cis.ksu.edu
 
 
We present BNJ, an experimental Java-based software toolkit for learning network models of gene regulation from microarray data.  We survey current research issues in learning the structure of graphical models, outline the components of BNJ used in modeling regulatory dynamics of S. cerevisiae, and present preliminary results and current research directions.
 Long abstract
 
 
 
 |  
| B-63  Computational prediction of macrophage specific regulatory network Brendan Tse1, Timothy Ravasi2, Christine Wells, Yi-Ping Phoebe Chen, David Hume
 1s371293@student.uq.edu.au, University of Queensland; 2t.ravasi@imb.uq.edu.au, University of Queensland
 Correspondence address: s371293@student.uq.edu.au
 
 
This project aims to create an analytical pipeline that links existing pattern discovery tools to allow automation of transcriptional element pattern predictions to be preformed simultaneously across multiple species directly from microarray experimental results. The system may be used as a means to map transcriptional pathways to macrophage specific regulatory networks. 
 Long abstract
 
 
 
 |  
| B-64  Protein Superfamily Clustering using Biomedical Text Mining via the Information Bottleneck Method Sahng-Joon Auh1, Jae-Hong Eom2, Byoung-Hee Kim, Byoung-Tak Zhang
 1sjauh@bi.snu.ac.kr, Biointelligence Laboratory; 2jheom@bi.snu.ac.kr, Biointelligence Laboratory
 Correspondence address: jheom@bi.snu.ac.kr
 
 
We present a novel implementation of protein superfamily clustering using biomedical literature via the recently introduced information bottleneck method which shows good performance in document clustering. We test our method over 1866 saccharomyces cerevisiae proteins in COGs (Clusters of Orthologous Groups of proteins) by NCBI (National Center for Biotechnology Information).
 Long abstract
 
 
 
 |  
| B-65  A Bayesian HMM algorithm for the identification of gene families Richard Boys1, Daniel Henderson2
 1richard.boys@ncl.ac.uk, University of Newcastle upon Tyne; 2d.a.henderson@open.ac.uk, The Open University
 Correspondence address: richard.boys@ncl.ac.uk
 
 
We describe an algorithm that identifies families of genes with
similar nucleotide patterns and hopefully similar function. The
approach is quite general in terms of its flexibility with respect to
the number of different families that may be present and the
complexity of the structure within each family.
 Long abstract
 
 
 
 |  
| B-66  Bio-Linux: An integrated bioinformatics solution for the EG community Dan Swan1, Bela Tiwari2, Dawn Field
 1dswan@ceh.ac.uk; 2btiwari@ceh.ac.uk
 Correspondence address: dswan@ceh.ac.uk
 
 
Bio-Linux is an integrated, bioinformatics-centred, research platform. By providing both standard favourite and cutting edge bioinformatics tools on a Linux-based system, it combines the benefits of being powerful, configurable, and easily updateable, with the ease of use and potential for software integration required for the handling and analysis of biological data.
 Long abstract
 
 
 
 |  
	| Data Visualisation
 |  
| C-1  Automated Construction of Comparative Maps between Zebrafish, Human, Rat and Mouse Jedidiah Mathis1, Victor Ruotti2, Jeff Nie, Dan Chen, John Postlethwait, Monte Westerfield, Michael Thomas, Michael Carvan, Peter Tonellato
 1jmathis@mcw.edu, Medical College of Wisconsin; 2vruotti@mcw.edu, Medical College of Wisconsin
 Correspondence address: jmathis@mcw.edu
 
 
Radiation hybrid maps coupled with the mapping of expressed sequence tags and their organization into UniGene clusters, has revolutionized the way comparative maps are built and maintained. We have used publicly available rat, mouse, human, and zebrafish data to build completely integrated comparative maps.
 Long abstract
 
 
 
 |  
| C-2  GET3D, A Genomic Exploration Tool in 3D John Gill1
 1john.gill@monash.edu.au, Victorian Bioinformatics Consortium, Monash University
 Correspondence address: john.gill@med.monash.edu.au
 
 
The GET3D, Genomic Exploration Tool in 3D, software tool is a complimentary
product to the CAS, Categorised Annotation Set, system. Through 3D visualization
and interaction techniques it allows for the manipulate of the CAS dataset, including 
the highlighting and exploration of data relationships, and for adding new information and 
relationships.
 Long abstract
 
 
 
 |  
| C-3  Application of Q-Gene software for the quantitation on Nipah virus in experimental animals using real-time PCR L.I. Pritchard1, Y. Kaku2, G. Crameri, B.T. Eaton, D.B. Boyle
 1ian.pritchard@csiro.au, AAHL CSIRO; 2, National Institute of Animal Health, Toyko, Japan
 Correspondence address: ian.pritchard@csiro.au
 
 
Quantitative real-time PCR represents a highly sensitive and powerful technique for the high-throughput analysis of virus load and gene expression. We used the Q-Gene software (Muller et al., 2002) to expedite the statistical analysis, graphical presentation and evaluation of the real-time PCR quantitation of Nipah virus in experimental animals.
 Long abstract
 
 
 
 |  
| C-4  Comparing Patterns in Gene Expression in Longitudinal Array Experiments Using a Novel Algorithm, TAPiR: Time-course Algorithm for Pattern Recognition Catherine Campbell1, Raj Lingam2, Yang Fann
 1campbelc@ninds.nih.gov, NIH-NINDS; 2lingamr@ninds.nih.gov, NIH-NINDS
 Correspondence address: campbelc@ninds.nih.gov
 
 
TAPiR is a novel algorithm for identifying, clustering and visualizing time-course microarray experiments based on either fold change or t-test p-value thresholds. TAPiR assigns letters to specific patterns of change that can be oriented along the time-course to form “words” that can then be sorted alphabetically to identify similar clusters. 
 Long abstract
 
 
 
 |  
| C-5  Correlation analysis as a preprocessing tool in clustering of time-course gene expression timecourse data Christopher Bowman1, Richard Baumgartner2, Stephanie Booth
 1Christppher.Bowman@nrc-cnrc.gc.ca, Institute for Biodiagnostics; 2Richard.Baumgartner@nrc-cnrc.gc.ca, Institute for Biodiagnostics
 Correspondence address: Christopher.Bowman@nrc-cnrc.gc.ca
 
 
We apply correlation analysis, a tool developed for analysis of functional magnetic resonance images to microarray timecourse experiments.  Although fMRI and microarray timecourses share little in common physically, the data analysis tasks are quite similar, and correlation analysis is shown to be a useful preprocessing tool to apply prior to clustering gene expression timecourses.
 Long abstract
 
 
 
 |  
| C-6  Uncovering Hidden Linkages among Disparate Information Sources Edy S. Liongosari1, Mitu Singh2
 1edy.s.liongosari@accenture.com, Accenture Technology Labs; 2mitu.singh@accenture.com, Accenture Technology Labs
 Correspondence address: edy.s.liongosari@accenture.com
 
 
The Knowledge Discovery Tool or KDT is a tool that utilizes a knowledge modeling approach to intelligently extract and integrate a large set of disconnected bio-medical information. Its unique user interface allows it users to see how the entities are linked together, uncovers hidden linkages and highlights certain unusual links that might be worth exploring.
 Long abstract
 
 
 
 |  
| C-7  Poster Title: Linguistic profiling of genome sequences. The Sequence identifier position end set cardinal: an estimate of linear sequence complexity: algorithms and genome profiling applications. Christophe Lefevre1
 1chris.lefevre@med.monash.edu.au, Victorian Bioinformatics Consortium
 Correspondence address: chris.lefevre@med.monash.edu.au
 
 
 The sequence identifier end set cardinal  is proposed as a new estimator of linear sequence linguistic complexity. A  scanning window algorithm to compute this value is presented and profiles obtained with genomic sequences are discussed.
 Long abstract
 
 
 
 |  
| C-8  GENOME-WIDE HAPLOTYPE STRUCTURE VISUALIZATION AND ANALYSIS IN MOUSE Tim Wiltshire1, Serge Batalov 2, Mathew Pletcher, R.J.Mural, M.D.Adams, C.F.Fletcher
 1timw@gnf.org, GNF; 2batalov@gnf.org, GNF
 Correspondence address: timw@gnf.org
 
 
SNPview, the interactive navigator for the individual SNPs, SSLPs, alleles and haplotypes projected to the genomic axis is available on-line at http://www.gnf.org/SNP/ . Large, but discrete regions of the genome are not very
polymorphic between particular strain pairs, and thus cannot easily be
interrogated for natural genetic variations
influencing QTLs.
 Long abstract
 
 
 
 |  
| C-9  Molecular Modeling using Virtual Reality with Force Feed Back Hiroshi Mizushima1, Hiroshi Tanaka2, Masaaki Hatsuta, Daisuke Arai, Hiroshi Nagata
 1hmizushi@ncc.go.jp, National Cancer Center Research Institute; 2tanaka@tmd.ac.jp, Tokyo Medical Dental University
 Correspondence address: hmizushi@ncc.go.jp
 
 
We developed a computer aided molecular modeling system using virtual reality technologies. Although it is still a prototype, the most characteristic function of the system is enabling its user to “touch” and “feel” the electrostatic potential field of a protein or a drug molecule.
 Long abstract
 
 
 
 |  
| C-10  BIRCH - A portable and comprehensive bioinformatics platform Brian Fristensky1
 1frist@cc.umanitoba.ca, University of Manitoba
 Correspondence address: frist@cc.umanitoba.ca
 
 
BIRCH is a  resource of integrated
programs and databases for molecular biology, unified through the GDE
graphic interface. The BIRCH framework is designed for semi-automated
installation and customization on Unix systems, and integration of
locally-installed software and databases into BIRCH.
http://home.cc.umanitoba.ca/~psgendb
 Long abstract
 
 
 
 |  
| C-11  BioViz: Brassica Arabidopsis Comparative Genome Browser - The application of Scalable Vector Graphics to comparative genomics Christopher T Lewis1, Andrew Sharpe2, Stephen Karcz, Isobel AP Parkin, Derek Lydiate
 1LewisCT@agr.gc.ca, Agriculture and Agri-food Canada; 2sharpea@agr.gc.ca, Agriculture and Agri-food Canada
 Correspondence address: LewisCT@agr.gc.ca
 
 
SVG has enabled a visually appealing application for the visual comparison of
Brassica napus and Arabidopsis thaliana.  SVG overcomes two key drawbacks of
current web-based genome browsers: fixed displays and frequent page reloads.
The Brassica Araboidopsis Comparative Genome Browser is available online at
http://www.brassica.ca. Long abstract
 
 
 
 |  
| C-12  STING MILLENNIUM SUITE v.3 and JAVA PROTEIN DOSSIER: a novel concept in data visualization and analysis of the protein structure/function relationship Goran Neshich1, Roberto Togawa2, Walter Rocchia, Adauto L. Mancini, Paula R. Kuser, Michel E. B. Yamagishi, Alexandre Alvaro, Christian Baudet and Roberto H. Higa
 1neshich@cnptia.embrapa.br, EMBRAPA/CNPTIA; 2togawa@cenargen.embrapa.br, EMBRAPA/CENARGEN
 Correspondence address: neshich@cnptia.embrapa.br
 
 
STING Millennium (SMS) and Java Protein Dossier (JPD) make a powerful duo for the structural analysis of macromolecules. SMS is a web based set of programs and databases for analysis of protein structures, while JPD provides an abundant collection of physical and chemical descriptors/parameters. SMS/JPD v.3 is available at http://www.cbi.cnptia.embrapa.br
 Long abstract
 
 
 
 |  
| C-13  A Visualization Framework to Assist in the Selection of SNP Markers for Association Studies of Complex Diseases Francisco M. De La Vega1, Hadar Avi-Itzhak2
 1delavefm@appliedbiosystems.com, Applied Biosystems; 2AviitzHI@appliedbiosystems.com, Applied Biosystems
 Correspondence address: delavefm@appliedbiosystems.com
 
 
We developed a framework to visualize SNPs, haplotype blocks, and genes across chromosomal physical maps and their relationship with linkage disequilibrium maps. This visualization is aimed to the cost-effective selection of SNP markers for disease association studies as a function of the profile of LD obtained on reference population samples.
 Long abstract
 
 
 
 |  
| C-14  Metabolic Control Analysis of Gene-knockout Escherichia coli Based on the Inverse Flux Analysis with Experimental Verification Md. Aminul Hoque1, Khandaker Al Zaid Siddiquee2, Kazuyuki Shimizu
 1aminul@sfc.keio.ac.jp, Institute for Advanced Bioscience, Keio University, Tsuruoka, 997-0035, Japan; 2, Department of Biochemical Engineering and Science, Kyushu Institute of Technology,
 Correspondence address: aminul@sfc.keio.ac.jp
 
 
It was shown from Inverse Flux Analysis (IFA) of Escherichia coli that if pyk was knocked out, then the flux through ppc and TCA cycle pathway increased, and the acetate production flux reduced but acetate production increased while ppc was knocked out. The experimental data well coincided with the IFA results. 
 Long abstract
 
 
 
 |  
	| Databases
 |  
| D-1  GeneView: A Dynamic Gene Annotation System and Its Application to Microarray Data Analysis Xiang Yao1, Heng Dai2, Bin Tian, David Zhao, Albert Leung, Simon Smith, and Jackson Wan
 1xyao@prdus.jnj.com, Johnson and Johnson PRD; 2hdai1@prdus.jnj.com, Johnson and Johnson PRD
 Correspondence address: xyao@prdus.jnj.com
 
 
I.	We have developed a system that monitors various data sources, dynamically extracts gene information, comprehensively matches genes, and integrates them into a central database by categories, such as pathway, genetic mapping, phenotype, expression profile, domain structure, protein interaction, disease association, and references. The system achieves high performance when querying a large batch of genes together
 Long abstract
 
 
 
 |  
| D-2  http://elm.eu.org - ELM Resource for prediction of functional sites in proteins Rune Linding1, ELM Consortium2
 1linding@embl.de, EMBL; 2info@elm.eu.org, ELM Consortium
 Correspondence address: linding@embl.de
 
 
ELM is a resource for predicting functional sites in eukarytic proteins. Putative functional sites are identified by conventional methods, such as patterns (regular expressions) or hidden Markov models. To improve the predictive power, context-based rules and logical filters will be developed and applied to reduce the amount of false positives. 
 Long abstract
 
 
 
 |  
| D-3  FlyMine: An integrated database for Drosophila and Anopheles genomics Gos Micklem1, Andrew Varley2, Richard Smith, Rachel Lyne
 1gos@gen.cam.ac.uk, University of Cambridge; 2ajv12@cam.ac.uk, University of Cambridge
 Correspondence address: ajv12@cam.ac.uk
 
 
FlyMine is a project to build an integrated database of genomic, expression and protein data for Drosophila and Anopheles. Data are stored massively redundantly and arbitrary queries are allowed using a web interface or Java API.  Queries are re-written in real time to make use of redundant tables.
 Long abstract
 
 
 
 |  
| D-4  InterPro, a protein functional classification resource Nicola Mulder1, InterPro Consortium2
 1mulder@ebi.ac.uk, EBI; 2interpro@ebi.ac.uk, EBI
 Correspondence address: mulder@ebi.ac.uk
 
 
InterPro is an integrated protein signature resource for predicting protein families, domains and functional sites. It incorporates data from PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIR Superfamilies and the structure-based SUPERFAMILY. InterPro classifies 80% of SWISS-PROT/TREMBL and provides links to the proteins, methods, specialised protein family resources and structural information.
 Long abstract
 
 
 
 |  
| D-5  dbZach: An Integrative Toxicogenomic Supportive Relational Database System Lyle D Burgoon1, Paul C Boutros2, Edward Dere, Shane Doran, Shraddha Pai, Raeka Aiyar, Jigger Vakharia, Rebecca Rotman, Tim Zacharewski
 1burgoonl@msu.edu, Michigan State University; 2 Michigan State University
 Correspondence address: burgoonl@msu.edu
 
 
The dbZach System is a microarray information management system meant for local installation in labs or larger, enterprise environments.  The dbZach database includes four core subsystems, and six modular subsystems.  The system also includes GUI data mining tools, and an API.  Source code will be available soon at: http://dbzach.fst.msu.edu.
 Long abstract
 
 
 
 |  
| D-6  MyMED -  An Internal XML Relational Database Implementation of MEDLINE Citations K Lewis1, CW Hogue2
 1lewis@mshri.on.ca, Samuel Lunenfeld Research Institute, Mt Sinai Hospital, Dept of Biochemistry, University of Toronto; 2hogue@mshri.on.ca, Samuel Lunenfeld Research Institute, Mt Sinai Hospital, Dept of Biochemistry, University of Toronto
 Correspondence address: lewis@mshri.on.ca
 
 
MyMED is an internal relational XML database implementation of MEDLINE citations. The MyMED database is necessary to execute text mining algorithms and complex text searches in a fast, secure manner. Data is stored in a DB2 database that is enabled for the XML Extender and Text Information Extender.
 Long abstract
 
 
 
 |  
| D-7  RAT GENOME DATABASE - RGD - DISEASE ORIENTED RESEARCH RESOURCE Aubrey Hughes1, Jedediah Mathis2, Mary Shimoyama, Milton Datta, Simon Twigger, Charles W.Wang, Nataliya Nenasheva, Dean Pasko, Norberto de la Cruz, Victor Ruotti, Susan Bromberg, Chin-Fu Chen, Rajni Nigam, Gopal Gopinathrao, Angela Zuniga-Myer and Peter Tonellato
 1ahughes@mcw.edu, Medical College of Wisconsin; 2jmathis@mcw.edu, Medical College of Wisconsin
 Correspondence address: ahughes@mcw.edu
 
 
The Rat Genome Database (RGD) provides research groups access to rat genomic and genetic data, including annotated sequence, related to particular diseases. DORR will help RGD answer the need of
the rat community for access to curated data of interest.
rgd.mcw.edu
 Long abstract
 
 
 
 |  
| D-8  RAT GENOME DATABASE - RGD - MAPPING DISEASE ONTO THE GENOME Aubrey Hughes1, Jedediah Mathis2, Mary Shimoyama, Norberto de la Cruz, Charles W. Wang, Nataliya Nenasheva, Dean Pasko, Jiali Chen, Lan Zhao, Chunyu Fan, Wenhua Wu, Chin-Fu Chen, Rajni Nigam, Gopal Gopinathrao, Angela Zuniga-Meyer, Susan Bromberg, Jessica Ginster, Anne Kwitek-Black, Janan Eppig, Lois Maltais, Donna Maglott, Greg Schuler, Simon Twigger, Howard Jacob and Peter Tonellato
 1ahughes@mcw.edu, Medical College of Wisconsin; 2jmathis@mcw.edu, Medical College of Wisconsin
 Correspondence address: ahughes@mcw.edu
 
 
The Rat Genome Database (RGD) is a disease centric resource of comprehensively curated data from rat genetic and genomic research. RGD's goal is to provide information that will aid researchers in using the rat as a model organism for human disease studies. RGD is available at rgd.mcw.edu .
 Long abstract
 
 
 
 |  
| D-9  Submission tools for EMBL-Bank, EMBL-Align and SWISS-PROT databases Vincent Lombard1, Mary ann Tuli, Robert Vaughan, Minna Lehvaslaiho, Weimin Zhu and Rolf Apweiler
 1lombard@ebi.ac.uk, EBI
 Correspondence address: lombard@ebi.ac.uk
 
 
 Webin, Webin-Align and SPIN are the web-based tools for
submitting nucleotide sequences, nucleotide sequence alignments and protein sequences respectively
to EMBL-Bank, EMBL-Align, or SWISS-PROT databases. These tools guide you
through a sequence of WWW forms allowing interactive submission. All
the information required to create a database entry is collected
during this process. All submission tools are available at http://www.ebi.ac.uk/Submissions/index.html  Long abstract
 
 
 
 |  
| D-10  UniProt: Universal Protein Databases for Protein Sequences and Function Allyson Williams1, Maria Jesus Martin2, Claire O'Donovan, Daniel Barrell, Alexander Fedotov, Rolf Apweiler
 1allyson@ebi.ac.uk, EMBL - EBI; 2martin@ebi.ac.uk, EMBL - EBI
 Correspondence address: allyson@ebi.ac.uk
 
 
The UniProt Consortium (European Bioinformatics Institute, Swiss Institute of Bioinformatics, and the Protein Information Resource) was created to merge Swiss-Prot, TrEMBL and PIR database activities into UniProt, a comprehensive resource of protein sequences and function. UniProt has three layers: protein sequence archive, protein knowledgebase, and non-redundant reference (NREF) databases.
 Long abstract
 
 
 
 |  
| D-11  PRIME: automatically extracted PRotein Interactions and Molecular Information databasE. Asako Koike1, Yoshiyuki Kobayashi2, Toshihisa Takagi
 1akoike@ims.u-tokyo.ac.jp, Human Genome Center, The Institute of Medical Science, Univ. of Tokyo; 2yashi@ls.hitachi.co.jp, Life Science Group, Hitachi, Ltd.
 Correspondence address: akoike@ims.u-tokyo.ac.jp
 
 
PRIME(http://prime.ontology.ims.u-tokyo.ac.jp/) is an integrated database involving major completely sequenced eukaryotes. It contains the protein-protein/gene/compound interaction data extracted by natural language processing, domain information, structural information, protein kinase classification, and ortholog tables among organisms. The comparison and prediction of pathways are also available by an automatic pathway graphic image interface.
 Long abstract
 
 
 
 |  
| D-12  Automating Data Collection And Categorisation Using The CAS Software John Gill1
 1john.gill@monash.edu.au, Victorian Bioinformatics Consortium, Monash University
 Correspondence address: john.gill@med.monash.edu.au
 
 
CAS is a data integration system that allows for the creation of categories and integrated data sets. Each category consists of a set of attributes and methods, which act as an object item. Incoming data, obtained manually or through the automated data collection component, is linked to category attributes according 
to user defined rules and conditions.
 Long abstract
 
 
 
 |  
| D-13  Building a Database of Protein Structure Using a Geographic Model based on Topological Consistency Sung-Hee Park1, Keun Ho Ryu2, Byeong-Jin Jeong Hyeon S. Son
 1shpark@dblab.chungbuk.ac.kr, Chungbuk National University; 2khryu@dblab.chungbuk.ac.kr, Chungbuk National University
 Correspondence address: shpark@dblab.chungbuk.ac.kr
 
 
We propose protein structure modeling using a geographic model and build a structure database which includes thematic information and geometry of protein. In the modeling, geometry is represented by spatial types and thematic information includes the physico-chemical data. We state queries to retrieve topological relationship between structural elements with spatial operators. 
 Long abstract
 
 
 
 |  
| D-14  Version Management of a Genomic Sequence Database Using Active Rules and Temporal Concepts Sung-Hee Park1, Keun Ho Ryu2, Byeong-Jin Jeong Hyeon S. Son
 1shpark@dblab.chungbuk.ac.kr, Chungbuk National University; 2khryu@dblab.chungbuk.ac.kr, Chungbuk National University
 Correspondence address: shpark@dblab.chungbuk.ac.kr
 
 
We propose modeling of sequence versions for sequence changes of the same piece of DNA using a time stamp attribute in a temporal data model and mechanism of management of sequence versions in a sequence database by applying trigger rules(Event-Condition-Action) in an active database system.
 Long abstract
 
 
 
 |  
| D-15  GCC: a database system for immune cells transcriptomes Andrea Splendiani1, C.Vizzardelli,N.Pavelka,M.Pelizzola,M.Capozzoli,O.Beretta,F.Granucci,P.Ricciardi-Castagnoli
 1andrea.splendiani@unimib.it, Univ. Milano Bicocca
 Correspondence address: andrea.splendiani@unimib.it
 
 
GCC is a database for  immune cells transcriptomes.
It is based on a database system (built using opensource technologies) that allows for data storage and intelligent retrieval. It is targeted at the affymetrix platform and allows for MIAME compliant experimental annotation and supports adoption of ontologies.
 Long abstract
 
 
 
 |  
| D-16  dbSTR: A Database for Short Tandem Repeats Haifeng Liu1, Loo Nin Teo2, Eric Yap, Linda Gan, Hui Min Wu, Sock Hoon Ng, Adrian Eng, Loo See Teo, Keng Wah Chao
 1lhaifeng@dso.org.sg, DSO National Laboratories, Singapore; 2tloonin@dso.org.sg, DSO National Laboratories, Singapore
 Correspondence address: lhaifeng@dso.org.sg
 
 
dbSTR is a repository of short tandem repeats (microsatellites) whose polymorphisms have either been predicted using machine learning or verified using wet-lab methods.  These STRs could be useful markers for high resolution linkage and association studies.  dbSTR is freely available at http://www.dbstr.org.
 Long abstract
 
 
 
 |  
| D-17  Integration and representation of heterogeneous metabolic databases for the analysis of metabolism: BIOSILICO Jin Sik Kim1, Ji Hoon Jun2, Yong Wook Kim, Sujin Chae, Mira Roh, Yong-Ho In and Sang Yup Lee
 1jskim@mail.kaist.ac.kr, KAIST; 2gene2@bioinfomatix.com, Bioinfomatix Inc.
 Correspondence address: jskim@mail.kaist.ac.kr
 
 
BIOSILICO is a web-based database system that facilitates the search and analysis of metabolic pathways. BIOSILICO allows efficient retrieval of all available information on enzymes, compounds, reactions and pathways by integrating the heterogeneous metabolic databases and generates well-designed view pages showing retrieved data in a systematic way for easy understanding.
 Long abstract
 
 
 
 |  
| D-18  GlycoSuiteDB: A curated relational database of glycoprotein glycan structures Hiren J. Joshi1, Sarah Jarvis2, Jonathan W Arthur, Mathew J. Harrison, Marc R. Wilkins, Nicolle H. Packer, Catherine A. Cooper
 1hirenj@proteomesystems.com, Proteome Systems; 2sjarvis@proteomesystems.com, Proteome Systems
 Correspondence address: Jonathan.Arthur@proteomesystems.com
 
 
GlycoSuiteDB is a relational database of published glycan structures designed to assist researchers in the analysis of glycans. GlycoSuiteDB can be accessed from http://www.glycosuite.com
 Long abstract
 
 
 
 |  
| D-19  SNP PrimerPicker Yip-Kuen Lau1, Ching-Fun Lau2, Henry Yiu-Hang Fu, Hong Xue
 1henryfu@ust.hk, Applied Genomics Center,  Hong Kong Bioinformatics Center,  ParmacoGenetics Ltd,  Department of Biochemistry, Hong Kong University of Science and Technology; 2carolau@ust.hk, Applied Genomics Center,  Hong Kong Bioinformatics Center,  ParmacoGenetics Ltd,  Department of Biochemistry, Hong Kong University of Science and Technology
 Correspondence address: hxue@ust.hk
 
 
SNP PrimerPicker is a software system for designing primers. Sequences in record are aligned with the source using bl2seq. It utilizes Primer3 to find primers and blastcl3 to compare similarity with the chromosomes. The components are integrated to make the procedure convenient.  It is web accessible at http://bcz099.ust.hk/primerpicker/.
 Long abstract
 
 
 
 |  
| D-20  Melbourne Brain Genome Project Seong-Seng Tan1, Lavinia Hyde2, Masters C, Gunnersen J, Kenshole B, Job C, Augustine C, Boon W-M, Brown M, Scott HS
 1s.tan@hfi.unimelb.edu.au, Howard Florey Institute of Experimental Medicine and Physiology; 2hyde@wehi.edu.au, Walter and Eliza Hall Institute
 Correspondence address: hyde@wehi.edu.au
 
 
The Melbourne Brain Genome Project is an Internet resource for studying gene expression as measured by serial analysis of gene expression, in both normal mice and specific mouse models.  These models mimic neurodegenerative human diseases. The resource includes  tools developed to analyse this data and is available at http://www.mbgproject.org.  
 Long abstract
 
 
 
 |  
| D-22  Assembly as part of a DNA sequence management system Zmasek, C. M.1, Lapp, H.2, Ching, K.; Wiltshire, T.; Fletcher, C.; Orth, A.
 1czmasek@gnf.org, Genomics Institute of the Novartis Research Foundation; 2hlapp@gnf.org, Genomics Institute of the Novartis Research Foundation
 Correspondence address: czmasek@gnf.org
 
 
We describe a tool to manage assembly processes. Two main advantages of this system are: [i]  Based on "assembly projects" which contain all the relevant data of an assembly. This allows for re-assembly at a later point with (e.g.) additional input sequences. [ii] All user interaction is through a GUI.
 Long abstract
 
 
 
 |  
| D-23  Yeast Protein Interactomes: The Novel Platform and Value-Added Database Chung-Yen Lin1, Chi-Shang Cho2, Chen-Zen Lo, Chao A. Hsiung
 1cylin@nhri.org.tw, National Health Research Institutes; 2vecstar@nhri.org.tw, National Health Research Institutes
 Correspondence address: cylin@nhri.org.tw
 
 
Based on the composing of php, Mysql, Linux, we construct the S. cerevisiae (15,000 entries) and putative C. elegans protein interactions database. This database can help to annotate novel proteins by the interacting partners; also can provide proper candidates to narrow down the scale of further high-throughput screening experiments.
 Long abstract
 
 
 
 |  
| D-24  Free public services from the European Bioinformatics Institute, the European Molecular Biology Laboratory outstation. Nicola Harte1, Rodrigo Lopez2,  Karyn Duggan, Rob Harper, Asif Kibria, Adam Lowe, Gulam Patel, Sharmila Pillai, Emmanuel Quevillon, Stephen Robinson, Ville Silventoinen
 1nharte@ebi.ac.uk, EMBL-EBI; 2rls@ebi.ac.uk, EMBL-EBI
 Correspondence address: nharte@ebi.ac.uk
 
 
The EBI provides free, publicly available bioinformatics services for the scientific community. These can be divided up into the following categories: data submissions processing, biological database production, access to query, analysis and retrieval systems, ftp downloads, training and education and user support. These services are available at: http://www.ebi.ac.uk/services.
 Long abstract
 
 
 
 |  
| D-25  KEGG API: A new web service for accessing the KEGG database Shuichi Kawashima1, Toshiaki Katayama2, Yoko Sato, Minoru Kanehisa
 1shuichi@kuicr.kyoto-u.ac.jp, Bioinformatics Center, Institute for Chemical Research, Kyoto University; 2k@bioruby.org, Bioinformatics Center, Institute for Chemical Research, Kyoto University
 Correspondence address: shuichi@kuicr.kyoto-u.ac.jp
 
 
KEGG API is a new web service for accessing the KEGG database. Using
the APIs in the local program, the user can retrieve various
information about genes, pathways, chemical compounds etc. stored in
the latest versions of the KEGG database. KEGG API is available at
http://www.genome.ad.jp/kegg/soap/.
 Long abstract
 
 
 
 |  
| D-26  A technology for integration of databases with common subject domains Maria Samsonova1, Andrei Pisarev2, Maxim Blagov
 1samson@spbcas.ru, SPbSPU; 2pisarev@spbcas.ru, SPbSPU
 Correspondence address: samson@spbcas.ru
 
 
We present a novel approach to the integration of distributed molecular biology information resources, which consists in a design of an adaptive natural language interface and application of multiagent technology. Our approach permits integration of any databases which have a common subject domain. The implemented prototype is available at http://urchin.spbcas.ru/NLP/NLP.htm.
 Long abstract
 
 
 
 |  
| D-27  JPIPE, A pipeline module for JEMBOSS Alex Garcia1, Leyla J. Garcia2, Mark A. Ragan, Yi-Ping Phoebe Chen
 1a.Garcia@imb.uq.edu.au, Institute for Molecular Bioscience; 2leyla.garcia@unisabana.edu.co, U. de la Sabana
 Correspondence address: a.garcia@imb.uq.edu.au
 
 
We present a module (JPIPE) that allows EMBOSS users to build analysis pipelines under the JEMBOSS GUI. JPIPE is a flexible workflow system that complements JEMBOSS. A tracking system is a part of JPIPE so the user is able to recreate pipelines, compare results at a particular point of the workflow and administer ongoing jobs.
 Long abstract
 
 
 
 |  
| D-28  An integrative searchable database for Bioinformatics tools, algorithms and software: pBIRD Sumeet Muju1, Catherine Campbell2, KaiIng Chow    Yang Fann
 1mujus@ninds.nih.gov, NIH-NINDS; 2campbelc@ninds.nih.gov, NIH-NINDS
 Correspondence address: mujus@ninds.nih.gov
 
 
pBIRD is a fully-searchable web based system designed to identify, catalog and describe a broad range of available bioinformatics software. The system is designed to upload, store, track and retrieve information associated with and about bioinformatics software products deposited in the database from both internal and external sources.
 Long abstract
 
 
 
 |  
| D-29  A genetic polymorphism object model and XML implementation: Biological Variation Markup Language. Greg Tyrelle1, Garry C. King2
 1greg@kinglab.unsw.edu.au, UNSW; 2garry@kinglab.unsw.edu.au, UNSW
 Correspondence address: greg@kinglab.unsw.edu.au
 
 
As molecular genotyping technologies accelerate there is an increasing need to communicate precise information on polymorphism data in machine-readable format. We have developed a hierarchical object model and XML implementation called Biological Variation Markup Language (BVML) to facilitate exchange between genotyping laboratories and distributed databases.  
 Long abstract
 
 
 
 |  
| D-30  updateBASE : Real-time automatic updating system of biological databases under the client-server environment. Sujin Chae1, Mira Roh2, Ji-Hoon Jun, Geunwoo Lee, Yong-ho In
 1sujin@bioinfomatix.com, Bioinfomatix Inc.; 2mrroh@bioinfomatix.com, Bioinfomatix Inc.
 Correspondence address: sujin@bioinfomatix.com
 
 
We developed the updateBASE system which provides real-time, automatic updating of biological databases under the client-server architecture. Using this system, an annotator can get up-to-date database sources and get more confident annotation results. The updateBASE will play an effective supporting role in predicting functions and relationships of unknown sequences.
 Long abstract
 
 
 
 |  
| D-31  CleanBank: a database of sequence artifacts Hanne Volpin1, Eitan Rubin2
 1hanne@agri.gov.il, Bioinformatics, Agricultural Research Organization, Bet Dagan, Israel; 2Eitan.Rubin@weizmann.ac.il, Bioinformatics and Biological Computing, Weizmann Institute of Science, Rehovot, Israel
 Correspondence address: Eitan.Rubin@weizmann.ac.il
 
 
CleanBank is a database that documents suspected artifacts found in sequences and/or their annotation in the international sequence databases.  The artifacts are either reported by researchers, or identified by curated algorithms.  Current algorithms detect E. coli and vector contamination.   For a detailed description and a preview, see http://bip.weizmann.ac.il/MIW/CleanBank/index.html
 Long abstract
 
 
 
 |  
| D-32  Optimizing Genome Interval Overlap Queries Using an R-Tree Index Hilmar Lapp1, Chris Mungall2, Scott Cain, Lincoln Stein
 1hlapp@gnf.org, GNF; 2cjm@fruitfly.org, University of California, Berkely
 Correspondence address: hlapp@gnf.org
 
 
We present a solution to the huge variance problem that has plagued B-tree supported genome interval overlap queries. Our approach is based on translating the overlap query into a two-dimensional point-in-box geometric query supported by an R-tree index.
 Long abstract
 
 
 
 |  
| D-33  A Pathway DB: Annotating Signal Transduction Pathways with bio-processes using hierarchical multi-layered structures. Ken Ichiro Fukuda1, Yuki Yamagata2, Toshihisa Takagi
 1fukuda-cbrc@aist.go.jp, CBRC, AIST; 2snowfox@hgc.jp, BIRD, JST
 Correspondence address: fukuda-cbrc@aist.go.jp
 
 
A database that formalizes Signal Transduction Pathway knowledge in scientific literatures is presented. The database focuses on annotating pathways or sub-pathways according to their related biological processes. Every process and element in a pathway has a pointer to ontologies, such as GO, and one can search (sub-)pathways, molecules by using them.
 Long abstract
 
 
 
 |  
| D-34  ANTIMIC: A database of antimicrobial peptides Manisha Brahmachary1, Judice L.Y.Koh, Mohammad Asif Khan,Seah Seng Hong Tin Wee Tan, Vladimir Bajic
 1manisha@lit.org.sg, Institute of Infocomm Research
 Correspondence address: manisha@lit.org.sg
 
 
ANTIMIC is a specialized database dedicated to antimicrobial peptides. It contains useful analysis tools that can aid wet-lab scientists to determine the family and function of a putative new anti-microbial peptide and also in design of artificial anti-microbial peptides.
 Long abstract
 
 
 
 |  
| D-35  OrthoDisease: A Human Disease Ortholog Database Kevin O Brien1, Isabelle Westerlund, Erik Sonnhammer
 1kevobr@mbox.ki.se, Karolinska Institutet
 Correspondence address: kevobr@mbox.ki.se
 
 
We report the construction of a novel database termed OrthoDisease, which was constructed using the Inparanoid program to analyze a list of disease genes derived from the Mendelian Inheritance in Man database. Our database is accessible online at orthodisease.cgb.ki.se and can be searched according to disease/gene/protein name or EC/MIM number.
 Long abstract
 
 
 
 |  
| D-36  BASE - a free microarray database system Carl Troein1, Johan Vallon-Christersson2, Lao Saal, Jari Häkkinen
 1carl@thep.lu.se, Dept. of Theor. Phys., Lund University; 2johan.vallon-christersson@onk.lu.se, Dept. of Oncology, Lund University
 Correspondence address: carl@thep.lu.se
 
 
BASE is a free microarray database system with a clean and
intuitive web interface. It manages biomaterials, array
production and raw data with images. Analysis tools are
included, and users can provide new tools through a plugin
interface. The BASE web site is http://base.thep.lu.se/.
 Long abstract
 
 
 
 |  
| D-37  MARS: Mutation Analysis Reporting System for Human Genetic Disease Byeong-Chul Kang1, Jun-Hyung Park2, In-Joo Kim, Hyo-Myung Kim, Hee-Kyung Park, and Cheol-Min Kim
 1bckang@pusan.ac.kr, Interdisciplinary Program of Bioinformatics, Graduate School, Pusan National University; 2jhaprk98@pusan.ac.kr, Busan Genome Center, College of Medicine, Pusan National University
 Correspondence address: bckang@pusan.ac.kr
 
 
MARS is an intelligent diagnosis system for human genetic disease. The MARS consists of databases of human genetic disease information and mutation detection system. The first release of MARS contains genetic information of MECP2 gene and its website is available at http://www.genome.re.kr/mars/ 
 Long abstract
 
 
 
 |  
| D-38  SDPS: Small Disulphide-bonded Proteins Structural database Lesheng Kong1, Shoba Ranganathan2
 1lesheng@bic.nus.edu.sg, National University of Singapore; 2shoba@bic.nus.edu.sg, National University of Singapore
 Correspondence address: lesheng@bic.nus.edu.sg
 
 
SDPS database is a comprehensive structural database of small disulphide-bonded proteins. This database is enriched with a number of new features which cannot be easily accessed through public databases. The database aims to facilitate the research on small disulphide-bonded proteins especially on disulphide connectivity features. SDPS database can be accessed freely at http://origin.bic.nus.edu.sg/sdps.
 Long abstract
 
 
 
 |  
| D-39  BioPAX - Biological Pathway Data Exchange Format BioPAX Group1
 1pax@cbio.mskcc.org, BioPAX
 Correspondence address: jluciano@biopathways.org
 
 
BioPAX (http://biopax.org)
is a new community-based initiative to address the growing need for a unified
framework for sharing pathway information.  Several groups are
participating in BioPAX to develop a data exchange format that will allow
communication between existing pathway databases and facilitate deposition of
data into a common public repository. Long abstract
 
 
 
 |  
| D-40  Mouse Genome Informatics: Integration Nexus for Mammalian Biology B Sinclair1, JA Blake 2, M Ringwald, CJ Bult, JA Kadin, JE Richardson, JT Eppig, Mouse Genome Informatics Group
 1bobs@informatics.jax.org, Jackson Laboratory; 2jblake@informatics.jax.org, Jackson Laboratory
 Correspondence address: jblake@informatics.jax.org
 
 
The Mouse Genome Informatics (MGI) databases provide access to comprehensive, integrated, experimental data for the laboratory mouse in the domains of sequence, expression, gene function (GO), molecular variation, phenotype, inbred strain characterization, homology, and tumor biology.  MGI provides the definitive mouse gene index.  MGI can be accessed at http://www.informatics.jax.org/.
 Long abstract
 
 
 
 |  
| D-41  An Effective Query Method for DNA Sequence Jiyuan An1, Yi-Ping Phoebe Chen2
 1j.an@qut.edu.au, Queensland University of Technology; 2p.chen@qut.edu.au, Queensland University of Technology
 Correspondence address: j.an@qut.edu.au
 
 
A measurement with edit distance is a typical way for searching for similar DNA sequences. But in some cases, time warping distance is more appropriate for measuring similarity between two DNA sequences. In this paper we propose a query method based on time warping distance by coding DNA sequence. 
 Long abstract
 
 
 
 |  
| D-42  Generating Database Technologies and Simulations for Branching Structure Applications Yi-Ping Phoebe Chen1
 1p.chen@qut.edu.au, Queensland University of Technology
 Correspondence address: p.chen@qut.edu.au
 
 
This research will investigate the ways in which biologists analyse data in their plant studies, and how these requirements may be expressed through a visual query language that allows researchers to directly address the plant characteristics that are of interest. 
 Long abstract
 
 
 
 |  
| D-43  A dimensional data warehouse for biological data Tore Eriksson1, Katsuki Tsuritani2
 1tore.eriksson@po.rd.taisho.co.jp, Taisho Pharmaceuticals, Co., Ltd.; 2k.tsuritani@po.rd.taisho.co.jp, Taisho Pharmaceuticals, Co., Ltd.
 Correspondence address: tore.eriksson@po.rd.taisho.co.jp
 
 
Data warehousing using dimensional modeling and s
tar schema data marts
was applied to biological information. The data marts capture biological
relations
hips like gene--protein, as well as numerical data in the intersection
between dimensions for example expression data.
 Focus was also on building a automatized extraction and transformation
of a wide
range of public data sources.
 Long abstract
 
 
 
 |  
| D-44  SeqHound: biological sequence and structure database as a platform for bioinformatics research Katerina Michalickova1, Hao Lieu2, Gary D. Bader, Michel Dumontier, Doron Betel, Ruth Isserlin, Christopher W.V. Hogue
 1katerina@mshri.on.ca, Samuel Lunenfeld Research Institute and Department of Biochemistry, University of Toronto; 2lieu@mshri.on.ca, Samuel Lunenfeld Research Institute
 Correspondence address: katerina@mshri.on.ca
 
 
SeqHound is a resource containing daily updated Entrez databases and 3-D structural data.   It holds links to similar sequences,  taxonomy, complete genomes, functional annotation, structural domains and literature.  SeqHound is accessible directly through a C API and via a web server through PERL, Bioperl, C or C++ APIs.
 Long abstract
 
 
 
 |  
| D-45  GENA - Genomics Array Database Gavin Kennedy1
 1gavin.kennedy@csiro.au, CSIRO
 Correspondence address: gavin.kennedy@csiro.au
 
 
The GENA Genomics Array database stores data generated by the Microarray process as well as information describing the experimental conditions. GENA provides structured queries to extract meaningful data that supports comparative analysis of gene expression ratios. Gena is unique in its capacity to mine gene expression data from several perspectives.
 Long abstract
 
 
 
 |  
| D-46  ScriptSure: A Non Redundant View of the Human Transcriptome Jarret Glasscock1, Warren Gish2
 1jglassco@sapiens.wustl.edu, Washington University; 2gish@watson.wustl.edu, Washington University
 Correspondence address: jglassco@sapiens.wustl.edu
 
 
The goal of the ScriptSure project is to create a database that gives an accurate, comprehensive representation of the human transcriptome. ScriptSure provides a non-redundant representation of the transcript data, provides high quality (genomic) sequence, and alleviates problems associated with current approaches to representing transcript data. http:://sapiens.wustl.edu/ScriptSure
 Long abstract
 
 
 
 |  
| D-47  Designing XML and XML Schemas for Bioinformatics using UML Philip Burton1, Russel Bruhn2
 1pjburton@ualr.edu, University of Arkansas at Little Rock; 2rebruhn@ualr.edu, University of Arkansas at Little Rock
 Correspondence address: pjburton@ualr.edu
 
 
The Unified Modeling Language (UML) can be used to display Bioinformatic data objects and their relationships graphically. The first step in the process is done at the conceptual level, allowing domain experts like biologists to participate. In this paper, we sketch the process of creating an XML document from scratch.
 Long abstract
 
 
 
 |  
	| Functional Genomics
 |  
| F-1  Bioinformatics Tools to Support siRNA Technology Fran Lewitter1, Bingbing Yuan2, Markus Hossbach,  Thomas Tuschl, George Bell, Robert Latek
 1lewitter@wi.mit.edu, Biocomputing Group, Whitehead Institute, Cambridge MA; 2yuan@wi.mit.edu, Biocomputing Group, Whitehead Institute, Cambridge MA
 Correspondence address: lewitter@wi.mit.edu
 
 
We have built a first-generation computational tool for
siRNA selection (http://jura.wi.mit.edu/bioc/siRNA)
which implements sophisticated selection algorithms to identify siRNAs with a high probability of specifically silencing the target gene. We have also designed a prototype database called sirBank  to be a repository for siRNA molecules known to silence target genes. 
 Long abstract
 
 
 
 |  
| F-2  Cancer-Specific Alternative Splicing is prevalent in the Human Genome Qiang Xu1, Christopher Lee2
 1qxu@chem.ucla.edu, Molecular Biology Institute, Department of Chemistry and Biochemistry, UCLA; 2leec@mbi.ucla.edu, Molecular Biology Institute, Department of Chemistry and Biochemistry, UCLA
 Correspondence address: qxu@chem.ucla.edu
 
 
We found strong evidence (p<0.01) of cancer-specific splice variants in 316 human genes through a genome-wide analysis of human expressed sequences.  The majority of these genes have functions associated with cancer.  For a large number of cancer-associated genes, it appears the normal form instead of the cancer form that is previously uncharacterized. 
 Long abstract
 
 
 
 |  
| F-3  Identification of Novel Two-partner Secretion Family in Burkholderia pseudomallei Annapoorna Nimaggadda1, Sheila Nathan2, Rahmah Mohamed
 1anulins@yahoo.com, Universiti Kebangsaan Malaysia; 2sheila@pkrisc.cc.ukm.my, Universiti Kebangsaan Malaysia
 Correspondence address: sheila@pkrisc.cc.ukm.my
 
 
Identification Of Novel Two-partner Secretion Family in Burkholderia pseudomallei
Filamentous hemagglutinin belongs to the Two Partner Secretion (TPS) family. Sequence analysis indicated the presence of filamentous hemagglutinin (FhaB) and its transporter (FhaC) in an operon in Burkholderia pseudomallei. Motif recognition, phylogenetic analysis using N-J and PAM matrix methods provide more information of the functionality of the operon.
 Long abstract
 
 
 
 |  
| F-5  BRIDGE - Building a Bioinformatics Ressource for the Integration of heterogeneous Data from Genomic Explorations into a platform for Systems Biology Alexander Goesmann1, Folker Meyer2, D. Bartels, L. Krause, B. Linke, O. Rupp, A. Pühler
 1Alexander.Goesmann@Genetik.Uni-Bielefeld.DE, Center for Genome Research, Bielefeld University; 2fm@Genetik.Uni-Bielefeld.DE, Center for Genome Research, Bielefeld University
 Correspondence address: Alexander.Goesmann@Genetik.Uni-Bielefeld.DE
 
 
We describe our concept for the integration of
heterogeneous data into a platform for systems biology. We have
implemented a Bioinformatics
Resource for the Integration of heterogeneous Data from Genomic Explorations (BRIDGE) and illustrate
the useability of our approach as a platform for systems biology for
two sample applications.
 Long abstract
 
 
 
 |  
| F-6  A Reconstruction Algorithm from Expression Data for Sparse Noninteracting Gene Networks Ilaria Mogno1, Lorenzo Farina2, Salvatore Monaco
 1mogno@dis.uniroma1.it, DIS, Universita di Roma La Sapienza; 2lorenzo.farina@uniroma1.it, DIS, Universita di Roma La Sapienza
 Correspondence address: mogno@dis.uniroma1.it
 
 
We propose an algorithm, which reconstructs gene networks from expression
data, trying to face the problem of "small" available data, assuming some reasonable biologically consistent hypotheses. We also evaluate algorithm
performance on artificial problems.
 Long abstract
 
 
 
 |  
| F-7  Extraction of Pathways Involved in Microarray Time Course Experiments Christine Steinhoff1, Tobias Mueller2, Hannes Luz, Martin Vingron
 1christine.steinhoff@molgen.mpg.de, Max Planck Institute; 2Tobias.Mueller@biozentrum.uni-wuerzburg.de, Biocenter, University Würzburg
 Correspondence address: christine.steinhoff@molgen.mpg.de
 
 
We present a procedure that integrates knowledge of multiple biological databases to recover potentially involved pathways from time course microarray experiments. Starting with a refined new clustering algorithm we group similarly behaving genes, followed by an integrated analysis of common transcription factor binding patterns, functional categories and biologically verified pathways. 
 Long abstract
 
 
 
 |  
| F-8  Computational Discovery of Gene Modules and Regulatory Networks Georg K. Gerber1, Ziv-Bar Joseph2, Tong Ihn Lee, François Robert, D. Benjamin Gordon, Ernest Fraenkel, Itamar Simon, Tommi S. Jaakkola, Richard A. Young, David K. Gifford
 1georg@mit.edu, Massachusetts Institute of Technology, Laboratory for Computer Science; 2georg@mit.edu, Massachusetts Institute of Technology, Laboratory for Computer Science
 Correspondence address: georg@mit.edu
 
 
We present an algorithm for combining genome-wide expression and protein-DNA binding data to discover co-regulated modules of genes and associated regulatory networks.  Our algorithm operates on discovered networks to label transcription factors as activators or repressors, identify patterns of combinatorial regulation, and uncover sub-networks for biological processes in Saccharomyces cervisiae.
 Long abstract
 
 
 
 |  
| F-9  Non-conserved alternative splicing of human and mouse genes I. Artamonova1, M. Gelfand2, A. Mironov, R. Nurtdinov.
 1irena@humgen.siobc.ras.ru, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya 16-10, Moscow, 117997, Russia; 2gelfand@ig-msk.ru, State Scientific Center GosNIIGenetika, 1st Dorozhny 1, Moscow 113545, Russia
 Correspondence address: irena@humgen.siobc.ras.ru
 
 
We analyzed conservation of alternative splicing patterns in pairs of orthologous genes from the human and mouse genomes. Our results demonstrate considerable diversity of alternative splicing in these genomes: at least half of alternatively spliced genes have species-specific isoforms. Orthologs with non-conserved isoforms may play a role in species-specific development.
 Long abstract
 
 
 
 |  
| F-10  ProDB - Bioinformatics support for high throughput proteomics Andreas Wilke1, Christian Rueckert2, Sebastian Kespoh , Martina Mahne, Andrea T. Hueser, Folker  Meyer
 1andreas.wilke@genetik.uni-bielefeld.de, UniversBielefeld University, Institute for Genome Research, Germany; 2christian.rueckert@genetik.uni-bielefeld.de, Int. NRW Grad. School in Bioinformatics  Genome Research, Bielefeld Unive rsity, Germanyy
 Correspondence address: andreas.wilke@genetik.uni-bielefeld.de
 
 
To cope with the need for automated data conversion, storage, and analysis in 
the field of proteomics, the open source system ProDB was developed. The system 
handles data conversion from different mass spectrometer software, automates data
analysis, and will allow the annotation of MS spectra. 
 Long abstract
 
 
 
 |  
| F-11  Functional Annotation in the Twilight Zone using Machine Learning Ali Al-Shahib1, David Gilbert2
 1alshahib@dcs.gla.ac.uk, University of Glasgow; 2drg@dcs.gla.ac.uk, University of Glasgow
 Correspondence address: alshahib@dcs.gla.ac.uk
 
 
In functional genomics, many are worried at the number of functionally unknown genes we have.  One of the areas that we think has contributed to this is the uncertainty in low sequence alignments (twilight zone).  Our work involves the development of a rule-based system that allows us to accurately assign functional annotation in the twilight zone. 
 Long abstract
 
 
 
 |  
| F-12  Predicting Co-Complexed Protein Pairs Using Genomic and Proteomic Data Integration Lan V. Zhang1, Sharyl L. Wong2, Oliver D. King, Frederick P. Roth
 1lan_zhang@student.hms.harvard.edu, Harvard Medical School; 2sharyl_wong@student.hms.harvard.edu, Harvard Medical School
 Correspondence address: lan_zhang@student.hms.harvard.edu
 
 
We took a probabilistic decision tree approach to predict
co-complexed pairs (CCPs) of proteins by integrating high-throughput
interaction datasets with other characteristics of gene/protein pairs.
Our method made more sensitive and specific predictions than
high-throughput interaction screens, and is also promising in detecting
unknown CCPs.
 Long abstract
 
 
 
 |  
| F-13  Protein Function Prediction Using Probabilistic Protein Interaction Networks Debra S. Goldberg1, Sharyl Wong2, Frederick P. Roth
 1debg@hms.harvard.edu, Harvard Medical School; 2sharyl_wong@student.hms.harvard.edu, Harvard Medical School
 Correspondence address: debg@hms.harvard.edu
 
 
To improve protein function prediction from high-throughput, error-prone data, we compute a posterior probability for the validity of each interaction.  These edge weights are based on the experimental data and how well each observation fits the expected network topology.  Our function predictions compare favourably to previously published methods.
 Long abstract
 
 
 
 |  
| F-14  GOPArcII - new features of the GeneOntology and Pathways Architecture Daniela Bartels1, Alexander Goesmann2, Oliver Rupp, Folker Meyer
 1Daniela.Bartels@Genetik.Uni-Bielefeld.DE, Center for Genome Research, Bielefeld University; 2Alexander.Goesmann@Genetik.Uni-Bielefeld.DE, Center for Genome Research, Bielefeld University
 Correspondence address: Daniela.Bartels@Genetik.Uni-Bielefeld.DE
 
 
We present GOPArcII, a new version of our comprehensive, 
open source framework for the integration of functional
classifications and metabolic pathways.
GOPArcII is based on a relational database. 
It enables a user to search and handle data like genome 
data from the perspective of functional categories and
metabolic pathways.
 Long abstract
 
 
 
 |  
| F-15  On the Sequence Pattern Distribution in Splice Junctions. An Analysis Using Information Theoretic and Machine Learning Christina Zheng1, Virginia R de Sa2, Michael Gribskov, T. Murlidharan Nair
 1nair@sdsc.edu, UCSD SDSC; 2desa@cogsci.ucsd.edu, UCSD
 Correspondence address: nair@sdsc.edu
 
 
The computational recognition of precise splice junctions is a challenge
faced in the analysis of newly sequenced genomes.  To understand the
sequence signatures at the splice junctions, comparative analysis using
both neural network based calliper randomization and information theoretic
based feature selection approaches have been used.
 Long abstract
 
 
 
 |  
| F-16  LOC3D: annotate sub-cellular localization for protein structures Rajesh Nair1, Burkhard Rost2
 1nair@cubic.bioc.columbia.edu, Columbia University; 2rost@columbia.edu, Columbia University
 Correspondence address: nair@cubic.bioc.columbia.edu
 
 
LOC3D is both a database and a web server for predicting the sub-cellular localization of eukaryotic proteins of known structure. Localization is predicted using a combination of four different methods; prediction of nuclear localization signals, through sequence homology to proteins with known localization, automatic text analysis of SWISS-PROT keywords and using neural networks. 
 Long abstract
 
 
 
 |  
| F-17  Identification of putative insulin binding motifs of the insulin receptor Steve Bottomley1, Jessica Mitchell2, Brian Plewright, Erik Helmerhorst
 1S.Bottomley@curtin.edu.au, Western Australian Biomedical Research Institute, School of Biomedical Sciences, Curtin University of Technology; 2MITCHEJM@ses.curtin.edu.au, Western Australian Biomedical Research Institute, School of Biomedical Sciences, Curtin University of Technology
 Correspondence address: S.Bottomley@curtin.edu.au
 
 
Overlapping 9 and 15mer peptides covering the insulin receptor alpha-subunit sequence were synthesised and measured for their ability to specifically bind 125I-insulin. The insulin binding sequences were analysed to identify putative insulin-binding regions of the receptor, insulin-binding motifs, and develop a preliminary insulin-binding scoring matrix.
 Long abstract
 
 
 
 |  
| F-18  A Functional Annotation Project for Novel and Uncharacterised Genes William Wilson1, Emily Hodges2, Ivana Novak, Claes Wahlestedt, Christer Höög, Boris Lenhard
 1bill.wilson@cgb.ki.se, Karolinska Institute; 2emily.hodges@cgb.ki.se, Karolinska Institute
 Correspondence address: bill.wilson@cgb.ki.se
 
 
Annotation of novel genes is a challenge to genome sequencing efforts. We streamlined the process for genes with novel protein-coding domains by integrating web-based databases and annotation tools with gene data from diverse sources. We show examples of how our approach enhances experimental design and leads to accurate gene annotation.
 Long abstract
 
 
 
 |  
| F-19  Interacting Determinants of Migraine Susceptibility Rod Lea1, Lyn Griffiths2
 1r.lea@griffith.edu.au, Genomics Research Centre, Griffith University; 2l.griffiths@griffith.edu.au, Genomics Research Centre, Griffith University
 Correspondence address: r.lea@griffith.edu.au
 
 
It is likely that multiple genetic variants interact to confer susceptibility to complex disease.  We have shown that functional variants in the MTHFR and ACE genes interact to increase risk of migraine.
 Long abstract
 
 
 
 |  
| F-20  Computational analysis of stop codon readthrough in D.melanogaster Misaki Sato1, Hitomi Umeki2, Rintaro Saito, Akio Kanai, Masaru Tomita
 1s00457ms@sfc.keio.ac.jp, Laboratory for Bioinformatics, Institute for Advanced Biosciences, Keio University; 2t01513hu@sfc.keio.ac.jp, Laboratory for Bioinformatics, Institute for Advanced Biosciences, Keio University
 Correspondence address: s00457ms@sfc.keio.ac.jp
 
 
We constructed a system that lists candidates of readthrough genes based on the existence of a “protein motif” at the 3’UTR. Using this system, we extracted 85 candidates in Drosophila melanogaster, and found features in those sequences which are known to have an effect on readthrough events.
 Long abstract
 
 
 
 |  
| F-21  GOODIES: Gene Ontology-based Data-mining Tool for Biological Interpretation and Functional Classification on a Group of Biological Entities Sung Geun Lee1, Wan Seon Lee2
 1sglee@istech21.com, Bioinformatics Unit, ISTECH Inc.; 2konan@istech21.com, Bioinformatics Unit, ISTECH Inc.
 Correspondence address: yskim@istech21.com
 
 
GOODIES is a Gene Ontology-based data-mining tool for effective functional classification. Given biological entities, GOODIES classifies them along their annotational attributes and selects optimal GO candidate terms from combinatorially many choices for overall biological interpretation, with intuitive visualization. The major applications of GOODIES include biologically-oriented cluster analysis and functional categorization.  
 Long abstract
 
 
 
 |  
| F-22  Generation and Clustering of Phylogenetic Profiles for automatic Functional Annotation of Proteins Yen-Chen Steven Huang1, Vic Arcus, Ted Baker, Shaun Lott, Patricia Riddle, Chris Triggs
 1yc.huang@auckland.ac.nz, The Centre for Molecular Biodiscovery,  University of Auckland
 Correspondence address: yc.huang@auckland.ac.nz
 
 
Phylogenetic profile analysis assigns functional clues to proteins in a manner that is independent of sequence similarity. We present an improved algorithm that constructs the phylogenetic profiles of proteins based on the unambiguous Smith-Waterman Alignment algorithm. We used MetaCyc metabolic pathway clusters to estimate the prediction accuracy of the method.
 Long abstract
 
 
 
 |  
| F-23  A Whole-genome Analysis of Transcription Factor Binding Site Data. Caroline Finnerty1, Dr. James McInerney2
 1caroline.s.finnerty@may.ie, Bioinformatics and Pharmacogenomics Laboratory; 2james.o.mcinerney@may.ie, Bioinformatics and Pharmacogenomics Laboratory
 Correspondence address: caroline.s.finnerty@may.ie
 
 
It
is widely accepted that our complexity as a species results from the regulation
of our genes. Our approach is to analyse, on a genome-wide scale the upstream
regions of human genes with particular emphasis on transcription factor binding
sites. The ultimate goal is to infer expression pattern from sequence.
 Long abstract
 
 
 
 |  
| F-24  In the search of genomic clusters of human co-expressed genes using microarray gene expression data. Johannes Olson1, Per Broberg2, Krzysztof Pawlowski
 1Johannes.EXT.Olson@astrazeneca.com, AstraZeneca; 2Per.Broberg@astrazeneca.com, AstraZeneca
 Correspondence address: Krzysztof.Pawlowski@astrazeneca.com
 
 
Adjacent human gene co-expression was investigated using GeneLogic library. We analyzed average and maximum expression profile correlation for gene pairs within sliding genomic windows for normal or diseased tissue samples. Significance estimates used randomized gene sets. Co-expressed clusters were analyzed for gene duplication, subcellular localization, functional themes, inter-species conservation.
 Long abstract
 
 
 
 |  
| F-25  LacplantCyc: a Pathway / Genome Database for Lactobacillus plantarum as the model for Lactic Acid Bacteria. Frank H.J. van Enckevort1,2, Bas Teusink1,3, Roland J. Siezen1,2,3
 Frank.van.Enckevort@nizo.nl, Bas.Teusink@nizo.nl, Roland.Siezen@nizo.nl
 1NIZO food research, Ede, The Netherlands; 2Centre for Molecular and Biomolecular Informatics, University of Nijmegen, The Netherlands; 3Wageningen Centre for Food Sciences, Wageningen, The Netherlands.
 Correspondence address: Frank.van.Enckevort@nizo.nl
 
 
Lactobacillus plantarum is a versatile lactic acid bacterium that is encountered in various niches. LacplantCyc is a newly created pathway/genome database predicted from the annotated complete genome sequence of L. plantarum WCFS1 (PNAS 2003;100:1990), using the PathoLogic software from PathwayTools (Peter Karp). Manual editing and experimental verification is in progress. 
 Long abstract
 
 
 
 |  
| F-26  Reconstruction of Genetic Networks from Gene Expression Perturbation  Data Using a Boolean Model Ronald Taylor1
 1ronald.taylor@uchsc.edu, U of Colorado
 Correspondence address: ronald.taylor@uchsc.edu
 
 
The use of Boolean models is explored in 
reconstruction of the topology of genetic transcriptional 
networks, employing gene expression data from
simulated perturbations. The construction and employment 
of a software suite for such exploration is described.
Results are compared for different simulated topologies,
inference methods, and amount of noise.
 Long abstract
 
 
 
 |  
| F-27  Scalable multi-processor application for gene expression profile clustering Andrey Ptitsyn1
 1ptitsyaa@pbrc.edu, Pennington Biomedical Research Center
 Correspondence address: ptitsyaa@pbrc.edu
 
 
We would like to report a scalable multi-processor application for clustering of gene expression profiles. The program implements the flexible clustering algorithms developed at PBRC. The program is written in MPI standard and tested on IBM AIX Parallel Environment. 
 Long abstract
 
 
 
 |  
| F-28  Detection of Global and Gene Specific Translational Control Signals in mRNAs Chris M Brown1, Grant Jacobs2, Mark Dalphin and Peter Stockwell
 1chris.brown@otago.ac.nz, University of Otago; 2gjacobs@bioinfotools.com, Bioinfotools
 Correspondence address: chris.brown@otago.ac.nz
 
 
We wish to detect signals in mRNAs that influence their translation. These include signals that modulate translation efficiency, mRNA stability or its localisation. To do this we have developed the TransTerm database (http://transterm.otago.ac.nz/) of mRNA regions and regulatory elements. We have detected novel elements and are testing them in vivo.
 Long abstract
 
 
 
 |  
| F-29  PATIKA: Pathway Analysis Tool for Integration and Knowledge Acquisition Emek Demir1, Ozgun Babur2, Ugur Dogrusoz, Attila Gursoy, Asli Ayaz, Gurcan Gulesir, Gurkan Nisanci, Rengul Cetin-Atalay, and Mehmet Ozturk
 1emek@cs.bilkent.edu.tr, BCBI, Bilkent University, Ankara, Turkey; 2babur@cs.bilkent.edu.tr, BCBI, Bilkent University, Ankara, Turkey
 Correspondence address: ugur@cs.bilkent.edu.tr
 
 
PATIKA is an ongoing research and development project for collaborative construction and analysis of cellular pathways. Our software tool provides an integrated, multi-user environment for visualizing and manipulating network of cellular events. PATIKA is available at http://www.patika.org.
 Long abstract
 
 
 
 |  
| F-30  Prediction of a full length gene from partial sequence Chung, Myungguen1, Cho, Sooyoung2, Ban, hyojeong ; Kim, hyun and Lee Youngseek
 1aobo@ihanyang.ac.kr, Hanyang University; 2singylu@hanmail.net, Hanyang University
 Correspondence address: aobo@chollian.net
 
 
We obtained 3’ end partial sequence of cDNA which have not found homologs in ‘nr’ by using BLAST. We predicted full length genes from partial sequence and cloned full length genes by using a predicted sequence
 Long abstract
 
 
 
 |  
| F-31  Design of Antisense Oligonucleotides Alistair M. Chalk1, Erik L.L. Sonnhammer2
 1alistair.chalk@cgb.ki.se, CGB, Karolinska Institute; 2esr@algol.cgb.ki.se, CGB, Karolinska Institute
 Correspondence address: alistair.chalk@cgb.ki.se
 
 
Antisense oligonucelotides are an important tool for gene-knockdown approaches in functional genomics. We assess the usefulness of current approaches predicting accessibility and/or efficacy using a database of known results. A set of utilities developed for AO and siRNA design (control design, specificity, site selection) is available at http://sonnhammer.cgb.ki.se.
 Long abstract
 
 
 
 |  
| F-32  Detection of natural antisense transcripts conserved between human and mouse Par Engstrom1, Hidenori Kiyosawa2, Claes Wahlestedt, Yoshihide Hayashizaki, Boris Lenhard
 1par.engstrom@cgb.ki.se, Center for Genomics and Bioinformatics, Karolinska Institutet, Stockholm, Sweden; 2kiyosawa@rtc.riken.go.jp, RIKEN Bio Resource Center, RIKEN Tsukuba Institute, Tsukuba, Japan
 Correspondence address: par.engstrom@cgb.ki.se
 
 
We devised an automated computational procedure to detect pairs of overlapping and oppositely directed human transcriptional units (sense-antisense pairs) equivalent to a large set of putative mouse sense-antisense pairs previously predicted by cDNA mapping. We found support in the human transcriptome for a significant proportion of mouse sense-antisense pairs.
 Long abstract
 
 
 
 |  
| F-33  A modular software platform integrating the processing and bioinformatic analysis of proteomics data Soeren Schandorff1, Hans Jespersen2, C. H. Ahrens, M. Damsbo,  S. Larsen, B. K. Ramsgaard, E. T. Nielsen, G. Thorvil, J. P. Kristensen, K. P. Budin, J. Matthiesen, P. Venø, J. C. Brønd, T. Topaloglou, P. T. Ruhoff
 1schandorff@mdsdenmark.com, MDS Denmark; 2hjespersen@mdsdenmark.com, MDS Denmark
 Correspondence address: cahrens@mdsdenmark.com
 
 
We have developed an integrated software platform that addresses data generation and handling, data verification/quality control and bioinformatic analysis steps of large scale proteomics projects. Experimental data is integrated with computationally enriched data from public and proprietary databases enabling protein isoform distinction, protein-protein interactions analysis, pathway analysis and text mining. 
 Long abstract
 
 
 
 |  
| F-34  LION Target Engine 1.0: an Enterprise Platform for Target Identification and Validation S. Bernauer, Z. Bilkic, N. Bojunga, T. Brostroem, D. Croft, N. Delhomme, A. Denagbe, L. Ehrlich, K. Fries, C. Girardot, M. Goeschl, M. Gumbel, J. Hermanns, C. Kaestner, C. Katz, U. Keck, H.-P. Keck, R. Kern, G. Kurapkat, P. Lederer, D. Leon, S. Marcel, S. Markel, B. Markus, J.E.M. Meyer, E. Minch, J. Mistry, C. Muench, S. I O'Donoghue1, C. Ohr, S. Richter, H.-J. Roemming, R. Russ, S. Schaefer, A. Schafferhans, T. Schlegl, T. Schlueter, A. Schmidt, O. Schmidt, D. Schulz, A. Sooky, A. Sergienko, F. Spangenberg, J. Suckow, B. Sulzer, C. Suter-Crazzolara, A. Tarasenko, E. Vatcheva, H. Voss, M. Weindel, G. Zhang
 1sean.odonoghue@lionbioscience.com, LION bioscience AG, Waldhoferstr. 98, 69123 Germany
 Correspondence address: sean.odonoghue@lionbioscience.com
 
 
LION Target Engine is a user-friendly system designed to streamline the identification and validation of targets in an enterprise environment. The system components for: sequence registration and curation; sequence analysis; gene and protein index; text mining; pathways and interaction networks; 3D structures; TaqMan and microarray data handling, and target tracking.
 Long abstract
 
 
 
 |  
| F-35  siRNA Design Tool: A Functional Genomics Accelerator Natasha Levenkova1, Qingjuan Gu2, John J. Rux
 1nlevenkov@wistar.upenn.edu, The Wistar Institute; 2qingjuan@wistar.upenn.edu, The Wistar Institute
 Correspondence address: rux@wistar.upenn.edu
 
 
Small interfering RNA (siRNA) is used in functional genomics applications to produce “knock-down” cells.   The siRNA design tool scans a target gene for candidate siRNA sequences that satisfy user-adjustable rules.  Selected candidates are then screened to identify those siRNA sequences that match only the gene of interest.
 Long abstract
 
 
 
 |  
| F-36  Statistical Analysis of Arabidopsis T-DNA-flanking sequences Hyung Seok Choi1
 1gnie@lycos.co.kr, Seoul National University
 Correspondence address: shchoe@snu.ac.kr
 
 
T-DNA use in plant functional genomics is based on a hypothesis that the T-DNA randomly inserts plant genomes. To test this, we analyzed 120,000 T-DNA flanking sequences of the SIGnAl database. Of the total 29,084 Arabidopsis genes, approximately 70% have >1insert, whereas 8760 (30%) genes are still left without any.
 Long abstract
 
 
 
 |  
| F-37  The Yeast Interactome -- analysis and evaluation of diverse sources of information. Jeremiah J Faith1, Ravi Sachidanandam2
 1faith@cshl.org, Cold Spring Harbor Laboratory; 2sachidan@cshl.org, Cold Spring Harbor Laboratory
 Correspondence address: faith@cshl.org
 
 
A network analysis of the protein-protein interactions in yeast reveals distinct clusters.  Some of the clusters are due to functional groups, while most are due to methods of detection; interactions detected by yeast two-hybrid tend to cluster proteins into groups that are different from the clusters due to mass spectrometry. We quantify these differences and discuss implications. 
 Long abstract
 
 
 
 |  
| F-38  cSAGE and the Serial Analysis of Gene Expression in Arabidopsis thaliana Christopher T Lewis1, Stephen Robinson2, Tony Kusalik, Isobel AP Parkin
 1LewisCT@agr.gc.ca, Agriculture and Agri-food Canada; 2RobinsonS@agr.gc.ca, Agriculture and Agri-food Canada
 Correspondence address: LewisCT@agr.gc.ca
 
 
cSAGE is an open-source application written in C to provide an efficient
mechanism for extracting SAGE tags and assigning matches to DNA sequences.  It
has been used for a cold tolerance experiment in A. thaliana involving 3
librarys with more than 180,000 tags.  See http://homepage.usask.ca/ctl271/csage
for more information.
 Long abstract
 
 
 
 |  
| F-39  Reconstructing Genome Architectures by End Sequence Profiling: Applications to Tumor Genomes Ben Raphael1, Pavel Pevzner2, Stas Volik, Colin Collins
 1braphael@ucsd.edu, University of Californa, San Diego; 2ppevzner@cs.ucsd.edu, Univeristy of Californa, San Diego
 Correspondence address: braphael@ucsd.edu
 
 
We describe a computational approach to the reconstruction of the architecture of a rearranged genome based on data from end sequence profiling
experiments.  We apply our
techniques to the reconstruction of the genome of a human MCF7 tumor cell.
 Long abstract
 
 
 
 |  
| F-40  Phosphoregulators: Protein kinases and Protein phosphatases of mouse Alistair RR Forrest1, Timothy Ravasi2, Darrin Taylor, Rohan Teasdale, RIKEN GER Group Members ,and Sean Grimmond
 1a.forrest@imb.uq.edu.au, IMB; 2t.ravasi@imb.uq.edu.au, IMB
 Correspondence address: a.forrest@imb.uq.edu.au
 
 
We describe the identification and classification of the complement of protein kinases and phosphatases in mouse. We also present preliminary results from a functional screen of these proteins, coupling sequence based classification with high throughput functional screens.
 Long abstract
 
 
 
 |  
| F-41  BioinformatIQ: Integrating devices, data types, and bioinformatic analysis in an information management system for proteomics F. Keith Junius1, P. Bizannes, P. Doggett, M. Harrison, B. Srinivasan, E. Shaw, M. Traini, W. McDonald, and Marc R. Wilkins
 1Keith.Junius@proteomesystems.com, Proteome Systems
 Correspondence address: Keith.Junius@proteomesystems.com
 
 
BioinformatIQ® is an integrated system for handling the information needs of proteomics from sample preparation, through automation of instrumentation, to protein identification and characterization. This informatcs platform for proteomics is demonstrated through application to the proteomic analysis of human plasma. More information on BioinformatIQ® can be found at http://www.proteomesystems.com
 Long abstract
 
 
 
 |  
| F-42  Visualizing and Exploring Linked Functional Genomic Data Sets in YETI: Yeast Exploration Tool Explorer Richard J. Orton1, William I. Sellers2, Dietlind L. Gerloff
 1Richard.Orton@ed.ac.uk, University of Edinburgh; 2W.I.Sellers@lboro.ac.uk, University of Loughborough
 Correspondence address: d.gerloff@ed.ac.uk
 
 
YETI is a novel bioinformatics tool for integrated visualization and analysis of functional genomic data from the yeast Saccharomyces cerevisiae. YETI 1.0 consists of three fully inter-linked sections allowing users to explore the “genomic” (e.g. chromosomal location), and “proteomic” (e.g. associated protein-protein interactions) context of multiple proteins of interest simultaneously.
 Long abstract
 
 
 
 |  
	| Genome Annotation
 |  
| E-1  In silico prediction of UTR repeats using clustered EST data Stefan Rensing1, Daniel Lang2, Ralf Reski
 1stefan.rensing@biologie.uni-freiburg.de, University of Freiburg, Plant Biotechnology; 2daniel.lang@biologie.uni-freiburg.de, University of Freiburg, Plant Biotechnology
 Correspondence address: stefan.rensing@biologie.uni-freiburg.de
 
 
Three approaches for the in silico prediction of UTR repeats have been used on a test data set, resulting in the detection of sequence stretches in ~5% of the input sequences during clustering and reduction in size of large clusters. Seven of those putative repeats have been proven to be repetitive in vivo by Southern blot analysis.
 Long abstract
 
 
 
 |  
| E-2  Using proteomics to mine genome sequences Jonathan W Arthur1, Marc R Wilkins2
 1jonathan.arthur@proteomesystems.com, Proteome Systems Ltd; 2marc.wilkins@proteomesystems.com, Proteome Systems Ltd
 Correspondence address: jonathan.arthur@proteomesystems.com
 
 
          We present a hypothesis-independent method for identifying 
             the region of a genome coding for a protein sequence using 
             proteomic information. The method can be used to identify
             novel genes that were not found by other annotation 
             techniques. It is demonstrated using theoretical and 
             experimental data sets from prokaryotic and eukaryotic organsims. 
           Long abstract
 
 
 
 |  
| E-3  Global insights into protein complexes through integrated analysis of the interactome and knockout lethality Harukazu Suzuki1, Rintaro Saito 2, Yoshihide Hayashizaki
 1harukazu@gsc.riken.go.jp, RIKEN Genomic Sciences Center ; 2, RIKEN Genomic Sciences Center
 Correspondence address: harukazu@gsc.riken.go.jp
 
 
We have developed the new interaction generality measure (IG2), which can be used to computationally assess the reliability of the interactome data. We performed an integrated analysis by using comprehensive phenotype dataset and IG2-treated interactome dataset from yeast, which yielded global insights into the biological features of the protein complexes.
 Long abstract
 
 
 
 |  
| E-4  An evaluation of new criteria for CpG islands in the human genome as gene markers Patrick, Yong Wang1, Frederick, C. Leung2
 1wangyong@hkusua.hku.hk, HKU, Dept of Zoology; 2fcleung@hkucc.hku.hk, HKU, Dept of Zoology
 Correspondence address: wangyong@hkusua.hku.hk
 
 
Using the new criteria for CpG islands introduced by Takai and Jones, we investigated several association types between CpG islands and genes to further establish the importance of CpG islands as gene markers. Our investigation gave us a useful tool for evaluating the accuracy of gene annotation in human chromosomes. 
 Long abstract
 
 
 
 |  
| E-5  cDNA2Genome: A TOOL FOR MAPPING AND ANNOTATING cDNAS Coral del Val1, Karl-Heinz Glatting2, S.Suhai
 1c.delval@dkfz.de, Department of Molecular Biophysics DKFZ, German Cancer Research Center; 2glatting@dkfz.de, Department of Molecular Biophysics DKFZ, German Cancer Research Center
 Correspondence address: c.delval@dkfz.de
 
 
cDNA2Genome is a web tool for automatic high-throughput mapping and characterization of cDNAs. It uses already existing annotation data and improves them when possible in the case of ESTs, proteins and mRNAs. It is focussed on the determination of the cDNA exon-intron structure. The final result of cDNA2Genome is an XML file with all information obtained by the task. Long abstract
 
 
 
 |  
| E-6  Identifying and Annotating Disease Specific Rat Genome Sequences Jedidiah Mathis1, Mary Shimoyama2, Aubrey Hughes, Norberto Dela Cruz, Charles Wang, Simon Twigger, Michael Jensen-Seamen, Michelle Feldmann, Artur Rangel Filho, Jozef Lazar, Howard Jacob, Peter Tonellato
 1jmathis@mcw.edu, Medical College of Wisconsin; 2shimoyma@mcw.edu, Medical College of Wisconsin
 Correspondence address: jmathis@mcw.edu
 
 
Disease related Genomic Regions Of Interest (GROI) were submitted to the Rat Genome Sequencing Consortium for prioritized sequencing. These areas were analyzed to determine whether greater coverage facilitated denser and more accurate annotation. Functional annotation of genes in these regions was achieved using datamining of public databases and manual curation.
 Long abstract
 
 
 
 |  
| E-7  Assembly and finishing tools for repeated and polymorphic genomes Martti T. Tammi1, Erik Arner2, Ellen Kindlund, Björn Andersson
 1martti.tammi@cgb.ki.se, Center for Genomics and Bioinformatics, Karolinska Institutet; 2erik.arner@cgb.ki.se, Center for Genomics and Bioinformatics, Karolinska Institutet
 Correspondence address: martti.tammi@cgb.ki.se
 
 
DNPTrapper is a graphical tool specifically designed for finishing shotgun assemblies containing complex repeated regions. DNPTrapper allows visualization of sequences and sequence features, e.g DNPs, mate-pairs, repeat boundaries, chromatograms, etc. in horizontal and vertical representation, followed by manual and semi-automatic manipulation, which greatly simplifies finishing.
 Long abstract
 
 
 
 |  
| E-8  e-PROTEIN: A Distributed Pipeline for Structure-based Proteome Annotation using Grid Technology Keiran Fleming1, Liam McGuffin2, Stefano Street, Andreas Kahari, Tim Massingham, Steven Newhouse, James Cuff, Ewan Birney, Soren Sorenson, Christine Orengo, John Darlington, David Jones, Janet Thornton, Michael Sternberg
 1k.fleming@imperial.ac.uk, Imperial College London; 2l.mcguffin@cs.ucl.ac.uk, University College London
 Correspondence address: k.fleming@imperial.ac.uk
 
 
The e-Protein project aims to provide structure-based annotations of proteins in the major genomes by linking resources via Grid technology at 3 sites; Imperial College London, University College London, and the EBI. At the end of the first 6 months we have established a pre-prototype using GLOBUS/Grid technology. Long abstract
 
 
 
 |  
| E-9  Gene Ontology Toolkit: A Comprehensive Software Package for Working with Gene Ontology Jing Ding1, Jun Xu2, Andy W. Fulmer
 1dingjing@iastate.edu, Iowa State University; 2xu.j.1@pg.com, The Procter and Gamble Company
 Correspondence address: fulmer.aw@pg.com
 
 
Gene Ontology Toolkit is a Java GUI for working with the Gene Ontology (GO). Its three integrated components (editor, annotator and merger) enable biologists to visualize and extend GO, annotate gene products with GO terms, and merge the extensions and/or the annotations with new release of GO or others’ extensions.
 Long abstract
 
 
 
 |  
| E-10  Assembly and Annotation of the Leptospira borgpetersenii serovar Hardjobovis Genome Sequence. Annette McGrath1, John Davis2, Peter J Wilson, Dieter Bulach, Torsten Seemann, John Davies, Ross Coppel, Ben Adler, Elizabeth S Kuczek
 1annette@agrf.org.au, Australian Genome Research Facility; 2john@agrf.org.au, Australian Genome Research Facility
 Correspondence address: annette@agrf.org.au
 
 
We have completed the assembly of the Leptospira borgpetersenii serovar hardjobovis genome, which contains a large chromosome (CI) of 3,614,529 base pairs and a smaller chromosome (CII) of 317,585 base pairs. The annotation of CII is complete and is currently in progress for  CI.
 Long abstract
 
 
 
 |  
| E-11  Towards the Bovine Ensembl Sean M. McWilliam1, Wes Barris2, Brian P. Dalrymple
 1sean.mcwilliam@csiro.au, CSIRO Livestock Industries, Brisbane, Australia; 2wes.barris@csiro.ai, CSIRO Livestock Industries, Brisbane, Australia
 Correspondence address: sean.mcwilliam@csiro.au
 
 
We have implemented the Ensembl genome sequence database and interface for handling the annotation of the bovine genome. Initial efforts have focussed on display of the annotation of small and micro RNAs and linking to the SNP database, IBISS.
 Long abstract
 
 
 
 |  
| E-12  Annotation of non-coding RNA molecules in cattle genomic sequences Brian Dalrymple1, Sean McWilliam2, Wes Barris, Pradeep Tokachichu
 1Brian.Dalrymple@csiro.au, CSIRO Livestock Industries; 2Sean.McWilliam@csiro.au, CSIRO Livestock Industries
 Correspondence address: Brian.Dalrymple@csiro.au
 
 
To identify members of known families of RNAs a combination of BLAST and INFERNAL is used with the RNA covariance models in Rfam. To identify potential new RNAs a combination of genome specific BLAST and QRNA is used. The results of analysis of bovine BAC-end sequences will be shown.
 Long abstract
 
 
 
 |  
| E-13  ASmodeler: Gene modeling of alternative splicing events from genomic alignment of mRNA and ESTs Namshin Kim1,2, Seokmin Shin2, Sanghyuk Lee
 1deepreds@hanmail.net, Division of Molecular Life Sciences, Ewha Womans University, Seoul 120-750, KOREA; 2sshin@snu.ac.kr, School of Chemistry, Seoul National University
 Correspondence address: deepreds@hanmail.net
 
 
ASmodeler is a novel web-based utility to find gene models of alternative splicing events from genomic alignment of mRNA and ESTs. It can be used as a transcript assembly program, an EST clustering utility, and a method of comparative gene modeling. ASmodeler is available at http://genome.ewha.ac.kr/ASmodeler/.
 Long abstract
 
 
 
 |  
| E-14  Protein domain extraction by quasi-convex set functions HwaSeob Yun1, Casimir Kulikowski2, Ilya Muchnik
 1seabee@cs.rutgers.edu, Rutgers University; 2kulikows@cs.rutgers.edu, Rutgers University
 Correspondence address: seabee@cs.rutgers.edu
 
 
We present a fast, fully automatic procedure for protein domain extraction from single query sequences without pre-calculation of domain statistics. Combinatorial clustering of domains from BLAST hits using quasi-convex set functions, followed by domain parsing and pattern discovery permits highly efficient (polynomial time) whole genome functional annotation.
 Long abstract
 
 
 
 |  
| E-15  Development of a Web-based Genome Annotation System and Two Analysis Tools Hongseok Tae1, Hyeweon Nam2, Daesang Lee, Kiejung Park
 1hstae@smallsoft.co.kr, Dept. of Microbiology, Kyungpook National University; 2hwnam@smallsoft.co.kr, Information Technology Institute, SmallSoft Co., Ltd.
 Correspondence address: hstae@smallsoft.co.kr
 
 
Our web-based genome annotation system has major modules such as gene prediction, homology search, promoter analysis, motif analysis, gene ontology analysis, annotation databases, and a genome browser which shows the entire information of a genome. We have also developed a motif analysis and a gene prediction programs based on HMM.
 Long abstract
 
 
 
 |  
| E-16  New approach to build models for predicting prokaryotic genes Chungoo Park1, Mihwa Park2, Jongwon  Chang, Jeongho Huh, Dong Soo Jung, Hong Gil Nam, Young Bock Lee, Jiin Choi, Seungsik Yoo, Jaewoo Kim
 1madreach@bric.postech.ac.kr, Biological Research Information Center, Pohang University of Science and Technology; 2bfpark@posdata.co.kr, Solution Development Research Institute,POSDATA
 Correspondence address: madreach@bric.postech.ac.kr
 
 
We propose a new method for increasing the gene prediction accuracy without using information of known genes. To increase the gene prediction accuracy we used the additional learning data through the phylogenetic concept. Tests on 3 complete prokaryotic genomes performed with the GLIMMER program demonstrate the ability of the new approach to detect additional genes.
 Long abstract
 
 
 
 |  
| E-17  A flexible model for promoter motifs Wei-Mou Zheng1
 1zheng@itp.ac.cn, Inst. Theor. Phys., Academia Sinica
 Correspondence address: zheng@itp.ac.cn
 
 
A general and flexible multi-motif model is proposed for promoter motif analysis based on dynamic programming. By extending the Gibbs sampler to the dynamic programming and introducing temperature, an efficient algorithm is developed for searching motifs in promoters. The algorithm is tested with plant promoters.
 Long abstract
 
 
 
 |  
| E-18  Analysis of human herpesvirus genomes based on COGs and Phylogenomics Chang-Jin Shin1, Cheol-Min Kim2, Byeong-Chul Kang, Jun-Hyung Park, Dong-Hoon Shin, Ok-Kyung Ham, Yoon-Jung Choi, In-Joo Kim, Choon-Hwan Lee, Cheol-Min Kim
 1teragene@pusan.ac.kr, Busan Genome Center, Busan Genome Center, College of Medicine, Pusan National University; 2kimcm@pusan.ac.kr, Busan Genome Center, College of Medicine, Pusan National University
 Correspondence address: teragene@pusan.ac.kr
 
 
The aim of this study is a development of a suitable procedure to predict the function of viral genes. To overcome conventional searches based on similarity, HHV (Human Herpesvirus genomes) were analyzed by COG and phylogenomic methods. It will provide a practical method to predict the function of new genes in viral genome.
 Long abstract
 
 
 
 |  
| E-19  Automated Gene Ontology annotation for anonymous sequence data Steffen Hennig1, Detlef Groth2, Hans Lehrach
 1hennig@molgen.mpg.de, MPI for Molecular Genetics, Berlin; 2dgroth@molgen.mpg.de, MPI for Molecular Genetics, Berlin
 Correspondence address: hennig@molgen.mpg.de
 
 
The unified vocabulary of terms provided by the  Gene Ontology consortium has become a standard tool in annotation of genes and their products. We present a web-service available at  http://goblet.molgen.mpg.de, which allows annotation of anonymous cDNA or protein sequences by GO terms.
 Long abstract
 
 
 
 |  
| E-20  IBISS - the interactive bovine in Silico SNP database. Rachel Hawken1, Wes Barris2, Brian Dalrymple
 1Rachel.Hawken@csiro.au, CSIRO Livestock Industries; 2Wes.Barris@csiro.au, CSIRO Livestock Industries
 Correspondence address: Rachel.Hawken@csiro.au
 
 
A bovine in Silico SNP database has been constructed.  Contigs of ‘unique bovine sequences’ were established which were treated as model mRNAs. A comprehensive web interface has been developed which highlights putative identity of each contig, putative SNPs, location of predicted intron-exon boundaries, and genome mapping data for each model mRNA. 
 Long abstract
 
 
 
 |  
| E-21   The Encyclopedia of Life (EOL) Project Phil Bourne1, Wilfred Li2, Baldridge, K.; Baru, C.; Byrnes, R.; Clingman, E.; Cotofana, C.; Ferguson, C.; Fountain, A.; Greenberg, J.; Jermanis, D.;  Matthews, J.; Miller, M.; Mitchell, J.; Mosley, M.; Pekurovsky, D.; Quinn, G.B.; Reyes, V.; Rowley, J.; Shindyalov, I.; Smith, C.; Stoner, D.; Veretnik, S.
 1bourne@sdsc.edu, San Diego Supercomputer Center; 2wilfred@sdsc.edu, San Diego Supercomputer Center
 Correspondence address: bourne@sdsc.edu
 
 
The Encyclopedia of Life Project(EOL; http://www.eolproject.info) is aimed
at utilizing Grid computing resources to catalog the complete proteome of
every living species in a flexible, powerful reference system using a
scalable protein annotation pipeline. Recognized protein sequences are
assigned putative functional annotation, structure assignment, and
cross-referenced to other data sources.
 Long abstract
 
 
 
 |  
| E-22  Sabiá - System for Automated Bacterial Integrated Annotation Ana Tereza R. Vasconcelos1, Roger Paixao2, Rangel C. Souza, Luiz Gonzaga, Gisele C. da Costa, Frank J. A. Barrientos, Marcelo T. dos Santos and Darcy F. de Almeida.
 1atrv@lncc.br, Laboratorio Nacional de Computacao Cientifica; 2roger@lncc.br, Laboratorio Nacional de Computacao Cientifica
 Correspondence address: atrv@lncc.br
 
 
A new tool called System for Automated Bacterial Integrated Annotation - SABIA was
 developed for the assembly and annotation of bacterial genomes. This system  
performs automatic tasks of assembly analysis, ORFs identification/analysis, and
 extragenic regions analysis. Genome assembly and contigs automatic annotation data are also available in the same working environment.
 Long abstract
 
 
 
 |  
| E-23  Identification of putative transcription factor binding sites conserved across orthologous human, mouse and rat sequences Alex Gout1, Tim Beissbarth2, Joelle Michaud, Catherine Carmichael, Matthew Ritchie, Gordon
 Correspondence address: gout@wehi.edu.au
 
 
The patterns of transcription factor binding sites (TFBSs) within 
    upstream regions of differentially expressed genes identified via
    microarray analysis may help identify regulatory genetic networks. We 
    have thus created a database of putative TFBSs conserved across 
    orthologous human, mouse and rat genes through the use of MAVID, Match and 
    Transfac. 
 Long abstract
 
 
 
 |  
| E-24  An Automated Procedure to Create a Protein Structure Family Database and Application to Whole-Genome Annotation Kenneth J Kelly1
 1kjk@chemcomp.com, Chemical Computing Group Inc
 Correspondence address: kjk@chemcomp.com
 
 
We present a fully-automated procedure to create a protein structure family database, along with a corresponding homology searching algorithm based on a combined E-value/Z-score approach. Whole-genome struture-based annotation tests on several completely sequenced genomes have demonstrated results comparable to PSI-BLAST.
 Long abstract
 
 
 
 |  
| E-25  Disease Ontology  - Unifying Bioinformatics and Clinical Medicine Patricia A. Dyck1, Rex L. Chisholm2
 1p-dyck@northwestern.edu, Northwestern University; 2r-chisholm@northwestern.edu, Northwestern University
 Correspondence address: p-dyck@northwestern.edu
 
 
The Disease Ontology is a hierarchical controlled vocabulary created to represent human disease. The ontology was created in order to enable database curation of disease gene associations. All terms in the ontology also maps to billing codes for the purpose of medical record mining. The ontology is available at: http://sourceforge.net/projects/diseaseontology/.
 Long abstract
 
 
 
 |  
	| Microarrays
 |  
| A-1  Binned-Intensity Normalization Algorithm for Single-Dye Microarrays Gene Cutler1
 1cutler@tularik.com, Tularik Inc
 Correspondence address: cutler@tularik.com
 
 
To generate meaningful mRNA expression ratios, data from separate arrays or probes must be normalized.  Median normalization performs adequately only when differences between data sets are linear.  To cope with non-linearities and noisy data, I have implemented a binned-intensity normalization algorithm which outperforms simple median normalization.
 Long abstract
 
 
 
 |  
| A-2  Overcoming Confounded Controls in the Analysis of Gene Expression Data from Microarray Experiments Soumyaroop Bhattacharya1, Dang Duc Long2, James Lyons-Weiler
 1bhattacharyas@msx.upmc.edu, Benedum Oncology Informatics Center, University of Pittsburgh; 2Dang_Long@student.uml.edu, Center for Bioinformatics and Computational Biology, University of  Massachusetts Lowell
 Correspondence address: bhattacharyas@msx.upmc.edu
 
 
The robust clustering of some normal samples within tumor groups and robust clustering of other normal samples in a separate, 'normal' group indicates the confounding of control samples. Our approach uses the maximum difference subset algorithm (MDSS) and bootstrap validation, which evaluates the difference in mean expression between two groups.
 Long abstract
 
 
 
 |  
| A-3  Gene Ontology Driven Classification of Gene Expression Patterns Claudio Lottaz1, Renate Kirschner2, Stefan Bentink, Christian Hagemeier and Rainer Spang
 1Claudio.Lottaz@molgen.mpg.de, Max-Planck-Institute for Molecular Genetics, Berlin; 2r.kirschner@charite.de, Medical Center Charité, Berlin
 Correspondence address: r.kirschner@charite.de
 
 
We propose to structure analysis of microarry according to biological knowledge in order to provide an intuitive and biologically meaningful rationale for computational classification results. Thereby, we rely on the Gene Ontology to attribute genes to biological aspects and usual machine learning methods for classification.
 Long abstract
 
 
 
 |  
| A-4  Improving the reliability of transcriptomics data; The effect of quenching on RNA transcription profiles Bart Pieterse1, Renger H. Jellema2, Mariët J. van der Werf
 1Pieterse@voeding.tno.nl, 1.	Wageningen Centre for Food Sciences; 2jellema@voeding.tno.nl, 2.	TNO Nutrition and Food Research
 Correspondence address: jellema@voeding.tno.nl
 
 
We validated a quenching method for the harvesting of micro-organisms from liquid cultures for gene expression studies. The transcription profiles of quenched L. plantarum WCFS1 cells were compared with those of cells that were harvested by alternative methods. PCA analysis and hierarchal clustering of the resulting transcriptomics data show a clear effect of this quenching method on the transcription profiles.
 Long abstract
 
 
 
 |  
| A-5  Statistical Promoter Regulatory Element Analysis of cDNA Microarray Data For the Prediction of cAMP Responsive Genes Lyle D Burgoon1, Ken Y Kwan, Tim Zacharewski2
 1 Dept of Pharmacology & Toxicology; 2 Institute for Environmental Toxicology, National Food Safety & Toxicology Center
 Correspondence address: burgoonl@msu.edu
 
 
We have developed a method for predicting transcription factor responsive genes by combining response element prediction with cDNA microarray data.  Our Statistical Promoter REgulatory Element (SPREE) application program identified cAMP responsive elements.  SPREE output was combined with cDNA microarray data to design an SVM model for predicting cAMP responsive genes.
 Long abstract
 
 
 
 |  
| A-6  The RNA Abundance Database and its Annotation Web-Forms. Elisabetta Manduchi1, G.R. Grant, Hongxian He, J. Liu, M.D. Mailman, A. Pizarro, P.L. Whetzel, C.J. Stoeckert Jr.
 1manduchi@pcbi.upenn.edu, Center for Bioinformatics, University of Pennsylvania
 Correspondence address: manduchi@pcbi.upenn.edu
 
 
RAD and its web-based annotation forms are a system aimed at the
collection, organization, and exchange of all relevant information pertaining to gene expression array (and SAGE) studies. 
The richness of information captured and the use of ontologies render RAD a very powerful infrastructure for querying and analysis ( http://www.cbil.upenn.edu/RAD3).
 Long abstract
 
 
 
 |  
| A-7  MPRIME: Efficient Large Scale Multiple Primer Design for Customized Microarrays Eric Rouchka1, Nigel Cooper2, Abdelnaby Khalyfa
 1eric.rouchka@louisville.edu, University of Louisville; 2nigelcooper@louisville.edu, University of Louisville
 Correspondence address: eric.rouchka@louisville.edu
 
 
MPrime is a system for efficiently creating large sets of PCR primer pairs for use in designing products for custom cDNA microarrays.  MPrime has allowed us to effectively design custom neurodegenerative microarray chips for humans as well as the rat and mouse genomes. MPrime is available at: http://kbrin.a-bldg.louisville.edu/Tools/MPrime/
 Long abstract
 
 
 
 |  
| A-9  Estimation of oncogenes by Bayesian inverse modeling of
 gene-expression patterns Mathaeus Dejori1, Martin Stetter2
 1mathaeus.dejori.external@mchp.siemens.de, Technical University of Munich; 2stetter@siemens.com, Siemens AG
 Correspondence address: mathaeus.dejori.external@mchp.siemens.de
 
 
We train a Bayesian network to represent statistical dependencies between
gene-expression levels from DNA-microarray datasets. The trained network is
used to predict the effect of pathologically altered expression levels on
the global expression pattern (inverse modeling). By use of this ability we
can powerfully predict new genes involved in pathogenesis.
 Long abstract
 
 
 
 |  
| A-10  Using scale-free topology to estimate critical genes from 
regulatory networks Mathaeus Dejori1, Martin Stetter2
 1mathaeus.dejori.external@mchp.siemens.de, Technical University of Munich; 2stetter@siemens.com, Siemens AG
 Correspondence address: mathaeus.dejori.external@mchp.siemens.de
 
 
We present a method for estimating key regulatory genes of genetic networks by analyzing their network topology. In networks learned from childhood leukemia microarray datasets we find a small
number of genes such as POU2AF1 that may contribute to B-cell tumorigenesis.
 Long abstract
 
 
 
 |  
| A-11  Using bayesian network learning to model yeast transcriptional response to nitrogen oxide Jingchun Zhu1, Joe DeRisi2
 1jzhu@itsa.ucsf.edu, UCSF; 2joe@derisilab.ucsf.edu, UCSF
 Correspondence address: jzhu@itsa.ucsf.edu
 
 
We used a Bayesian Network learning technique to analyze microarray transcriptional response profiles of yeast to nitrogen oxide.  Using gene clusters as network nodes, the learned transcription response networks are consistent with the proposed biological hypotheses.  The model also revealed a previously unknown link between galactose input and a fzf1 dependent cluster.  
 Long abstract
 
 
 
 |  
| A-12  GEPAS, a web-based resource for microarray gene expression data analysis Javier Herrero1, Fatima Al-Shahrour2, Ramon Diaz-Uriarte, Alvaro Mateos, Juan M. Vaquerizas, Javier Santoyo, Joaquin Dopazo
 1jherrero@cnio.es, CNIO; 2falshahrour@cnio.es, CNIO
 Correspondence address: jdopazo@cnio.es
 
 
GEPAS is a web-based pipeline for microarray gene expression profile analysis, freely available at http://gepas.bioinfo.cnio.es. The most commonly used tools for the processing and management of different functional genomics data are included in GEPAS as  interconnected modules that exchange information in a user friendly manner.
 Long abstract
 
 
 
 |  
| A-13  Robust k-means Clustering of Gene Expression Chris1, Dimitri2, Yong-Chuan Tao, Karine G. Le Roch, Garret Hampton, Elizabeth A. Winzeler, Jiayu Liao, Guangzhou Zou, Peter Schultz, Yingyao Zhou
 1cbenner@gnf.org, Benner; 2dpetrov@gnf.org, Petrov
 Correspondence address: zhou@gnf.org
 
 
The existing k-means clustering algorithm for gene expression data suffers from its uncertainty and ambiguity. This robust k-means clustering algorithm demonstrates how both variations in data sources and the intrinsic indeterminacy of clustering procedures can be overcome and that reliable, informative, and optimal clustering results can be achieved.
 Long abstract
 
 
 
 |  
| A-14  Quantitative microarray spot profile optimization: A systematic evaluation of buffer/slide combinations D P Kreil1, R P Auburn, L A Meadows, S Russell, G Micklem
 1ISMB03@Kreil.Org, University of Cambridge
 Correspondence address: ISMB03@Kreil.Org
 
 
  Selection of spotting-buffer and slide-chemistry is critical for the
  reliability of microarray-hybridization-experiments. Comparisons,
  however, have tended to be subjective, not suitable for systematic
  study.
  We present a novel approach, objectively assessing spot-morphology
  and -variance by measuring average (variance) of radial
  spot-pixel-intensity-profiles. It is successfully demonstrated
  comparing over 24x6 buffer/slide combinations.
 Long abstract
 
 
 
 |  
| A-15  Module Networks: Identifying Regulatory Modules and their Condition Specific Regulators from Gene Expression Data Eran Segal1, Aviv Regev2, Dana Pe'er, Michael Shapira, David Botstein, Daphne Koller, Nir Friedman
 1eran@cs.stanford.edu, Stanford; 2ARegev@CGR.Harvard.edu, CGR
 Correspondence address: eran@cs.stanford.edu
 
 
We present a probabilistic method for identifying regulatory modules from gene expression data. Our procedure identifies modules of coregulated genes, their regulators and the conditions under which regulation occurs, generating testable hypotheses in the form
‘regulator X regulates module Y under conditions W ’. We present microarray experiments
supporting three novel predictions,suggesting regulatory roles for previously uncharacterized proteins.
 Long abstract
 
 
 
 |  
| A-16  Integrated Storage For Microarray Experimental Data Supawan Prompramote1, Yi-Ping Phoebe Chen2, Frederic Maire
 1s.prompramote@student.qut.edu.au, Centre for Information Technology Innovation, Faculty of Information Technology, Queensland University of Technology; 2p.chen@qut.edu.au, Centre for Information Technology Innovation, Faculty of Information Technology, Queensland University of Technology
 Correspondence address: s.prompramote@student.qut.edu.au
 
 
The use of different terminologies and structures in microarray databases is 
    limiting the sharing of data and the collating of results between laboratories. We have proposed an 
    integrated information management architecture for microarray experimental data that will focus on 
    addressing these problems. Long abstract
 
 
 
 |  
| A-17  Implementation of BASE for microarray data analysis at ACGT Microarray facility, Pretoria, South Africa Daniel F. Theron1, David K. Berger2, Sanushka Naidoo, Fourie Joubert
 1danie.theron@fabi.up.ac.za, University of Pretoria; 2dberger@postino.up.ac.za, University of Pretoria
 Correspondence address: danie.theron@fabi.up.ac.za
 
 
This concise guide demonstrates the implementation of BASE as a microarray data analysis pipeline. BioArray Software Environment is an open-source platform for archiving, analysis and visualizing of microarray data. Data for this demonstration compares gene expression between a wild-type and mutant Arabidopsis plants that identified 86 differentially expressed genes. Long abstract
 
 
 
 |  
| A-18  Estimating Gene Networks by Bayesian Networks from Microarrays and Biological Knowledge Seiya Imoto1, Tomoyuki Higuchi2, Takao Goto, Kousuke Tashiro, Satoru Kuhara, Satoru Miyano
 1imoto@ims.u-tokyo.ac.jp, University of Tokyo; 2higuchi@ism.ac.jp, The Institute of Statistical Mathematics
 Correspondence address: imoto@ims.u-tokyo.ac.jp
 
 
We propose a statistical method for estimating a gene network based on Bayesian networks from microarray data together with biological knowledge including protein-protein interactions, protein-DNA interactions, binding site information, existing literature and so on. Our method can optimize the balance between microarray and biological knowledge automatically. 
 Long abstract
 
 
 
 |  
| A-19  A new FDR algorithm for differential expression analysis of microarray data Gregory Grant1, Elisabetta Manduchi2, Christian Stoeckert
 1ggrant@pcbi.upenn.edu, CBIL; 2manduchi@pcbi.upenn.edu, CBIL
 Correspondence address: ggrant@pcbi.upenn.edu
 
 
PaGE is an algorithm we have developed at CBIL which uses statistical methods to assign discrete patterns to gene expression data.  This poster will highlight the new implementation (version 5.0) with improved novel FDR alrogithm based on the Westfall and Young minP stepdown distributions, and new interface.  
 Long abstract
 
 
 
 |  
| A-20  GenMAPP and MAPPFinder 2.0: Tools for Viewing and Analyzing Genomic and Proteomic Data Using Gene Ontology and Biological Pathways Kam D. Dahlquist1, Scott W. Doniger2, Nathan Salomonis, Karen Vranizan, Steven C. Lawlor, and Bruce R. Conklin
 1kadahlquist@vassar.edu, Department of Biology, Vassar College; 2sdoniger@gladstone.ucsf.edu, Gladstone Institute of Cardiovascular Disease
 Correspondence address: kadahlquist@vassar.edu
 
 
GenMAPP is designed for viewing expression data on biological pathways. GenMAPP automatically color-codes the genes according to criteria supplied by the user. MAPPFinder matches expression data to the Gene Ontology and indicates whether there is a significant over-representation of genes meeting the user’s criterion for each GO term.  http://www.GenMAPP.org.
 Long abstract
 
 
 
 |  
| A-21  Decision-tree approach to the classification of prostate tissue samples using microarray gene expression data Changqing Ma1, Rajiv Dhir2, Jianhua Luo, George Michalopoulos, Michael Becich, John Gilbertson
 1chmst40@pitt.edu, University of Pittsburgh; 2dhirr@MSX.UPMC.EDU, University of Pittsburgh
 Correspondence address: chmst40@pitt.edu
 
 
A decision-tree learning approach was applied to classify three types of prostate tumor tissue samples using microarray gene-expression data. In LOOCV, results were comparable to those obtained from applying SVMs or weighted voting method to this dataset. Furthermore, human-understandable models from decision-tree learning correctly predicted sample classes in previously published prostate tumor datasets.
 Long abstract
 
 
 
 |  
| A-22  Common transcription factor binding sites in the regulatory regions of a cluster of genes statistically linked to the hox gene HB24 Mar Bellido 1, Whipple Neely2, Fan W, Beppu L, Zhao LP, Radich JP
 1mbellido@fhcrc.org, Fred Hutchinson Cancer Research Center; 2whipple@fhcrc.org, Fred Hutchinson Cancer Research Center
 Correspondence address: mbellido@fhcrc.org
 
 
The hox gene HB24 encodes a protein expressed in CD34 cells which plays a role in T-cell activation. Using a statistical approach based on regression analysis we identified 29 genes co-regulated with the HB24 gene. The analysis of the regulatory regions of these genes revealed common transcription factor binding sites.
 Long abstract
 
 
 
 |  
| A-23  BioRag - Bio Resource for Array Genes: An Online Resource for Analyzing and interpreting Microarray data Ritu Pandey1, Raghavendra K Guru 2, David W Mount
 1ritu@u.arizona.edu, Bioinformatics, Arizona Cancer Center, University of Arizona; 2graghave@cs.arizona.edu, Bioinformatics, Arizona Cancer Center, University of Arizona
 Correspondence address: ritu@u.arizona.edu
 
 
BioRag (Bio Resource for Array Genes at http://www.biorag.org) is an interactive platform for analyzing and developing a biological interpretation of the microarray results. Differential gene expression patterns can be interpreted using tools that mine and extract variety of biological relationships captured in this integrative resource.
 Long abstract
 
 
 
 |  
| A-24  QUINTET: An R-based unified cDNA microarray data analysis system with graphical user interface Tae-Hoon Chung1, Cheol-Goo Hur2, Sun Yong Park, Hyo Soo Lee
 1thcng@kribb.re.kr, KRIBB; 2hurlee@kribb.re.kr, KRIBB
 Correspondence address: thcng@kribb.re.kr
 
 
We present QUINTET: an R-based unified cDNA microarray data analysis system with graphical user interface. It can seamlessly perform five principal categories of the data analysis: data quality assessment, faulty spot filtering and normalization, identification of differentially expressed genes, clustering of gene expression profiles and classification of samples.
 Long abstract
 
 
 
 |  
| A-25  Combining Bayesian Network Model with Promoter Element Detection for Estimating Gene Networks from Gene Expression Data Yoshinori Tamada1, SunYong Kim2, Hideo Bannai, Seiya Imoto, Kousuke Tashiro, Satoru Kuhara, Satoru Miyano
 1tamada@kuicr.kyoto-u.ac.jp, Laboratory of Biological Information Network, Bioinformatics Center, Institute for Chemical Research, Kyoto University; 2sunk@ims.u-tokyo.ac.jp, Laboratory of DNA Information Analysis,  Human Genome Center, Institute of Medical Science, University of Tokyo
 Correspondence address: tamada@kuicr.kyoto-u.ac.jp
 
 
We developed a statistical method for estimating gene networks
and detecting promoter elements simultaneously.
The estimation of gene network from  cDNA microarray data alone
is likely to cause misdirected edges.
Our method overcomes this problem by integrating microarray
data and the DNA sequence information into a Bayesian network
estimation. 
 Long abstract
 
 
 
 |  
| A-26  Design of the custom whole-genome malaria oligonucleotide array Serge Batalov1, Elizabeth A. Winzeler2
 1batalov@gnf.org, GNF; 2winzeler@scripps.edu, TSRI/GNF
 Correspondence address: batalov@gnf.org
 
 
To study the transcriptome of the malaria parasite, we designed a custom no-mismatch
oligonucleotide array containing 260,596 25mer single stranded probes from predicted
coding sequence (including mitochondrion and plastid genome sequences) and 106,630
probes from non-coding sequence.  In
addition 124,957 probes from Plasmodium yoelli contigs are include on the array.
 Long abstract
 
 
 
 |  
| A-27  Better Affymetrix Estimates Mark Reimers1
 1mark.reimers@cgb.ki.se, Karolinska
 Correspondence address: mark.reimers@cgb.ki.se
 
 
This shows improvements on the best current estimates for gene abundance using Affymetrix raw data, by compensating for spatial heterogeneity, and by assessing individual probe quality and background, prior to using a robust method for fit. 
 Long abstract
 
 
 
 |  
| A-28  A New Method of Block/Spot Indexing with Maximal epsilon-Regularity Point Set for Microarray Image Analysis Hee-Jeing Jin1, Ho-Youl Jung2, Hyun-Kyung Lee, Choon-Hwan Lee, Hwan-Gue Cho
 1hjjin@pearl.cs.pusan.ac.kr, Pusan National University; 2hyjung@ngri.re.kr, National Genome Research Institute
 Correspondence address: hjjin@pearl.cs.pusan.ac.kr
 
 
It is very difficult to automatically analyze microarray images due to several problems such as spot position variation. We propose a novel block and spot indexing algorithm with the use of maximal epsilon-regularity. The time complexity of our algorithm is O(n2) where n is the number of cells.
 Long abstract
 
 
 
 |  
| A-29  A Novel Feature Selection Method using Evolving Supervised Clustering and Applications for Gene Expression Data Modeling Nikola Kasabov1, Liang Goh2
 1nkasabov@aut.ac.nz, KEDRI; 2liang.goh@aut.ac.nz, KEDRI
 Correspondence address: liang.goh@aut.ac.nz
 
 
The method combines the tasks of classification and feature selection by using the obtained clusters in ECOS to further extract specific features for each of the clusters.  The method overcomes the problem of the signal to noise ratio method when data of the same class are spread in several clusters.   
 Long abstract
 
 
 
 |  
| A-30  GeneAnnot: Annotation of high-density oligunocleotide arrays and their linking with GeneCards. Vered Chalifa-Caspi1, Itai Yanai2, Ron Ophir, Michael Shmoish, Hila Benjamin-Rodrig, Naomi Rosen, Pavel Kats, Marilyn Safran, Orit Shmueli and Doron Lancet.
 1vered.caspi@weizmann.ac.il, Weizmann Insitute of Science; 2Iyanai@wisemail.weizmann.ac.il, Weizmann Insitute of Science
 Correspondence address: vered.caspi@weizmann.ac.il
 
 
The availability of entire genomic sequences enables  matching the short probe sequences of oligonucleotide arrays to their annotated gene representations. Here, we present a framework for estimating the sensitivity and specificity of gene-representing probe sets, and for integrating this information with the comprehensive genome and transcriptome repositories of the GeneCards databases suite.
 Long abstract
 
 
 
 |  
| A-31  Reliable feature extraction from mechanically spotted two-color microarrays. Yuching Lai1, Greg Tyrelle2, Daniel Di Giusto, Garry C. King
 1yuching@kinglab.unsw.edu.au, UNSW; 2greg@kinglab.unsw.edu.au, UNSW
 Correspondence address: yuching@kinglab.unsw.edu.au
 
 
 An intrinsic weakness – spot inhomogeneity – can be turned into a strength by using pixel correlation methods to reliably extract red/green ratios, identify dye-selective quenching and cull spots by data quality.  We compare our methods to established approaches.  
 Long abstract
 
 
 
 |  
| A-32  Target selection for the custom oligonucleotide array by clustering experimentally determined and computationally predicted transcript sets in mouse Serge Batalov1
 1batalov@gnf.org, GNF
 Correspondence address: batalov@gnf.org
 
 
Custom oligonucleotide array design is aimed at effectively interrogating 
a largest possible non-redundant set of transcripts under a physical size constraint.
200,000+ publicly available and proprietary mouse transcript 
sequences were clustered. The custom chip was subsequently extensively 
used to profile the expression in 70+ different tissues. 
 Long abstract
 
 
 
 |  
| A-33  Normalization of cDNA Microarray Data Using R-Language Sang Cheol Kim1, In Uk Hwang2, In Young Kim¹, Sunho Lee³, Hyun Cheol Chung¹, Sun Young Rha¹ Byung Soo Kim²
 1kimsc77@yonsei.ac.kr, Brain Korea 21 Project for Medical Science, Cancer Metastasis Research Center, Yonsei University College of Medicine; 2mzhwang@yonsei.ac.kr, Applied Statistics, Yonsei University
 Correspondence address: kimsc77@yonsei.ac.kr
 
 
The user-friendly R-based software program NOM-R, implemented Yang et al’s normalization procedures is developed. This program is not only convenient for the biologists to use in microarray data normalization, but also can handle the repeated intensity values and dye-swap experiments, simultaneously. 
 Long abstract
 
 
 
 |  
| A-34  Expression profiling and analysis of transcription factors for neuronal differentiation from stem cells Dong Mi Shin1, Joon Ik Ahn2, Ki Hwan Lee, Young Seek Lee, Yong Sung Lee
 1dongmishin@yahoo.com, Hanyang University; 2joonic@gaiagene.com, Hanyang University
 Correspondence address: dongmishin@yahoo.com
 
 
To identify transcription factors that may play an important role in the differentiation of stem cells to neurons, high throughput gene expression experiment and computational analysis were performed. Our result suggest many transcription factors- novel transcription factors as well as those previously known to be involved in differentiation signaling.
 Long abstract
 
 
 
 |  
| A-35  Latin Square Design to Gene Expression Experiments Tetsutaro Hamano1, Akira Ohide2, Masaru Sekijima, Kazuto Nishio, Masahiro Takeuchi, Yasuhiro Fujiwara
 1hamanot@pharm.kitasato-u.ac.jp, Division of Biostatistics, Kitasato University Graduate School, Japan; 2a-ohide@ankaken.co.jp, Applied Biology Division, Kashima Laboratory, Mitsubishi Chemical Safety Institute, Japan
 Correspondence address: hamanot@pharm.kitasato-u.ac.jp
 
 
We used a randomized Latin square design for blocking experimental variations and for estimating tamoxifen effect in human breast cancer cell lines assay. Orthogonal decomposition of a gene expression map based on the experimental design is expected to elucidate treatment effects as well as systematic error components.
 Long abstract
 
 
 
 |  
| A-36  Quality measures for Affymetrix data Ken Simpson1
 1ksimpson@wehi.edu.au, The Walter and Eliza Hall Institute
 Correspondence address: ksimpson@wehi.edu.au
 
 
We present several methods (qualitative and quantitative) for
determining the quality of hybridizations to Affymetrix GeneChips.  In
particular, we make an attempt to quantify the effect of hybridization
artifacts on estimates of gene expression.
 Long abstract
 
 
 
 |  
| A-37  Statistical Characterization of Spervised Learning and Gene Selection Algorithms for Gene Expression Analysis Eisaku Maeda1, Ichiro Takemasa2, Tomonori Izumitani, Hirotoshi Taira, Kenichi Matsubara, Morito Monden
 1maeda@cslab.kecl.ntt.co.jp, NTT Communication Science Laboratories; 2alfa-t@sf6.so-net.ne.jp, Graduate School of Medicine, Osaka University
 Correspondence address: maeda@cslab.kecl.ntt.co.jp
 
 
We focused on histopathological 
phenotype prediction in colorectal
carcinoma from microarray expression 
data, and investigated statistically 
their prediction performance of 
various combinations of classification 
techniques and gene selection 
methods. The results demonstrated 
detected marker genes and prediction 
accuracy strongly depends on the employed 
combination.
 Long abstract
 
 
 
 |  
| A-38  Automation of cDNA microarray image analysis Jin Hyuk Kim1, Hye Young Kim2, Tae Sung Park, Ki Woong Kim, Young Seek Lee, and Yong Sung Lee
 1jhkim1@hanyang.ac.kr, Hanyang Univeirsity College of Medicine; 2hykim121@hanyang.ac.kr, Hanyang University College of Medicine
 Correspondence address: jhkim1@hanyang.ac.kr
 
 
To automate the microarray image analysis, several processes were developed for detecting spots, filtering bad spots, and generating HTML reports. It can analyze a lot of microarray images without user’s attention. Therefore, it can be a connection in high throughput pipeline.
 Long abstract
 
 
 
 |  
| A-39  Compensation of scanner before robust M regression normalization in cDNA microarray Hye Young Kim1, Jin Hyuk Kim2, Yong Sung Lee, Young Seek Lee, Tae Sung Park, Ki Woong Kim, and Hyun Ju Chang
 1hykim121@hanyang.ac.kr, Hanyang Univeirsity College of Medicine; 2jhkim1@hanyang.ac.kr, Hanyang University College of Medicine
 Correspondence address: hykim121@hanyang.ac.kr
 
 
In cDNA microarray, the conversion of the amount of fluorescence to image intensity with the scanning process must be carefully handled to find the gene expression ratio. We developed a reverse scanning method for the microarray image and applied robust M regression to normalize the data from the compensated image.
 Long abstract
 
 
 
 |  
| A-40  OligoDesign: Design of LNA oligonucleotides for gene expression arrays Niels Tolstrup1, Peter S. Nielsen and Sakari Kauppinen
 1tolstrup@exiqon.com, Exiqon
 Correspondence address: tolstrup@exiqon.com
 
 
OligoDesign is a webservice for the design of DNA/LNA mixmer
oligonucleotides. The OligoDesign software features
recognition and filtering of the target sequence by
genome-wide BLAST analysis. It includes routines for
prediction of melting temperature, self-annealing and
secondary structure for LNA substituted oligonucleotides.
The OligoDesign program is freely accesible at
http://lnatools.com/
 Long abstract
 
 
 
 |  
| A-41  Hierarchical Clustering of Gene Expression Data with the Agglomerative Information-Bottleneck Method Byoung-Hee Kim1, Kyu-Baek Hwang2, Jung-Ho Chang, Byoung-Tak Zhang
 1bhkim@bi.snu.ac.kr, Biointelligence Lab, Seoul National University; 2kbhwang@bi.snu.ac.kr, Biointelligence Lab, Seoul National University
 Correspondence address: bhkim@bi.snu.ac.kr
 
 
By applying the double clustering with the agglomerative information-bottleneck method to NCI60 cell lines, the correspondence between gene expression patterns and the ostensible origins of the tumours was verified. By computing 'entropy', mutual information, and its variation for several stages of clustering, an appropriate number of clusters could be estimated.
 Long abstract
 
 
 
 |  
| A-42  A multivariate method for comparison of microarray data from different platforms Aedin C Culhane1, Guy Perriére2, Desmond G. Higgins
 1A.Culhane@ucc.ie, University College Cork; 2perriere@biomserv.univ-lyon.fr, Universite Claude Bernard
 Correspondence address: A.Culhane@ucc.ie
 
 
We describe a powerful method for comparison and visualisation of gene expression data from different microarray platforms.  Co-inertia analysis (CIA) is a multivariate method that identifies co-relationships in multiple datasets.  The genes from each dataset, which define these trends, can be identified.  Further details: http://bioinfo.ucc.ie. 
 Long abstract
 
 
 
 |  
| A-43  Exact Power Under Independence for the False Discovery Rate in Gene Expression Array Experiments Lawrence Hunter1, Deborah H. Glueck2, Anis Karimpour-Fard and Keith E. Muller
 1Larry.Hunter@uchsc.edu, U. Colorado School of Medicine; 2, U. Colorado School of Medicine
 Correspondence address: Larry.Hunter@uchsc.edu
 
 
The false discovery rate (Benjamini & Hochberg, 1995) is widely used
 for multiple comparison problems, including gene expression array
 studies. For independent, but not necessarily identically distributed
 test statistics, we derive the joint probability distribution of the
 number of total and false rejections, and thereby provide methods for
 exact small sample power and sample size.
 Long abstract
 
 
 
 |  
| A-44  Application of Stellar Photometry To The Analysis of Microarray Images Mahyar Sabripour1, Christopher I. Amos2, Kevin Coombes
 1msabripo@mdanderson.org, Department of Epidemiology, The University of Texas M. D. Anderson Cancer Center; 2camos@request.mdacc.tmc.edu, Department of Epidemiology, The University of Texas M. D. Anderson Cancer Center
 Correspondence address: msabripo@mdanderson.org
 
 
Improvements in quantifying spots can directly impact the identification of
genes critical to the development and progression of cancer. We utilize a
stellar photometric model, the Moffat function, to analyze cDNA membrane microarray
images. In the current setting, we fit the Moffat function to cDNA spots on the array.
 Long abstract
 
 
 
 |  
| A-45  An automatic and unbiased GA for finding the most discriminant gene sets on Microarray Han-Yu Chuang1, Hwa-Sheng Chiu2, Huai-Kuang Tsai, Cheng-Yan Kao
 1r90002@csie.ntu.edu.tw, Bioinfo Lab., Department of CSIE, National Taiwan University; 2r91031@csie.ntu.edu.tw, Bioinfo Lab., Department of CSIE, National Taiwan University
 Correspondence address: cykao@csie.ntu.edu.tw
 
 
A multi-objective genetic algorithm based approach, combining univariate and multivariate techniques, was proposed to find optimal gene sets for sample classification on gene expression data automatically and unbiased. Eight genes with 93% LOOCV accuracy of KNN were selected to be the optimal predictive gene sets for Colon cancer dataset.
 Long abstract
 
 
 
 |  
| A-46  MGraph: graphical models for microarray data analysis Junbai Wang1, Ola Myklebost2, Eivind Hovig, Norwegian Radium Hospital, j.e.hovig@labmed.uio.no
 1junbaiw@radium.uio.no, Norwegian Radium Hospital; 2olam@radium.uio.no, Norwegian Radium Hospital
 Correspondence address: junbaiw@radium.uio.no
 
 
MGraph is a MATLAB toolbox, which applies graphical models to solve problems in microarray data analysis. MGraph with its graphical interface allows user to predict genetic regulatory networks by a graphical gaussian model, and to quantify the effects of different experimental treatment conditions on gene-expression profiles by graphical log-linear model.
 Long abstract
 
 
 
 |  
| A-47  Elucidating Patterns within Patterns: A Post-Processing Step in Promoter Sequence Analysis Jessica Mar1, Alvis Brazma2
 1jess@ebi.ac.uk, European Bioinformatics Institute; 2brazma@ebi.ac.uk, European Bioinformatics Institute
 Correspondence address: jess@ebi.ac.uk
 
 
SPEXS is an algorithm that extracts statistically overrepresented patterns in a set of sequences. SPEXS output generally contains too many significant patterns for a user to survey in detail, hence a post-processing step designed to group these patterns into key clusters is helpful. We present approaches to isolate these clusters.
 Long abstract
 
 
 
 |  
| A-48  Infectomic Analysis of Cryptococcus Infections Using DNA Microarray Ambrose Jong1, Timothy Triche2, Steven H-M Chen, Sheng-He Huang
 1ajong@chla.usc.edu, Childrens Hospital Los Angeles/University of Southern California; 2ttriche@chla.usc.edu, Childrens Hospital Los Angeles/University of Southern California
 Correspondence address: shhuang@hsc.usc.edu
 
 
Our laboratories have been keenly interested in infectomic analysis of transcription profiles in human BMEC infected with C. neoformans. We performed a time-course study of C. neoformans infection using oligonucleotide microarrays. We have found dynamic changes in transcription profiles of cytokines that are important for pathogenesis of Cryptococcus meningitis. 
 Long abstract
 
 
 
 |  
	| New Frontiers
 |  
| L-1  HyBrow: A Hypothesis Space Browser Nigam Shah1, Stephen Racunas2, Nina V. Fedoroff
 1nigam@psu.edu, Penn State University; 2sar147@psu.edu, Penn State University
 Correspondence address: nigam@psu.edu
 
 
HyBrow is a prototype computer system comprising an event-based ontology for biological processes, an associated database and programs to perform hypothesis evaluation using a wide variety of available data. We demonstrate the feasibility of HyBrow, using the galactose metabolism gene network in Saccharomyces cerevisiae as our test system, for ranking alternative hypotheses in an automated manner.
 Long abstract
 
 
 
 |  
| L-2  Phylogenetic footprinting of co-expressed genes by Tree-Gibbs sampling Stefan Van Yper 1, Olivier Thas 2, Jean-Pierre Ottoy and Wim Van Criekinge
 1Stefan@biomath.rug.ac.be, Department of Applied Mathematics, Biometrics and Process Control, Ghent University; 2olivier.thas@rug.ac.be, Department of Applied Mathematics, Biometrics and Process Control, Ghent University
 Correspondence address: Stefan@biomath.rug.ac.be
 
 
Using site/motif Gibbs sampling, transcription factor binding sites can be found by analysing either the promoter sequences of co-expressed genes or the promoter sequences of orthologous genes. Tree-Gibbs sampling combines both data sources in one algorithm. This way additional information is available, resulting in improved, both in speed and accuracy, motif finding
 Long abstract
 
 
 
 |  
| L-3  Statewide Bioinformatics in Kentucky Eric Rouchka1, Nigel Cooper2
 1eric.rouchka@louisville.edu, University of Louisville; 2nigelcooper@louisville.edu, University of Louisville
 Correspondence address: eric.rouchka@louisville.edu
 
 
Abbreviated Abstract
The KBRIN bioinformatics core is attempting to create a Kentucky-wide network of bioinformatics expertise.  This venture has led to the identification of knowledge- and compute-based resources.  The core seeks to improve bioinformatics knowledge through research and the creation of bioinformatics courses, certificates, and degrees.  The core web site is: 
 http://www.kbrin.louisville.edu/about/bioinform_core.html
 Long abstract
 
 
 
 |  
| L-4  Developing Analysis and Visualization Tools for Lead Discovery Dimitri Petrov1, Shumei Jiang2, Andrey Santrosyan, Hayk Asatryan, Kaisheng Chen, Chris Benner, Robert Downs, John Isbell, Yingyao Zhou
 1dpetrov@gnf.org; 2sjiang@gnf.org
 Correspondence address: zhou@gnf.org
 
 
Genomics Institute of the Novartis Research Foundation (GNF) is developing data analysis and visualization tools on top of a web-based informatics system for its lead discovery biomedical research. Recent tools include: dose-response data fitting and visualization, structural similarity-based compound hierarchical clustering, ring component-based compound diversity analysis, LCMS data visualization, etc.
 Long abstract
 
 
 
 |  
| L-5  Using Web Services as part of the 2D gel analysis workflow Nataliya Sklyar1, Matthias Berth 2, Dirk Lewerentz
 1sklyar@informatik.uni-leipzig.de, University of Leipzig; 2berth@decodon.com , DECODON GmbH
 Correspondence address: sklyar@informatik.uni-leipzig.de
 
 
We present the integration of web services into Delta2D, an end-user 2D gel analysis application. With Delta2D's web services plugins, users can access data from external sources and display it alongside the image analysis results in a uniform way. We describe the plugin architecture and first experiences in its use.
 Long abstract
 
 
 
 |  
| L-6  A DNA-based Theorem Proving by Resolution Refutation Ji-Yoon Park1, In-Hee Lee2, Young-Gyu Chai, Byoung-Tak Zhang
 1scolaswhite@hotmail.com, Dept. of Biochemistry and Moleluar Biology, Hanyang University; 2ihlee@bi.snu.ac.kr, Biointelligence Laboratory School of Computer Science and Engineering, Seoul National University
 Correspondence address: scolaswhite@hotmail.com
 
 
Theorem proving is a method involving logical reasoning and has a variety of applications, including diagnosis and decision-making. Resolution refutation is a general technique of proving the validity of a theorem given a set of axioms and rules. To prove theorem proving, DNA molecular reaction is implemented by resolution refutation. 
 Long abstract
 
 
 
 |  
| L-7  G-language Genome Analysis Environment Version 2 Yohei Yamada1, Kazuharu Arakawa2, Ryo Hattori, Yusuke Kobayashi, Hayataro Kouchi, Atsuko Kishi, Masaru Tomita
 1skipper@g-language.org, Keio University, Department of Environmental Information; 2gaou@g-language.org, Keio Institute for Advanced Biosciences
 Correspondence address: skipper@g-language.org
 
 
Version 2 of the G-language Genome Analysis Environment is
developed to gain further speed and integrity through the object-oriented architecture.  Powered by the front-end "inspire" based on Flash/HTML and the database
system "bluebird", the new environment aims to achieve greater flexibility and integrity.
 Long abstract
 
 
 
 |  
| L-8  Integrated data Modeling of Protein Structures by using a fact constellation model based on a XML Mediated Warehouse System RongHua Li1, Sung-Hee Park2, Kwang Su Jeong, Keun Ho Ryu
 1lrh@dblab.chungbuk.ac.kr, Chungbuk University; 2shpark@dblab.chungbuk.ac.kr, Chungbuk University
 Correspondence address: ksjeong@dblab.chungbuk.ac.kr
 
 
This paper describes integrated protein structure modeling by using a fact constellation model and represents this modeling to XML in order to store  and query highly complex protein data based on a XML mediated warehouse system. It performs complex queries employed during analyzing process by using XML query processing. 
 Long abstract
 
 
 
 |  
| L-9  Using XML-RPC for Distributed BLAST -Desterilizing idle resources- Yong Wook Kim1, Keun Woo Lee, Hee Won, Yong-Ho In
 1yongari@bioinfomatix.com, Bioinfomatix, Inc.
 Correspondence address: yongari@bioinfomatix.com
 
 
Usually, personal computers use the MS Windows operating system, but the computing power is used for simple work. We try to use these available resources as member resources of a clustering system. XML-RPC provides the straight forward way for distributed BLAST and heterogeneous operating systems to be used as a member of the distributed system for BLAST.
 Long abstract
 
 
 
 |  
| L-10  MineLink: A novel information integration framework for Life Sciences Tanveer Syeda-Mahmood1, Bhooshan Kelkar2
 1stf@almaden.ibm.com, IBM Almaden Research; 2bkelkar@us.ibm.com, IBM Life Sciences
 Correspondence address: stf@almaden.ibm.com
 
 
MineLink is a novel information integration framework that can pull together life sciences data and analytic applications from disparate sources.  It specifies a design methodology for automatically integrating individual components, be they data sources, processors, data miners or visualization components without the need for explicit programming. It addresses both syntactic and semantic aspects of information integration.
 Long abstract
 
 
 
 |  
| L-11  TransMiner: Biotransformation Prediction Minesh Upadhyaya1, Imran Shah2, Daniel McShan, Weiming Zhang
 1minesh.upadhyaya@uchsc.edu, UCHSC; 2imran.shah@uchsc.edu, UCHSC
 Correspondence address: minesh.upadhyaya@uchsc.edu
 
 
We developed TransMiner, a symbolic computational approach for
inferring  novel biochemical functions. We have developed a novel
sub-graph isomorphism-based algorithm to search the detailed
representations of known biotransformations to induce biocatalytic
"rules". These symbolic biocatalytic rules represent the simplified
functions of enzymes and can be used to infer novel biochemistry. 
 Long abstract
 
 
 
 |  
| L-13  Gene-Protein networks in Drosophila Melanogaster Inigo San Gil 1, Kevin White2, Joel Bader, Tong-Ruei Li
 1inigo.sangil@yale.edu, Yale University; 2kevin.white@yale.edu, Yale University
 Correspondence address: inigo.sangil@yale.edu
 
 
The poster shows a gene network based on a map of interactions between proteins and genes in Drosophila melanogaster.  The map is based on cross correlations of genome wide time courses of gene expression and yeast to hybrid interactions. Results show a new rich network of connections between genes and proteins.
 Long abstract
 
 
 
 |  
| L-14  Performing in silico experiments on the Grid using myGrid Robert Stevens1, Tom Oinn2, Peter Li
 1robert.stevens@cs.man.ac.uk, University of Manchester; 2tmo@ebi.ac.uk, European Bioinformatics Institute
 Correspondence address: robert.stevens@cs.man.ac.uk
 
 
myGrid aims to exploit Grid technology &
provide high-level services and middleware that make it suitable for
bioinformatics. myGrid uses resource
discovery and workflow enactment services that allow scientists to perform
in silico experiments
over bioinformatics resources. Services are provided to support the scientific
method, notably provenance management, change notification &
personalization. Long abstract
 
 
 
 |  
	| Phylogeny and Evolution
 |  
| G-1  The Evaluation of Different Approaches to Infer Positive Selection Sites Li Jia1, Tao Jiang2, Michael Clegg
 1lijia@cs.ucr.edu, University of California; 2jiang@cs.ucr.edu, University of California
 Correspondence address: lijia@cs.ucr.edu
 
 
It is important to infer positive selection sites associated with a given gene family.  Different approaches have been proposed to detect positive selection at single amino acid sites.  The performance of these approaches was evaluated so that researchers should be able to apply an appropriate method to their research when certain circumstances are met.
 Long abstract
 
 
 
 |  
| G-2  CompMapper: An Automatic Pipeline to Define Conserved Segments between Genomes Systematically Fu Lu1, Zhenyuan Wang2,  Xiangqun Holly Zheng, Wenyan Zhong,  Fei Zhong, Richard Mural
 1fu.lu@celera.com, Celera Genomics; 2jack.wang@celera.com, Celera Genomics
 Correspondence address: fu.lu@celera.com
 
 
To overcome the limitations of comparative mapping using orthologous genes, we have developed a new paradigm and systematic approach to define conserved synteny between human and mouse directly from genomic sequence. The automatic pipeline should be applicable to compare any species with complete or draft genome sequence and within an appropriate phylogenetic distance.
 Long abstract
 
 
 
 |  
| G-3  Evolutionary analysis of long terminal repeats of human endogenous retroviruses Artamonova I.1
 1irena@humgen.siobc.ras.ru, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya 16-10, Moscow, 117997, Russia
 Correspondence address: irena@humgen.siobc.ras.ru
 
 
We analyzed locations of HERV-K LTRs in the human genome. Their distribution is not uniform among human chromosomes and among different regions of one chromosome. The majority of HERV-K LTRs are clustered. Positions of LTR clusters correlate with the Giemsa segmentation of human chromosomes. Most clusters are observed GC-rich regions.
 Long abstract
 
 
 
 |  
| G-5  Development of SCAR marker for the Discrimination of Bang-Poong species of Herbal Medicine Mi Young Lee1, Byong Seob Ko2, Seong Mi Hong, Jung Eun Kim, Sung Jin Lee
 1mylee2020@hanmail.net, Korea Institute of Oriental Medicine; 2bsko@kiom.re.kr, Korea Institute of Oriental Medicine
 Correspondence address: mylee2020@hanmail.net
 
 
Bang-Poong species is one of the most important species of herbal medicine. We have applied a molecular approach to developing SCAR markers. We show that this methodology can be applied to dried herbal medicine. 
 Long abstract
 
 
 
 |  
| G-6  Novel Heterogeneous Maximum Likelihood Methods for The Detection of Adaptive Evolution. Jennifer Commins1, Dr. James O. McInerney 2
 1jennifer.commins@may.ie, NUI, Maynooth; 2james.o.mcinerney@may.ie, NUI, Maynooth
 Correspondence address: jennifer.commins@may.ie
 
 
Maximum Likelihood methods are popular for analysing sequences to detect adaptive evolution. We have designed new methods for robustly inferring the evolutionary history of extant sequences and identifying signatures of adaptive evolution, performing analyses of multiple sequence alignments in ways closer to biological reality than existing methods. Available at http://bioinf.may.ie/likelihood.
 Long abstract
 
 
 
 |  
| G-7  Modelling change in codon substituion, using serially sampled sequence data. Matthew Goode1, Allen Rodrigo2
 1m.goode@auckland.ac.nz, University of Auckland; 2a.rodrigo@auckland.ac.nz, University of Auckland
 Correspondence address: m.goode@auckland.ac.nz
 
 
We describe a method for modeling changes in codon substitution in populations where evolution can be measured, e.g. rapidly evolving viral populations. Our model extends Neilsen and Yang's codon model to allow parameters associated with selection and proportion of selected sites to vary over time.
 Long abstract
 
 
 
 |  
| G-8  Evolution of Toll genes from the perspective of transcriptional regulatory regions Rajakumar Sankula1, Narayanan Perumal2, Lang Li
 1rsankula@iupui.edu, School of Informatics, Indiana University Purdue University Indianapolis ; 2nperumal@iupui.edu, School of Informatics, Indiana University Purdue University Indianapolis
 Correspondence address: nperumal@iupui.edu
 
 
Phylogenetic analysis of Toll gene evolution across insects, plants, and mammals using their transcriptional regulatory regions has been performed. This analysis is based on a unique approach employing the frequency of “evolutionarily” informative transcription factor binding sites.  Interestingly, the resultant phylogeny produced results similar to that of protein-based phylogeny.
 Long abstract
 
 
 
 |  
| G-11  PyPop: A framework for large-scale population genomics analysis Alex Lancaster1, Mark P. Nelson2, Richard M. Single; Diogo Meyer; Glenys Thomson
 1alexl@socrates.berkeley.edu, UC Berkeley; 2, UC Berkeley
 Correspondence address: alexl@socrates.berkeley.edu
 
 
PyPop (Python for Population Genetics) is a suite of programs for the
analysis of multi-locus population genetic data, outputs are stored in
XML and can be transformed into other data formats.  PyPop will be
made freely available under the GNU GPL at:
http://allele5.biol.berkeley.edu/pypop/
 Long abstract
 
 
 
 |  
| G-12  Bayesian Population Genetics and the Human History of China Michael Black 1, Cheryl Wise 1, Wei Wang, Alan Bittles
 1m.black@ecu.edu.au, Centre for Human Genetics, Edith Cowan University, Perth, Western Australia
 Correspondence address: m.black@ecu.edu.au
 
 
The history of China is a record of migration, population admixture and community endogamy.  While comparing current Bayesian and "Classical" methodologies, these factors were prevalent in the genetic structure of eight Chinese ethnic populations.  It’s concluded that Bayesian models comprising multiple-system data and historical factors are required for future studies.
 Long abstract
 
 
 
 |  
| G-13  A hybrid clustering approach to genome-scale recognition of protein families Timothy J. Harlow1, J. Peter Gogarten2, Mark A. Ragan
 1t.harlow@imb.uq.edu.au, Institute for Molecular Bioscience, University of Queensland; 2gogarten@uconn.edu, University of Connecticut
 Correspondence address: m.ragan@imb.uq.edu.au
 
 
We develop a hybrid approach to recognizing protein families among multi-genomic datasets based on Markov and single-linkage clustering of normalised pairwise BLASTP bit scores. We present results for all proteins from 114 microbial genomes, and illustrate its utility by recognizing orthologs and paralogs of rotary motor ATP synthetase F1 subunits.
 Long abstract
 
 
 
 |  
| G-14  Computing accurate phylogenies from gene-order data Jijun Tang1, Bernard M.E. Moret2
 1jtang@cs.unm.edu, University of New Mexico; 2moret@cs.unm.edu, University of New Mexico
 Correspondence address: jtang@cs.unm.edu
 
 
DCM-GRAPPA is a method for phylogeny recontruction based on gene-order 
data; it scales gracefully to one thousand genomes, using a day or two
of computation and producing highly accurate results (within 1%).
GRAPPA and DCM-GRAPPA are available in source form at
http://www.cs.unm.edu/~moret/GRAPPA/
 Long abstract
 
 
 
 |  
| G-15  EVOLVE: a toolkit for statistical molecular evolutionary analysis of genomes Gavin Huttley1, Alex Isaev2, Andrew Butterfield, Edward lang, Cath Lawrence
 1gavin.huttley@anu.edu.au, ANU; 2Alexander.Isaev@maths.anu.edu.au, ANU
 Correspondence address: gavin.huttley@anu.edu.au
 
 
The number of genes and species for which data are now available present an opportunity for statistically powerful examinations of molecular evolutionary processes. We will present a description of the functionality and performance of EVOLVE (cbis.anu.edu.au/software), a high performance computing package for phylogeny-based maximum likelihood modeling and hypothesis testing.
 Long abstract
 
 
 
 |  
| G-16  Analyses using novel Markov models of substitution support a significant role for germline methylation in male biased evolution. Matthew J Wakefield1, Gavin A Huttley2, Alexander Isaev, Andrew Butterfield, Edward Lang & Cath Lawrence
 1Matthew.Wakefield@kangaroo.genome.org.au, Centre for Bioinformation Science, The Australian National University; 2Gavin.Huttley@anu.edu.au, Centre for Bioinformation Science, The Australian National University
 Correspondence address: matthew.wakefield@kangaroo.genome.org.au
 
 
 We have constructed a novel Markov model of dinucleotide substitution including parameters for methylation and strand using the EVOLVE  toolkit. Our analysis supports a significant contribution of differential methylation in the germline elevating male mutation rates: clearly demonstrating the utility of EVOLVE (http://cbis.anu.edu.au/software/) in developing new sequence evolution models.  Long abstract
 
 
 
 |  
| G-17  BOSS: Boxes of Sequence Similarity Robert Flegg1, Malcolm Simons2
 1robert.flegg@med.monash.edu.au, GeneType Pty. Ltd., Fitzroy Vic 3065, Australia and Victorian Bioinformatics Consortium, PO Box 53, Monash University, Clayton Vic 3800, Australia; 2mjsimons@optusnet.com.au,
 Correspondence address: robert.flegg@med.monash.edu.au
 
 
Alignments display a mosaic pattern due to recombination.  Existing programs use a broad window to find major events.  This leads to not recognising the finer detail where multiple events occur at different points in different lineages.  BOSS uses pairwise comparison and a short sliding window to analyse this mosaic structure.
 Long abstract
 
 
 
 |  
| G-18  Determining the Eukaryote Phylogeny Gayle Philip1, James McInerney2
 1gayle.k.philip@may.ie, National University of Ireland, Maynooth; 2james.o.mcinerney@may.ie, National University of Ireland, Maynooth
 Correspondence address: gayle.k.philip@may.ie
 
 
The relationship of nematodes to arthropods and vertebrates can be described by the Coelomata and Ecdysozoa hypotheses.
Our aim was to test these hypotheses by finding the supertree that best described the relationship of orthologous, single gene family trees from ten eukaryotic taxa. Our results support the traditional Coelomata hypothesis.
 Long abstract
 
 
 
 |  
| G-19  Evolutionary analysis of single nucleotide polymorphism distribution in duplicated gene pairs of Arabidopsis thaliana Brad Chapman1, Andrew Paterson2
 1chapmanb@uga.edu, University of Georgia; 2paterson@uga.edu, University of Georgia
 Correspondence address: chapmanb@uga.edu
 
 
Whole genome duplication has played a major role in the structuring of the Arabidopsis thaliana genome. We examined single nucleotide polymorphism (SNP) variation in duplicate genes retained after these duplication events. We compare SNP accumulation in duplicates and singletons with respect to their effect on protein evolution.
 Long abstract
 
 
 
 |  
| G-20  MICROSATELLITE REPEATS IN PLANTS Chandri N Yandava1, Roger Pennell2, Kenneth Feldmann, Peter Mascia, Richard Flavell, William Kimmerly
 1cyandava@ceres-inc.com, Ceres Inc; 2rpennell@ceres-inc.com, Ceres Inc
 Correspondence address: cyandava@ceres-inc.com
 
 
The classes AAC, AAG present in higher number in Arabidopsis, whereas AAT AGG and CCG are more abundant in rice. Repeats with A and T bases (except AAT) are more frequent in Arabidopsis, repeats with G and C bases are more in rice, as rice is high in GC composition.
 Long abstract
 
 
 
 |  
| G-21  Phylogeny of DNA Methyltransferases recognizing GATC and related DNA sequences. Richard D. Morgan1
 1morgan@neb.com, New England Bioloabs
 Correspondence address: morgan@neb.com
 
 
DNA modification plays important roles in nucleic acid metabolism. Methyltransferases modifying the related DNA sequences GATC and GANTC are close homologs. We present a phylogeny of these enzymes as an example of how DNA sequence recognition has evolved, and predict how to evolve further specificities in vitro
 Long abstract
 
 
 
 |  
| G-22  Freeing Phylogenies from Alignments Michael Höhl1, Isidore Rigoutsos2, Mark Ragan
 1m.hoehl@imb.uq.edu.au, Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia.; 2rigoutso@us.ibm.com, Computational Biology Center, IBM Thomas J Watson Research Center
 Correspondence address: m.hoehl@imb.uq.edu.au
 
 
To free phylogenies from alignments we present two approaches based on
pattern discovery using TEIRESIAS:  the first one computes distances
from patterns, where distance is defined analogous to distances on
alignments.  The second approach transforms patterns into character
data, meaning that we do not have to explicitly extract relevant
properties.
 
 Long abstract
 
 
 
 |  
| G-23  Genome phylogenies based on the mean normalized BLASTP score Robert G. Beiko1, Robert L. Charlebois2, Mark A. Ragan
 1rbeiko@science.uottawa.ca, Dept. of Biology, University of Ottawa, 30 Marie Curie, Ottawa, ON, K1N 6N5, Canada; Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia; 2rlcharl@neurogadgets.com, GenomeAtlantic, 1721 Lower Water St., Suite 401, Halifax, NS, B3J 1S5, Canada; Dept. of Biology, University of Ottawa, 30 Marie Curie, Ottawa, ON, K1N 6N5, Canada; NeuroGadgets Inc., www.neurogadgets.com; Evolutionary Biology Program, Canadian Institute for Advanced Research
 Correspondence address: rbeiko@science.uottawa.ca
 
 
Individual genes often fail to reproduce similar trees, sometimes with disparities so serious as to question the entire concept of organismal phylogeny for prokaryotes. However, trees produced from bulk genomic signal display topologies that are largely consistent with accepted taxonomy, suggesting that the underlying mode of prokaryotic evolution is clonal.
 Long abstract
 
 
 
 |  
| G-24  Evidence that rice, and other cereals, are ancient aneuploids Klaas Vandepoele1, Cedric Simillion2, Stephane Rombauts and Yves Van de Peer
 1klpoe@gengenp.rug.ac.be, University of Gent, VIB dep. Plant Systems Biology; 2cesim@gengenp.rug.ac.be, University of Gent, VIB dep. Plant Systems Biology
 Correspondence address: strom@gengenp.rug.ac.be
 
 
By analyzing the genome of Oryza sativa (rice), we show that a substantial fraction (15%) of all rice genes is found in duplicated segments. However, detailed analysis shows that rice is not an ancient polyploid as previously suggested, but an ancient aneuploid that has experienced partial genome duplication, approximately 70 million years ago.
 Long abstract
 
 
 
 |  
| G-25  Phylogeny and Evolution of Human Cathepsins Veronika Stoka1, Vito Turk2
 1veronika.stoka@ijs.si , J. Stefan Institute; 2Vito.Turk@ijs.si, J. Stefan Institute
 Correspondence address: veronika.stoka@ijs.si
 
 
Cathepsins are proteolytic enzymes of lysosomal origin. According their catalytic mechanism they can be classified as cysteine, serine or aspartic protases. In this work we investigated the phylogeny and evolution of human cathepsins.
 Long abstract
 
 
 
 |  
| G-26  EVA: Examining foodborne virus evolution using the Enteric Virus Analysis tool Graham Etherington1, Ian Roberts2, Jo Dicks,  Vic Rayward-Smith
 1Graham.Etherington@bbsrc.ac.uk, John Innes Centre; 2Ian.Roberts@bbsrc.ac.uk, Institute of Food Research
 Correspondence address: Graham.Etherington@bbsrc.ac.uk
 
 
EVA (Enteric Virus Analysis) is an analysis tool that brings together existing and novel computational techniques within a single integrated software environment. Here we describe the design of EVA and its use in examining the evolution of emerging group of foodborne viruses.
 Long abstract
 
 
 
 |  
| G-27  LUMBERJACK:  A Heuristic Tool For Sequence Alignment Exploration And Phylogenetic Inference Carolyn J. Lawrence1, Christian M. Zmasek2, R. Kelly Dawe, Russell L. Malmberg
 1carolyn@plantbio.uga.edu, Department of Plant Biology, University of Georgia Athens; 2czmasek@gnf.org, Genomics Institute of the Novartis Research Foundation
 Correspondence address: czmasek@gnf.org
 
 
LumberJack is a phylogenetic tool intended to facilitate sampling treespace to find likely tree topologies quickly and to map phylogenetic signal onto regions of an alignment in a heuristic manner.
 Long abstract
 
 
 
 |  
| G-28  Identifying and genetic relationships within the Arisaema determined using microsatellite markers Byong Seob Ko1, Mi Young Lee2, Seong Jin Lee, Jeong Eun Kim, Seong Mi Hong, Young seung Ju
 1bsko@kiom.re.kr, Korea Institute of Oriental Medicine; 2mylee2020@hanmail.net, Korea Institute of Oriental Medicine
 Correspondence address: mylee2020@hanmail.net
 
 
Microsatellite technology rapidly reveals high polymorphic fingerprints and thus determines the genetic markers.  In combination with oligonucleotides of arbitrary sequence, 5' anchored oligonucleotides based on simple sequence repeats were used in PCR to produce Arisaema DNA fingerprints. 
 Long abstract
 
 
 
 |  
	| Predictive Methods
 |  
| H-1  Classification of Virus Risk Types Using Kernel-Based Classifiers Je-Gun Joung1, Sirk June Augh2, Byoung-Tak Zhang
 1jgjoung@bi.snu.ac.kr, Graduate Program in Bioinformatics, Seoul National University, Korea; 2sjaugh@cbit.snu.ac.kr, Center for Bioinformation Technology, Seoul National University, Korea
 Correspondence address: jgjoung@bi.snu.ac.kr
 
 
Classification of virus risk types is important to understand the mechanisms in infection. We propose a machine learning approach to classify HPV risk types. Our approach is based on the kernel method that provides efficient computation. In our experiments, the string kernel-based classifier predicted four unknown HPV types exactly.
 Long abstract
 
 
 
 |  
| H-2  Studies of the transcriptional regulation of the genes coding for the novel IL28A,B and IL29 protein family:
Illustration of an in silico approach applicable on a genomic scale William Krivan1, Brian Fox2, Emily Cooper, Teresa Gilbert, Frank Grant, Betty
 Correspondence address: krivan@zgi.com
 
 
We use the novel IL28A,B and IL29 protein family to illustrate an approach 
to the computational identification and characterization of putative 
transcriptional regulatory regions. Our technique consists of 
a combination of phylogenetic footprinting and detection of statistically 
significant clusters of binding sites and can be applied on a genomic scale.
 Long abstract
 
 
 
 |  
| H-3  Prediction of Protein Function from Primary Structure Paul J. Tan1, Vladimir Brusic2, Asif M. Khan, Judice L.Y. Koh, Seng-Hong Seah
 1tjtan@i2r.a-star.edu.sg, I2R; 2vladimir@i2r.a-star.edu.sg, I2R
 Correspondence address: tjtan@i2r.a-star.edu.sg
 
 
An approach was developed for predicting the presence of a specific functional effect for active peptides. It involved multiple steps: a) collection of protein sequences from multiple sources, b) data cleaning and functional annotation, c) definition of basic structure-function unit groups, and d) prediction of protein function by an intelligent agent. Long abstract
 
 
 
 |  
| H-4  HMM Frameworks for Nuclear Receptor Binding Sites Albin Sandelin1, Wyeth Wasserman2
 1albin.sandelin@cgb.ki.se, Karolinska Institutet; 2wyeth@cmmt.ubc.ca, Univeristy of British Colombia
 Correspondence address: albin.sandelin@cgb.ki.se
 
 
Nuclear Receptors (NR) control diverse programs of gene expression.  These transcription factors bind in homo- and heterodimeric forms to complex target sequences.  Due to variable spacing and orientation of half-sites, standard profile models are inadequate. We construct an HMM framework for the prediction and classification of NR binding sites.
 Long abstract
 
 
 
 |  
| H-6  The model representation of the mode of action of combination therapy of chloroquine,puritine an ascobic acid Onimisi Hassan Bello1
 1hassanbello2001@yahoo.com, bima tutorial outfit
 Correspondence address: hassanbello2001@yahoo.com
 
 
The action of the combine therapy of
chloroquine,puritine and ascobic acid has baffled
biochemist for some times.but it is
believe to potentiate its action based on the
pertubation of the lipid bilayer surrounding the
cells.
  but further studies to reveal the genetic conditions
has not yielded much results.
  
 Long abstract
 
 
 
 |  
| H-7  Species-Specific Protein Sequence and Fold Optimizations Michel Dumontier1, Katerina Michalickova2, Christopher W.V. Hogue
 1micheld@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5; 2katerina@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5
 Correspondence address: micheld@mshri.on.ca
 
 
The ability of each and every organism to adapt its particular environmental niche is of fundamental importance to its survival and proliferation.  We have identified species-specific protein sequence and fold optimizations, which we exploited to generate predictive scoring functions.  These scoring functions will be used in future species-specific protein identification and optimization experiments.
 Long abstract
 
 
 
 |  
| H-8  An evolving approach to finding Schemas for protein secondary structure prediction Huang,Hsiang Chi1
 1illusion@iii.org.tw, Institute for Information Industry
 Correspondence address: illusion@iii.org.tw
 
 
A genetic algorithm has been applied to predict building schemas of protein secondary structure. Although the average Q3 of this research is not the highest score among researches, some fundamental and useful building schemas of protein secondary structure have been found. 
 Long abstract
 
 
 
 |  
| H-9  Inhibitors of Glycogen Synthase Kinase-3beta and Cyclin-Dependent Kinases Modelled by 3D-QSAR Using a Novel Alignment Method Based on Electrostatic Potentials. Mahindra Makhija 1, Erik Helmerhorst2
 1M.Makhija@exchange.curtin.edu.au, Western Australian Biomedical Research Institute, School of Biomedical Sciences, Curtin University of Technology, Bentley, WA 6102, Australia; 2E.Helmerhorst@curtin.edu.au, Western Australian Biomedical Research Institute, School of Biomedical Sciences, Curtin University of Technology, Bentley, WA 6102, Australia
 Correspondence address: M.Makhija@exchange.curtin.edu.au
 
 
The ability of paullones and aloisines to inhibit glycogen synthase kinase-3NULL and cyclin-dependent kinases was well predicted by 3D-QSAR using CoMSIA in conjunction with a novel alignment method based on electrostatic potentials.  The predictive value of this approach may lead to the development of better therapeutics for Alzheimer’s disease.
 Long abstract
 
 
 
 |  
| H-10  REMmatch program: finding potential hormone responsive elements Nikolai Aksenov1, Alex Lyakhovich2
 1nikolay.aksenov@plantphys.umu.se, Umea University; 2alexlyak@umich.edu, University of Michigan
 Correspondence address: nikolay.aksenov@plantphys.umu.se
 
 
REMmatch program is designed for 
preliminary screening of any sequence
database for searching potential 
hormone responsive elements (HRE) and
is comparable with other known programs
TESS, Match etc.). REMmatch is available
at http://www.math.wisc.edu/~karp/REMmatch.exe
 Long abstract
 
 
 
 |  
| H-11  Prediction of New Regulatory Properties For Proteins Sharing Different Functional Motifs Alex Lyakhovich1, Anatoly Karp2
 1alexlyak@umich.edu, University of Michigan; 2, University of Wisconsin-Madison
 Correspondence address: alexlyak@umich.edu
 
 
We suggest a new algorithm that allows prediction of additional regulatory functions for the proteins containing different structural motifs. This algorithm was successfully applied for a set of proteins containing ubiquitin motifs where we could also show their regulation by certain protein kinases.
 Long abstract
 
 
 
 |  
| H-12  Predicting Synthetic Lethality Sharyl L. Wong1, Lan O. Zhang2, Amy H. Tong, Debra S.
 Correspondence address: sharyl_wong@student.hms.harvard.edu
 
 
We successfully predicted synthetic lethal gene pairs in Saccharomyces cerevisiae. Using probabilistic decision trees, we integrated multiple data types including correlated  mRNA  expression, physical interaction, protein function, and sequence homology.  
Our predictions may help identify redundant genes and pathways and may may better our 
understanding of genetic robustness.
 Long abstract
 
 
 
 |  
| H-13  PSORT-B: A Web-based Tool for Bacterial Subcellular Localization Prediction Jennifer L. Gardy1, Cory A. Spencer2, Fiona S.L. Brinkman
 1jlgardy@sfu.ca, Dept. of Molecular Biology and Biochemistry, Simon Fraser University; 2cspencer@sprocket.org, Dept. of Molecular Biology and Biochemistry, Simon Fraser University
 Correspondence address: jlgardy@sfu.ca
 
 
We present PSORT-B (http://www.psort.org) - a subcellular localization predictor with a measured accuracy of 97% for Gram-negative bacteria. Issues including the handling of proteins resident at multiple localizations, the importance of the “unknown” result and the development of a Gram-positive version are discussed. Selected whole genome analysis is also presented. 
 Long abstract
 
 
 
 |  
| H-14  Low-budgetary scheme for differentiation and DNA quantity investigation in blood lymphocytes of patients with chronical tonsillitis Boris V.Shilov1, Dmitry A.Dolgun2
 1bvshilov@hotbox.ru, SSMU; 2, SSMU
 Correspondence address: bvshilov@hotbox.ru
 
 
The nucleus belonging to any type of the cells was determined to estimate of atypical lymphocytes in patients with chronic tonsillitis, to analyze the vital cycle those cells on the base of classification of transformed cells stained according to Romanovsky-Gimza. Image segmentation process was adapted to condition of low-budgetary science
 Long abstract
 
 
 
 |  
| H-16  New features for microRNA gene finding Uwe Ohler1, Chris Burge2, David Bartel
 1ohler@mit.edu, MIT; 2cburge@mit.edu, MIT
 Correspondence address: ohler@mit.edu
 
 
MicroRNAs are a class of tiny RNA genes excised from precursor hairpin
structures. We identified a highly specific conserved motif upstream
of precursors that might be involved in miRNA transcription, and
describe how much this motif, plus features of general upstream and
downstream conservation, aid in miRNA gene finding.
 Long abstract
 
 
 
 |  
| H-17  Long time scale simulations of Molecular Systems Benjamin Gladwin1, Dr Thomas Huber2
 1gladwin@maths.uq.edu.au, University of Queensland Department of Mathematics.; 2huber@maths.uq.edu.au, University of Queensland Department of Mathematics.
 Correspondence address: gladwin@maths.uq.edu.au
 
 
Modelling large bio-molecules is still primarily limited to the simulation of short timeframes, which in many cases are not biologically significant. The goal of this project is to use the optimisation of Hamiltonian paths
to enable calculation of the behaviour of molecular systems over large time frames.
 Long abstract
 
 
 
 |  
| H-18  Identification of PKC Phosphorylation Sites on AC7 using a Directed Bioinformatics Approach. Eric J. Nelson1, John VanHoven2, Vlad Verkhusha, Tonny deBeer, and Boris Tabakoff.
 1eric.nelson@uchsc.edu, Univ. of Colorado; 2john.vanhoven@uchsc.edu, Univ. of Colorado
 Correspondence address: eric.nelson@uchsc.edu
 
 
A bioinformatics approach is presented that can bypass in some instances the traditional means of phosphopeptide mapping a radiologically labeled target protein to identify PKC phosphorylation sites. This directed bioinformatics approach utilizes comparative sequence analysis, molecular modeling, and machine learning techniques to assist in the discovery of PKC phosphorylation sites.
 Long abstract
 
 
 
 |  
| H-19  Evaluating the Predictability of RNA Pseudoknots J. Reeder1, R. Giegerich2
 1robert@techfak.uni-bielefeld.de, Bielefeld University; 2jreeder@techfak.uni-bielefeld.de, Bielefeld University
 Correspondence address: robert@techfak.uni-bielefeld.de
 
 
We define a new class of pseudoknots and present algorithms for thermodynamic folding of RNA secondary structures including such pseudoknots. Their time/space complexity is O(n^4) and O(n^2). We also compute the best structure guaranteed to contain a pseudoknot, and the most tightly knotted substructure. An extensive evaluation of Pseudobase is performed.
 Long abstract
 
 
 
 |  
| H-20  A sequence-independent strategy for the prediction of prokaryotic promoters Pierre-Etienne Jacques1, Sebastien Rodrigue2, Jocelyn Beaucher, Jean-François Jacques, Luc Gaudreau, Jean Goulet and Ryszard Brzezinski
 1pierre-etienne.jacques@hermes.usherb.ca, Universite de Sherbrooke; 2, Universite de Sherbrooke
 Correspondence address: pierre-etienne.jacques@hermes.usherb.ca
 
 
Our strategy is based on the biological fact which show that promoters are localized in the upstream regulatory regions of genes. The possibility for a particular sequence to be a bona fide promoter can be evaluated from its mismatch distribution amongst the various areas of the genome.
 Long abstract
 
 
 
 |  
| H-21  NetOGlyc 3.0: Prediction of mucin type O-glycosylation sites from sequence and  sequence-derived features. Karin Julenius1, Ramneek Gupta, Kristoffer Rapacki, Lars Juhl Jensen, Søren Brunak
 1kj@cbs.dtu.dk, Center for Biological Sequence Analysis, BioCentrum-DTU
 Correspondence address: kj@cbs.dtu.dk
 
 
NetOGlyc 3.0 is a predictor of mucin type O-glycosylation sites,
predicting from the protein sequence alone.
NetOGlyc 3.0 shows much better generalization behaviour (the ability of the network to correctly predict for completely new examples) than its predecessor and is available at www.cbs.dtu.dk/services/NetOGlyc/
 Long abstract
 
 
 
 |  
| H-22  Genomics of Vertebrate Splicing Regulatory Elements Gene Yeo1, Shawn Hoon2, Chris Burge
 1geneyeo@mit.edu, MIT; 2shawnh@fugu-sg.org, IMCB, Singapore
 Correspondence address: geneyeo@mit.edu
 
 
We find differences in the distribution and conservation of vertebrate splicing regulatory elements and relevant trans-factors given the availability of large-scale genomic data for Homo Sapiens, Mus musculus and Fugu rubripes.  
 Long abstract
 
 
 
 |  
| H-23  Support Vector Machine Approach for Cancer Detection using Amplified Fragment Length Polymorphism (AFLP) Screening Method Waiming KONG1, Lawrence THAM2, Kee Yew Wong, Patrick Tan, Keng Wah CHOO
 1KONG_WAI_MING@nyp.gov.sg, Nanyang Polytechnic; 2wm_tham@hotmail.com, Nanyang Polytechnic
 Correspondence address: kongwm@hotmail.com
 
 
We investigated on the novel use of Amplified Fragment Length Polymorphism screening in the diagnosis and classification of cancers using a set of 58 gastric tumor and 16 normal genomic DNA samples.  
The result shows that SVM can be used to differentiate cancer from non-cancer tissues with high accuracy.  
 Long abstract
 
 
 
 |  
| H-25  An automated protocol for membrane protein prediction and annotation Melissa J. Davis1, Zheng Yuan, Shane Fashang Zhang and Rohan D. Teasdale
 1m.davis@imb.uq.edu.au, Institute of Molecular Biosciences
 Correspondence address: m.davis@imb.uq.edu.au
 
 
In order to annotate the membrane organization of whole-proteome datasets, we have developed a consensus annotation protocol automated for high through-put analysis. This protocol predicts the presence of transmembrane domains, gpi-modifications and signal peptides. These features are combined to generate a prediction of membrane organization.
 Long abstract
 
 
 
 |  
| H-26  A Mathematical Model for Protein Folding Yi Fang1, Warren Kaplan2
 1yi@maths.anu.edu.au, CBiS, ANU; 2w.kaplan@garvan.org.au, Garvan Institute
 Correspondence address: yi@maths.anu.edu.au
 
 
We mimic the major geometric features of the native structures of globular proteins: compactness, hydrophobic core, and smaller surface area.  We hypothesize that the native structure of a globular protein
should minimize all above three geometric features simultaneously and 
coherently among all conformations satisfying a relaxed steric condition.
 Long abstract
 
 
 
 |  
| H-27  Reparametrizing loop entropy weights: effect on DNA melting curves Ralf Blossey1, Enrico Carlon2
 1blossey@bioinf.uni-sb.de, IRI; 2carlon@lusi.uni-sb.de, IRI
 Correspondence address: blossey@bioinf.uni-sb.de
 
 
We report an analysis of melting curves for genomic DNA. Our in-house
software employs novel estimates for the weights of loop entropy factors. As test-cases, we studied D. Discoideum and synthetic sequences inserted in a linearized plasmid to compare with experiment. We find that the cooperativity parameter may be one order of magnitude larger than its consensus value.
 Long abstract
 
 
 
 |  
| H-28  Computational Analysis of Homeodomain Protein Interaction Interfaces Christopher Warren1, Mary Brezinski, Aseem Ansari
 1clwarren2@wisc.edu, University of Wisconsin - Madison
 Correspondence address: clwarren2@wisc.edu
 
 
	Ubx and Exd are homeodomain transcriptional factors in Drosophila that cooperatively bind DNA to regulate cell differentiation.  Using FADE, we discovered an interaction between these proteins that is nearly absent in the human homologs.  Through chemical analysis we find that this interaction may increase binding in fly, but don't expect this in human.
 Long abstract
 
 
 
 |  
| H-29  Identifying Bacterial Outer Membrane Proteins using Frequent Subsequences - A Data Mining Approach Rong She1, Fei Chen2, Ke Wang, Martin Ester, Jennifer L. Gardy, Fiona S.L. Brinkman
 1rshe@cs.sfu.ca, School of Computing Science, Simon Fraser University; 2fchena@cs.sfu.ca, School of Computing Science, Simon Fraser University
 Correspondence address: jlgardy@sfu.ca
 
 
Outer membrane proteins (OMPs) of bacteria are medically important as drug targets. We developed two OMP predictors based on frequent subsequences studied in data mining. Both significantly outperformed the state-of-the-art method and one also produced explicit patterns of OMPs that can be used for further biological analysis.
 Long abstract
 
 
 
 |  
| H-30  A New Hybrid Haplotype Inference Method based-on Maximum Likelihood Estimation Ho-Youl Jung1, Gil-Mi Ryu2, Jee-Yeon Heo, Ju-Young Lee, Hyo-Mi Kim, Jong-Keuk Lee, Chan Park, Bermseok Oh, and Kuchan Kimm
 1hyjung@ngri.re.kr, National Genome Research Institute; 2gmryu@ngri.re.kr, National Genome Research Institute
 Correspondence address: hyjung@ngri.re.kr
 
 
This article presents a hybrid method that can identify the individual's haplotype from the given genotypes. Our method combines statistical and computational approaches in order to increase the accuracy. The individuals' haplotypes are resolved by considering the MLE (maximum likelihood estimation) in the process of computing the frequencies of the common haplotypes.
 Long abstract
 
 
 
 |  
| H-31  Prediction of snoRNAs in Human Genome Sagara Jun-Ichi1, Asai Kiyoshi2, Nakamura Shugo, Kenmochi Naoya
 1jun@ni.aist.go.jp, CBRC; 2, CBRC
 Correspondence address: jun@ni.aist.go.jp
 
 
We predict snoRNAs in human genome using several methods for sequence analysis. We also develop a Predicted Human Intron database produced from exons predicted by Gene Decoder which is a gene finding technology based on HMMs. We show the results of prediction of snoRNAs and the databases of human introns.
 Long abstract
 
 
 
 |  
| H-32  PIVS: Protein-protein interaction inference and visualization system using sequence-based homology search with DIP and BIND Ki-Bong Kim1, Mi-Kyung Lee2, Seo Hwajung
 1kbkim@bioinfo.smallsoft.co.kr, SmallSoft Co., Ltd.; 2mklee@bioinfo.smallsoft.co.kr, SmallSoft Co., Ltd.
 Correspondence address: kbkim@bioinfo.smallsoft.co.kr
 
 
We developed the system, PIVS, which is very useful for predicting the function and interaction of unknown protein sequence as well as for visualizing its protein-protein interaction map. In addition, it offers integral genomic and motif/domain-related information concerning unknown input protein sequence.
 Long abstract
 
 
 
 |  
| H-33  Correlated Feature Extraction for Classification of Microarray and Mass Spectroscopy Data Christopher Bowman1, Richard Baumgartner2, Ray Somorjai
 1Christppher.Bowman@nrc-cnrc.gc.ca, Institute for Biodiagnostics; 2Richard.Baumgartner@nrc-cnrc.gc.ca, Institute for Biodiagnostics
 Correspondence address: Christopher.Bowman@nrc-cnrc.gc.ca
 
 
We present a novel correlation based technique for unsupervised feature extraction in large datasets.  The algorithm selects features based on their redundancy and unlike PCA, preserves the spatial information in the data, allowing one to easily interpret the extracted features.  We demonstrate that classification accuracy on the reduced feature data is comparable to that on the original data.
 Long abstract
 
 
 
 |  
| H-34  Approaches for Predicting Protein-Protein Interaction Residues from Amino Acid Sequences Changhui Yan1, Vasant Honavar2, Drena Dobbs
 1chhyan@iastate.edu, Iowa State University, IA, USA; 2honavar@cs.iastate.edu, Iowa State University, IA, USA
 Correspondence address: chhyan@iastate.edu
 
 
We have used support vector machines and Naive Bayes methods to classify protein surface residues into interface residues and non-interface residues based on the sequence neighbors of target residues. The results showed that both methods are able to successfully discover and use sequence neighbor features predictive of functional properties to identify interface residues.
 Long abstract
 
 
 
 |  
| H-35  Determination of sub-cellular localization of membrane proteins Kevin C Miranda1, Rajith Aturaliya2, Melissa Davis, Zheng Yuan, Cameron Flegg and Rohan Teasdale
 1k.miranda@imb.uq.edu.au, University of Queensland; 2r.aturaliya@imb.uq.edu.au, University of Queensland
 Correspondence address: k.miranda@imb.uq.edu.au
 
 
The RIKEN Representative Protein Set was subdivided into groups based on signal peptide and transmembrane domain combination.  Functional analysis using InterPro and SCOP was performed on three sub-groups: type I and II transmembrane and secreted proteins.  The sub-cellular localization of type II proteins were computationally predicted and tested in vivo.  
 Long abstract
 
 
 
 |  
| H-36  A Statistical Model of Protein Sequences in Interaction Networks and Its Solution via Gibbs Sampling David J. Reiss1, Benno Schwikowski2, Andrew F. Siegel, Stanley Fields
 1dreiss@systemsbiology.org, Institute for Systems Biology; 2benno@systemsbiology.org, Institute for Systems Biology
 Correspondence address: dreiss@systemsbiology.org
 
 
We describe a novel statistical model of protein sequences and interaction networks that utilizes discriminative and informed priors, and apply it via a Gibbs sampling algorithm to the experimental SH3 domain-peptide interaction network of Tong et al (2001). Our results reveal that such interaction networks can, to a large degree, be modelled by this technique.
 Long abstract
 
 
 
 |  
| H-37  Ligand specificity of proteases and Kinases: an applicationto IC50 prediction on a large scale Shandar Ahmad1, Koji Kitajima2, Akinori Sarai
 1shandar@bse.kyutech.ac.jp, Department of Biochemical Enigineering and Science, Kyushu Institute of Technology, Japan; 2,
 Correspondence address: sarai@bse.kyutech.ac.jp
 
 
We have attempted neural-network-based predictions of inhibition
coefficient (IC50) from the SMILES of ligands for kinases and proteases
in the protein-ligand interaction database, ProLINT. Our method is
useful for a large scale filtering of ligands in drug-design. We have
also attempted to develop ligand fragment-signature for proteins in ProLINT.
 Long abstract
 
 
 
 |  
| H-38  EPP: Eukaryotic Promoter Prediction system using an efficient training approach Sang-Soo Yeo1, Sung-Kwon Kim2, Jung-Won Rhee, Kyoung-Rak Na
 1ssyeo@alg.cse.cau.ac.kr, Chung-Ang University, Seoul, Republic of Korea; 2skkim@cau.ac.kr, Chung-Ang University, Seoul, Republic of Korea
 Correspondence address: ssyeo@alg.cse.cau.ac.kr
 
 
EPP is a eukaryotic promoter prediction system. In EPP, after training set is divided into many clusters, each cluster is separately trained to make a decision model. This approach enhances the sensitivity and specificity of EPP. EPP is available at http://epp.cau.ac.kr Long abstract
 
 
 
 |  
| H-39  Multiplexed SBE primer design for highly polymorphic loci. Greg Tyrelle1, Daniel Di Giusto2, Garry C. King
 1greg@kinglab.unsw.edu.au, UNSW; 2daniel@kinglab.unsw.edu.au, UNSW
 Correspondence address: daniel@kinglab.unsw.edu.au
 
 
Single base extension (SBE) is the most widely used SNP genotyping method. When multiplex SBE (MSBE) is applied to highly polymorphic regions, hybridisation to variable DNA may occur. We have developed an MSBE primer design algorithm that increases the coverage of MSBE for these regions by up to 20%.
 Long abstract
 
 
 
 |  
| H-40  Genefiler: High throughput genetic analysis. Raw data to analysed results with one click Paul Matthews1, I Findlay2, D Mouradov, BK Mulcahy
 1paul@agrf.org.au, AGRF; 2Ian@agrf.org.au, AGRF
 Correspondence address: paul@agrf.org.au
 
 
Current genotyping analysis packages provide basic genetic marker interpretation. However, many applications often require further manual specialised analysis, which is labour intensive and expensive. We developed Genefiler, software providing massive genotyping capability, giving diagnostic results from raw data, comprehensive project management, intuitive GUI and a flexible modular format.
 Long abstract
 
 
 
 |  
| H-41  QSAR Analysis of Transcription Factors Akinori Sarai1, Samuel Selvaraj2, Michael M. Gromiha, Hidetoshi Kono
 1sarai@bse.kyutech.ac.jp, KIT; 2sel_emi@yahoo.co.uk, Bharathidasan University
 Correspondence address: sarai@bse.kyutech.ac.jp
 
 
We have analyzed relationship between structure and function (activity) of transcription factors, based on two approaches: a knowledge-based approach, utilizing structural data of protein-DNA complexes, and computer simulations. We have examined the roles of structural deformation of DNA and cooperativity in target recognition, and predicted targets in yeast genome.
 Long abstract
 
 
 
 |  
| H-42  Improved Approach to Protein Identification Using Peptide Mass Fingerprint Won-A Joo1, Kap-Soon Noh2, Chan-Wha Kimm
 1wajoo0824@hanmail.net, Graduate School of Life Sciences and Biotechnology; 2, Graduate School of Life Sciences and Biotechnology
 Correspondence address: cwkim@korea.ac.kr
 
 
Peptide mass fingerprint (PMF) has been a useful method for rapid and high-throughput protein identification. In our study, we compared software used frequently to identify the proteins of Homo sapiens and Halobacterium salinarum. These attempts could provide more effective algorithm for protein identification of each species using PMF.
 Long abstract
 
 
 
 |  
| H-43  Optimizing the location and the number of the maximal scoring subsequences with constrained segment lengths with MaxSubSeq Piero Fariselli1, Pier Luigi Martelli2, Ivan Rossi and Rita Casadio
 1piero@lipid.biocomp.unibo.it, Department of Biology CIRB, University of Bologna; 2 Department of Biology CIRB, University of Bologna
 Correspondence address: piero@lipid.biocomp.unibo.it
 
 
 We describe a general dynamic programming-like algorithm (MaxSubSeq) specifically designed to optimise the number and length of segments with constrained length in a protein sequence. Our algorithm is independent of the underling predictive method and is available through the web interface at http://gpcr.biocomp.unibo.it 
 Long abstract
 
 
 
 |  
| H-44  Finding transcription regulatory elements, using transcription factor data base and genome comparison. Hiroshi Mizushima1, Kozo Kawahara2, Mitsuru Takatsu, Teruhiko Yoshida
 1hmizushi@ncc.go.jp, National Cancer Center Research Institute; 2kkawahara@w-fusion.co.jp, World Fusion Co.
 Correspondence address: hmizushi@ncc.go.jp
 
 
We have developed a Web based system for searching transcriptional regulatory elements in Human Genome Sequences using Transcription Factor Database (TFDB: which we have been maintaining) and Genome comparison. Combined with Microarray data, we will give some common regulatory elements between similar expression genes.
 Long abstract
 
 
 
 |  
| H-45  PathMiner: de novo Metabolic Pathway Synthesis McShan, D.1, Upadhyaya, M.2, Imran Shah
 1Daniel.McShan@uchsc.edu, UCHSC; 2Minesh.Upadhayaya@uchsc.edu, UCHSC
 Correspondence address: Daniel.McShan@uchsc.edu
 
 
This poster presents PathMiner, a computational framework for exploring metabolic pathways. To make inferences about pathways we abstract metabolism as
a state-space in which compounds are points and biotransformations are state-transitions. In this poster, we discuss applications of PathMiner to two quite
different biological problems. 
 Long abstract
 
 
 
 |  
| H-46  Polymorphism Prediction Angela Baldo1, J. Labate2
 1abaldo@pgru.ars.usda.gov, USDA ARS PGRU; 2jl265@cornell.edu, USDA ARS PGRU
 Correspondence address: abaldo@pgru.ars.usda.gov
 
 
Single nucleotide polymorphism (SNP) distribution has been demonstrated as nonrandom in the genomes of animals and plants.  While genetic variation is traditionally expected in noncoding regions, additional genetic feature have been correlated with SNPs.  We investigate whether this information might be used to predict regions of polymorphism.
 Long abstract
 
 
 
 |  
| H-47  On the Correspondence between Scoring Matrices and Binding Site Sequence Distributions Jan E. Gewehr1, Jan T. Kim2, Thomas Martinetz
 1gewehr@bio.informatik.uni-muenchen.de, Institute for Computer Science, Ludwig-Maximilians-University Munich, Theresienstr. 39, D-80333 Munich, Germany; 2kim@inb.uni-luebeck.de, Institute for Neuro- and Bioinformatics, University of Luebeck, Seelandstr. 1a, D-23569, Germany
 Correspondence address: gewehr@bio.informatik.uni-muenchen.de
 
 
Using maximum likelihood estimation, we analyze the correspondence 
between popular scoring matrix classifiers for binding site prediction  
and specific probability distributions of binding site sequences. 
For unknown distributions, the binding matrix is a good choice
since it achieves maximal specificity under the constraint that all known 
sequences are classified correctly.
 Long abstract
 
 
 
 |  
| H-48  In silico detection of CpG-island in plants Stephane Rombauts1, Kobe Florquin2, Rouze Pierre and Yves Van de Peer
 1strom@gengenp.rug.ac.be, University of Gent, dep. Plant Systems Biology; 2koflo@gengenp.rug.ac.be, University of Gent, dep. Plant Systems Biology
 Correspondence address: strom@gengenp.rug.ac.be
 
 
CpG-islands are considered evolutionary remnants linked to the regulation of genes. Compared to animal systems, plants have a higher number of genes encoding DNA-methyltransferases that show a broader specificity. In this work we explored the compositional landscape surrounding promoters that would enable the identification of distinct CpG or CpNpG-islands.
 Long abstract
 
 
 
 |  
| H-49  Predicting Chemical Carcinogenesis: Problem Representation Governs Model Performance Douglas W. Bristol1
 1bristol@niehs.nih.gov, NIH-NIEHS
 Correspondence address: bristol@niehs.nih.gov
 
 
The predictive performance and comprehensibility of 54 chemical-carcinogenicity models, generated by three Predictive-Toxicology Challenges, was evaluated using ROC convex-hull analysis. Models from problem representations that used a mix of attributes, reflecting interactions between chemical and biological features, clearly outperformed those derived using only attributes of chemical structure or biological features.
 Long abstract
 
 
 
 |  
| H-50  Predicting accuracy of comparative gene finders using evolutionary models Vladimir Pavlovic 1, Lingang Zhang 2, Charles Cantor and Simon Kasif
 1vladimir@cs.rutgers.edu, Rutgers University 2 zlg@bu.edu, Boston University
 Correspondence address: vladimir@cs.rutgers.edu
 
 
Comparative computational analysis can lead to improved identification
of genes, while relying on a given pair of genomes, such as human
and mouse.  We propose a formal way to select an optimal pair of genomes
by linking Markov models of molecular evolution to comparative HMMs and
studying their prediction accuracy.
 Long abstract
 
 
 
 |  
| H-51  The Prion Paradox: Infection or PolymerizationNULL Jan C. Biro1
 1Jan.biro@kbh.ki.se, Homulus Informatics
 Correspondence address: jan.biro@kbh.ki.se
 
 
There is a weak but significant similarity between the prion protein (PrP) and some transcription factors and Zn-finger proteins. A molecular model of the Cu++-binding monomeric (normal, cytoplasmic) PrPC and the Cu++-stabilized polymeric (scrapie - pathogenic) PrPSc is presented and is called the CUPRION model.
 Long abstract
 
 
 
 |  
	| Sequence Comparison
 |  
| I-1  The distance function for computing the continuous distance of biopolymer sequences G.H. Hakobyan1, T.V. Margaryan2
 1gaghakob@ysu.am, YSU; 2
 Correspondence address: gaghakob@ysu.am
 
 
In some applications of sequence comparison theories the actual items
to be compared are not successions of discrete elements, but
"continuous" functions of a continuous argument. The present paper is
aimed to construct a "continuous" distance function with the help
of the given "distance" matrix D.
 Long abstract
 
 
 
 |  
| I-2  Species-Specific Substitution Matrices Michel Dumontier1, Christopher W.V. Hogue2
 1micheld@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5; 2hogue@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5
 Correspondence address: micheld@mshri.on.ca
 
 
We derived and tested novel species-specific substitution matrices (SSSMs) for sequence alignment.  Our results show increased alignment accuracy in the 20-30% sequence identity range, but decreased alignment length as compared to popular sequence alignment programs such as PSI-BLAST (using BLOSUM) using CASA’s SCOP based alignment test sets.
 Long abstract
 
 
 
 |  
| I-3  Secondary structure interpretation of genetic sequence variation in Plasmodium falciparum cell surface antigens Stanley Adoro1, Roseangela Nwuba, Chiaka Anumudu, Mark Nwagwu
 1stanleyadoro@hotmail.com, University of Ibadan
 Correspondence address: stanleyadoro@hotmail.com
 
 
We have analyzed genetic and amino acid residue variation of Plasmodium falciparum cell surface antigens (merozoite surface proteins 1 and 2; circumsporozoite protein; stevor and rifin) in the context of their known or predicted secondary structures. Locations of structural motifs suggest the presence of functional domains or antigenic epitopes.
 Long abstract
 
 
 
 |  
| I-4  IMPROVING SEQUENCE ASSEMBLIES USING HIGH-QUALITY OVERLAPS Michael Roberts1, James Yorke2, Brian Hunt, Wayne Hayes, Aleksey Zimin, Cevat Ustun,  Paul Havlak
 1tri@ipst.umd.edu, University of Maryland; 2yorke@ipst.umd.edu, University of Maryland
 Correspondence address: wayne@cs.toronto.edu
 
 
Finishing a genome costs as much as initial assembly. Since
initial assembly gets about 95% of it, gaining just a
few extra percent initially can save tens of millions of
dollars in finishing costs. We present computational techniques for improving
"overlaps" result in up to 5% additional sequence during initial assembly.
 Long abstract
 
 
 
 |  
| I-5  The Bielefeld University Bioinformatics Server Alexander Sczyrba1, Jan Krueger2, Robert Giegerich
 1asczyrba@techfak.uni-bielefeld.de, Bielefeld University, Germany; 2jkrueger@techfak.uni-bielefeld.de, Bielefeld University, Germany
 Correspondence address: asczyrba@techfak.uni-bielefeld.de
 
 
The Bielefeld University Bioinformatics Server (BiBiServ),
http://bibiserv.techfak.uni-bielefeld.de, supports Internet-based
collaborative research and education in bioinformatics.  Currently, 15
software tools and various educational media are available. These
include tools from different areas such as Genome Comparison,
Alignments, Primer Design, RNA Structures, and Evolutionary
Relationships. In 2002 approximate 14.000 users per month used
BiBiServ services with rising tendency.
 Long abstract
 
 
 
 |  
| I-6  Evolutionary significance of G1/S checkpoint among Eukaryotes Keng Hwa Tan1, Pawan Dhar2
 1eric@bii.a-star.edu.sg, BioInformatics Institute; 2pk@bii.a-star.edu.sg, BioInformatics Institute
 Correspondence address: eric@bii.a-star.edu.sg
 
 
G1/S checkpoint plays a pivotal role during early stages of the cell cycle.  Progression through G1/S boundary is controlled by a series of regulators.  To explore the roles of these regulators and identify targets of CDKs, bioinformatics analysis are being done on DNA and protein sequences from different eukaryotic organisms.
 Long abstract
 
 
 
 |  
| I-7  Efficient algorithms for sequence comparison and overlap
detection - Correcting errors in shotgun sequence reads Martti T. Tammi1, Erik Arner2,  Ellen Kindlund, Björn Andersson
 1martti.tammi@cgb.ki.se, Center for Genomics and Bioinformatics, Karolinska Institutet; 2erik.arner@cgb.ki.se, Center for Genomics and Bioinformatics, Karolinska Institute
 Correspondence address: martti.tammi@cgb.ki.se
 
 
We developed a rapid approximate pattern matching algorithm and a linear time algorithm for multiple alignment construction, with no previous pairwise matching of sequences required. These are implemented in a program for shotgun sequence error correction able to correct 99% of sequencing errors. This is significantly better than previous methods.
 Long abstract
 
 
 
 |  
| I-8  Parallel Implementation of Hmm-pfam on EARTH platform Using THREADED-C Weirong Zhu1, Yanwei Niu2, Jizhu Lu, Guang R. Gao
 1weirong@capsl.udel.edu, University of Delaware; 2niu@capsl.udel.edu, University of Delaware
 Correspondence address: weirong@capsl.udel.edu
 
 
A parallel HMM-pfam is implemented on EARTH - an
event-driven fine-grain multi-threaded program execution model. It demonstrated
significant performance improvement over another PVM based version. On a
cluster of 128 dual-CPU nodes, the execution time of a representative testbench
is reduced from 15.9 hours to 4.3 minutes.
 Long abstract
 
 
 
 |  
| I-9  Human and Mouse Genome Comparison Using Genome-Wide Unique Sequences Ben-Yang Liao1, Yu-Jung Chang2, Jan-Ming Ho and Ming-Jing Hwang
 1liaoby@gate.sinica.edu.tw, Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan; 2yjchang@iis.sinica.edu.tw, Institute of Information Science, Academia Sinica, Taipei, Taiwan
 Correspondence address: yjchang@iis.sinica.edu.tw
 
 
We used coexistent genome-wide unique sequences in human and mouse genomes to recognize their homologous regions. The resulting syntenic map revealed more than 400 conserved segments and covering more than 90% of both genomes. This alignment-free method is capable of comparing two mammalian genomes within hours on one personal computer.
 Long abstract
 
 
 
 |  
| I-10  Long-range correlation in protein sequences and its implication Kazuhito Shida1, Makoto Ikeda2, Atsuo Kasuya
 1shida@cir.tohoku.ac.jp, CIR Tohoku University; 2ikeda@imr.edu, CIR Tohoku University
 Correspondence address: shida@cir.tohoku.ac.jp
 
 
Long-range correlations in  
the amino-acid sequences of natural proteins
are extracted from GanBank sequences.
Some bigrams 
with more than one letter gaps turned out to be clearly over-represented.
This data may improve the assessment of 
alignment quality, phylogeny analyses, 
and perhaps database searches. 
 Long abstract
 
 
 
 |  
| I-11  Implementing the Smith-Waterman Algorithm on a Reconfigurable Computer Gianpaolo Gioiosa1, David Kearney2
 1GIOGY001@students.unisa.edu.au, University of South Australia; 2David.Kearney@unisa.edu.au, University of South Australia
 Correspondence address: GIOGY001@students.unisa.edu.au
 
 
The Smith-Waterman algorithm is a dynamic-programming algorithm that finds the optimal alignment between two biological sequences.  The algorithm was implemented on a field 
programmable gate array (FPGA) in order to investigate the advantages and disadvantages of using reconfigurable computers and associated tools for computationally intensive sequence analysis applications.
 Long abstract
 
 
 
 |  
| I-12  Multiple alignments of sequences and structures using T-Coffee Orla OSullivan 1, Desmond Higgins2, Cedric Notredame
 1ojos@student.ucc.ie, University College Cork; 2 University College Cork
 Correspondence address: ojos@student.ucc.ie
 
 
T-Coffee is a novel method for multiple sequence alignment that allows you to combine heterogeneous sources of data to produce very accurate alignments. In this poster we look at the effects of mixing sequence and structural information using two structural alignment programs, SAP and FUGUE, in combination with T-Coffee.
 Long abstract
 
 
 
 |  
| I-13  Correlation between antisense activity and RNA secondary structure Li Liao1, Zhongwei Li2
 1lliao@cis.udel.edu, University of Delaware; 2zli@fau.edu, Florida Atlantic University
 Correspondence address: lliao@cis.udel.edu
 
 
Correlation between activity of antisense oligonucleotides and local structural features of target RNAs is studied. Statistical analysis showed that high activity of antisense oligonucleotides is more likely to occur in regions of target RNA having hairpin or multi-branched loops with flanking stems. 
 Long abstract
 
 
 
 |  
| I-14  Alexa: an improved EST and genomic sequence alignment tool Miao Zhang1, Warren Gish2
 1mzhang@sapiens.wustl.edu, Department of Genetics and Department of Biomedical Engineering, Washington University; 2gish@watson.wustl.edu, Department of Genetics , Washington University
 Correspondence address: mzhang@sapiens.wustl.edu
 
 
To better align EST and genomic sequence, we developed a tool named Alexa, which incorporates a splice site model into the recursive dynamic programming equation. 
For reduced memory consumption and increased speed, Alexa can be guided by an input file produced by WU-BLASTN. Alexa is available at http://sapiens.wustl.edu/alexa.
 Long abstract
 
 
 
 |  
| I-15  ASAD-A Sequence Attribute Display tool Keith Satterley1
 1keith@wehi.edu.au, The Walter and Eliza Hall Institute of Medical Research
 Correspondence address: keith@wehi.edu.au
 
 
ASAD is A Sequence Attribute Display tool. It is written as a set of Excel macros. It builds on the familiar Excel interface to allow for flexible and efficient display of attributes (hydrophobicity etc.) using similar colours and styles for one sequence or multiple aligned sequences. ASAD is available at ftp://ftp.wehi.edu.au/pub/biology/ASAD. 
 Long abstract
 
 
 
 |  
| I-16  Detection of false positive results from PSI-BLAST N. Faux1, M. Cameron2, M. Garcia de la Banda, J.C. Whisstock
 1noel.faux@med.monash.edu.au, Department of Biochemistry and Molecular Biology. Monash University; 2mcam@csse.monash.edu.au, School of Computer Science and Software Engineering. Monash University
 Correspondence address: noel.faux@med.monash.edu.au
 
 
PSI-BLAST is a sensitive and fast database search algorithm, however false positive results can taint the final results. We are currently investigating ways of detecting when the matrix generated by PSI-BLAST no longer describes the original query sequence or family.
 Long abstract
 
 
 
 |  
| I-17  Use of motif analysis programs to identify putative regulatory elements in the orthologous human promoters of co-regulated bovine genes Amonida Zadissa1, John McEwan2, Chris Brown
 1amonida@sanger.otago.ac.nz, Biochemistry Department, Otago University; 2john.mcewan@agresearch.co.nz, AgResearch
 Correspondence address: amonida@sanger.otago.ac.nz
 
 
We aim to identify putative regulatory elements in the orthologous human promoters of co-regulated bovine genes. Promoters were extracted and Motif prediction was performed by MEME. The MatInspector Professional software and BioPerl programs were used to assess the results. Novel putative elements were identified in the human promoters.
 Long abstract
 
 
 
 |  
| I-18  LAGAN2: Probabilistic Global Alignment of DNA Under Multiple Conservation Models Chuong B. Do1, Michael Brudno2, Serafim Batzoglou
 1chuong.do@stanford.edu, Stanford University; 2brudno@cs.stanford.edu, Stanford University
 Correspondence address: chuong.do@stanford.edu
 
 
LAGAN2 is a probabilistic method for limited area global nucleotide alignment of distantly related species.  The algorithm incorporates multiple conservation models for protein-coding, non-coding, and unconstrained alignments, approximate logarithmic gap penalties, and Hidden Markov Model based training of alignment parameters through expectation-maximization.  Additional information is available at http://lagan.stanford.edu.
 Long abstract
 
 
 
 |  
| I-19  Frequency enumeration of DNA subsequences from large-scale sequences using linear codes Yoichi Takenaka1, Hideo Matsuda2
 1takenaka@ist.osaka-u.ac.jp, Department of Bioinformatic Engineering, Osaka Univ. Japan; 2matsuda@ist.osaka-u.ac.jp, Department of Bioinformatic Engineering, Osaka Univ. Japan
 Correspondence address: takenaka@ist.osaka-u.ac.jp
 
 
Frequency enumeration of the DNA subsequence is one of the important techniques. The algorithm is simple, but it takes enormous memory space. We propose an enumerate method that uses linear codes to reduce the memory space. The method with 2-ary (31,26) Hamming code requires 1/60 of the usual memory space. 
 Long abstract
 
 
 
 |  
| I-20  SeqFreq: A Statistical Repetitive Motif Discovery Tool Roger Craig1, Li Liao2, Javier Garcia-Frias, Adam Marsh
 1rcraig@eecis.udel.edu, University of Delaware; 2lliao@cis.udel.edu, University of Delaware
 Correspondence address: lliao@cis.udel.edu
 
 
SeqFreq is a repetitive motif discovery tool for finding repeats in both intragenomic and intergenomic sequences. SeqFreq uses a numerical suffix tree method to enumerate repetitive n-mers. Currently, intragenomic n-mer repeats of multiple bacterial genomes and their statistical distributions have been analyzed.
 Long abstract
 
 
 
 |  
| I-21  DSC : Efficient Primer design algorithm with partial order graphs Yu-Cheng Huang1, Ming-Hui Jin 2, Cheng-Yan Kao
 1r91021@csie.ntu.edu.tw, NTU. Computer Science and Information Engineering Department; 2jinmh@db.csie.ntu.edu.tw, Bioinformatics Research Center, National Taiwan University, Taiwan
 Correspondence address: jinmh@db.csie.ntu.edu.tw
 
 
A novel method called DSC (Difference String Comparison) is proposed for speeding up the primer finding procedure. DSC method presents a partially ordered graph of DNA sequences and also reserves all information of sequences. 
 Long abstract
 
 
 
 |  
| I-22  Analysis of corona virus genome sequences Jingchu Luo1
 1luojc@pku.edu.cn, Centre of Bioinformatics, Peking University
 Correspondence address: luojc@pku.edu.cn
 
 
ClustalW analysis of 27 corona virus genome sequences reveals that 9 SARS virus sequences are identical. Multiple sequence alignment was also performed to the different groups of corona virus genome sequence and coding product to find conservative and divergent regions. All the analysis results are available at ftp://ftp.cbi.pku.edu.cn/pub/sars/analysis/.
 Long abstract
 
 
 
 |  
| I-23  BlastNP: A new sequence similarity searching and visualization method Jan C. Biro1
 1jan.biro@kbh.ki.se, Homulus Informatics
 Correspondence address: jan.biro@kbh.ki.se
 
 
An alternative method to TblastX has been developed, known as blastNP. Nucleic acids in database and query sequences were translated into overlapping protein-like sequences (overlappingly translated sequences,  OTSs) before searching with blastP. Thus, each nucleic acid sequence is represented by a single “protein like” sequence (instead of three reading frames).
 Long abstract
 
 
 
 |  
| I-24  A Shannon entropy-based filter improves the detection  of high quality profile-profile alignments in remote-homologous searching. Emidio Capriotti1, Ivan Rossi2, Piero Fariselli and Rita Casadio
 1emidio@biocomp.unibo.it, CIRB Biocomputing Unit & BioDec srl; 2ivan@biocomp.unibo.it, CIRB Biocomputing Unit & BioDec srl
 Correspondence address: ivan@biocomp.unibo.it
 
 
An analysis of the quality of the profile--profile alignments generated by a BASIC-like algorithm highlights that Shannon entropy can be used to filter out most of the bad high-scoring alignments, enhancing its reliability in the detection of remote homology.
When entropy-filtering is used, the best-scoring alignments are comparable to that obtained by the CE structural alignment algorithm.
 Long abstract
 
 
 
 |  
	| Structural Biology
 |  
| J-1  Analyzing Protein Structure-Function Correlations Using Statistical Geometry Majid Masso1, Losif Vaisman2
 1mmasso@gmu.edu, George Mason University; 2ivaisman@gmu.edu, George Mason University
 Correspondence address: mmasso@gmu.edu
 
 
An approach based on computational geometry is used to elucidate structural changes in a HIV-1 protease monomer caused by dimerization and inhibitor binding. A comprehensive mutational analysis of HIV-1 protease is also performed using this method and reveals a strong structure-function correlation.
 Long abstract
 
 
 
 |  
| J-2  http://globplot.embl.de - Exploring protein sequences for globularity and disorder Rune Linding1, Rob Russell2, Victor Neduva, Toby Gibson
 1linding@embl.de, EMBL; 2russell@embl.de, EMBL
 Correspondence address: linding@embl.de
 
 
We present here a new tool for the discovery of unstructured, or disordered regions within proteins. GlobPlot http://globplot.embl.de is a web service that allows the user to plot the tendency within the query protein for order/globularity and disorder.
 Long abstract
 
 
 
 |  
| J-3  Protein unfolding governed by geometric and steric principles Howard J Feldman1, Christopher WV Hogue2
 1feldman@mshri.on.ca, Samuel Lunenfeld Research Institute, Mount Sinai Hospital; 2hogue@mshri.on.ca, Samuel Lunenfeld Research Institute, Mount Sinai Hospital
 Correspondence address: feldman@mshri.on.ca
 
 
Using a novel physics-based approach, complete protein unfolding pathways for five distinct folds were computed and compared with published molecular dynamics and in-vitro experiments.  Agreement was good and computation reduced by three orders of magnitude, suggesting that unfolding pathways of small globular proteins are largely constrained by geometry and sterics.
 Long abstract
 
 
 
 |  
| J-4  Semi-Automated Homology Modeling Using A Modified TRADES Algorithm Michel Dumontier1, Howard J. Feldman2, Christopher W.V. Hogue
 1micheld@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5; 2feldman@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5
 Correspondence address: micheld@mshri.on.ca
 
 
We present a modified version of the TRADES algorithm for homology modeling of protein sequences given weak similarity to experimentally determined protein structure that generates realistic, all-atom models of non-idealized geometry as it incorporates backbone dependent rotamers, reasonable bond lengths, bond angles, torsion angles, minimized electrostatics and van der Waals forces.
 Long abstract
 
 
 
 |  
| J-5  Role of Long-range Interactions in the Transition and Folded Native states of Two-state Proteins M. Michael Gromiha1, S. Selvaraj2
 1michael-gromiha@aist.go.jp, CBRC, AIST, Tokyo, Japan;
 2 Department of Physics, Bharathidasan University, Tiruchirapalli, India
 Correspondence address: michael-gromiha@aist.go.jp
 
 
We have proposed a novel parameter, long-range order (LRO) for a protein from the knowledge of long-range contacts in protein structure. LRO correlates very well with experimental protein folding rates. The short and medium-range non-bonded energy, long-range contacts, and helical/strand tendency are the major determinants for transition state structures of two-state proteins. 
 Long abstract
 
 
 
 |  
| J-6  Homology modeling of dihydropteroate synthase from plasmodium falciparum T de Beer1, F Joubert2, AI Louw
 1tjaart.de.beer@tuks.co.za, University of Pretoria; 2fjoubert@postino.up.ac.za, University of Pretoria
 Correspondence address: tjaart.de.beer@tuks.co.za
 
 
A homology model was constructed for the DHPS-PPPK bifunctional enzyme of Plasmodium falciparum. This enzyme plays a vital part in folate synthesis and is targeted by current, failing therapies. The LUDI/ACD and the NCI database was scanned against the models to identify new potential inhibitors. Long abstract
 
 
 
 |  
| J-7  3D circle fitting of leucine-rich repeat proteins Purevjav Enkhbayar1, Norio Matsushima2, Mitsuru Osaki
 1penkh@chem.agr.hokudai.ac.jp, Hokkaido University; 2matusima@sapmed.ac.jp, Sapporo Medical University
 Correspondence address: penkh@chem.agr.hokudai.ac.jp
 
 
Three dimensional circle fitting using atomic coordinate was performed to all known structures of leucine-rich repeat (LRR)-containing proteins. The analysis results indicate that there is a regular relationship between the radius of the LRR arc and the rotation angle about the central axis of the arc per repeating unit.
 Long abstract
 
 
 
 |  
| J-8  Protein Folding Simulations by Combining Tabu Search with Genetic Algorithms Based on HP Model Tianzi Jiang1, Qinghua Cui2, Guihua Shi, Songde Ma
 1jiangtz@nlpr.ia.ac.cn, Chinese Academy of Sciences; 2qhcui@nlpr.ia.ac.cn, Chinese Academy of Sciences
 Correspondence address: jiangtz@nlpr.ia.ac.cn
 
 
A hybrid algorithm combining genetic algorithm and tabu search is presented. The hybrid algorithm can be successfully applied to protein folding based on a hydrophobic-hydrophilic model. A protein structure database is also created. The results indicate that in all cases our method works better than genetic algorithm or tabu search alone.
 Long abstract
 
 
 
 |  
| J-9  Relationship of the enzyme functional classes with the statistical attributes of their secondary structures Rekha Iyer1, Sudhir Kumar2
 1rekha_siyer@yahoo.com, Department of Biology, Arizona State University; 2s.kumar@asu.edu, Center for Evolutionary Functional Genomics, Arizona Biodesign Institute,and the Department of Biology, Arizona State University
 Correspondence address: rekha_siyer@yahoo.com
 
 
In order to explore the relationship of protein function with secondary structural attributes, we compared simple statistical attributes of secondary structural elements in three major enzyme classes. We tested the usefulness of these inferences in predicting the enzyme functional class using neural networks. The strategy presented may help facilitate broad functional annotation of unknown proteins.
 Long abstract
 
 
 
 |  
| J-10  The analysis and prediction of protein-protein interacting sites. Asako Koike1, Toshihisa Takagi2
 1akoike@ims.u-tokyo.ac.jp, Human Genome Center, The Institute of Medical Science, Univ. of Tokyo; 2takagi@ims.u-tokyo.ac.jp, Human Genome Center, The Institute of Medical Science, Univ. of Tokyo
 Correspondence address: akoike@ims.u-tokyo.ac.jp
 
 
We developed a prediction method for protein interaction sites using sequence profiles and accessible surface area of neighboring residues, surface patches, other physicochemical characteristics, and support vector machines. The relationship between the prediction accuracy and the characteristic of protein-protein interaction sites is discussed.
 Long abstract
 
 
 
 |  
| J-11  Incorporating Sequence and Biochemical Information in TOPS models - For Biologically Significant Pattern Matching and Pattern Discovery in Protein Mallika Veeramalai1, David Gilbert2, David R Westhead
 1mallika@dcs.gla.ac.uk, Bioinformatics Research Centre, Dept. of Computing Science, University of Glasgow; 2drg@dcs.gla.ac.uk,
 Correspondence address: mallika@dcs.gla.ac.uk
 
 
Incorporating sequence and biochemical features in TOPS (Topological Models of Protein
Structures) is significant for pattern matching and pattern discovery in protein structures. Interesting results would be valuable for efforts to predict protein structure and function from sequences. These problems remain key challenges of direct relevance to projects in structural and functional
genomics. TOPS is available at http://www.tops.leeds.ac.uk
 Long abstract
 
 
 
 |  
| J-12  A bioinformatic toolbox for postprocessing of MASCOT results and its application to the proteome of Halobacterium salinarum Carolina Garcia-Rizo1, Cristian Klein2, Pfeiffer, Siedler, Oesterhelt
 1rizo@biochem.mpg.de, Max-Planck Institute Biochemistry; 2klein@biochem.mpg.de, Max-Planck Institute Biochemistry
 Correspondence address: rizo@biochem.mpg.de
 
 
We present a bioinformatic toolbox for postprocessing of proteomic data obtained by peptide fingerprint analysis. This toolbox (i) increases the reliability of protein identifications, (ii) detects additional annotation information and (iii) corrects  or validates start codon assignments by gene finders. The toolbox was developed for and applied to the proteome of Halobacterium salinarum strain R1 (http://www.halolex.mpg.d)
 Long abstract
 
 
 
 |  
| J-13  Theoretical prediction of the feasibility of identifing membrane proteins by MALDI-TOF Carolina Garcia-Rizo1, Cristian Klein2, Pfeiffer, Siedler, Oesterhelt
 1rizo@biochem.mpg.de, Max-Planck Institute Biochemistry; 2klein@biochem.mpg.de, Max-Planck Institute Biochemistry
 Correspondence address: rizo@biochem.mpg.de
 
 
In the set of proteins of H. salinarum identified by MALDI-TOF peptide fingerprints, membrane proteins are severely underrepresented. Besides the technical problems encountered in their analysis by 2D gel electrophoresis, an ‘inherent data analysis problem’ is found by statistical analysis. The degree to which this effect aggravates problems in detection ratio of membrane proteins varies from organism to organism.
 Long abstract
 
 
 
 |  
| J-14  Correlated errors of neural network predictions improve fold recognition. Dariusz Przybylski1, Burkhard Rost2
 1dudek@cubic.bioc.columbia.edu, Columbia Univeristy; 2rost@columbia.edu, Columbia University
 Correspondence address: dudek@cubic.bioc.columbia.edu
 
 
We present a fold recognition method that uses predicted secondary structure and solvent accessibility in a way that significantly improves both specificity and sensitivity of fold assignments compared to PSI-BLAST. The method can readily be used for high quality/throughput database annotations.
 Long abstract
 
 
 
 |  
| J-15  Protein class recognition with neural networks Vadim Valuev1
 1valuev@bionet.nsc.ru, Institute of Cytology and Genetics
 Correspondence address: gease@mail.ru
 
 
There exist several classes of protein structure that are determined by predominance of an element of secondary structure. Their recognition, starting from aminoacid composition, gives some insight into the nature of this classification.
Class recognition methods are built by means of neural networks. The accuracy of recognition has reached 75%.
 Long abstract
 
 
 
 |  
| J-16  Quantifying similarities in proteins using features based on hydropathy distribution along the protein sequence Josef Panek1
 1j.panek@imb.uq.edu.au, University of Queensland
 Correspondence address: j.panek@imb.uq.edu.au
 
 
A computational approach is presented to explore relationships between functional specifity of proteins and hydropathy distribution in proteins. The approach uses features based on hydrophobicity of amino acids to model the hydropathy distribution in proteins. A feature space is employed to identify functional protein families and the features that are specific for the families.
 Long abstract
 
 
 
 |  
| J-17  Protein structure comparison based on profiles of topological motifs Juris Viksna1, David Gilbert2, Gilleain Torrance
 1jviksna@cclu.lv, Institute of Mathematics and Computer Science, University of Latvia; 2drg@brc.dcs.gla.ac.uk, Bioinformatics Research Centre, Department of Computing Science, University of Glasgow
 Correspondence address: drg@brc.dcs.gla.ac.uk
 
 
We present a new approach to protein structure comparison using the existing, on graph representations based, TOPS database. Instead of comparisons based on single patterns, we
use profiles of patterns, which allows in an indirect way to capture "negative" information. This leads to a significant increase in prediction accuracy.  
 Long abstract
 
 
 
 |  
| J-18  Into protein universe - a global representation of the protein fold space Jingtong Hou1, Gregory E. Sims2, Chao Zhang, Sung-Hou Kim
 1JTHou@lbl.gov, UC Berkeley; 2gsims1997@yahoo.com, UC Berkeley
 Correspondence address: JTHOU@LBL.GOV
 
 
A global view of the “protein structure universe” was constructed. Such a representation reveals a high-level organization of the fold space that is intuitively interpretable, offering an interesting perspective on both the demography and the evolution of protein structures. The protein fold space is available at  http://pro.lbl.gov/~jingtong/foldspace.
 Long abstract
 
 
 
 |  
| J-19  IDPharmo:  A Virtual High Throughput Screening System Based on Stuctural Biology and Cheminformatics. Jeong Hyeok Yoon1,  Jee Young Lee  2, Won Seok Oh, Doo Ho Cho, Jae Min Shin
 1yoon@idrtech.com, IDRTech. Inc; 2jyoung@idrtech.com, IDRTech. Inc
 Correspondence address: yoon@idrtech.com
 
 
IDPharmoTM integrates programs that can discover the potential lead compounds using protein structures and 3D chemicals database in silico.  This tool was evaluated for discovery of potential lead compound of various disease targets such as HIV-RT, PDE5, HCV pol and DPP IV. We could get new chemical entities whose biological activities are from IC50 50uM to 1uM
 Long abstract
 
 
 
 |  
| J-20  Differences in dynamics of dimeric and monomeric human prion protein revealed by molecular dynamics simulations Chie Motono1, Masakazu Sekijima2, Satoshi Yamasaki,Kiyotoshi Kaneko,and Yutaka Akiyama
 1c-motono@aist.go.jp, Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology; 2sekijima@cbrc.jp, Computational Biology Research Center , National Institute of Advanced Industrial Science and Technology
 Correspondence address: c-motono@aist.go.jp
 
 
We performed molecular dynamics simulations on monomeric and dimeric forms of human prion protein at various conditions (at 300K, 500K, acidic pH, or with a mutation D178N) for 10 ns to investigate the differences in the dynamics of each form. Most differences resulted from additional inter-subunit interactions of dimeric HuPrP.
 Long abstract
 
 
 
 |  
| J-21  Database of Three-Dimensional Exon Structures, SEDB Chesley Leslin1, Valentin Ilyin2, Alex Abyzov, Grigory Makarevich
 1chesley.leslin@verizon.net, Theoretical Molecular Biology and Bioinformatics Lab, Northeastern University, Boston, Massachusetts; 2valentin.ilyin@verizon.net, Theoretical Molecular Biology and Bioinformatics Lab, Northeastern University, Boston, Massachusetts
 Correspondence address: chesley.leslin@verizon.net
 
 
Expeditious mapping of exon borders and intron phase data onto structurally similar proteins presents a novel approach to studying protein structure evolution. We present SEDB, which allows researchers to examine and quickly identify where exon/intron boundaries are located and how these borders have possibly shaped current protein structure. http://glinka.bio.neu.edu/~cleslin/SEDB/SEDB.html
 Long abstract
 
 
 
 |  
| J-22  Contribution of Interhelical Weak Interactions to the Regulation of Protein-Gated Electron Transfer in the Membrane Milieu Ilan Samish1, Haim J. Wolfson2, Avigdor Scherz
 1ilan.samish@weizmann.ac.il, Weizmann Institute of Science; 2, Tel Aviv University
 Correspondence address: ilan.samish@weizmann.ac.il
 
 
The mechanism of protein-gated electron transfer between two quinones of photosystem II was investigated. Sequence and structural conservation of a high-packing motif including an intersubunit H-bond, combined with in-silico and in-vivo combinatorial mutagenesis and biophysical characterization of the H-bond donor suggests that reversible dissociation of the bond regulates the gating.   
 Long abstract
 
 
 
 |  
| J-23  Towards modulating protein-protein interactions: Clustering protein surfaces to identify biologically-relevant structural space to focus molecular design Stephen Long1, Mark Smythe2, Peter Adams, Darryn Bryant and Tran Trung Tran
 1sml@maths.uq.edu.au, School of Physical Sciences, Institute for Molecular Biosciences, The University of Queensland; 2M.Smythe@protagonist.com.au, Institute for Molecular Bioscience, The University of Queensland and Protagonist Pty. Ltd.
 Correspondence address: sml@maths.uq.edu.au
 
 
Identifying small molecules that modulate protein-protein interactions
continues to be a major challenge for drug discovery.
From a database of homologous protein-protein interactions,
datasets were extracted
representing the interaction region of pairs of proteins of these
complexes. The aim of this research is to cluster features
of each of these datasets.
 Long abstract
 
 
 
 |  
| J-24  Comparing Protein Structures with Constraints Su-Hyun Lee1, Jin-Hong Kim2, Geon-Tae Ahn, and Myung-Joon Lee
 1suhyun@sarim.changwon.ac.kr, Changwon National University, South Korea; 2avenue@ulsan.ac.kr, University of Ulsan, South Korea
 Correspondence address: suhyun@sarim.changwon.ac.kr
 
 
S4E, a protein structure comparison system using constraint technology, searches 
common substructures of secondary structure elements between two proteins given 
in PDB format. For fast comparison, we developed an efficient algorithm for 
constructing the compatibility graphs and applying constraint programming to 
finding maximal common subgraphs. Long abstract
 
 
 
 |  
| J-25  Molecular basis of ligand recognition in the human glucocorticoid receptor Johannes R.G. von Langen1, Stephan Diekmann2, Alexander Hillisch
 1langen@imb-jena.de, IMB-Jena; 2diekmann@imb-jena.de, IMB-Jena
 Correspondence address: langen@imb-jena.de
 
 
The glucocorticoid receptor selectively recognises cortisol with high affinity. We build a homology model of this receptor on the basis of the progesterone receptor x-ray structure and simulated binding of different steroids (estradiol, progesterone, testosterone, aldosterone and cortisol). Using molecular dynamics simulations and free energy calculations we were able reveal the molecular basis of ligand recognition. 
 Long abstract
 
 
 
 |  
| J-26  Prediction of disulfide connectivity patterns in proteins Shih-Chieh Chen1, Chi-Hung Tsai2, Huai-Kuang Tsai, Cheng-Yan Kao
 1r90039@csie.ntu.edu.tw, Bioinfo Lab., Department of CSIE, National Taiwan University; 2d90008@csie.ntu.edu.tw, Bioinfo Lab., Department of CSIE, National Taiwan University
 Correspondence address: cykao@csie.ntu.edu.tw
 
 
We propose an approach to predict the protein disulfide connectivity directly from the sequence. The proposed approach trained a SVR model and predicted the disulfide connectivity by Gabow’s algorithm. The experiments showed the proposed method has an accuracy of 62%, which is promising to locate the disulfide bridges in proteins.
 Long abstract
 
 
 
 |  
| J-27  Prediction of RNA Secondary Structures with XNAfold Yanming Zou1, Alan J. Hillier2, P. Scott Chandry
 1y.zou@pgrad.unimelb.edu.au, University of Melbourne; 2Alan.Hillier@foodscience.afisc.csiro.au, University of Melbourne
 Correspondence address: y.zou@pgrad.unimelb.edu.au
 
 
XNAfold is a JAVA-C hybrid program that takes an RNA sequence as input and predicts RNA secondary structure. The minimum free energy structure predicted by XNAfold matched the experimentally determined structure for 77 out of 133 different RNA molecules. XNAfold is freely available at http://www.student.unimelb.edu.au/~yanmz/index.html
 Long abstract
 
 
 
 |  
| J-28  Inside the beta sheet Charlotte Deane1
 1deane@stats.ox.ac.uk, Oxford University
 Correspondence address: deane@stats.ox.ac.uk
 
 
Beta sheets are one of the two common repeating elements found in protein structures. Despite their importance in structure they are not well or fully understood. 
Here we investigate the properties of beta sheets in order to better predict protein structure/folding and understand amyloid (and general aggregate) formation.
 Long abstract
 
 
 
 |  
| J-29  Prevalence and Molecular Characterisation of Campylobacter spp. from Free-Living Animals and Dairy Cattle Bijay Adhikari1, Joanne H Connolly2, Per Madie, Peter R Davies
 1bijayadhikari@hotmail.com, Department of Livestock Services, Kathmandu, Nepal; 2J.H.Connolly@massey.ac.nz, Institute of Veterinary, Animal and Biomedical Sciences, Massey University, Palmerston North, New Zealand
 Correspondence address: bijayadhikari@hotmail.com
 
 
Of the 290 samples collected, only Campylobacter jejuni was isolated. Highest isolation rate was from dairy cows (54%), followed by urban sparrows (40%), farm sparrows (38%), rodents (11%) and flies (9%). Molecular charecterisation with PFGE of campylobacters provided 22 restriction patterns, of which 7 were common to more than one source.
 Long abstract
 
 
 
 |  
| J-30  Modelling the main cysteine proteinase of the SARS virus Shoba Ranganathan1, Victor Joo Chuan Tong2
 1shoba@bic.nus.edu.sg, National University of Singapore; 2victor@bic.nus.edu.sg, National University of Singapore
 Correspondence address: shoba@bic.nus.edu.sg
 
 
We present a structural model of the SARS main cysteine proteinase (MPro), based on the X-ray structure of the transmissible gastroenteritis (corona)virus. From the SARS genome, peptides representing real MPro substrates have been docked into the active site. MPRo, with its docked ligands provides important clues for designing proteinase inhibitors.
 Long abstract
 
 
 
 |  
| J-31  A New Efficient Conformation Search Method for ab initio Protein Folding Jae-Min Shin1, Dai Sig Im2, Byungkook Lee
 1jms@idrtech.com, IDR Tech. Inc.; 2idscom@idrtech.com, IDR Tech. Inc.
 Correspondence address: jms@idrtech.com
 
 
Window Growth Evolutionary Algorithm (WGEA) has been developed for protein 3D structure prediction. In WGEA, locally favored structures, populated during initial search stages, are likely to survive and give more offspring structures to give final folded protein structures. By using RMS as a scoring function, WGEA successfully refolds many small proteins. 
 Long abstract
 
 
 
 |  
| J-32  The Evolutionary Search for an RNA Common-Structural Grammar Jin-Wu Nam1, Je-Gun Joung2, Byoung-Tak Zhang
 1jwnam@bi.snu.ac.kr, Graduate Program In Bioinformatics, Seoul National University; 2jgjoung@bi.snu.ac.kr, Graduate Program In Bioinformatics, Seoul National University
 Correspondence address: jwnam@bi.snu.ac.kr
 
 
We developed a system which could learn automatically the common grammar of RNA secondary structure. In this research, genetic programming has been applied to evolve function trees which were able to be converted into RNA structural grammars. We show results of learning common-structural grammar of tRNA and RNA pseudoknots.
 Long abstract
 
 
 
 |  
	| Systems Biology
 |  
| K-1  Regulation of Cytokines and G-protein gene expression by Cholera toxin Zafar Nawaz1, Bukhtiar H Shah2
 1zafarn1@hotmail.com, University of karachi; 2, Aga khan University
 Correspondence address: zafarn1@hotmail.com
 
 
The molecular mechanism of Cholera toxin (CTX) action on Gas and Gaq,  inflammatory cytokines and Nitric oxide gene regulation was studied in mice intestinal epithelial cells.
 Long abstract
 
 
 
 |  
| K-2  Shc-dependent pathway is redundant but dominant in MAPK cascade activation by EGF receptors: a computational result Yunchen Gong1, Xin Zhao2
 1ygong@po-box.mcgill.ca, McGill University; 2zhao@macdonald.mcgill.ca, McGill University
 Correspondence address: ygong@po-box.mcgill.ca
 
 
MAPK is activated by EGFR via Shc-dependent and Shc-independent pathways. Exploring a mathematical model revealed redundancy and dominance of the Shc-dependent pathway. Its dominance results from the majority consumption of the common precursor. Results imply that organisms may use the longer pathway rather than the shorter alternative pathway for signal transduction.
 Long abstract
 
 
 
 |  
| K-3  The Biology of Ageing e-Science Integration and Simulation (BASIS) System Darren Wilkinson1, Tom Kirkwood2, Richard Boys, Colin Gillespie, Carole Proctor, Daryl Shanley
 1d.j.wilkinson@ncl.ac.uk, University of Newcastle upon Tyne, UK; 2tom.kirkwood@ncl.ac.uk, Newcastle upon Tyne, UK
 Correspondence address: d.j.wilkinson@ncl.ac.uk
 
 
The BASIS project (www.basis.ncl.ac.uk
) will significantly extend the scope of current
integrative models of the ageing process and it will make these models
widely accessible. Accessibility will be achieved by developing a GRID
node where investigators can explore models and run simulations for
themselves.
 Long abstract
 
 
 
 |  
| K-4  Protein Feature Based Identification of Cell Cycle Regulated Proteins Ulrik de Lichtenberg1, Thomas Skøt Jensen2, Lars Juhl Jensen, Anders Fausbøll, Søren Brunak
 1ulrik@cbs.dtu.dk, Center for Biological Sequence Analysis, BioCentrum-DTU, The Technical University of Denmark; 2skot@cbs.dtu.dk, Center for Biological Sequence Analysis, BioCentrum-DTU, The Technical University of Denmark
 Correspondence address: ulrik@cbs.dtu.dk
 
 
DNA microarrays have been used extensively to identify cell cycle regulated genes, however, the overlap in the sets of genes identified by different groups is surprisingly small. We show that cell cycle regulated proteins can be identified via certain features, including protein phosphorylation, glycosylation, subcellular location and instability/degradation. 
 Long abstract
 
 
 
 |  
| K-5  Gene-O-Matic: Regulatory Network Simulation in Multicellular Organisms Ute Platzer1, Rolf Lohaus2, Hans-Peter Meinzer
 1u.platzer@dkfz.de, German Cancer Research Centre;
 Correspondence address: u.platzer@dkfz.de
 
 
Gene-O-Matic is a graphical tool for the construction and simulation of regulatory networks in multicellular organisms. A developmental process simulated successfully is embryonic development of the worm Caenorhabditis elegans. Gene-O-Matic is availabe on request from the authors, or by e-mail to cello@dkfz.de. The project's website can be found at http://mbi.dkfz-heidelberg.de/projects/cellsim/.
 Long abstract
 
 
 
 |  
| K-6  Theoretical analysis of mutations and evolution of gene networks Alexander V. Ratushny1, Vitaly A. Likhoshvai2, Yuri G. Matushkin, Nikolay A. Kolchanov
 1ratushny@bionet.nsc.ru, Institute of Cytology and Genetics, SBRAS; 2likho@bionet.nsc.ru, Institute of Cytology and Genetics, SBRAS
 Correspondence address: ratushny@bionet.nsc.ru
 
 
The mathematical model simulating cholesterol biosynthesis in a cell and
its exchange with blood plasma cholesterol was used for computer
analysis of a mutational portrait and evolution of this gene
network. The graphic interface of the gene network and its
computer dynamic model can be accessed at http://wwwmgs.bionet.nsc.ru/mgs/gnw/gn_model/. Long abstract
 
 
 
 |  
| K-7  Discovering novel regulatory controls of budding yeast cell cycle by Reverse Engineering and Bayesian Network Modeling Yan Sun1, Pawan Dhar2
 1sunyan@bii.a-star.edu.sg, Bioinformatics Institute, 21 Heng Mui Keng Terrace, Singapore; 2pk@bii.a-star.edu.sg, Bioinformatics Institute, 21 Heng Mui Keng Terrace, Singapore
 Correspondence address: sunyan@bii.a-star.edu.sg
 
 
In this study, a Reverse Engineering and Bayesian Network Modeling (REBNM) approach has been used for inferring the cell cycle regulatory network from high-throughput gene expression data. The REBNM analyzer uses prior biological knowledge and supervised classification scheme for functionally grouping genes for downstream processing by Bayesian network modeling.
 Long abstract
 
 
 
 |  
| K-8  Functional topology in a network of protein interactions Natasa Przulj1, Dennis Wigle2, Igor Jurisica
 1natasha@cs.toronto.edu, Department of Computer Science, University of Toronto, Canada; 2, Department of Surgery, University of Toronto, Canada
 Correspondence address: natasha@cs.toronto.edu
 
 
We systematically analyzed the S.cerevisiae protein-protein interaction
network using graph theoretic tools to determine structure-function
relationships. Constructed computational models describe and predict the
properties of functional groups, protein complexes, signaling pathways,
lethal and viable mutations, and proteins participating in genetic
interactions. Our models offer insight into the complex wiring underlying
cellular function.
 Long abstract
 
 
 
 |  
| K-9  Cellware: A Modeling and Simulation tool for Large Scale Biological Systems Sandeep Somani1, Chee Meng2, Li Ye, Anand Sairam, Zhu Hao,Mandar Chitre, Pawan Dhar
 1ssomani@bii.a-star.edu.sg, Bioinformatics Institute; 2cheemeng@bii.a-star.edu.sg, Bioinformatics Institute
 Correspondence address: ssomani@bii.a-star.edu.sg
 
 
A software tool for in-silico modeling and simulation of large scale biological networks is being developed. Stress is on using distributed computing and grid computing technologies and novel algorithmic development for meeting the computational challenge of simulating large scale networks. 
 Long abstract
 
 
 
 |  
| K-10  Parameter Estimation for Biochemical Pathways using Swarm Algorithm Tan Chee Meng1, Sandeep Somani2
 1cheemeng@bii.a-star.edu.sg, Bioinformatics Institute; 2ssomani@bii.a-star.edu.sg, Bioinformatics Institute
 Correspondence address: cheemeng@bii.a-star.edu.sg
 
 
Biological systems exhibit high robustness and operate at a broad range of kinetic parameters, it is important that optimization techniques capture this important property of cells. In this study, we present an optimization technique called SWARM that is capable of detecting multiple optimal solutions.
 Long abstract
 
 
 
 |  
| K-11  A Novel Bayesian Network Model for the Study of Genetic Regulatory Networks Y. Zeng1, J. Garcia-Frias2
 1zeng@eecis.udel.edu, University of Delaware; 2jgarcia@eecis.udel.edu, University of Delaware
 Correspondence address: zeng@eecis.udel.edu
 
 
We propose the use of Bayesian networks (BNs) with continuous valued variables, modeled by Student distributions, to simulate the cellular regulatory mechanism. Experimental results show the robustness of the proposed approach, which outperforms previous existing schemes based on BNs with either discrete variables or continuous variables with Gaussian distribution. 
 Long abstract
 
 
 
 |  
| K-12  Population Genetic Structure in the South Pacific: Prospects for Identifying Disease Susceptibility Genes in New Zealanders with Polynesian Ancestry Rodney Lea1, Geoffrey Chambers2
 1Rod.Lea@vuw.ac.nz, Institute of Molecular Systematics; 2Geoff.Chambers@vuw.ac.nz, Institute of Molecular Systematics
 Correspondence address: Rod.Lea@vuw.ac.nz
 
 
Understanding population genetic structure has important implications for identifying disease susceptibility genes. Our study will utilises multi-locus genotype data from unlinked microsatellite markers to describe the genetic structure in different Polynesian (Maori, Samoan, Tongan), European and admixed populations. 
 
 Long abstract
 
 
 
 |  
| K-13  Data Preprocessing Facilitates Metabolic Pathway Identification from Time Profiles Eberhard O. Voit1, Jonas S. Almeida2
 1VoitEO@MUSC.edu, Medical University of S. Carolina; 2almeidaj@musc.edu, Medical University of S. Carolina
 Correspondence address: VoitEO@MUSC.edu
 
 
The identification of metabolic pathway structure is a challenging problem that must be solved for the analysis of metabolic time profiles.  Twofold data preprocessing significantly speeds up this identification.  First, we model and smooth the data with an artificial neural network, and second, we replace differentials with estimated slopes.
 Long abstract
 
 
 
 |  
| K-14  Towards more biological mutation operators in models of gene regulation James Watson1, Nicholas Geard2, Janet Wiles
 1jwatson@itee.uq.edu.au, University of Queensland; 2nic@itee.uq.edu.au, University of Queensland
 Correspondence address: jwatson@itee.uq.edu.au
 
 
Gene regulation is often studied through models of directed graphs.  Mutation operators applied to such networks impose limitations on how the models evolve.  A method to extract a regulation network from an artificial
nucleotide sequence is presented, and the impact of sequence-level
mutations on network-level structure is discussed.
 Long abstract
 
 
 
 |  
| K-15  BioPACS: BioPathway Automatic Convert System for Genomic Object Net Masao Nagasaki 1, Atsushi Doi2, Hiroshi Matsuno, Satoru Miyano
 1masao@ims.u-tokyo.ac.jp, Human Genome Center, University of Tokyo; 2atsushi@ib.sci.yamaguchi-u.ac.jp, Faculty of Science, Yamaguchi University
 Correspondence address: matsuno@sci.yamaguchi-u.ac.jp
 
 
 For the modeling and simulation of a biopathway, suitable information selection from public biopathway databases, such as KEGG and BioCyc, would be useful. We have developed a method to transform these pathway databases so that the converted biopathways can run on Genomic Object Net (http://www.GenomicObject.Net). 
 Long abstract
 
 
 
 |  
| K-16  Models and Simulations in Systems Biology Joao Carlos Marques Magalhaes1, Cedric Gondro2
 1jcmm@bio.ufpr.br, Federal University of Parana; 2cgondro@pobox.une.edu.au, University of New England
 Correspondence address: genetics@sigex.com.br
 
 
Computational models that simulate complex adaptive systems offer an alternative where analytical handling is untenable. A relatively small set of objects and simple rules, computationally implemented via techniques such as genetic algorithms or genetic programming can replicate such complexity. An example of this approach is available for download from http://www.sigex.com.br/genetics. 
 Long abstract
 
 
 
 |  
| K-17  Stochastic Neural Network Models for Gene Regulatory Networks Tianhai Tian1, Kevin Burrage2
 1tian@maths.uq.edu.au, ACMC, University of Queensland; 2kb@maths.uq.edu.au, ACMC University of Queensland
 Correspondence address: tian@maths.uq.edu.au
 
 
Stochastic models are presented by introducing stochastic processes into neural network models for studying the genome dynamics. Poisson random variables are used to represent chance events in the processes of synthesis and degradation. Using an example network, we show how to study robustness and stability properties of gene expression patterns. 
 Long abstract
 
 
 
 |  
| K-18  Metabolic comparison of the in-silico phenotype-genotype relationship of Pseudomonas putida and Pseudomonas aeruginosa Vitor A.P. Martins dos Santos 1, Miguel Godinho de Almeida 2, Jeremy S. Edwards2, Kenneth N. Timmis1
 1vds@gbf.de, National Centre for Biotechnology Research; 2University of Delaware
 Correspondence address: vds@gbf.de
 
 
We report on an in-silico representation of Pseudomonas putida and Pseudomonas aeruginosa that describes their metabolic capacities within the scope of their environmental constraints. Using annotated genome sequence data, biochemical information and strain-specific knowledge, we analysed the cellular behaviour of this micro-organism under a wide range of conditions relevant for both human health.
 Long abstract
 
 
 
 |  
| K-19  Functional Analysis of Mammalian Cell Cycle using A Computational Model of Hybrid Petri Net. Shuji Kotani1, Takashi Yoshioka2, Kaoru Takahashi, Akihiko Konagaya
 1shuji.kotani@nifty.ne.jp, RIKEN Genomic Sciences Center ; 2yoshiokatks@nttdata.co.jp, NTT Data Corp
 Correspondence address: shuji.kotani@nifty.ne.jp
 
 
To describe the molecular mechanism of the mammalian cell cycle, we developed a new computational model by using Hybrid Petri Nets. Using the model, we presume how the change of gene product influences the cell cycle progression. This model maybe useful for exploring the relationship between gene functions and diseases.
 Long abstract
 
 
 
 |  
| K-20  CADLIVE for constructing a yeast cell cycle network Hiroyuki Kurata1
 1kurata@bse.kyutech.ac.jp, Kyushu Institute of Technology
 Correspondence address: kurata@bse.kyutech.ac.jp
 
 
CADLIVE is a powerful software suit with GUI for constructing a large-scale map of complicated biochemical reaction networks. We constructed a biochemical map of the budding yeast cell cycle, which consists of 184 molecules and 152 reactions, and integrated postgenomic data into the map to predict novel pathways.
 Long abstract
 
 
 
 |  
| K-21  BSTLab: A Matlab Toolbox for Biochemical Systems Theory John Schwacke1, Eberhard O. Voit2
 1schwacke@musc.edu, Medical University of South Carolina; 2voiteo@musc.edu, Medical University of South Carolina
 Correspondence address: schwacke@musc.edu
 
 
To facilitate application of Biochemical Systems Theory (BST), we have begun development of BSTLab, a Matlab-based toolbox implementing functions common to BST-based studies. The toolbox automates common computations, permits expansion and customization, and includes functions needed to reformulate and transport models between Matlab and the Systems Biology Markup Language (SBML).
 Long abstract
 
 
 
 |  
| K-22  Comparative Metabolic Flux Analysis by MetaFluxNet Dong-Yup Lee1, Hongsoek Yun2, Sang-Yup Lee and Sunwon Park
 1dylee@pse.kaist.ac.kr, KAIST; 2hsyun@pse.kaist.ac.kr, KAIST
 Correspondence address: dylee@pse.kaist.ac.kr
 
 
MetaFluxNet is a program package for managing information on the metabolic reaction network and for quantitatively analyzing metabolic fluxes in an interactive and customized way. Using the feature of the comparative metabolic flux analysis supported in MetaFluxNet, one can design and evaluate various metabolically engineered in silico strains interactively. MetaFluxNet is available at http://mbel.kaist.ac.kr.
 Long abstract
 
 
 
 |  
| K-23  Ontologies in CellML: A Versatile Method to Describe Cellular Models Poul Nielsen1, Matt Halstead2, Autumn Cuellar, Michael Dunstan, David Bullivant, Peter Hunter
 1p.nielsen@auckland.ac.nz, University of Auckland; 2matt.halstead@auckland.ac.nz, University of Auckland
 Correspondence address: p.nielsen@auckland.ac.nz
 
 
CellML is an XML-based exchange language used to describe the underlying mathematics and topology of a wide variety of biological models. Knowledge implicitly associated with a model, however, is not normally included in the CellML representation. In order to address this problem facilities to include ontologies have been added to CellML.
 Long abstract
 
 
 
 |  
| K-24  Bayesian inference for stochastic models of genetic networks Richard Boys1, Darren Wilkinson2, Tom Kirkwood, Wan Ng
 1richard.boys@ncl.ac.uk, University of Newcastle upon Tyne; 2d.j.wilkinson@ncl.ac.uk, University of Newcastle upon Tyne
 Correspondence address: richard.boys@ncl.ac.uk
 
 
This poster describes the detailed stochastic techniques used to
model regulatory networks, and the computational tools needed for
simulation and analysis. An overview is given of the modern
computationally intensive statistical techniques which can in
principle be used for carrying out Bayesian inference for the
parameters underlying these network models.
 Long abstract
 
 
 
 |  
| K-25  Is Segmentation a Robust Gene Network Mark Reimers1
 1mark.reimers@cgb.ki.se, Karolinska
 Correspondence address: mark.reimers@cgb.ki.se
 
 
A major question in evolutionary developmental biology is whether conserved developmental networks are robust. We address this with a simulation study and an evolutionary comparison of gene networks.
 Long abstract
 
 
 
 |  
| K-26  On Metabolic Pathway Reconstruction from Gene Expression Data Cedric Gondro1, Brian P. Kinghorn2
 1genetics@sigex.com.br, University of New England; 2bkinghor@une.edu.au, University of New England
 Correspondence address: genetics@sigex.com.br
 
 
This work aims to infer metabolic pathways and other biological processes from data generated in kinetically simulated microarray experiments using evolutionary algorithms. A preliminary test has shown correct reconstruction of lac operon model parameters derived from simulated expression data collected following a perturbation in the level of lactose.
 Long abstract
 
 
 
 |  
| K-27  Modelling the Role of Small RNAs in Gene Regulation Nicholas Geard1, Janet Wiles2
 1nic@itee.uq.edu.au, The University of Queensland; 2j.wiles@itee.uq.edu.au, The University of Queensland
 Correspondence address: nic@itee.uq.edu.au
 
 
Small functional RNA molecules have been discovered to play an important role in the regulation of gene transcription.  The abstract model presented here uses a sequence-matching paradigm to generate regulatory networks that utilise multiple levels of transcriptional control to increase their computational power.  
 Long abstract
 
 
 
 |  
| K-28  High level properties of genetic regulatory network Kai Willadsen1, Janet Wiles2
 1kaiw@itee.uq.edu.au, University of Queensland; 2, University of Queensland
 Correspondence address: kaiw@itee.uq.edu.au
 
 
Abstract models of gene regulation date back to the development of the Random Boolean Network model in 1969. This class of models aims to investigate emergent properties of genetic regulatory networks with a view to better understanding high-level characteristics of the behaviours that these systems display.
 Long abstract
 
 
 
 |  
| K-29  Emergent Models in Complex System Simulations of Genetic and Biochemical Networks Henk Stolk1, Kevin Gates2, Jim Hanan
 1hjs@maths.uq.edu.au, University of Queensland; 2keg@maths.uq.edu.au, University of Queensland
 Correspondence address: hjs@maths.uq.edu.au
 
 
Complex systems are simulated by interacting software agents at various hierarchical levels. Emergent macro-level models are derived from micro-level properties and behavior, for example by genetic programming algorithms. Micro-level models are also derived from macro-level behavior. The methodology can relate micro-level genetic and biochemical networks to macro-level gene expression patterns.
 Long abstract
 
 
 
 |  
| K-30  ISAWB : a Windows-based Integrated Workbench for Managing Sequence Information in Small Scale Hongseok Tae1, Hyeweon Nam2, Daesang Lee, Kiejung Park
 1hstae@smallsoft.co.kr, Dept. of Microbiology, Kyungpook National University; 2hwnam@smallsoft.co.kr, Information Technology Institute, SmallSoft Co., Ltd.
 Correspondence address: hstae@smallsoft.co.kr
 
 
ISAWB is a small workbench for managing sequence data and analyzing them to get processed information with Windows GUI.  It contains data management features and eleven analysis module tools suggested by a Korean MOST project while each module is composed of a few objects in C++ classes, for reusability.
 Long abstract
 
 
 
 |  
| K-31  Multi-algorithm, multi-timescale cell simulation using E-Cell3 Kouichi Takahashi1, Kazunari Kaizu2, Bin Hu, Yohei Yamada, Masaru Tomita
 1shafi@sfc.keio.ac.jp, Institute for Advanced Biosciences, Keio University; 2t00220kk@sfc.keio.ac.jp, Institute for Advanced Biosciences, Keio University
 Correspondence address: yoyo@sfc.keio.ac.jp
 
 
The integration of sub-cellular models running on different types of
algorithms poses a significant computational challenge. A heat-shock
response model combining the Gillespie-Gibson stochastic algorithm and
deterministic equations, and a multi-timescale model with multiple ODE
components have been constructed. An implementation of the method is
available at http://www.e-cell.org/.
 Long abstract
 
 
 
 |  
| K-32  SPACE-BLAST:  Linux Cluster based Biological Sequence Parallel Processing system  Summarized by Gene Ontology Mihwa Park1, Jaewoo Kim2, Hyungsuk Won, Seungsik Yoo
 1bfpark@posdata.co.kr, POSDATA; 2jaewoo@posdata.co.kr, POSDATA
 Correspondence address: bfpark@posdata.co.kr
 
 
SPACE-BLAST (Super PArallel Computer Engine for BLAST) is a high performance bioinformatics system that implements the NCBI’s BLAST system with low cost Linux cluster based parallel processing to search DNA sequencing at high speed. Also, Gene Ontology is applied to summarize massive amount of the BLAST search results. SPACE-BLAST is available at at http://space-blast.posdata.co.kr. |  |