PLoS Computational Biology Late Breaking Poster Session

NEW! PLoS Computational Biology Late Breaking Poster Session.

Public Library of Science Computational Biology (PLoS CB) Late Breaking Poster Session
A: RNA and Protein Structural Biology
B: Ontologies and NLP
C: Pathways, Networks, and Proteomics
D: Sequence Analysis, Phylogeny, and Evolution
E: Genomics, and Gene Expression
F: Gene Regulation, microRNA’s
G: Databases

A-1
Title: An algorithm to improve the selection of known protein fragments for loop structure prediction
Authors: Narcis Fernandez-Fuentes, Albert Einstein College of Medicine, narcis@fiserlab.org, Andras Fiser, Albert Einstein College of Medicine, andras@fiserlab.org
Abstract: We develop a new approach to identify fragments for loop modeling. Candidate loops are gauged using six parameters: (1) Geometry fitting; (2) Sequence similarity; (3) RMSD of the stems; (4) phi/psi angle probabilities; (5) Free energy in the new protein (contact potentials); and (6) Repulsive contacts or clashes.

A-2
Title: RNAforester: A Tool for Comparing RNA Secondary Structures
Authors: Matthias Hoechsmann, International NRW Graduate School in Bioinformatics and Genome Research, mhoechsm@techfak.uni-bielefeld.de, Robert Giegerich, International NRW Graduate School in Bioinformatics and Genome Research, robert@techfak.uni-bielefeld.de
Abstract: Comparative analysis of coding regions, i.e. regions where the order of nucleotides code for proteins, has been studied extensively. But what if the signal is not sequential? Nowadays, there are numerous examples of RNA genes and motifs where the structure instead of the sequence determines the function (and for sure, there are a lot of unknown ones today). Where the selective pressure acts on the function, often the structure instead of the sequence is conserved. In spite of all its success, pure sequence based comparative analysis gets to its limit when structural conservation is of interest. In contrast, RNAforester is a tool that aligns the structure (and sequence) of RNA molecules. RNAforester is a command line based tool for comparing RNA secondary structures. It supports the computation of (local) pairwise and multiple alignment of structures based on the tree alignment model (Jiang et al. 1995) and the extensions and algorithms presented in Hoechsmann et al 2003,2004. The user interface follows the philosophy of the Vienna RNA Package (Hofacker et al. 1994) and will be part of the forthcoming Vienna RNA Package Version 1.6. The online version of RNAforester and the source code distribution is available at http://bibiserv.techfak.uni-bielefeld.de/rnaforester.

A-3
Title: Prediction of catalytic residues in proteins using machine-learning techniques
Authors: Natalia Petrova, Ph.D. Student of Department of Biochemistry and Molecular Biology, and Protein Information Resource (PIR), Georgetown University Medical Center, Washington, DC, np6@georgetown.edu, Cathy Wu, Director of Protein Information Resource (PIR), Professor of Biochemistry and Molecular Biology and Oncology, Georgetown University Medical Center, Washington, DC, wuc@georgetown.edu
Abstract: We present a novel method for the prediction of the catalytic residues in proteins using machine learning technique. We found relevant features of the protein residues for the prediction of catalytic residues using benchmarking dataset of enzymes with known catalytic sites and machine learning attribute selection algorithm.

A-4
Title: Using Machine Learning Tools to Assess the Druggability of Protein Surface Cavities
Authors: Murad Nayal, Columbia University / HHMI, mn216@columbia.edu, Barry Honig, Columbia University / HHMI, bh6@columbia.edu
Abstract: We developed a new method, SCREEN (Surface Cavity REcognition and EvaluatioN) for the identification of protein surface cavities. By computing a comprehensive set of cavity properties and using Random Forests classification strategy we predicted drug binding cavities with 7.2% BER and 88.9% coverage. The important predictive properties were highlighted.

A-5
Title: Fast Searches for RNA Structures Including Pseudoknots in Genomes Using Tree Decomposition

Authors: Yinglei Song, Department of Computer Science, University of Georgia, song@cs.uga.edu, Chunmei Liu, Department of Computer Science, University of Georgia, chunmei@cs.uga.edu, Russell Malmberg, Department of Plant Biology, University of Georgia, russell@plantbio.uga.edu, Fangfang Pan, Department of Plant Biology, University of Georgia, fpan@plantbio.uga.edu, Liming Cai, Department of Computer Science, University of Georgia, cai@cs.uga.edu
Abstract: To search genomes for specific RNA secondary structures we develop consensus profiles based upon a conformational graph followed by a tree decomposition. The time complexity is O(k**t N**2) where k and t are small integers. We used the algorithm to successfully search genomes for tmRNAs and telomerase RNAs.

A-6
Title: A New Sampling Algorithm for Simultaneous RNA Secondary Structure Prediction and Structural Alignment

Authors: Xing Xu, Washington University in St Louis, xingxu@ural.wustl.edu, Gary Stormo, Washington University in St Louis, stormo@ural.wustl.edu
Abstract: We present a new algorithm for simultaneous common RNA secondary structure prediction and structural alignment on two and multiple sequences. It iteratively samples the common structures and calculates the base-pairing and pairwise alignment probabilities. This algorithm is able to predict pseudoknots, and has shown promising results on many test sets.

A-7
Title: Negatively cooperative binding of melittin to neutral phospholipid vesicles

Authors: Francisco Torrens, Institut Universitari de Ciencia Molecular, Universitat de Valencia, Francisco.Torrens@uv.es
Abstract: Association of basic amphipathic peptides to neutral phospholipid membranes is studied with binding and partition models. Results. Binding of native and modified melittin (Mel) to egg-yolk phosphatidylcholine (EPC) is studied by spectrofluorimetry and size-exclussion chromatography. Binding isotherms, Scatchard and Hill plots for DNC/EPC-Mel.

A-8
Title: BALL (Biochemical ALgorithms Library) and BALLView—a multiplatform molecular viewer and modeling tool

Authors: Andreas Moll, Center for Bioinformatics, Saarland University, Saarbrücken, amoll@bioinf.uni-sb.de, Andreas Hildebrandt, Center for Bioinformatics, Saarland University, Saarbrücken, anhi@bioinf.uni-sb.de, Oliver Kohlbacher, Dept. for Simulation of Biological Systems, University of Tübingen, oliver.kohlbacher@uni-tuebingen.de, Hans-Peter Lenhof, Center for Bioinformatics, Saarland University, Saarbrücken, len@bioinf.uni-sb.de
Abstract: BALLView is a free molecular modeling and molecular graphics tool. It was created using the Biochemical ALgorithms Library (BALL) and provides OpenGL-based visualisation of molecular structures, molecular mechanics methods (minimization, MD simulation using the AMBER and CHARMM force fields), calculation and visualisation of electrostatic properties and a Python interface.

A-9
Title: Image descriptors for the recognition of protein active sites

Authors: Claudio Garutti, Department of Information Engineering, University of Padova, garuttic@dei.unipd.it, Concettina Guerra, Department of Information Engineering, University of Padova, Concettina.Guerra@dei.unipd.it
Abstract: We present a new approach to measure the similarity between shapes and exploit it to search for binding sites in related proteins. The approach uses geometric descriptors, based on a spin-image representation, that capture the distribution of neighboring points providing a powerful local shape characterization.

A-10
Title: Latent Factors in Protein Crystallization

Authors: Christian Cumbaa, Ontario Cancer Institute, ccumbaa@uhnresearch.ca, Igor Jurisica, Ontario Cancer Institute, juris@ai.utoronto.ca
Abstract: Protein X-ray crystallography begins with a search through high-dimensional space for a chemical environment (cocktail) promoting protein crystal growth. We analyze data from high-throughput crystallization trials (220 proteins X 1536 cocktails), using sparse matrix factorization and association-rule discovery to uncover latent variables influencing protein crystallization.

A-11
Title: Molecular Symmetry as Aid to Homo-oligomeric Protein Structure Determination by NMR, using Sparse Inter-molecular NOE Restraints

Authors: shobha potluri, dartmouth College, potluri@cs.dartmouth.edu, Bruce Donald, Dartmouth College, brd@cs.dartmouth.edu, Chris Bailey-Kellogg, Dartmouth College, cbk@cs.dartmouth.edu
Abstract: Membrane proteins constitute 30\% of the proteins in a genome and are necessary for key functions such as cell-cell interactions, energy transduction and cell signaling. A vast array of inherited and acquired diseases including Alzheimer's are caused by mutations in membrane proteins and hence understanding their structure is vital. Yet, due to the challenges inherent in their isolation and stability, very limited structural information is currently available on membrane proteins. Several of the membrane proteins are homo-oligomeric in nature. We propose an algorithm that makes use of the symmetry of a homo-oligomer, the monomer structure and sparse intermolecular NOEs to predict structures and assess the uncertainty associated with them as the number of restraints get sparse.

B-1
Title: UMLS Concept Indexing of OMIM Clinical Synopsis and Knowledge Extraction

Authors: Jing Chen, Department of Biomedical Engineering, University of Cincinnati, Jing.Chen@cchmc.org, Anil Jegga, Department of Pediatrics, University of Cincinnati and Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Anil.Jegga@cchmc.org, Bruce Aronow, Departments of Biomedical Engineering and Pediatrics, University of Cincinnati and Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Bruce.Aronow@cchmc.org
Abstract: Semantic interoperability between knowledge corpora in medicine and genomics/genetics will lead to advances in fundamental research and improved patient care. We present the preliminary results of parsing the clinical synopses of human inherited diseases (NCBI-OMIM), and indexing and associating them with UMLS concepts and biological pathways.

B-2
Title: Integrating Complex Biological Data Using Multiple Ontologies

Authors: Mary Shimoyama, Human and Molecular Genetics Center, Medical College of Wisconsin, shimoyma@mcw.edu, Victoria Petri, Human and Molecular Genetics Center, Medical College of Wisconsin, vpetri@mcw.edu, Dean Pasko, Human and Molecular Genetics Center, Medical College of Wisconsin, dpasko@mcw.edu, Wenhua Wu, Human and Molecular Genetics Center, Medical College of Wisconsin, wwu@mcw.edu, Jiali Chen, Human and Molecular Genetics Center, Medical College of Wisconsin, jlchen@mcw.edu, Nataliya Nenasheva, Human and Molecular Genetics Center, Medical College of Wisconsin, nnenashe@mcw.edu, Simon Twigger, Human and Molecular Genetics Center, Medical College of Wisconsin, simont@mcw.edu, Howard Jacob, Human and Molecular Genetics Center, Medical College of Wisconsin, jacob@mcw.edu
Abstract: The Rat Genome Database has developed an ontology-based data structure to integrate physiological data with environmental and experimental factors, as well as genetic and genomic information. Multiple ontologies facilitate integration of complex biological information from the molecular level to the whole organism, development of data mining and presentation tools.

B-3
Title: A method for simultaneous gene selection in B-cell lymphoma from methylation and expression microarrays

Authors: Gerald Arthur, University of Missouri School of Medicine, arthurg@health.missouri.edu, Mihail Popescu, University of Missouri School of Medicine, PopescuM@missouri.edu, Farahnaz Rahmatpanah, University of Missouri School of Medicine, RahmatpanahF@health.missouri.edu, Ozy Sjahputera, University of Missouri School of Medicine, SjahputeraO@health.missouri.edu, James Keller, University of Missouri, KellerJ@missouri.edu, Huidong Shi, University of Missouri School of Medicine, ShiHu@health.missouri.edu, Charles Caldwell, University of Missouri School of Medicine, CaldwellC@health.missouri.edu
Abstract: Epigenetic regulation of gene transcription is important in cell differentiation and possibly in malignant transformation. A t-test-based algorithm for identifying genes simultaneously methylated and unexpressed in small B-cell lymphoma is presented. This method identifies genes involved in critical pathways and may provide new insights into lymphoma tumorigenesis.

B-4
Title: Integration of the Gene Ontology into an object-oriented architecture

Authors: Wenjin Zheng, Dept. Biostat. Bioinfo & Epidemiology/Med. Univ. South Carolina, zhengw@musc.edu, Daniel Shegogue, Dept. Biostat. Bioinfo & Epidemiology/Med. Univ. South Carolina, shegogue@musc.edu
Abstract: The static and dynamic events that occur during a GO process, "transforming growth factor-beta (TGF-beta) receptor complex assembly" (GO:0007181) have been captured in an object-oriented model. We demonstrate that the utility of GO terms can be enhanced by object-oriented technology.

B-5
Title: Mining short sequence elements out of the literature

Authors: Jonathan Wren, University of Oklahoma, Jonathan.Wren@OU.edu, William Hildebrand, University of Oklahoma Health Sciences Center, William-Hildebrand@ouhsc.edu, Ulrich Melcher, Oklahoma State University, umelcher@biochem.okstate.edu
Abstract: We report the development of a Markov Model algorithm able to accurately locate, classify and extract sequence data from large text databases. The algorithm was benchmarked against entries from two published databases and used to extract data from over 7.7 million MEDLINE records and 9,000 full-text articles.

B-6
Title: Steps to Automated Recognition of Useful Papers

Authors: Amith Reddy Gosukonda, Department of Computer Science, University of Missouri, Columbia, arg257@mizzou.edu, Toni Kazic, Department of Computer Science, University of Missouri, Columbia, toni@athe.rnet.missouri.edu
Abstract: When searching bibliographic databases, one commonly retrieves many uninteresting papers. For example, searches for papers reporting experimental biochemical data characterizing purified enzymes retrieves many other papers that describe population, proposed drug, and biosensors. Therefore, we have been exploring strategies to better filter papers of biochemical interest from bulk retrievals.

B-7
Title: Using Meta-Network to Analyze Networks of GO Functional Modules

Authors: Jie Wu, Student, jiewu@bu.edu, Zhenjun Hu, Research Associate, zjhu@bu.edu, Joseph Mellor, Student, mellor@bu.edu, Gul Dalgin, Student, sdalgin@bu.edu, Charles DeLisi, Prof., delisi@bu.edu
Abstract: We have developed a new meta-network methodology to construct, analyze and visualize the network of functional modules based on the interaction connectivity of the elements in the modules. Networks for Saccharomyces cerevisiae with GO term as functional modules are available at http://visant.bu.edu/go_networks.htm

C-1
Title: Analysis of metastasis suppressor genes: comparative genomics and systems biology approaches

Authors: RangaChandra Gudivada, Department of Biomedical Engineering, University of Cincinnati, chandra_bio@yahoo.com, Anil Jegga, Department of Pediatrics, University of Cincinnati and Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, anil.jegga@cchmc.org, Bruce Aronow, Departments of Biomedical Engineering and Pediatrics, University of Cincinnati and Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, bruce.aronow@cchmc.org
Abstract: Metastatic suppressor genes are valuable prognostic, prophylactic and therapeutic markers. Recent reports on transcriptional regulation of metastasis suppressor genes offer a new paradigm for investigating mechanisms of down-regulation of these genes. We used computational approaches to identify potential regulatory regions and compile the associated pathways and protein interactions.

C-2
Title: Proteome Data Filtering and Classification using a Reversed Protein Database and Molecular Weight Information

Authors: Gun Wook Park, Korea Basic Science Institute, cancun@kbsi.re.kr, Kyung-Hoon Kwon, Korea Basic Science Institute, khoon@kbsi.re.kr, Jin Young Kim, Korea Basic Science Institute, jinyoung@kbsi.re.kr, Jeong Hwa Lee, Korea Basic Science Institute, leepurry@kbsi.re.kr, Sung-Ho Yun, Korea Basic Science Institute, labap@kbsi.re.kr, Seung Il Kim, Korea Basic Science Institute, ksi@kbsi.re.kr, Jong Shin Yoo, Korea Basic Science Institute, jongshin@kbsi.re.kr
Abstract: We propose a filtering method for human proteome analysis using tandem mass spectrometry. As a reference proteome dataset, the Pseudomonas putida KT2440 proteome was analyzed by considering the reversed sequence database and 1D-gel band position. It was used to classify the three groups of human plasma proteome.

C-3
Title: Data Analysis of the 3020 confirmed proteins identifications by Plasma Proteome Project

Authors: Gilbert S. Omenn, University of Michigan, gomenn@med.umich.edu, Rajasree Menon, University of Michigan, rajmenon@umich.edu, Marcin Adamski, University of Michigan, marcin_adamski@yahoo.com, Thomas Blackwell, University of Michigan, tblackw@umich.edu, Yin Xu, University of Michigan, yinxu@umich.edu, David J. States, University of Michigan, dstates@bioinformatics.med.umich.edu
Abstract: At the completion of the pilot phase of Plasma Proteome Project, 3,020 distinct IPI proteins were identified with at least two or more peptides. Analyses from different corners were done on this dataset and are still on going.

C-4
Title: Confidence Estimation of Dose Response Data from High Content Imaging Experiments

Authors: Yong-Chuan Tao, Novartis Institutes for Biomedical Research, yong-chuan.tao@novartis.com, Elizabeth McWhinnie, Novartis Institutes for Biomedical Research, elizabeth.mcwhinnie@novartis.com, Yan Feng, Novartis Institutes for Biomedical Research, yan.feng@novartis.com
Abstract: Statistical methods for confidence interval estimation of dose-response parameters in the context of high content imaging experiments are described. The analysis demonstrates the benefit of high content imaging, i.e., the possibility of making strong statistical inferences based on the large number of cells scored at each given condition.

C-5
Title: Advantages of network-based approaches for the phylogenetic analysis of intragenomic repeat regions

Authors: Surya Saha, Mississippi State University, ss307@cse.msstate.edu, Susan Bridges, Mississippi State University, bridges@cse.msstate.edu, Daniel Peterson, Mississippi State University, dpeterson@pss.msstate.edu
Abstract: We have compared the effectiveness of traditional "tree based" methods and newer "network-based" computational methods in the phylogenetic analysis of intragenomic DNA repeat sequences. Our results suggest that only the network-based methods afford the plasticity required for meaningful classification of repetitive elements.

C-6
Title: New Probabilistic Graphical Models for Genetic Regulatory Networks Studies

Authors: Junbai Wang, Department of Biological Sciences, Columbia University, jw2256@columbia.edu, Leo Cheung, Cancer Research Center of Hawaii, University of Hawaii, lcheung@crch.hawaii.edu, Jan Delabie, Department of Pathology, Norwegian Radium Hospital, jan.delabie@labmed.uio.no
Abstract: This paper introduces two new probabilistic graphical models, a new Independence Graph Model and a new Gaussian Network Model, for reconstruction of genetic regulatory networks using DNA microarray data. The performances of both models were evaluated on four MAPK pathways in yeast and compared with several other commonly used models.

C-7
Title: Neighborhood Similarity in the Functional Interaction Network of Yeast

Authors: Shuye Pu, Center for Computational Biology, The Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M5G 1X8, shuyepu@sickkids.ca, Shoshana Wodak, Center for Computational Biology, The Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M5G 1X8, shoshana@sickkids.ca
Abstract: We defined two complementary measures of similarity between the neighborhoods of two genes/proteins in the network of functional interactions in yeast, and investigated how these measures are related to the sequence and functional similarity between these genes. This study will be extended to multiple organisms in the future.

C-8
Title: Clustering of Genes into Regulons using Integrated Modeling

Authors: Guang Chen, CBIL, PCBI, University of Pennsylvania, ggchen@pcbi.upenn.edu, Shane Jensen, Department of Statistics, The Wharton School, University of Pennsylvania, stjensen@wharton.upenn.edu, Christian Stoeckert, CBIL, PCBI, University of Pennsylvania, stoeckrt@pcbi.upenn.edu
Abstract: We present a Bayesian hierarchical model and MCMC implementation that integrates heterogeneous biological data (e.g. expression data, ChIP binding data) in a principled and robust fashion to discover regulatory networks. Our model overcomes intrinsic drawbacks of available methods and can be applied to any organism.

C-9
Title: Connectivity and Function in a Biochemical Network

Authors: Avanthi Mummaneni, Dept. of Computer Science, University of Missouri, Columbia, am0f3@mizzou.edu, Toni Kazic, Dept. of Computer Science, University of Missouri, Columbia, toni@athe.rnet.missouri.edu
Abstract: Studying the architecture of reactions in the Enzyme Nomenclature helps us understand the behavior of a biochemical network. The network is highly interconnected without using currency metabolites; the canonical textbook topology for pathways is absent; and all EC classes except class five are more heterogeneous than their canonical reactions.

C-10
Title: Paired t-test and unpaired t-test for selecting genes differentially expressed between tumor and normal samples

Authors: Howard Yang, NCI/NIH, yanghow@mail.nih.gov, Nan Hu, NCI/NIH, nhu@mail.nih.gov, Hua Su, NCI/NIH, ptaylor@mail.nih.gov, Philip Taylor, NCI/NIH, ptaylor@mail.nih.gov, Maxwell Lee, NCI/NIH, leemax@mail.nih.gov
Abstract: It is generally believed that a paired t-test detects more differentially expressed genes than an unpaired t-test. However, this was true only when p-values were above a threshold. It is critical to know the differences between the two methods and which to use for selecting differentially expressed genes.

C-11
Title: Gene Expression Profiling of Rhematoid Arthritis Tissue Reveals Signature of Disease Process and Progression

Authors: Paolo Martini, Serono Research Institute, paolo.martini@serono.com, Deanne Taylor, Serono Research Institute, deanne.taylor@serono.com, Gregg McAllister, Serono Research Institute, gregg.mcallister@serono.com, Jennifer Jackson, Serono Research Institute, jennifer.jackson@serono.com, Jadwiga Bienkowska, Serono Research Institute, jadwiga.bienkowska@serono.com, Robert Campbell, Serono Research Institute, robert.campbell@serono.com
Abstract: We have found that differentially expressed genes between three breakouts of 34 samples of rheumatoid arthritis and normal synovial tissues can be categorized by disease process using Gene Ontologies. We have found significant representation in categories such as inflammation, anti-apoptosis, homeostasis, and proliferation consonant with disease process and progression.

C-12
Title: Simple Outlier Removal Improves the Performance of Support Vector Machines as a Biomarker Selection Method

Authors: Richard Moffitt, Georgia Institute of Technology, gte394z@mail.gatech.edu, John Phan, Georgia Institute of Technology, gtg407s@mail.gatech.edu, May Wang, Georgia Institute of Technology, maywang@bme.gatech.edu
Abstract: Biomarker selection is an important step in translating gene microarray data into clinical application. This study investigates the effect of simple outlier removal on the performance of a biomarker selection method, Support Vector Machines (SVM). Results show that outlier removal increases the predictive power of SVM by decreasing noise.

C-13
Title: Computational Orthologous Prioritization (COP): A comparative genomic approach toward candidate gene prioritization for disease gene identification

Authors: Annie Chiang, The University of Iowa, achiang@eng.uiowa.edu, Terry Braun, The University of Iowa, terry-braun@uiowa.edu, Val Sheffield, The University of Iowa, val-sheffield@uiowa.edu, Thomas Casavant, The University of Iowa, tom-casavant@uiowa.edu
Abstract: One of the many challenges of finding genes involved in human diseases is the non-deterministic process of candidate gene selection for mutation screening. Here we describe a comparative genomic approach, Computational Orthologous Prioritization (COP), toward candidate gene prioritization for disease gene identification.

C-14
Title: Modeling the stability and patterns of protein interactions

Authors: Joshua Rest, University of Chicago, jrest@uchicago.edu, Geoffrey Morris, University of Chicago, gmorris@uchicago.edu, Richard Lusk, University of Chicago, lusk@uchicago.edu, Henry Horng-Shing Lu, National Chiao Tung University, hslu@stat.nctu.edu.tw, Wen-Hsiung Li, University of Chicago, wli@midway.uchicago.edu
Abstract: We propose a framework that predicts, based on protein-protein interaction studies, whether interactions are direct or indirect in stable complexes or whether they are direct but weak. Separating the interaction network by degree and pattern allows assessment of the network's robustness, structure and evolution.

D-1
Title: Global landscape of recent inferred Darwinian selection for Homo sapiens

Authors: Eric Wang, University of California, Irvine School of Medicine, tewang@uci.edu, Pierre Baldi, University of California, Irvine School of Information and Computer Science, pfbaldi@uci.edu, Robert Moyzis, University of California, Irvine School of Medicine, rmoyzis@uci.edu
Abstract: Using the 1.5-million genotype data from Hinds et al 2005(Perlegen) and the International Human Haplotype Map (HapMap), a probabilistic search for the landscape exhibited by positive Darwinian selection was conducted. 1.7% of the SNPs were found to exhibit the genetic architecture of selection.

D-2
Title: Parallel Algorithms for Finding Short Approximate non-Tandem Repeats

Authors: Min Qian, University of Connecticut, huang@engr.uconn.edu, Chun-Hsi Huang, University of Connecticut, huang@cse.uconn.edu
Abstract: Short approximate non-tandem repeats within biological sequences have been shown to relate to human hereditary anomalies. The formalized problem appears to be computation-heavy. In this work we investigate algorithms on a computational Grid to speedup the identification of such repeats by exploiting iterative independent operations not observed before.

D-3
Title: Detection of Vaccinia Virus Promoters Using Interpolated Context Models

Authors: Chunlin Wang, University of Alabama at Birmingham, wangcl@uab.edu, Elliot Lefkowitz, University of Alabama at Birmingham, elliotl@uab.edu
Abstract: Modeling nucleic acid motifs remains a significant problem in computational biology. We have developed interpolated context models (ICMs) to capture both compositional biases and inter-dependencies in motifs. ICMs have proven to be flexible and can greatly increase the predictive capability of models given training sets of sufficient size.

D-4
Title: Integrative bioinformatic approaches for functional analysis of non-synonymous single nucleotide polymorphisms

Authors: Sivakumar Gowrisankar, Department of Biomedical Engineering, University of Cincinnati, sivakumar.gowrisankar@cchmc.org, Jing Chen, Department of Biomedical Engineering, University of Cincinnati, jing.chen@cchmc.org, Anil Jegga, Department of Pediatrics and Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, University of Cincinnati, anil.jegga@cchmc.org, Bruce Aronow, Departments of Biomedical Engineering and Pediatrics and Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, University of Cincinnati, bruce.aronow@cchmc.org
Abstract: Mining public SNP databases is challenging and it is technically impossible to proceed with association studies for all SNPs. We present PolyDoms (http://polydoms.cchmc.org), a Web-based resource, for efficient SNP database mining, polymorphism annotation and prioritization of candidate genes and SNPs for further studies.

D-5
Title: SIMPROT: Using an empirically determined indel distribution in simulations of protein evolution
Authors: Andy Pang, Ontario Cancer Institute, University Health Network, e.tillier@utoronto.ca, Andrew Smith, Ontario Cancer Institute, University Health Network, e.tillier@utoronto.ca, Paulo Nuin, Ontario Cancer Institute, University Health Network, pnuin@uhnres.utoronto.ca, Elisabeth Tillier, Ontario Cancer Institute, University Health Network, Dept of Medical Biophysics, University of Toronto, e.tillier@utoronto.ca
Abstract: Simprot has a new method of simulating protein sequence evolution, including insertion and deletion events and amino-acid substitutions. Statistical model is based on empirical indel distribution determined by Qian-Goldstein. We have parameterized this distribution; it applies to sequences diverged by varying evolutionary times, providing flexibility in simulation conditions.

D-6
Title: Phylogibbs: A motif sampling algorithm that incorporates phylogeny

Authors: Rahul Siddharthan, Institute of Mathematical Sciences, CIT Campus, Taramani, Chennai, rsidd@imsc.res.in, Eric D. Siggia, Center for Studies in Physics and Biology, The Rockefeller University, siggia@eds1.rockefeller.edu, Erik van Nimwegen, Biozentrum, University of Basel, erik.vannimwegen@unibas.ch
Abstract: We present a new motif finding algorithm that takes the phylogeny of the input sequences into account and samples the space of all possible assignments of regulatory sites for multiple regulatory motifs. Extensive comparisons on synthetic and real data show it to be significantly more accurate than existing algorithms.

D-7
Title: Classifying bacterial species using base composition analysis

Authors: Christian Massire, Ibis Therapeutics, a division of Isis Pharmaceuticals, cmassire@isisph.com, Vanessa Harpin, Ibis Therapeutics, a division of Isis Pharmaceuticals, vharpin@isisph.com, Vivek Samant, Ibis Therapeutics, a division of Isis Pharmaceuticals, vsamant@isisph.com, Thomas A. Hall, Ibis Therapeutics, a division of Isis Pharmaceuticals, thall@isisph.com, Harold M. Levine, Ibis Therapeutics, a division of Isis Pharmaceuticals, hlevene@isisph.com, Rangarajan Sampath, Ibis Therapeutics, a division of Isis Pharmaceuticals, rsampath@isisph.com, David J. Ecker, Ibis Therapeutics, a division of Isis Pharmaceuticals, decker@isisph.com
Abstract: TIGER technology allows the quick and reliable identification of bacterial species, using only mass spectrometry-derived base compositions of selected loci. Here we show how "triangulation" of these base compositions allows a unique bacterial identification without prior knowledge of which species might be present in a given sample.

D-8
Title: Analysis of Conserved Non-coding Sequences in Vertebrate Genomes

Authors: Christina Chen, Washington University, chen@genetics.wustl.edu, Barak Cohen, Washington University, cohen@genetics.wustl.edu
Abstract: Several recent computational studies identified segments of non-coding DNA that are more conserved than coding sequences across many species. We wish to gain insight into the functions of these segments and the evolutionary mechanisms that maintain their high level of sequence identity by studying their orthologues in fish genomes.

D-9
Title: Predicting Non-Coding RNA Genes in Genome Sequences

Authors: Andrew Uzilov, University of Rochester, andrew.uzilov@gmail.com, David Mathews, University of Rochester, david_mathews@urmc.rochester.edu
Abstract: The effectiveness of predicting non-coding RNA genes from two crudely aligned genome sequences by simultaneous optimization of secondary structure formation free energy change and sequence alignment is demonstrated. This approach predicts ncRNA genes with high sensitivity and specificity in tests with known and randomized ncRNA sequences, respectively.

D-10
Title: A Novel Clustering Approach for Protein Family Partitioning

Authors: Tsu-Shu Tseng, Academia Sinica, tsushu@yahoo.com, Tsai-Tien Tseng, University of Illinois at Urbana-Champaign, ttseng@uiuc.edu, Milton Saier, University of California, San Diego, ttseng@uiuc.edu
Abstract: We here present a novel approach for the identification of phylogenetic subfamilies within a protein superfamily. When many families are related, a superfamily is established. The GROUP program will allow large numbers of homologous proteins to be systematically separated into subfamilies based on statistical methods.

D-11
Title: Gap Attraction: An objective quality measure for whole-genome alignments

Authors: Naila Mimouni, Oxford University, naila.mimouni@bnc.ox.ac.uk, Gerton Lunter, Oxford University, lunter@stats.ox.ac.uk, Jotun Hein, Oxford University, hein@stats.ox.ac.uk
Abstract: "Gap Attraction" is a new objective measure of alignment accuracy. It measures the proportion of intergaps - conserved regions between two random neighbouring indels. This measure was obtained for alignment of the human and mouse genomes, for Blastz and Clustalw. As expected but never previously verified, Blastz performs better than Clustalw.

D-12
Title: Computational geometry approach to predict the affect of the point mutation on the B-Raf kinase activity

Authors: Tariq Alsheddi, George Mason University, talshedd@gmu.edu, Iosif Vaisman, George Mason University, ivaisman@gmu.edu
Abstract: A computational geometry technique employing Delaunay tessellation of protein structure, represented by Ca atoms, to derive a statistical residue contact potential is used to study the effects of single residue mutations on the kinase activity of B-Raf kinase. Profiles of residue scores derived from the four-body statistical potential are constructed for 21 mutants of the B-Raf kinase and subtracted from the profile of the wild-type B-Raf protein. The net contact potential for mutations correlates with measured kinase activity.

D-13
Title: Synucleins and group 3 LEA proteins: nature's slight-of-hand, exposed

Authors: Shahin Zibaee, Cambridge Institute for Medical Research, University of Cambridge, sz215@cam.ac.uk, Michael Wise, Biomedical and Chemical Sciences, University of Western Australia, mwise@cyllene.uwa.edu.au
Abstract: A novel bioinformatic method is used to detect certain sequence patterns common to two natively unfolded protein families which have otherwise been undetected. Following this up, we have performed biophysical experiments that confirm the similarity extends to conformational traits and potentially function, particularly in the context of membrane bilayers.

D-14
Title: Variation in the level of diversity of synonymous codon usage bias among bacteria

Authors: Haruo Suzuki, Institute for Advanced Biosciences Keio University, haruo@sfc.keio.ac.jp, Rintaro Saito, Institute for Advanced Biosciences Keio University, rsaito@sfc.keio.ac.jp, Masaru Tomita, Institute for Advanced Biosciences Keio University, mt@sfc.keio.ac.jp
Abstract: We introduce a mean distance-based index for quantifying the level of diversity of synonymous codon usage bias among genes within each genome. The index can be applied to any genomes for which no biological knowledge is known. We have applied this index to complete genome sequences of 120 bacteria.

D-15
Title: PSI-BLAST-ISS, intermediate sequence searching for estimation of sequence alignment reliability

Authors: Mindaugas Margelevicius, Institute of Biotechnology, minmar@ibt.lt, Ceslovas Venclovas, Institute of Biotechnology, venclovas@ibt.lt
Abstract: Sequence alignments have become indispensable in evolutionary, structural and functional protein studies. However, only accurate and reliable alignment regions are informative. We have developed PSIBLAST-ISS, a tool designed to delineate reliable regions of sequence alignments. It favorably compares with the existing similar software both in performance and functional features.

D-16
Title: Hidden Markov Models Hierarchical Classification for ab-initio prediction of Protein Subcellular Localization

Authors: Hugues Richard, Laboratoire Statistique et Génomes, CNRS/INRA, richard@genopole.cnrs.fr, Marie-Hélène Mucchielli, Centre de Génétique Moléculaire, Marie-Helene.Mucchielli@cgm.cnrs-gif.fr
Abstract: We propose a new method to predict the subcellular localization of proteins in eukaryotic organisms. Each sequence is classified using a hierarchical tree, making decision at each node based on the likelihood with respect to hidden markov models

D-17
Title: A Learning Framework for Detecting Remote Non-Coding RNA Homologues

Authors: Keyur Desai, Michigan State University, desaikey@egr.msu.edu, John Deller, Michigan State University, deller@egr.msu.edu, Hayder Radha, Michigan State University, radha@egr.msu.edu
Abstract: A framework for identifying remote non-coding RNA homologues is proposed and shown to perform well in classifying sequences from RFAM database. The framework combines generative models like stochastic context-free grammars that are capable of modeling inherent statistical signals of ncRNA sequences with discriminative classifiers like support vector machines.

D-18
Title: PDZ domains: Predicting sub-family specificity of protein-protein interaction

Authors: Boris Reva, Computational Biology Center, Memorial Sloan-Kettering Cancer Center, New York, borisr@mskcc.org, Gary Bader, Computational Biology Center, Memorial Sloan-Kettering Cancer Center, New York, bader@cbio.mskcc.org, Chris Sander, Computational Biology Center, Memorial Sloan-Kettering Cancer Center, New York, sanderc@mskcc.org
Abstract: The family of PDZ peptide recognition domains is important in cell signaling and has many representatives in the human genome. We identify PDZ domain residues important for specific peptide recognition ("specificity signatures") by computing an optimal division of a multiple sequence alignment into a set of functionally specific subfamilies.

D-19
Title: An Interactive Web-Based Multiple Sequence Alignment Viewer with Polymorphism Analysis Support

Authors: Payan Canaran, Cold Spring Harbor Laboratory, canaran@cshl.edu, Doreen Ware, Cold Spring Harbor Laboratory, ware@cshl.edu, Lincoln Stein, Cold Spring Harbor Laboratory, steinl@cshl.edu
Abstract: We developed an interactive web-based viewer for displaying pre-computed multiple sequence alignments. Initially developed to support visualization needs of the maize diversity web site Panzea (www.panzea.org), the viewer is designed as a generic stand-alone tool that can be integrated into already existing web sites.

D-20
Title: Identifying potential targets for enhancer action by the ultra conserved elements

Authors: Courtney Onodera, University of California, Santa Cruz, conodera@soe.ucsc.edu, Gill Bejerano, University of California, Santa Cruz, jill@soe.ucsc.edu, Sofie Salama, University of California, Santa Cruz, ssalama@soe.ucsc.edu, W. James Kent, University of California, Santa Cruz, kent@soe.ucsc.edu, David Haussler, University of California, Santa Cruz, haussler@soe.ucsc.edu
Abstract: The ultra conserved elements in the human, mouse, and rat genomes often occur in clusters, and many may function as transcriptional enhancers for key developmental transcription factors. We present a comparative analysis of the genomes which characterizes the clusters as possible enhancers and identifies potential targets for their action.

D-21
Title: New long oligonucleotide platform for whole genome expression in Drosophila melanogaster: sex and genotype dependent gene expression in inbred lines.

Authors: Richard Westerman, Purdue University, westerman@purdue.edu, Damion Junk, Purdue University, junkda@purdue.edu, Anne Genissel, University of California -- Davis, amgenissel@ucdavis.edu, Lisa Bono, Purdue University, lbono@purdue.edu, Marina Telonis-Scott, University of Florida, mtelonis@zoo.ufl.edu, Larry Harshman, University of Nebraska, lharsh@unlserve.unl.edu, Marta Wayne, University of Florida, mlwayne@zoo.ufl.edu, Artyom Kopp, University of California -- Davis, akopp@ucdavis.edu, Sergey Nuzhdin, University of California -- Davis, svnuzhdin@ucdavis.edu, Lauren McIntyre, Purdue University, lmcintyre@purdue.edu
Abstract: For Drosophila we have designed a set of oligonucleotides 60 base pairs in length that cover the known transcriptome, known alternative variants and predicted transcripts. Unique and common probes were designed for alternative transcript variants. The design process is general and can be readily amended as genome annotation improves.

D-22
Title: Motif-based similarity measures for regulatory modules

Authors: Eric Blais, McGill University, eblais@mcb.mcgill.ca, Swaminathan Mahadevan, McGill University, smahad1@cs.mcgill.ca, Gill Bejerano, UC Santa Cruz, jill@soe.ucsc.edu, Bohdana Ratitch, McGill University, bohdana@cs.mcgill.ca, Doina Precup, McGill University, dprecup@cs.mcgill.ca, David Haussler, UC Santa Cruz, haussler@soe.ucsc.edu, Mathieu Blanchette, McGill University, blanchem@mcb.mcgill.ca
Abstract: We present and compare two novel motif-based similarity measures for regulatory modules: a mismatch-kernel method and a transcription factor binding site profile-based method. Both methods are tested on simulated regulatory regions. By applying these methods to regulatory regions of the human genome, we identify clusters of functionally similar modules.

D-23
Title: Cis-regulatory Modules Detection Using Bayesian Network

Authors: Xiaoyu Chen, McGill Centre for Bioinformatics, McGill University, xiaoyu@mcb.mcgill.ca, Mathieu Blanchette, McGill Centre for Bioinformatics, McGill University, blanchem@mcb.mcgill.ca
Abstract: A new algorithm is proposed to predict tissue-specific Cis-regulatory modules. A Bayesian network is constructed to integrate comparative sequence data, gene expression data, and biologically verified module data. An EM algorithm and probability tree learning algorithm are used to train the Bayesian network, which is shown to work well on simulated and real data.

D-24
Title: Determining the Evolutionary Direction of Protein Domain-Fusion Using Genomic Fusion Flux

Authors: Zhenjun Hu, Research Associate, Boston University, zjhu@bu.edu, Jie Wu, Student, jiewu@bu.edu, Shujiro Okuda, Student, okuda@kuicr.kyoto-u.ac.jp, Tianhua Niu, Assistant Prof., niu@bioinfo.stat.harvard.edu, Boris Shakhnovich, Assistant Prof., Boston University, shaxno@gmail.com, Charles DeLisi, Professor, Boston University, delisi@bu.edu
Abstract: A computational method is proposed to extend the protein evolution at the genomic level by measuring genome-wide fusion flux (GFF) between any two organisms. The evolutionary order predicted by GFF of 13 Archaea species are confirmed according to known evolutionary orders of corresponding branches.

E-1
Title: Simple decision rules for classifying human cancers from gene expression profiles

Authors: Aik Choon Tan, Center for Cardiovascular Bioinformatics and Modeling, Whitaker Biomedical Engineering Institute, Johns Hopkins University, actan@bme.jhu.edu, Daniel Q Naiman, Department of Applied Mathematics and Statistics, Johns Hopkins University, daniel.naiman@jhu.edu, Raimond L Winslow, Center for Cardiovascular Bioinformatics and Modeling, Whitaker Biomedical Engineering Institute, Johns Hopkins University, rwinslow@bme.jhu.edu, Donald Geman, Department of Applied Mathematics and Statistics, Johns Hopkins University, geman@jhu.edu
Abstract: We introduce a new classifier - k-TSP (k-Top Scoring Pairs) - for classifying cancers from gene expression profiles. When tested on microarray data, our method performs as well as PAM and SVM, generates simple and accurate decision rules that only involve a small number of gene-to-gene expression comparisons, thereby facilitating follow-up studies.

E-2
Title: Alternative splice isoforms inferred from potential and cryptic splice sites in human genes

Authors: Chenghai Xue, Institute of Automation, Chinese Academy of Sciences, chenghai.xue@mail.ia.ac.cn, Fei Li, MOE Key Laboratory of Bioinformatics / Department of Automation, Tsinghua University, flee@tsinghua.edu.cn, Yuqiang Chen, Institute of Automation, Chinese Academy of Sciences, yuqiang.chen@mail.ia.ac.cn, Guo-ping Liu, Institute of Automation, Chinese Academy of Sciences, gpliu@glam.ac.uk, Xuegong Zhang, MOE Key Laboratory of Bioinformatics / Department of Automation, Tsinghua University, zhangxg@tsinghua.edu.cn, Yanda Li, MOE Key Laboratory of Bioinformatics / Department of Automation, Tsinghua University, daulyd@mail.tsinghua.edu.cn
Abstract: Identification of all possible alternative isoforms in human genes is an important task. We predict potential and cryptic splice sites in both exons and introns of human genes with support vector machine. The potential alternative isoforms in human genes are predicted by combining known constitutive and predicted potential splice sites.

E-3
Title: Association Mapping of Quantitative Traits Using Haplotypes

Authors: Jing Li, Case Western Reserve University, jingli@case.edu
Abstract: We recently developed an association mapping method for complex diseases by mining the sharing of haplotype segments in affected individuals that are rarely present in normal individuals. In this paper, we address the problem of localizing quantitative trait loci from unrelated individuals.

E-4
Title: Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data

Authors: Lei Xu, The Whitaker Biomedical Engineering Institute and Center for Cardiovascular Bioinformatics and Modeling, Johns Hopkins University, leixu@jhu.edu, Aik Choon Tan, The Whitaker Biomedical
Engineering Institute and Center for Cardiovascular Bioinformatics and Modeling, Johns Hopkins University, actan@bme.jhu.edu, Daniel Naiman, Department of Applied Mathematics and Statistics, Johns Hopkinis University, dan@ams.jhu.edu, Donald Geman, Department of Applied Mathematics and Statistics, Johns Hopkinis University, geman@jhu.edu, Raimond Winslow, The Whitaker Biomedical Engineering Institute and Center for Cardiovascular Bioinformatics and Modeling, Johns Hopkins University, rwinslow@bme.jhu.edu
Abstract: We propose a novel, simple method to integrate microarray data across multiple studies to identify reliable biomarkers. By applying the method to prostate cancer data, we have successfully identified a pair of highly robust marker genes which can be used to predict prostate cancer with high accuracy.

E-5
Title: Adapting SVM to Predict Translation Initiation Sites in the Human Genome

Authors: Stephen Kwek, University of Texas at San Antonio, kwek@cs.utsa.edu, Rehan Akbani, University of Texas at San Antonio, rakbani@cs.utsa.edu
Abstract: We modified the SVM algorithm to predict Translation Initiation Sites (TIS) in human genome. It can handle an imbalance ratio of 1:100 for TIS vs. non-TIS ATG sites. Predictors that are trained using the popular Pedersen and Nielsen dataset are ill-suited to handle such high imbalance ratio.

E-6
Title: G-compass: A New Web Tool for Comparative Genomics

Authors: Yasuyuki Fujii, JBIRC, JBIC, yfujii@jbirc.aist.go.jp, Takeshi Itoh, National Institute of Agrobiological Sciences, taitoh@affrc.go.jp, Ryuichi Sakate, JBIRC, JBIC, rsakate@jbirc.aist.go.jp, Kanako Koyanagi, Hokkaido University, kkoyanag@ist.hokudai.ac.jp, Akihiro Matsuya, Hitachi, Co., Ltd., amatsuya@jbirc.aist.go.jp, Takuya Habara, JBIRC, JBIC, thabara@jbirc.aist.go.jp, Kaori Yamaguchi, JBIRC, JBIC, khabara@jbirc.aist.go.jp, Yayoi Kaneko, JBIRC, JBIC, y-kaneko@m9.dion.ne.jp, Takashi Gojobori, National Institute of Genetics, tgojobor@genes.nig.ac.jp, Tadashi Imanishi, BIRC, AIST, imanishi@jbirc.aist.go.jp
Abstract: We developed a new database of genome alignments, G-compass. Currently, G-compass provides human-mouse genome alignments that cover 17% of the human genome. G-compass is useful for finding conserved regions in the human genome and is freely accessible at http://www.jbirc.aist.go.jp/g-compass/.

E-7
Title: Identification of cellular mixtures from microarray gene expression data

Authors: Andrew Hill, Wyeth Research, ahill@wyeth.com, Yizheng Li, Wyeth Research, yli@wyeth.com, Maryann Whitley, Wyeth Research, mwhitley@wyeth.com
Abstract: Tissue samples are rarely homogeneous. Tools to characterize cell mixtures from microarray gene expression data are needed. We analyze a mixing experiment, where samples contained defined fractions of two cell types. We also examine the more difficult problem of identifying cell mixtures from RNA samples derived from peripheral blood cells.

E-8
Title: The Gluon RNAs: A Model of Global Gene Expression Control and 3-D Chromosome
Interactions Mediated by Multiplex RNA-DNA-DNA Interconnections
Authors: Vladimir Kuznetsov, Genome institute of Singapore, kuznetsov@gis.a-star.edu.sg
Abstract: Alternative forms of multiplex conserved RNA sequences from the same genic (3'UTR/5'UTR) region is thought can form the homologous triplex helixes on the same or different chromosomes. We will call such multiplex DNA-bridging mRNAs the gluon RNAs. The gluon RNA-mediated DNA links/loops can provide a global probabilistic gene-expression control network.

E-9
Title: Design of Multiplexed Oligionucleotide Ligation Assays for High Throughput Single-nucleotide and Insertion/Deletion Polymorphism Genotyping

Authors: Ryan Koehler, Applied Biosystems, koehlert@appliedbiosystems.com, Zheng Zhang, Applied Biosystems, zheng@paracel.com, Nicolas Peyret, Applied Biosystems, peyretnn@appliedbiosystems.com, Joseph Day, Applied Biosystems, dayjp@appliedbiosystems.com, Sabine Short, Applied Biosystems, shortsn@appliedbiosystems.com, Michael Wenz, Applied Biosystems, wenzmh@appliedbiosystems.com, Francisco De La Vega, Applied Biosystems, delavefm@appliedbiosystems.com
Abstract: Insertion and deletions are important mediators of disease and disease susceptibility. We developed a multiplexed assay based on oligonucleotide ligation to determine genotypes of insertion/deletions and SNPs. Improved algorithms ensuring specific probes meet thermodynamic and genome specificity requirements are described. Substantially increased success rates of indel genotyping are demonstrated.

E-10
Title: PANP - a New Method for Gene Detection for Oligonucleotide Expression Arrays

Authors: Peter Warren, Serono Research Institute/Rabb Bioinformatics Graduate Program, Brandeis University, peter.warren@verizon.net, Jadwiga Bienkowska, Serono Research Institute, jadwiga.bienkowska@serono.com, Paolo Martini, Serono Research Institute, paolo.martini@serono.com, Jennifer Jackson, Serono Research Institute, jennifer.jackson@serono.com, Deanne Taylor, Serono Research Institute/Rabb Bioinformatics Graduate Program, Brandeis University, deanne.taylor@serono.com
Abstract: We have developed a statistical method in R, called the Presence- Absence calls with Negative Probes (PANP) which out-performs the MAS5.0 PA method across concentrations in several metrics of accuracy and precision, using a variety of pre-processing methods: MAS5.0, RMA and GCRMA.

E-11
Title: VCMap - Integrating Multiple Maps to Increase Value of Unfinished Genomes

Authors: Jeff Nie, Medical College of Wisconsin, jnie@mcw.edu, Anne Kwitek, Medical College of Wisconsin, ablack@mcw.edu, Simon Twigger, Medical College of Wisconsin, simont@mcw.edu, Dawei Li, Medical College of Wisconsin, dli@mcw.edu, Susan Bromberg, Medical College of Wisconsin, sbromber@mcw.edu, Dean Pasko, Medical College of Wisconsin, dpasko@mcw.edu, Mary Shimoyama, Medical College of Wisconsin, shimoyma@mcw.edu, Howard Jacob, Medical College of Wisconsin, jacob@mcw.edu
Abstract: VCMap is a popular multi-species, multiple map integration and visualization tool, It isparticularly useful for species for which finished whole genome assemblies are not available. The inclusion of QTL maps also is a unique feature.

E-12
Title: DNA microarray analysis of gene expression during molting in the spruce budworm, Choristoneura fumiferana

Authors: Dayu Zhang, Great Lakes Forestry Centre, Natural Resources Canada and Department of microbiology, University of Guelph, dzhang@nrcan.gc.ca, Tim Ladd, Great Lakes Forestry Centre, Natural Resources Canada, tladd@nrcan.gc.ca, Sichun Zheng, Great Lakes Forestry Centre, Natural Resources Canada, szheng@nrcan.gc.ca, Lan Li, Great Lakes Forestry Centre, Natural Resources Canada, lanli@nrcan.gc.ca, Deborah Buhlers, Great Lakes Forestry Centre, Natural Resources Canada, dbuhlers@nrcan.gc.ca, Peter Krell, Department of microbiology, University of Guelph, pkrell@uoguelph.ca, Basil Arif, Great Lakes Forestry Centre, Natural Resources Canada, barif@nrcan.gc.ca, Qili Feng, Great Lakes Forestry Centre, Natural Resources Canada, qfeng@nrcan.gc.ca
Abstract: A cDNA-based microarray has been constructed and used to analyze gene expression profiles of Choristoneura during larval molting. Genes that show significant difference in the expression level between molting and intermolting have been identified and clustered. These genes are involved in several biological processes.

E-13
Title: Improvement of Microarray Analysis Algorithms with Intelligent Parameter Selection for SVM

Authors: John Phan, Georgia Institute of Technology, gtg407s@mail.gatech.edu, Richard Moffitt, Georgia Institute of Technology, gtg394z@mail.gatech.edu, Andrew Young, Atlanta VA Medical Center, maywang@bme.gatech.edu, John Petros, Emory University, maywang@bme.gatech.edu, May Wang, Georgia Institute of Technology, maywang@bme.gatech.edu
Abstract: Genetic marker identification from microarray data is an important step towards improving clinical diagnosis and prognosis of disease. Several methods have been applied to this problem, such as support vector machines (SVM). However, SVM parameters must be finely tuned to the dataset of interest in order to optimize the algorithm.

E-14
Title: Patterns of Gene Deletion following Genome Duplication in Yeast

Authors: Jake Byrnes, Department of Ecology and Evolution, University of Chicago, byrnes@uchicago.edu, Wen-Hsiung Li, Department of Ecology and Evolution, University of Chicago, wli@uchicago.edu
Abstract: Whole genome duplication (WGD) is followed by massive duplicate deletion that reorganizes gene adjacencies (Wolfe 2001). We compare the deletion patterns and adjacency reorganization following WGD in yeast with simulations. We find that deletion events alternate between paralogous chromosomes more often than expected under a random duplicate deletion model.

F-1
Title: An algorithm for designing siRNA and oligonucleotides with enhanced target specificity

Authors: Tariq Alsheddi, George Mason University, talshedd@gmu.edu, Ancha Baranova, George Mason University, abaranov@gmu.edu
Abstract: We have developed a new algorithm that allows one to map all unique short-string sequences ("the target") with lengths (N) = 9-15 nt within large sets of sequences. This efficient approach minimizes off-target gene silencing by excluding potential cross-hybridization candidates which widely used BLAST search may overlook.

F-2
Title: A Pipeline for Computational Screening of Candidate Mammalian Non-coding RNAs

Authors: Yongmei Ji, Rosetta Inpharmatics LLC, a wholly owned subsidiary of Merck & Co., Inc., yongmei_ji@merck.com, Ronghua Chen, Rosetta Inpharmatics LLC, a wholly owned subsidiary of Merck & Co., Inc., ronghua_chen@merck.com, Archie Russell, Rosetta Inpharmatics LLC, a wholly owned subsidiary of Merck & Co., Inc., archie_russell@merck.com, Guoya Li, Rosetta Inpharmatics LLC, a wholly owned subsidiary of Merck & Co., Inc., guoya_li@merck.com, Jason Johnson, Rosetta Inpharmatics LLC, a wholly owned subsidiary of Merck & Co., Inc., jason_johnson@merck.com
Abstract: A significant fraction of the mammalian transcriptome appears to be non-coding sequences. We developed a pipeline for systematic computational screening of mammalian transcript sequences to identify putative non-coding RNA genes. The predicted non-coding transcripts provide a foundation for experimental validation and expression analysis of ncRNA genes.

F-3
Title: Identifying Targets of Small Non-coding RNA Genes in Prokaryotes

Authors: Brian Tjaden, Wellesley College, btjaden@wellesley.edu
Abstract: Small non-coding RNA genes in prokaryotes often act by basepair binding to mRNA targets as a means of post-transcriptional regulation. To date, only a handful of targets have been identified for these genes. We present a novel approach for predicting regulatory targets of non-coding RNAs in prokaryotes.

F-4
Title: Analyzing DNA sequence motifs in a SNAP

Authors: Yoseph Barash, Hebrew University, hoan@cs.huji.ac.il, Aviad Rozenhek, Hebrew University, aviadr@cs.huji.ac.il, Jeremy Moskovich, Hebrew University, playmobil@cs.huji.ac.il, Tommy Kaplan, Hebrew University, tommy@cs.huji.ac.il, Hanah Margalit, Hebrew University, hanah@md.huji.ac.il, Nir Friedman, Hebrew University, nir@cs.huji.ac.il
Abstract: Despite the abundance of publications on cis regulatory motif search, a researcher may still find this task hard. One of the reasons is the lack of a friendly tool integrating all computational tasks involved. In this work we present the SNAP toolset and demonstrate its usefulness for sequence motif analysis.

G-1
Title: T-STAG: resource and web interface for tissue specific transcripts and genes

Authors: Shobhit Gupta, MPI for molecular genetics, gupta@molgen.mpg.de, Martin Vingron, MPI for molecular genetics, vingron@molgen.mpg.de, Stefan Haas, MPI for molecular genetics, haas@molgen.mpg.de
Abstract: T-STAG contains EST-based predicted genes and transcripts specifically/significantly expressed in certain tissues/sub-tissues. This data set is categorized according to different biological (disease/normal) and technical (normalization/subtraction) origin of the respective cDNA libraries. Thus T-STAG allows to investigate/compare distinct subsets of genes/transcripts.

G-2
Title: Priority ANalysis for Disease Association (PANDA) System

Authors: Takayuki Taniya, Japan Biological Informatics Research Center, ttaniya@jbirc.aist.go.jp, Susumu Tanaka, Tokyo Institute of Psychiatry, stanaka@prit.go.jp, Hideki Hanaoka, The University of Tokyo Biotechnology Research Center Lab. of Plant Biotechnology, uhanaoka@mail.ecc.u-tokyo.ac.jp, Harutoshi Maekawa, C's lab Co.Ltd., hmaekawa@jbirc.aist.go.jp, Chisato Yamasaki, National Institute of Advanced Industrial Science and Technology, cyamasak@jbirc.aist.go.jp, Tadashi Imanishi, National Institute of Advanced Industrial Science and Technology, Imanishi@jbirc.aist.go.jp, Takashi Gojobori, National Institute of Genetics, tgojobor@genes.nig.ac.jp
Abstract: The purpose of our study is to focus on new disease-susceptible genes by using H-InvDB. We used 8 kinds of information (GO, KEGG, etc.) coupled with algorithm to assign significance value for each gene in relation to a specific-disease and conducted discriminant analysis to predict those candidates.

G-3
Title: Gviewer: A novel genome-wide viewer for gene, pathway, phenotype, and disease data

Authors: Dean Pasko, Medical College of Wisconsin, dpasko@mcw.edu, Jiali Chen, Medical College of Wisconsin, jlchen@mcw.edu, Lan Zhao, Medical College of Wisconsin, zlan1@hotmail.com, Mary Shimoyama, Medical College of Wisconsin, shimoyma@mcw.edu, Weiye Wang, Medical College of Wisconsin, wcwangw13@yahoo.com, Victoria Petri, Medical College of Wisconsin, vpetri@mcw.edu, Susan Bromberg, Medical College of Wisconsin, SBromberg@mail.brc.mcw.edu, Simon Twigger, Medical College of Wisconsin, simont@mcw.edu, Anne Kwitek, Medical College of Wisconsin, akwitek@mcw.edu, Howard Jacob, Medical College of Wisconsin, jacob@mcw.edu
Abstract: Gviewer provides researchers with an interactive, broad-view graphic of all genomic elements, from single gene function to biological processes, cellular components, phenotypes, diseases, and pathways. The tool also provides navigation functionality, which allows researchers to access detailed reports and sequence data.

G-4
Title: A New Professional Web-based ISCB Student Council Framework for Computational Biology Support http://www.iscbsc.org

Authors: Parthiban Vijaya, International Max Planck Research School, parthi@uni-koeln.de
Abstract: A new professional framework for ISCB student council activities has been created to promote computational biology support for the researchers and students. It integrates all activities of the existing student council committees under one roof. Besides, the service is handled with distributed maintenance and enables a simple, fast and efficient communication between students and mentors and shortens the distance between those who work in closely related areas of computational biology.

G-5
Title: EST database development and bioinformatics analysis of the transcripts of the spruce budworm genes

Authors: Lan Li, Great Lakes Forestry Center, Natural Resources Canada, lanli@nrcan.gc.ca, Tim Ladd, Great Lakes Forestry Center, Natural Resources Canada, tladd@nrcan.gc.ca, Sichun Zheng, Great Lakes Forestry Center, Natural Resources Canada, szheng@nrcan.gc.ca, Deborah Buhlers, Great Lakes Forestry Center, Natural Resources Canada, dbuhlers@nrcan.gc.ca, Peter J. Krell, University of Quelph, pkrell@uoguelph.ca, Basil Arif, Great Lakes Forestry Center, Natural Resources Canada, barif@nrcan.gc.ca, Arthur Retnakaran1, Great Lakes Forestry Center, Natural Resources Canada, aretnak@nrcan.gc.ca, Qili Feng, Great Lakes Forestry Center, Natural Resources Canada, qfeng@nrcan.gc.ca
Abstract: A web based EST resource (http://www.pestgenomics.org/database.htm) to accelerate the molecular and genomic analysis of the spruce budworm(SB), Choristoneura fumiferana, can be searched by BLAST and Keyword. A further bioinformatics analysis of the transcripts of the SB genes is based on the annotated data resources.

G-6
Title: Protein Structure Visualization on a Graphics-Accelerated PDA

Authors: Gregory Quinn, San Diego Supercomputer Center, quinn@sdsc.edu
Abstract: We present here a protein structure visualization application for mobile devices that leverages the hardware graphics acceleration and VGA displays being incorporated into the new breed of PDA's and mobile devices coming to market. The application includes a protein structure database to enable persistance of network-derived data.

G-7
Title: Applying Online Analytical Processing (OLAP) to Mine Gene Expression Databases

Authors: Nadim Alkharouf, US Department of Agriculture, ARS, Soybean Genomics and Improvement Laboratory, nadimk@hotmail.com, D. Curtis Jamison, George Mason University, cjamison@gmu.edu, Benjamin Matthews, US Department of Agriculture, ARS, Soybean Genomics and Improvement Laboratory, matthewb@ba.ars.usda.gov
Abstract: An OLAP cube was constructed and used to mine a time series experiment designed to identify genes associated with resistance of soybean to the soybean cyst nematode. A number of candidate resistance genes and pathways were found. Compared to traditional cluster analysis of gene expression data, OLAP was more effective and faster in finding biologically meaningful information.

G-8
Title: Estimating Characteristics of Chemical Reactions Using SPLS

Authors: Jung-Wook Bang, Imperial College, jbang@doc.ic.ac.uk, Derek Crockford, Imperial College, jbang@doc.ic.ac.uk, Pazos Florencio, Imperial College, jbang@doc.ic.ac.uk
Abstract: Inhibition of enzyme in metabolic networks is a complex phenomenon that has important practical implications in pharmacology since many drugs can inhibit enzymes leading to unwanted side effects. We propose a System that estimates toxin-induced effects. We can successfully retrieve toxin-induced effects on given metabolites automatically.

G-9
Title: The New OMIA: An Enhanced Platform For Online Mendelian Inheritance In Animals

Authors: Johann Lenffer, Biotechnology Research Institute, Macquarie University, jlenffer@bio.mq.edu.au, Kao Castle, biokao, Sydney, Australia, kao@biokao.com, Michael Poidinger, Johnson & Johnson Research Australia, mpoiding@medau.jnj.com, Frank W. Nicholas, Reprogen, Faculty of Veterinary Science, University of Sydney, frankn@vetsci.usyd.edu.au, Shoba Ranganathan, Biotechnology Research Institute, Macquarie University, shoba@els.mq.edu.au
Abstract: Online Mendelian Inheritance in Animals (OMIA) is the primary source of information on inherited disorders in non-laboratory animals. OMIA provides comprehensive access to animal models of human inherited disorders and information on potential human orthologs of inherited traits in animals. This re-release brings a long-overdue update to a well-established resource.