ISMB ECCB 2009

Accepted Posters

Category 'M'- Machine Learning'

Poster M01

Discovering the Rules of Reversible Membrane Binding: A Machine Learning Protocol for identifying C1, C2, and PH Domain Properties.

Morten Källberg- University of Illinois

No additional authors

Short Abstract: We present a machine learning protocol for determining membrane-targeting properties achieving 85-90% accuracy in separating binding and non-binding domains within families. The developed model is represented as an interpretable tree of rules showing good agreement between statistically discovered binding properties and those reported in experimental work.

Long Abstract: Click Here

Poster M02

A new kernel function for clinical data

Anneleen Daemen- KULeuven

Bart De Moor (KULeuven, ESAT-SCD);

Short Abstract: To fully exploit clinical information, appropriate modeling is required. We propose a new kernel function that distinguishes between continuous, ordinal and nominal variables. Evidently, a Least Squares Support Vector Machine based on this kernel function significantly outperformed the widely used linear kernel function when tested on three data sets.

Long Abstract: Click Here

Poster M03

Identifying essential genes in metabolic networks of bacteria in silico

Rainer Koenig- IPMB

Kitiporn Plaimas (IPMB, University of Heidelberg, Bioinformatics); Roland Eils (DKFZ Heidelberg, TBI Bioinformatics);

Short Abstract: We have developed a machine learning algorithm that infers essential reactions in a metabolic network from the topology of the network and experimental data from genomic sequences and gene expression of the corresponding coding genes. With this we support and extend experiments of high throughput genome wide knock screens.

Long Abstract: Click Here

Poster M04

Predicting Protein Subcellular Localization Using Abstract Sequential Features

Cornelia Caragea- Iowa State University

Adrian Silvescu (Yahoo! Labs, CA); Vasant Honavar (Iowa State University, Computer Science);

Short Abstract: We present an approach to predicting protein subcellular localization from amino acid sequences that exploits the complementary strengths of feature construction (constructing complex features from existing features) and feature abstraction (grouping similar features to generate more abstract features) or feature selection to adapt the data representation used by the learner.

Long Abstract: Click Here

Poster M05

Kernel Alignment K-NN for the Identification of Human Cancer Samples using the Gene Expression Profiles

Manuel Martin-Merino- Universidad Pontificia de Salamanca

No additional authors

Short Abstract: kNN classifier has been applied to the identification of cancer samples with encouraging results. We present a new method to learn a linear combination of dissimilarities for the kNN classifier that is robust to overfitting and solves a semi-definite programming approach. Our algorithm outperforms otheralternatives in several cancer datasets.

Long Abstract: Click Here

Poster M06

A Kernel PCA Biplot method applied to gene expression data

Ferran Reverter- Universitat de Barcelona

Esteban Vegas Lozano (Universitat de Barcelona, Dept. Estadística);

Short Abstract: We describe a computational graphical tool to visualize genes and samples. We develop a biplot technique based on kernel PCA. We analyze two genomic datasets. Results suggest that our technique is a useful tool to find genes that have a similar pattern of up/down regulation for the samples

Long Abstract: Click Here

Poster M07

Automatic classification of P-type ATPases using Structured Logistic Regression

Poul Liboriussen- Aarhus University

Bjørn Panyella Pedersen (Aarhus University, Centre for Membrane Pumps in Cells and Disease (PUMPKIN)); Poul Nissen (Aarhus University, Centre for Membrane Pumps in Cells and Disease (PUMPKIN)); Christian Nørgaard Storm Pedersen (Aarhus University, Bioinformatics Research Center (BiRC));

Short Abstract: P-type ATPases are a very large family of ATP-driven membrane pumps involved in transmembrane transport of charged substrates. We have constructed a classifier that can distinguish between the 11 subfamilies with high accuracy. The classified it applied to Swiss-Prot/TrEMBL, and finds 6.624 P-Type ATPases.

Long Abstract: Click Here

Poster M08

Discovering biomarker panels in experiments with pooling or heterogeneous tissues

Dirk Repsilber- Research Institute for the Biology of Farm Animals

Anna Telaar (FBN Dummerstorf, Genetics and Biometry); Gerd Nürnberg (FBN Dummerstorf, Genetics and Biometry);

Short Abstract: A plethora of statistical learning approaches is being applied to find biomarkers and also true multivariate biomarker signatures. We show how pooling and tissue heterogeneity influence the possibility of detection of biomarker signatures and compare statistical learning algorithms with respect to robustness to these stumbling blocks of biomarker discovery.

Long Abstract: Click Here

Poster M09

Combining evidence from ranked gene lists

Raivo Kolde- University of Tartu

Sven Laur (University of Tartu, Institute of Computer Science); Priit Adler (University of Tartu, Institute of Molecular and Cell Biology); Jaak Vilo (University of Tartu, Institute of Computer Science);

Short Abstract: We propose a strategy for combining evidence from ranked lists of genes. In addition to the ranking of genes, the algorithm assigns significance probability for each gene. The method can be applied in network reconstruction, meta-analysis of microarray studies, etc.

Long Abstract: Click Here

Poster M10

Missing Value Imputation for Epistasis Maps

Colm Ryan- University College Dublin

Derek Greene (University College Dublin, School of Computer Science and Informatics); Nevan Krogan (University of California, San Francisco, Quantitiative Biology Institute); Gerard Cagney (University College Dublin, Conway Institute of Biomolecular and Biomedical Research); Pádraig Cunningham (University College Dublin, School of Computer Science and Informatics);

Short Abstract: We introduce the problem of missing value imputation for Epistasis miniarray profiles(E-MAPS) and show the results of adapting two existing techniques to address the problem. In doing so we highlight some unique aspects of the problem – the pairwise nature of the data and the high percentage of missing values.

Long Abstract: Click Here

Poster M11

KIRMES: Kernel-based Identification of Regulatory Modules in Euchromatic Sequences

Sebastian Schultheiss- Friedrich Miescher Laboratory of the Max Planck Society

Wolfgang Busch (Duke University, Biology Department); Jan Lohmann (University of Heidelberg, Center for Organismal Studies); Oliver Kohlbacher (University of Tuebingen, Wilhelm Schickard Institute for Computer Science); Gunnar Raetsch (Friedrich Miescher Laboratory of the Max Planck Society, Machine Learning in Biology);

Short Abstract: We identify genes regulated by the same transcription factor by analyzing sets of co-expressed genes from microarrays. KIRMES infers all genes regulated by the same mechanism as the ones in the input set. KIRMES makes use of motif sampling and newly developed kernel methods for this task.

Long Abstract: Click Here

Poster M12

Combining Support Vector Machines to predict novel angiogenesis genes

Kaur Alasoo- University of Tartu

Phaedra Agius (Memorial Sloan-Kettering Cancer Center, .); Jaak Vilo (Quretec Ltd, .); Hedi Peterson (Quretec Ltd, .);

Short Abstract: Angiogenesis is the natural process of growing new blood vessels in human body, that also plays an important role in cancer development. We have identified candidate genes based on 274 known angiogenesis genes using a new machine learning method employing Support Vector Machine (SVM) classification.

Long Abstract: Click Here

Poster M13

Natural Kernel-Induced Bayesian Learning for Microarray Data Analysis

Leo Cheung- Loyola University Medical Center

Xin Zhao (Sanjole Inc., Computer Engineering);

Short Abstract: Incorporating a novel natural kernel building procedure under a general unifying Bayesian framework, we propose a Natural Kernel-Imbedded Gaussian Process (NKIGP) for microarray data analysis. Based on simulated and real data studies, NKIGP performed very well consistently in both linear and non-linear cases without the need of parameter tuning.

Long Abstract: Click Here

Poster M14

Using support vector machines for the evaluation of computationally developed lipoxygenase structures

Aditya Jitta- University of Hyderabad

Aparoy P (University of Hyderabad, School of Life Sciences); Reddanna P (University of Hyderabad, School of Life Sciences);

Short Abstract: Lipoxygenases are a group of structurally related family of non-heme, iron-containing dioxygenases,the geometry and composition at metal binding site in 3D models of lipoxygenases is very important.Based on these features,a tool was developed using support vector machines to evaluatecomputationally developed lipoxygenase structures.

Long Abstract: Click Here

Poster M15

I/NI-calls: a novel unsupervised feature selection criterion

Sepp Hochreiter - Johannes Kepler University Linz

Djork-Arné Clevert (Johannes Kepler University Linz, Institute of Bioinformatics); Willem Talloen (Johnson & Johnson Pharmaceutical Research & Development, Pharmaceutical Research & Development); Hinrich Göhlmann (Johnson & Johnson Pharmaceutical Research & Development, Pharmaceutical Research & Development); Sepp Hochreiter (Johannes Kepler University Linz, Institute of Bioinformatics);

Short Abstract: We propose a novel unsupervised gene selection criterion that is based on a probabilistic latent variable model that takes probe level information -- probe correlations that cannot be explained by noise -- into account to filter out inconsistent probe sets.

Long Abstract: Click Here

Poster M16

An integrative pipeline for automated data analysis and gene function annotation for genome wide high content RNAi screening

Stephen Wong- Center for Biotechnology and Informatics, The Methodist Hospital

Xiaobo Zhou (Center for Biotechnology and Informatics, The Methodist Hospital, The Methodist Hospital Research Institute and Department of Radiology);

Short Abstract: We propose an integrated pipeline of automated data analysis for high-content screening of genome-wide RNA interference on Drosophila cell assays. Millions of cells are efficiently segmented, and previously un-scored phenotypes are identified. This image bioinformatics pipeline is especially helpful in predicting the roles of genes in complex biological processes.

Long Abstract: Click Here

Poster M17

Computational Linguistic Analyses of Unknown Metagenome Sequences

Victor Seguritan- San Diego State University

Anca Segall (San Diego State University, Biology); Rob Edwards (San Diego State University, Computer Science); Forest Rohwer (San Diego State University, Biology);

Short Abstract: A method is needed to assign functions to unknown sequences which does not rely on sequence homology alone. The linguistic elements, syntax and semantics, of several model proteins will be used to assign functions to unknown metagenomes in a manner similar to the concept of understanding human language.

Long Abstract: Click Here

Poster M18

Neural Network Pairwise Interaction Fields for protein model quality assessment

Alberto Jesus Martin- University College

Gianluca Pollastri (Complex and Adaptive Systems Laboratory, University College Dublin, School of Computer Science and Informatics); Alessandro Vullo (Complex and Adaptive Systems Laboratory, University College Dublin, School of Computer Science and Informatics);

Short Abstract: We present a new knowledge-based Model Quality Assessment Program (MQAP) at the residue level which evaluates single protein structure models. We use a tree representation of the C-alpha trace to train a novel Neural Network Pairwise Interaction Field (NN-PIF) to predict the global quality of a model.

Long Abstract: Click Here

Poster M19

BayesCall: A model-based basecalling algorithm}{BayesCall: A model-based basecalling algorithm

Wei-Chun Kao- UC Berkeley

Kristian Stevens (UC Davis, Computer Science); Yun Song (UC Berkeley, EECS);

Short Abstract: A novel model-based basecalling algorithm BayesCall is introduced for the Illumina sequencing platform. This new approach significantly improves the accuracy over Illumina's basecaller Bustard. For the 76-cycle PhiX174 data from Genome Analyzer II, BayesCall improves Bustard's per-base error rate by about 47%.

Long Abstract: Click Here

Poster M20

A Bayesian Monte Carlo Hidden Markov Model Approach to Transmembrane Protein Structure Prediction

Takashi Kaburagi- Waseda University

Takashi Matsumoto (Waseda University, Electrical Engineering and Bioscience);

Short Abstract: We present the preliminary results of a novel scheme for transmembrane protein structure prediction using a Bayesian hidden Markov model. We applied a Bayesian learning method via the Markov chain Monte Carlo (MCMC) sampling scheme to evaluate posterior distribution of Hidden Markov Model (HMM) parameters given the training data set.

Long Abstract: Click Here

Poster M21

A method for analyzing gene expression profiles based on the underlying structures

Shigeto Seno- Dept. Bioinfo. Eng., Grad. Sch. Info. Sci. Tech., Osaka Univ.

Yoichi Takenaka (Graduate School of Information Science and Technology, Osaka University, Bioinfomatic Engineering); Hideo Matsuda (Graduate School of Information Science and Technology, Osaka University, Bioinfomatic Engineering);

Short Abstract: Clustering is a powerful tool for elucidating relationships among genes, and one of the first steps in analysis. Meanwhile choice of suitable method for a given dataset is still difficult. Our approach discovers the underlying structure of a gene expression profile and provides a more intuitive understanding.

Long Abstract: Click Here

Poster M22

Monte Carlo-Based Bayesian Prediction of Gene Regulatory Networks with Zipf Distribution: Mouse Nuclear Receptor Superfamily

Haruka Miyachika- Waseda University

Yusuke Kitamura (Waseda University, Electrical Engineering and Bioscience); Tomomi Kimiwada (National Center of Neurology and Psychiatry, Neurosurgery); Jun Maruyama (Waseda University, Electrical Engineering and Bioscience); Takashi Kaburagi (Waseda University, Electrical Engineering and Bioscience); Takashi Matsumoto (Waseda University, Electrical Engineering and Bioscience); Keiji Wada (National Center of Neurology and Psychiatry, Neurosurgery);

Short Abstract: We present a Monte Carlo-based algorithm to predict gene regulatory network structure within a Bayesian framework. The algorithm assumes that prior distribution follows the Zipf law, and is implemented using the Exchange Monte Carlo method. We applied the algorithm to a mouse nuclear receptor superfamily.

Long Abstract: Click Here

Poster M23

Improving the prediction of protein-protein interactions by combining different biological sources

Herman van Haagen- LUMC

Peter-Bram 't Hoen (LUMC, Human Genetics); Barend Mons (LUMC, Human Genetics); Martijn Schuemie (Erasmus MC, Biosemantics group);

Short Abstract: Protein-protein interactions (PPIs) can be predicted based on different databases. In this study we investigate if combining those databases increases prediction power. In addition we investigate if the combined system covers more PPIs that can be evaluated. First results are promising both on coverage and prediction improvement.

Long Abstract: Click Here

Poster M24

Two-way Analysis of High-Dimensional Metabolomic Datasets

Ilkka Huopaniemi- Helsinki University of Technology

Tommi Suvitaival (Helsinki University of Technology, Department of Information and Computer Science); Janne Nikkilä (Helsinki University of Technology, Department of Information and Computer Science); Matej Oresic (VTT Technical Research Centre of Finland, Quantitative Biology and Bioinformatics); Samuel Kaski (Helsinki University of Technology, Department of Information and Computer Science);

Short Abstract: We present a Bayesian machine learning method for multivariate two-way ANOVA-type analysis ofhigh-dimensional, small sample-size metabolomic datasets. The method assumes clustered metabolites and presents confidence intervals of main and interaction up/down-regulation effects of the clusters.

Long Abstract: Click Here

Poster M25

Prediction of antifreeze protein from protein sequence

Chin-Sheng Yu- Feng Chia University

No additional authors

Short Abstract: By overall screening in current databases, there are very few homologs of anti-freeze protein in any other species with similar protein sequence and structure identified. We present an approach to recognize AFP from protein sequence. For a nonredundant data set, the overall prediction accuracy reaches 88%.

Long Abstract: Click Here

Poster M26

Bioinformatic analyses of mammalian 5'-UTR sequence properties of mRNAs predicts alternative translation initiation sites

Jill Wegrzyn- University of California at San Diego

Thomas Drudge (University of California at San Diego, Skaggs School of Pharmacy and Pharmaceutical Sciences); Farmarz Valafar (San Diego State University, Bioinformatics and Medical Informatics Research Center (BMIRC)); Vivian Hook (University of California at San Diego, Skaggs School of Pharmacy and Pharmaceutical Sciences);

Short Abstract: This study conducted a bioinformatic evaluation of the 5'-UTR of mammalian mRNA sequences. Machine learning techniques were applied for the classification and identification of non-AUG initiation sites in a group of mRNAs that have been experimentally demonstrated to utilize alternative sites for protein translation.

Long Abstract: Click Here

Poster M27

Cognitive State Classification with Magnetoencephalography Data

andrej savol- University of Pittsburgh

No additional authors

Short Abstract: We train a Support Vector Machine (SVM) soft-margin classifier on magnetoencephalography (MEG) brain-activation trajectories generated by human subjects viewing 60 common nouns divided into 12 noun groups. Semantic groupings and sensor information content are addressed.

Long Abstract: Click Here

Poster M28

The evaluation of common 1H-NMR metabolomics data preprocessing procedures reveals unanticipated side-effects

Tim De Meyer- Ghent University

Bjorn Van Gasse (Ghent University, Dept. Organic Chemistry); Davy Sinnaeve (Ghent University, Dept. Organic Chemistry); Sofie Bekaert (Ghent University, Dept. Molecular Biotechnology); José Martins (Ghent University, Dept. Organic Chemistry); Wim Van Criekinge (Ghent University, Dept. Molecular Biotechnology);

Short Abstract: 1H-NMR metabolomics provides a high-throughput methodology capable of acquiring high-resolution profiles of low-molecular weight metabolites. However, the complicated data-analysis forms a major drawback, requiring numerous data preprocessing procedures (particularly normalization, reduction and scaling steps). Here, we evaluate the most common procedures and demonstrate several unanticipated side-effects.

Long Abstract: Click Here

Accepted Posters

Preparing your Poster - Information and Poster Size
Poster presentation video taped for posting to the SciVee website Information Poster Schedule
Poster Categories
Search for a Poster

View Posters By Category

Search Posters:

↑ TOP

Poster Number	Matches
Last Name
Co-Authors	Contains
Title
Abstract	Contains