ISMB 2008 ISCB


















Accepted Posters
Category 'M'- Machine Learning'
Poster M01
Discovering the Rules of Reversible Membrane Binding: A Machine Learning Protocol for identifying C1, C2, and PH Domain Properties.
Morten Källberg- University of Illinois
No additional authors
Short Abstract: We present a machine learning protocol for determining membrane-targeting properties achieving 85-90% accuracy in separating binding and non-binding domains within families. The developed model is represented as an interpretable tree of rules showing good agreement between statistically discovered binding properties and those reported in experimental work.
Long Abstract: Click Here

Poster M02
A new kernel function for clinical data
Anneleen Daemen- KULeuven
Bart De Moor (KULeuven, ESAT-SCD);
Short Abstract: To fully exploit clinical information, appropriate modeling is required. We propose a new kernel function that distinguishes between continuous, ordinal and nominal variables. Evidently, a Least Squares Support Vector Machine based on this kernel function significantly outperformed the widely used linear kernel function when tested on three data sets.
Long Abstract: Click Here

Poster M03
Identifying essential genes in metabolic networks of bacteria in silico
Rainer Koenig- IPMB
Kitiporn Plaimas (IPMB, University of Heidelberg, Bioinformatics); Roland Eils (DKFZ Heidelberg, TBI Bioinformatics);
Short Abstract: We have developed a machine learning algorithm that infers essential reactions in a metabolic network from the topology of the network and experimental data from genomic sequences and gene expression of the corresponding coding genes. With this we support and extend experiments of high throughput genome wide knock screens.
Long Abstract: Click Here

Poster M04
Predicting Protein Subcellular Localization Using Abstract Sequential Features
Cornelia Caragea- Iowa State University
Adrian Silvescu (Yahoo! Labs, CA); Vasant Honavar (Iowa State University, Computer Science);
Short Abstract: We present an approach to predicting protein subcellular localization from amino acid sequences that exploits the complementary strengths of feature construction (constructing complex features from existing features) and feature abstraction (grouping similar features to generate more abstract features) or feature selection to adapt the data representation used by the learner.
Long Abstract: Click Here

Poster M05
Kernel Alignment K-NN for the Identification of Human Cancer Samples using the Gene Expression Profiles
Manuel Martin-Merino- Universidad Pontificia de Salamanca
No additional authors
Short Abstract: kNN classifier has been applied to the identification of cancer samples with encouraging results. We present a new method to learn a linear combination of dissimilarities for the kNN classifier that is robust to overfitting and solves a semi-definite programming approach. Our algorithm outperforms otheralternatives in several cancer datasets.
Long Abstract: Click Here

Poster M06
A Kernel PCA Biplot method applied to gene expression data
Ferran Reverter- Universitat de Barcelona
Esteban Vegas Lozano (Universitat de Barcelona, Dept. Estadística);
Short Abstract: We describe a computational graphical tool to visualize genes and samples. We develop a biplot technique based on kernel PCA. We analyze two genomic datasets. Results suggest that our technique is a useful tool to find genes that have a similar pattern of up/down regulation for the samples
Long Abstract: Click Here

Poster M07
Automatic classification of P-type ATPases using Structured Logistic Regression
Poul Liboriussen- Aarhus University
Bjørn Panyella Pedersen (Aarhus University, Centre for Membrane Pumps in Cells and Disease (PUMPKIN)); Poul Nissen (Aarhus University, Centre for Membrane Pumps in Cells and Disease (PUMPKIN)); Christian Nørgaard Storm Pedersen (Aarhus University, Bioinformatics Research Center (BiRC));
Short Abstract: P-type ATPases are a very large family of ATP-driven membrane pumps involved in transmembrane transport of charged substrates. We have constructed a classifier that can distinguish between the 11 subfamilies with high accuracy. The classified it applied to Swiss-Prot/TrEMBL, and finds 6.624 P-Type ATPases.
Long Abstract: Click Here

Poster M08
Discovering biomarker panels in experiments with pooling or heterogeneous tissues
Dirk Repsilber- Research Institute for the Biology of Farm Animals
Anna Telaar (FBN Dummerstorf, Genetics and Biometry); Gerd Nürnberg (FBN Dummerstorf, Genetics and Biometry);
Short Abstract: A plethora of statistical learning approaches is being applied to find biomarkers and also true multivariate biomarker signatures. We show how pooling and tissue heterogeneity influence the possibility of detection of biomarker signatures and compare statistical learning algorithms with respect to robustness to these stumbling blocks of biomarker discovery.
Long Abstract: Click Here

Poster M09
Combining evidence from ranked gene lists
Raivo Kolde- University of Tartu
Sven Laur (University of Tartu, Institute of Computer Science); Priit Adler (University of Tartu, Institute of Molecular and Cell Biology); Jaak Vilo (University of Tartu, Institute of Computer Science);
Short Abstract: We propose a strategy for combining evidence from ranked lists of genes. In addition to the ranking of genes, the algorithm assigns significance probability for each gene. The method can be applied in network reconstruction, meta-analysis of microarray studies, etc.
Long Abstract: Click Here

Poster M10
Missing Value Imputation for Epistasis Maps
Colm Ryan- University College Dublin
Derek Greene (University College Dublin, School of Computer Science and Informatics); Nevan Krogan (University of California, San Francisco, Quantitiative Biology Institute); Gerard Cagney (University College Dublin, Conway Institute of Biomolecular and Biomedical Research); Pádraig Cunningham (University College Dublin, School of Computer Science and Informatics);
Short Abstract: We introduce the problem of missing value imputation for Epistasis miniarray profiles(E-MAPS) and show the results of adapting two existing techniques to address the problem. In doing so we highlight some unique aspects of the problem – the pairwise nature of the data and the high percentage of missing values.
Long Abstract: Click Here

Poster M11
KIRMES: Kernel-based Identification of Regulatory Modules in Euchromatic Sequences
Sebastian Schultheiss- Friedrich Miescher Laboratory of the Max Planck Society
Wolfgang Busch (Duke University, Biology Department); Jan Lohmann (University of Heidelberg, Center for Organismal Studies); Oliver Kohlbacher (University of Tuebingen, Wilhelm Schickard Institute for Computer Science); Gunnar Raetsch (Friedrich Miescher Laboratory of the Max Planck Society, Machine Learning in Biology);
Short Abstract: We identify genes regulated by the same transcription factor by analyzing sets of co-expressed genes from microarrays. KIRMES infers all genes regulated by the same mechanism as the ones in the input set. KIRMES makes use of motif sampling and newly developed kernel methods for this task.
Long Abstract: Click Here

Poster M12
Combining Support Vector Machines to predict novel angiogenesis genes
Kaur Alasoo- University of Tartu
Phaedra Agius (Memorial Sloan-Kettering Cancer Center, .); Jaak Vilo (Quretec Ltd, .); Hedi Peterson (Quretec Ltd, .);
Short Abstract: Angiogenesis is the natural process of growing new blood vessels in human body, that also plays an important role in cancer development. We have identified candidate genes based on 274 known angiogenesis genes using a new machine learning method employing Support Vector Machine (SVM) classification.
Long Abstract: Click Here

Poster M13
Natural Kernel-Induced Bayesian Learning for Microarray Data Analysis
Leo Cheung- Loyola University Medical Center
Xin Zhao (Sanjole Inc., Computer Engineering);
Short Abstract: Incorporating a novel natural kernel building procedure under a general unifying Bayesian framework, we propose a Natural Kernel-Imbedded Gaussian Process (NKIGP) for microarray data analysis. Based on simulated and real data studies, NKIGP performed very well consistently in both linear and non-linear cases without the need of parameter tuning.
Long Abstract: Click Here

Poster M14
Using support vector machines for the evaluation of computationally developed lipoxygenase structures
Aditya Jitta- University of Hyderabad
Aparoy P (University of Hyderabad, School of Life Sciences); Reddanna P (University of Hyderabad, School of Life Sciences);
Short Abstract: Lipoxygenases are a group of structurally related family of non-heme, iron-containing dioxygenases,the geometry and composition at metal binding site in 3D models of lipoxygenases is very important.Based on these features,a tool was developed using support vector machines to evaluatecomputationally developed lipoxygenase structures.
Long Abstract: Click Here

Poster M15
I/NI-calls: a novel unsupervised feature selection criterion
Sepp Hochreiter - Johannes Kepler University Linz
Djork-Arné Clevert (Johannes Kepler University Linz, Institute of Bioinformatics); Willem Talloen (Johnson & Johnson Pharmaceutical Research & Development, Pharmaceutical Research & Development); Hinrich Göhlmann (Johnson & Johnson Pharmaceutical Research & Development, Pharmaceutical Research & Development); Sepp Hochreiter (Johannes Kepler University Linz, Institute of Bioinformatics);
Short Abstract: We propose a novel unsupervised gene selection criterion that is based on a probabilistic latent variable model that takes probe level information -- probe correlations that cannot be explained by noise -- into account to filter out inconsistent probe sets.
Long Abstract: Click Here

Poster M16
An integrative pipeline for automated data analysis and gene function annotation for genome wide high content RNAi screening
Stephen Wong- Center for Biotechnology and Informatics, The Methodist Hospital
Xiaobo Zhou (Center for Biotechnology and Informatics, The Methodist Hospital, The Methodist Hospital Research Institute and Department of Radiology);
Short Abstract: We propose an integrated pipeline of automated data analysis for high-content screening of genome-wide RNA interference on Drosophila cell assays. Millions of cells are efficiently segmented, and previously un-scored phenotypes are identified. This image bioinformatics pipeline is especially helpful in predicting the roles of genes in complex biological processes.
Long Abstract: Click Here

Poster M17
Computational Linguistic Analyses of Unknown Metagenome Sequences
Victor Seguritan- San Diego State University
Anca Segall (San Diego State University, Biology); Rob Edwards (San Diego State University, Computer Science); Forest Rohwer (San Diego State University, Biology);
Short Abstract: A method is needed to assign functions to unknown sequences which does not rely on sequence homology alone. The linguistic elements, syntax and semantics, of several model proteins will be used to assign functions to unknown metagenomes in a manner similar to the concept of understanding human language.
Long Abstract: Click Here

Poster M18
Neural Network Pairwise Interaction Fields for protein model quality assessment
Alberto Jesus Martin- University College
Gianluca Pollastri (Complex and Adaptive Systems Laboratory, University College Dublin, School of Computer Science and Informatics); Alessandro Vullo (Complex and Adaptive Systems Laboratory, University College Dublin, School of Computer Science and Informatics);
Short Abstract: We present a new knowledge-based Model Quality Assessment Program (MQAP) at the residue level which evaluates single protein structure models. We use a tree representation of the C-alpha trace to train a novel Neural Network Pairwise Interaction Field (NN-PIF) to predict the global quality of a model.
Long Abstract: Click Here

Poster M19
BayesCall: A model-based basecalling algorithm}{BayesCall: A model-based basecalling algorithm
Wei-Chun Kao- UC Berkeley
Kristian Stevens (UC Davis, Computer Science); Yun Song (UC Berkeley, EECS);
Short Abstract: A novel model-based basecalling algorithm BayesCall is introduced for the Illumina sequencing platform. This new approach significantly improves the accuracy over Illumina's basecaller Bustard. For the 76-cycle PhiX174 data from Genome Analyzer II, BayesCall improves Bustard's per-base error rate by about 47%.
Long Abstract: Click Here

Poster M20
A Bayesian Monte Carlo Hidden Markov Model Approach to Transmembrane Protein Structure Prediction
Takashi Kaburagi- Waseda University
Takashi Matsumoto (Waseda University, Electrical Engineering and Bioscience);
Short Abstract: We present the preliminary results of a novel scheme for transmembrane protein structure prediction using a Bayesian hidden Markov model. We applied a Bayesian learning method via the Markov chain Monte Carlo (MCMC) sampling scheme to evaluate posterior distribution of Hidden Markov Model (HMM) parameters given the training data set.
Long Abstract: Click Here

Poster M21
A method for analyzing gene expression profiles based on the underlying structures
Shigeto Seno- Dept. Bioinfo. Eng., Grad. Sch. Info. Sci. Tech., Osaka Univ.
Yoichi Takenaka (Graduate School of Information Science and Technology, Osaka University, Bioinfomatic Engineering); Hideo Matsuda (Graduate School of Information Science and Technology, Osaka University, Bioinfomatic Engineering);
Short Abstract: Clustering is a powerful tool for elucidating relationships among genes, and one of the first steps in analysis. Meanwhile choice of suitable method for a given dataset is still difficult. Our approach discovers the underlying structure of a gene expression profile and provides a more intuitive understanding.
Long Abstract: Click Here

Poster M22
Monte Carlo-Based Bayesian Prediction of Gene Regulatory Networks with Zipf Distribution: Mouse Nuclear Receptor Superfamily
Haruka Miyachika- Waseda University
Yusuke Kitamura (Waseda University, Electrical Engineering and Bioscience); Tomomi Kimiwada (National Center of Neurology and Psychiatry, Neurosurgery); Jun Maruyama (Waseda University, Electrical Engineering and Bioscience); Takashi Kaburagi (Waseda University, Electrical Engineering and Bioscience); Takashi Matsumoto (Waseda University, Electrical Engineering and Bioscience); Keiji Wada (National Center of Neurology and Psychiatry, Neurosurgery);
Short Abstract: We present a Monte Carlo-based algorithm to predict gene regulatory network structure within a Bayesian framework. The algorithm assumes that prior distribution follows the Zipf law, and is implemented using the Exchange Monte Carlo method. We applied the algorithm to a mouse nuclear receptor superfamily.
Long Abstract: Click Here

Poster M23
Improving the prediction of protein-protein interactions by combining different biological sources
Herman van Haagen- LUMC
Peter-Bram 't Hoen (LUMC, Human Genetics); Barend Mons (LUMC, Human Genetics); Martijn Schuemie (Erasmus MC, Biosemantics group);
Short Abstract: Protein-protein interactions (PPIs) can be predicted based on different databases. In this study we investigate if combining those databases increases prediction power. In addition we investigate if the combined system covers more PPIs that can be evaluated. First results are promising both on coverage and prediction improvement.
Long Abstract: Click Here

Poster M24
Two-way Analysis of High-Dimensional Metabolomic Datasets
Ilkka Huopaniemi- Helsinki University of Technology
Tommi Suvitaival (Helsinki University of Technology, Department of Information and Computer Science); Janne Nikkilä (Helsinki University of Technology, Department of Information and Computer Science); Matej Oresic (VTT Technical Research Centre of Finland, Quantitative Biology and Bioinformatics); Samuel Kaski (Helsinki University of Technology, Department of Information and Computer Science);
Short Abstract: We present a Bayesian machine learning method for multivariate two-way ANOVA-type analysis ofhigh-dimensional, small sample-size metabolomic datasets. The method assumes clustered metabolites and presents confidence intervals of main and interaction up/down-regulation effects of the clusters.
Long Abstract: Click Here

Poster M25
Prediction of antifreeze protein from protein sequence
Chin-Sheng Yu- Feng Chia University
No additional authors
Short Abstract: By overall screening in current databases, there are very few homologs of anti-freeze protein in any other species with similar protein sequence and structure identified. We present an approach to recognize AFP from protein sequence. For a nonredundant data set, the overall prediction accuracy reaches 88%.
Long Abstract: Click Here

Poster M26
Bioinformatic analyses of mammalian 5'-UTR sequence properties of mRNAs predicts alternative translation initiation sites
Jill Wegrzyn- University of California at San Diego
Thomas Drudge (University of California at San Diego, Skaggs School of Pharmacy and Pharmaceutical Sciences); Farmarz Valafar (San Diego State University, Bioinformatics and Medical Informatics Research Center (BMIRC)); Vivian Hook (University of California at San Diego, Skaggs School of Pharmacy and Pharmaceutical Sciences);
Short Abstract: This study conducted a bioinformatic evaluation of the 5'-UTR of mammalian mRNA sequences. Machine learning techniques were applied for the classification and identification of non-AUG initiation sites in a group of mRNAs that have been experimentally demonstrated to utilize alternative sites for protein translation.
Long Abstract: Click Here

Poster M27
Cognitive State Classification with Magnetoencephalography Data
andrej savol- University of Pittsburgh
No additional authors
Short Abstract: We train a Support Vector Machine (SVM) soft-margin classifier on magnetoencephalography (MEG) brain-activation trajectories generated by human subjects viewing 60 common nouns divided into 12 noun groups. Semantic groupings and sensor information content are addressed.
Long Abstract: Click Here

Poster M28
The evaluation of common 1H-NMR metabolomics data preprocessing procedures reveals unanticipated side-effects
Tim De Meyer- Ghent University
Bjorn Van Gasse (Ghent University, Dept. Organic Chemistry); Davy Sinnaeve (Ghent University, Dept. Organic Chemistry); Sofie Bekaert (Ghent University, Dept. Molecular Biotechnology); José Martins (Ghent University, Dept. Organic Chemistry); Wim Van Criekinge (Ghent University, Dept. Molecular Biotechnology);
Short Abstract: 1H-NMR metabolomics provides a high-throughput methodology capable of acquiring high-resolution profiles of low-molecular weight metabolites. However, the complicated data-analysis forms a major drawback, requiring numerous data preprocessing procedures (particularly normalization, reduction and scaling steps). Here, we evaluate the most common procedures and demonstrate several unanticipated side-effects.
Long Abstract: Click Here



Accepted Posters

View Posters By Category
Search Posters:
Poster Number Matches
Last Name
Co-Authors Contains
Title
Abstract Contains






↑ TOP