ISMB/ECCB 2011 Posters

19th Annual International Conference on
Intelligent Systems for Molecular Biology and
10th European Conference on Computational Biology

Accepted Posters

Category 'H'- Gene Prediction'

Poster H1

A Data Effectiveness Score

Gregory Bloom Moffitt Cancer Center

Steven Eschrich (Moffitt Cancer Center, Biomedical Informatics); Matthew Schabath (Moffitt Cancer Center, Cancer Prevention and Control); Gang Han (Moffitt Cancer Center, Biostatistics); David Fenstermacher (Moffitt Cancer Center, Biomedical Informatics); Neera Bhansali (Moffitt Cancer Center, Biomedical Informatics); Andrew Hoerter (Moffitt Cancer Center, Biomedical Informatics);

Short Abstract: Typically the entirety of a data element’s utility in prediction of an outcome is defined by standard statistical techniques such as those used in survival analysis. These analyses produce a p-value relating an element to its relationship with survival. This p-value can be compared to others to determine the best element for use in predicting a given outcome. In certain fields such as comparative effectiveness research, this approach gives an incomplete description of utility since it fails to account for the intrinsic quality of the data element itself. A data element’s quality can encompass a wide variety of attributes including timeliness, completeness, reproducibility, and cost. These attributes must be taken into account if a complete picture of a given element’s true utility or effectiveness is desired since these attributes can greatly affect the usefulness of an element as in the example of a highly predictive but extremely expensive test for determining cancer metastasis. Here the utility of the test would be low due to the cost of obtaining the data even though the test was very predictive of survival. Here we present a novel approach in which a p-value from a standard statistical test is algorithmically combined with an element’s quality attributes. We term this combined measurement an effectiveness or E-score. Both the E-score and p-value are used to rank the utility of a large set of clinical and molecular elements in the context of a comparative effectiveness study of a large lung cancer cohort from the Moffitt Cancer Center.

Poster H2

Improving the Accuracy of Automated Prokaryotic Gene Calling wih the Intergenic Region Scanner

Heather Kent Public Health Agency of Canada

Philip Mabon (Public Health Agency of Canada, Bioinformatics Core); Gary Van Domselaar (Public Health Agency of Canada, Bioinformatics Core);

Short Abstract: Most popular prokaryotic gene finding algorithms use ab-initio approaches, making predictions based on statistical profiling of the genome under analysis. The accuracy of these gene calling programs may be negatively affected by confounding factors such as horizontally acquired sequences, pseudogenes, and artificial frameshifts introduced by next generation sequencing errors. Post processing pipelines such as GenePrimp or GenVar are designed to detect missed genes and sequencing anomalies but require a manual curation step for any annotation improvements. The Intergenic Region Scanner(IRS) Module described here is designed to provide a system for automatically detecting and annotating sequencing anomalies found in the region prediction pipeline for our in-house version of the GenDB 2.2 Prokaryotic Genome Annotation System. The scanner detects potential missed gene calls, and differentiates potential sequencing errors from true gene sequence variants. The IRS evaluates a set of observations for each intergenic region and an extended region of 300 bp around each annotated gene call. A heuristic quality score is generated as a measure of confidence for each potential annotation anomaly and those reaching the adjustable threshold score are automatically corrected and annotated. Missed gene calls, alternative start sites, frameshifts due to homopolymer sequencing errors, and potential pseudogenes or split genes missed by the original gene finding algorithm may all be automatically annotated. The pipeline provides a higher level of automated genome annotation for microbial genes derived from high throughput sequencing technologies where whole genome sequence finishing or manual curation is unlikely due to labour and budget constraints.

Poster H3

Computing the probability of RNA hairpin and multiloop formation

Peter Clote Boston College

William A. Lorenz (Denison University, Mathematics and Computer Science); Yang Ding (University of Pennsylvania, Biology);

Short Abstract: The continued development of noncoding RNA gene finders remains an important area of bioinformatics, since the human genome is pervasively transcribed into RNA, most of whose structure and biological function is completely unknown. In this paper, we describe two novel, thermodynamics-based algorithms, RNAhairpin and RNAmloop, which compute parametric probabilities for an RNA sequence to form hairpins and multiloops. These probabilities are shown to provide discriminating novel features for a support vector machine (SVM) binary classifier for Rfam RNA families. For instance, by using hairpin formation probabilities, there is greater than 95% accuracy in distinguishing 6S RNA (Rfam class RF00013) from U2 spliceosomal RNA (Rfam class RF00004). Since hairpin and multiloop formation probabilities cannot be computed by any other software, we believe that our software provides novel features that will improve the accuracy of noncoding RNA gene finders. A technical description of our algorithms is as follows. Given an RNA sequence of length n, RNAhairpin computes, simultaneously for each value of 0 h K, the minimum free energy structure MFEh and partition function Zh over all secondary structures of the input RNA sequence having exactly h hairpins. Similarly, RNAmloop computes, simultaneously for each value of 0 m K [resp. 0 d K], the minimum free energy structure MFEm [resp. MFEd] and partition function Zh [resp. Zd] over all secondary structures of the input RNA sequence having exactly m multiloops [resp. maximum order d multiloops]. Availability: http://bioinformatics.bc.edu/clotelab/

Poster H4

A fast stochastic score based rank fusion method: application on miRNA target prediction

Debarka Sengupta Indian Statistical Institute

Indranil Aich (Indian Statistical Institute, Machine Intelligence Unit); Ujjwal Maulk (Jadavpur University, Computer Science and Engineering); Sanghamitra Bandyopadhyay (Indian Statistical Institute, Machine Intelligence Uniy);

Short Abstract: This poster is based on Proceedings Submission 89. Rank fusion or aggregation, in layman’s term is finding median or consensus ranking of certain set of objects, having in hands some distinct proposals for their orderings. After being extensively used in context of web mining, rank fusion, in past few years has efficiently been used for solving problems in bioinfomratics. The applications areas range from meta- analysis of microarray data to microRNA target prediction. It is common to knowledge that scores are more informative than ranks and combining ranked lists, solely based on ranks leads to information loss. Unlike other application areas, bioinformatics data sets exhibit higher sensitivity to scores along with the ranks. Fast heuristics that employ scores of the elements belonging to the respective orderings are therefore a true need in biological data mining. Unfortunately such methods are rare and naïve in principle. Some existing score based aggregation methods use evolutionary computing approach to search the solution space. Such methods get exponentially slower with the increasing input list size. We, in this article propose for a Markov chain that effectively incorporates scores, furnished in the given ranked lists to predict the consensus ranking. The method is used to combine orderings laid by different microRNA target prediction algorithm, which predict the mRNAs that can be targeted by a specific microRNA. The method is compared with two existing score based fusion techniques. A simulation is produced for observing the performance of the current method.

Poster H5

WebScipio: A web tool for gene structure prediction

Klas Hatje Max Planck Institute for Biophysical Chemistry

Oliver Keller (University of Göttingen, Institute of Computer Science); Björn Hammesfahr ( Max Planck Institute for Biophysical Chemistry, NMR based Structural Biology); Holger Pillmann (Max Planck Institute for Biophysical Chemistry, NMR based Structural Biology); Stephan Waack (University of Göttingen, Institute of Computer Science); Martin Kollmar (Max Planck Institute for Biophysical Chemistry, NMR based Structural Biology);

Short Abstract: Obtaining the gene structure for a given protein-encoding gene is an important step in many analyses. Due to the ever-increasing speed of genome sequencing the gap to the process of genome annotation is growing. Finding the corresponding gene of a given protein sequence by means of conventional tools is error prone, and cannot be completed without manual inspection. We developed Scipio, a tool to determine the precise gene structure of a given protein sequence. To make this tool accessible and easy to use we also developed WebScipio, a web interface to the Scipio software. The resulting gene structure is presented in various human readable formats like a schematic representation, and a detailed alignment of the query and the target sequence highlighting any discrepancies.

Scipio is able to precisely map a protein query onto a genome, even in cases when there are many sequencing errors, or when incomplete genome assemblies lead to hits that stretch across multiple target sequences. Recently, Scipio and WebScipio were extended to cope with difficult cases like very short exons and to predict homologues genes in closely related organisms. Scipio is able to correctly predict almost all genes in cross-species searches even if the ancestor of the species separated more than 100 Myr ago and if the sequence identity is below 80 %. In addition, we implemented an algorithm to search for mutually exclusive spliced exons. WebScipio provides easy access to genome assemblies of about 600 eukaryotic species. Scipio and WebScipio are freely accessible at http://www.webscipio.org.

Accepted Posters

Preparing your Poster - Information and Poster Size
Poster Schedule
Vienna Poster Printing Services
Poster Categories
Search for a Poster

Attention Poster Authors: The ideal poster size should be max. 1.30 m (130 cm) high x 0.90 m (90 cm) wide. Fasteners (Velcro / double sided tape) will be provided at the site, please DO NOT bring tape, tacks or pins. View a diagram of the the poster board here

Posters Display Schedule:

Odd Numbered posters:

Set-up timeframe: Sunday, July 17, 7:30 a.m. - 10:00 a.m.
Author poster presentations: Monday, July 18, 12:40 p.m. - 2:30 p.m.
Removal timeframe: Monday, July 18, 2:30 p.m. - 3:30 p.m.*

Even Numbered posters:

Set-up timeframe: Monday, July 18, 3:30 p.m. - 4:30 p.m.
Author poster presentations: Tuesday, July 19, 12:40 p.m. - 2:30 p.m.
Removal timeframe: Tuesday, July 19, 2:30 p.m. - 4:00 p.m.*

* Posters that are not removed by the designated time may be taken down by the organizers and discarded. Please be sure to remove your poster within the stated timeframe.

Delegate Posters Viewing Schedule

Odd Numbered posters:
On display Sunday, July 17, 10:00 a.m. through Monday, June 18, 2:30 p.m.
Author presentations will take place Monday, July 18: 12:40 p.m.-2:30 p.m.

Even Numbered posters:
On display Monday, July 18, 4:30 p.m. through Tuesday, June 19, 2:30 p.m.
Author presentations will take place Tuesday, July 19: 12:40 p.m.-2:30 p.m

Want to print a poster in Vienna - try these options:

Repacopy- next to the congress venue link [MAP]

Also at Karlsplatz is in the Ring Center, Kärntner Str. 42, link [MAP]

If you need your poster on a thicker material, you may also use a plotter service next to Karlsplatz: http://schiessling.at/portfolio/

View Posters By Category

Search Posters:

↑ TOP