ISMB ECCB 2009

Accepted Posters

Category 'U'- Sequence analysis'

Poster U01

TATA-variant identification, characterization and functional classification in plant genomes

Virginie BERNARD- URGV

Véronique BRUNAUD (URGV, bioinformatics); Alain LECHARNY (URGV, bioinformatics);

Short Abstract: Taking advantages of the TATA-box topological constraints we identified TATA-variants sharing the same constraints and being conserved in Arabidopsis thaliana and Oryza sativa. This work led to TATA-variant characterization distinguishing some motifs relative to the specific function, structure and expression of their related genes.

Long Abstract: Click Here

Poster U02

Significance of hidden Markov model results

Lee Newberg- Wadsworth Center, New York State Department of Health

No additional authors

Short Abstract: For hidden Markov model / dynamic programming algorithm scans of large databases of sequence data, the ability to quickly estimate p-values at the 1e-12 level or smaller is necessary. We present a general approach that has quickly estimated p-values as low as 1e-4000.

Long Abstract: Click Here

Poster U03

SNPlexViewer- a solution towards cost effective traceability system

Eyal Seroussi- The Agricultural Research Organization (ARO), Volcani Center

Yanir Seroussi (Monash University, Clayton School of Information Technology); Andrey Shirak (The Agricultural Research Organization (ARO), Volcani Center, Institute of Animal Sciene); Baruch Karniol (The Agricultural Research Organization (ARO), Volcani Center, Institute of Animal Sciene);

Short Abstract: DNA-based-traceability uses the animal own DNA-code for identity control. For this purpose we multiplexed 25 SNPs and to further decrease SNaPshot-genotyping expenses we introduced software, which facilitates the analysis of trace-files without size-standards. SNPlexViewer improves genotyping performance by aligning two trace-chromatograms while embedding within a normalized target-trace-file the reference size-standards.

Long Abstract: Click Here

Poster U04

Investigation of a simple heuristic improving the speed of statistical alignments

Jesper Nielsen- Aarhus University

Rune Lyngsø (University of Oxford, Department of Statistics); Christian Pedersen (Aarhus University, Bioinformatics Research Center); Jotun Hein (University of Oxford, Department of Statistics);

Short Abstract: We investigate a simple heuristic for speeding up statisticalalignments. The idea is to use the results from pairwise alignmentsto estimate which multiple alignments are likely before actuallycomputing them. We investigate several ways to do this and achieve asignificant speed-up.

Long Abstract: Click Here

Poster U05

Refinement of structure-based sequence alignments by Seed Extension

Chin-Hsien (Emily) Tai- National Cancer Institute, NIH

Changhoon Kim (National Cancer Institute, NIH, Center for Cancer Research); Byungkook Lee (National Cancer Institute, NIH, Center for Cancer Research);

Short Abstract: Refinement with Seed Extension (RSE) is a new procedure for refining a structure-based sequence alignment using a Seed Extension algorithm. With negligible increases in computation time, it improved the average accuracy of sequence alignments from all nine popular structure comparison/alignment programs, when tested against NCBI’s CDD alignments.

Long Abstract: Click Here

Poster U06

Software tool for bulk annotation of genomic loci

Mali Salmon- EMBL-EBI

No additional authors

Short Abstract: Here we describe a collection of software tools developed for the efficient annotation of genomic loci. The programs automatically identify key features of interest, such as the location of experimental peaks within genes, their proximity to up- or downstream transcription start sites, and the presence of binding site motifs

Long Abstract: Click Here

Poster U07

Multiple Motif Scanning to Identify Methyltransferases

Tanya Petrossian- University of California, Los Angeles

Steve Clarke (UCLA, Department of Chemistry and Biochemistry and the Molecular Biology Institute);

Short Abstract: This study seeks to refine the methyltransferase database by using the novel “Multiple Motif Scanning” program. HMM profiles and secondary structures were utilized to identify AdoMet-binding motifs. Statistical examination of these sequences allowed for motif refinement. Additionally, clustering analysis revealed probable substrates for the putative methyltransferases.

Long Abstract: Click Here

Poster U08

Sequence context-specific profiles for homology searching

Andreas Biegert- Gene Center, LMU Munich

Johannes Soeding (Gene Center, LMU Munich, Computational Biology);

Short Abstract: In standard sequence searches, amino acids are compared one by one. We derive context-specific amino acid similarities from short windows centered on each query sequence residue. By employing our context-specific similarities in combination with NCBI BLAST, CS-BLAST achieves two-fold increased sensitivity at the same specificity and speed.

Long Abstract: Click Here

Poster U09

Optimized pipeline for the analysis of mircoRNA sequences obtained from next-generation sequencing technologies

Marcel Grunert- Max Planck Institute for Molecular Genetics

Markus Schueler (Max Planck Institute for Molecular Genetics, Vertebrate Genomics / Computational Molecular Biology); Ilona Dunkel (Max Planck Institute for Molecular Genetics, Vertebrate Genomics); Silke Sperling (Max Planck Institute for Molecular Genetics, Vertebrate Genomics);

Short Abstract: We present an optimized data analysis pipeline for the processing and analysis of small RNA sequences obtained from Solexa next-generation sequencing data. The pipeline includes quality control, statistical reporting, whole genome mapping, several filtering steps, profiling of small RNAs by database annotations, and finally, the prediction of novel microRNAs.

Long Abstract: Click Here

Poster U10

WebLab: a data-centric, knowledge-sharing bioinformatic platform

Ge Gao- Peking University

Xiaoqiao Liu (Peking University, Center for Bioinformatics, School of Life Sciences); Jianmin Wu (Peking University, Center for Bioinformatics, School of Life Sciences); Jun Wang (Peking University, Center for Bioinformatics, School of Life Sciences); Xiaochun Liu (Peking University, Center for Bioinformatics, School of Life Sciences); Shuqi Zhao (Peking University, Center for Bioinformatics, School of Life Sciences); Zhe Li (Peking University, Center for Bioinformatics, School of Life Sciences); Lei Kong (Peking University, Center for Bioinformatics, School of Life Sciences); Xiaocheng Gu (Peking University, Center for Bioinformatics, School of Life Sciences); Jingchu Luo (Peking University, Center for Bioinformatics, School of Life Sciences);

Short Abstract: In order to support biological researches, we have developed WebLab, a data-centric knowledge-sharing bioinformatic platform. Besides plentiful types of analysis tools, WebLab provides powerful data management function for both experimental data and scientific literature. Flexible sharing mechanism and group strategy are also provided to facilitate collaborative team work.

Long Abstract: Click Here

Poster U11

ENTROPIC PROFILER – efficient whole genome analysis using information theory and statistical concepts

Susana Vinga- INESC-ID

Francisco Fernandes (INESC-ID, KDBIO); Ana T Freitas (INESC-ID/IST, KDBIO); Jonas S Almeida (MDAnderson Cancer Center, Biostat Appl Math);

Short Abstract: Entropic Profiles (EP) are local information plots that indicate overall conservation of motifs in genomes. They are based on Information Theory concepts, in particular the Renyi entropy of biological sequences. The present tool implementation, based on new data structures and algorithmic simplifications, allows to process whole genomes in few minutes.

Long Abstract: Click Here

Poster U12

ADAPTdb/ADAPT - A Framework for the Analysis of ARISA Data Sets

Robert Schmieder- San Diego State University

Matthew Haynes (San Diego State University, Biology); Elizabeth Dinsdale (San Diego State University, Biology); Forest Rohwer (San Diego State University, Biology); Robert Edwards (San Diego State University, Computer Science);

Short Abstract: ADAPTdb/ADAPT presents a web-based system for the automatic analysis of ARISA data sets. The database ADAPTdb stores and maintains ITS regions along with information about their source organisms. ADAPT uses ADAPTdb to taxonomically characterize ARISA data sets. Additionally, ADAPT performs pathogenic and autotrophic/heterotrophic comparisons of organisms among different ARISA samples.

Long Abstract: Click Here

Poster U13

Mining unique-m substrings from genomes

Kai Ye- European Bioinformatics Institute

Qilan Li (Leiden/Amsterdam Centre for Drug Research, Medicinal Chemistry); Ad IJzerman (Leiden/Amsterdam Centre for Drug Research, Medicinal Chemistry); Zhenyu Jia (University of California, Irvine, Department of Pathology and Laboratory Medicine); Paul Flicek (European Bioinformatics Institute, PANDA); Rolf Apweiler (European Bioinformatics Institute, PANDA);

Short Abstract: Information about unique substrings of genomes is fundamental but not sufficient for many genetic investigations. We propose an efficient (time and space) pattern growth approach to systematically mine all unique-m substrings, which have exactly one perfect match in the genome while all approximate matches must have more than m mismatches.

Long Abstract: Click Here

Poster U14

LOCAS - a new lowest coverage assembler to support resequencing with ultra-short reads

Juliane Klein- University of Tuebingen

Korbinian Schneeberger (Max Planck Institute for Developmental Biology, Department of Molecular Biology); Stephan Ossowski (Max Planck Institute for Developmental Biology, Department of Molecular Biology); Detlef Weigel (Max Planck Institute for Developmental Biology, Department of Molecular Biology); Daniel H. Huson (University of Tuebingen, Faculty of Computer Science);

Short Abstract: We present LOCAS, a new assembly tool for short read sequence data. Incontrast to existing short read assemblers, which assume highcoverage of reads, LOCAS is aimed at assembling low-coveragedatasets. LOCAS is particularly suited forresequencing projects. We are using it in an Arabidopsis resequencingproject (1001 genomes).

Long Abstract: Click Here

Poster U15

Evaluation of Association Measures for Motif Discovery

Pedro Ferreira- Centre for Genomic Regulation

Roderic Guigó (Centre for Genomic Regulation, Genome Bioinformatics Lab);

Short Abstract: Combinatorial motif discovery algorithms rely on association measures to assess the strength of co-occurrence between simple motifs. We surveyed 14 association measures previously applied in bioinformatics, data mining and language processing and performed an empirical evaluation in artificially generated datasets in order to better understand there similarities and differences.

Long Abstract: Click Here

Poster U16

Command-line-based integration of online bioinformatics resources

Kazuki Oshita- Institute for Advanced Biosciences, Keio University

Masaru Tomita (Institute for Advanced Biosciences, Keio University, Environment and Information Studies); Kazuharu Arakawa (Institute for Advanced Biosciences, Keio University, Graduate School of Media and Governance);

Short Abstract: Here we present a software package that maps online bioinformatics resources as UNIX command-line tools that can be pipelined using EMBOSS Ajax Command Definition ontologies. The software package currently contains more than 50 tools, and is freely available from http://www.g-language.org/.

Long Abstract: Click Here

Poster U17

Biological sequence motif discovery using feature selection in Conditional Random Field

Thanh Hai Dang- University of Antwerp

Alain Verschoren (University of Antwerp, Department of Mathematics and Computer Science); Kris Laukens (University of Antwerp, Department of Mathematics and Computer Science);

Short Abstract: Motif discovery plays important role in molecular biology. Most of computational methods developed so far are limited to gapless motifs and independent assumption between the positions within sequences. We hereby introduce a motif discovery method using feature selection in Conditional Random Field (CRF), which overcomes the above mentioned limitations.

Long Abstract: Click Here

Poster U18

Computational motif discovery using extreme-valuedtuples from mutual information profiles

Sara Garcia- Signal Processing Laboratory, IEETA

Armando Pinho (Signal Processing Laboratory, IEETA, University of Aveiro); Holger Kantz ( Max Planck Institute for the Physics of Complex Systems, Nonlinear Time Series Analysis );

Short Abstract: We propose a new methodology for computational motif discovery based on extreme-valued tuples, using information theory for assessing optimal tuple information measures based on the formalism of Shannon's entropy, and extreme value statistics for providing a framework for threshold-based selecting criteria.

Long Abstract: Click Here

Poster U19

Application of VAMSAS enabled tools for the investigation of protein evolution.

James Procter- University of Dundee

Iain Milne (Scottish Crop Research Institute, Bioinformatics); Frank Wright (Biomathematics and Statistics, Scotland, Genetics); Pierre Marguerite (European Bioinformatics Institute, MSD); Andrew Waterhouse (Riken, Genome Sciences Centre); Dominik Lindner (Scottish Crop Research Institute, Bioinformatics); David Martin (University of Dundee, School of Life Sciences Research); Tom Oldfield (European Bioinformatics Institute, MSD); David Marshall (Scottish Crop Research Institute, Bioinformatics); Geoff Barton (University of Dundee, School of Life Sciences Research);

Short Abstract: Protein evolutionary analysis often involves the use of many programs.We demonstrate how it can be performed effectively using applicationsthat have been modified to dynamically exchange data; via the 'Visualization and Analysis of Molecular Sequences, Alignments, andStructures\\\\\\\\\\\\\\\' (VAMSAS) framework.

Long Abstract: Click Here

Poster U20

Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features

Oliver Frings- Stockholm University

Timo Lassmann (Stockholm University, Stockholm Bioinformatics Center); Erik Sonnhammer (Stockholm University, Stockholm Bioinformatics Center);

Short Abstract: To make Kalign a versatile tool for large-scale alignment studies, we have dramatically improved its computational properties, while maintaining its high accuracy. Kalign 2 now supports the alignment of nucleotide sequences, and a newly introduced extension allows to include sequence annotation into the alignment process to improve alignment accuracy.

Long Abstract: Click Here

Poster U21

A novel ab initio method finding microRNA clusters

Anthony Mathelier- CNRS-UPMC

Alessandra Carbone (CNRS-UPMC, Informatique);

Short Abstract: MicroRNAs are a class of endogenes whose expression profiles reflect origin and differentiation state of human cancers and tumours. We propose a novel ab initio approach searching for clusterized paralogous microRNAs in highly dense palindromic regions. miRNA precursors discrimination is based on 5 (physical and combinatorial) conditions only.

Long Abstract: Click Here

Poster U22

G-language Genome Analysis Environment Version 2: Integrated workbench for computational genome sequence analysis

Kazuharu Arakawa- Institute for Advanced Biosciences, Keio University

Masaru Tomita (Institute for Advanced Biosciences, Keio University, Department of Environment and Information Studies);

Short Abstract: G-language Genome Analysis Environment is a software package written in Perl for genome sequence analysis compatible with BioPerl, especially focusing on bacterial genomes. Here we present the second version of the software, implemented with interactive shell and more than 200 analysis programs. The software is freely available at http://www.g-language.org/.

Long Abstract: Click Here

Poster U23

The CluSTr database in 2009

Craig McAnulla- EMBL - European Bioinformatics Institute

John Maslen (EMBL - European Bioinformatics Institute, InterPro team); Antony Quinn (EMBL - European Bioinformatics Institute, InterPro team); Sarah Hunter (EMBL - European Bioinformatics Institute, InterPro team);

Short Abstract: The CluSTr database offers an automatic classification of proteins from UniProtKB and other databases into groups of related proteins. The clustering is based on analysis of all pairwise similarities between protein sequences. New developments in CluSTr will be presented, including increased coverage, new protein datasets, and extended website functionality.

Long Abstract: Click Here

Poster U24

Novel miRNA identification and target gene prediction in Glycine Max

Trupti Joshi- University of Missouri-Columbia

No additional authors

Short Abstract: We identified over 50 novel miRNAs from Illumina SBS sequencing of seven tissues in soybean, and validated some computationally predicted putative target genes. We also developed a soybean genome browser and incorporated the small RNA libraries, along with Solexa transcriptome sequencing data. The genome browser can be accessed at http://genomebrowser.missouri.edu/cgi-bin/hgGateway.

Long Abstract: Click Here

Poster U25

Extraction of transcription factor binding sites from ChIP-Seq data through de novo TFBS motif identification. Application for EWS-Fli1 oncogenic transcription factor.

Valentina Boeva- Institut Curie

Noëlle Guillon (Institut Curie, Genetics and Biology of Cancers ); Franck Tirode (Institut Curie, Genetics and Biology of Cancers ); Olivier Delattre (Institut Curie, Genetics and Biology of Cancers ); Emmanuel Barillot (Institut Curie, Bioinformatics, biostatistics, epidemiology and computational systems biology of cancer);

Short Abstract: We propose a new algorithm for ChIP-Seq data analysis. It enables binding site extraction without the setting of an explicit threshold on the DNA fragment coverage. On EWS-Fli1 data, the algorithm showed significantly increased peak selection sensitivity with a very minor increase in the expected number of false positive hits.

Long Abstract: Click Here

Poster U26

Revealing the Density-based Clustering Structure of the SwissProt database

Gabor Ivan- PhD Student

Vince Grolmusz (Eotvos Lorand University, Department of Computer Science);

Short Abstract: We classified 389046 sequences occurring in SwissProt using the OPTICS algorithm. We proposed a colouring scheme that is based on taxonomy information and helps analyzing the composition of clusters. We validated our results with the Pfam database using an OPTICS-specific quality measure and concluded that we obtained clusters of high quality.

Long Abstract: Click Here

Poster U27

Identification of double coding regions in papillomaviruses based on nucleotide frequencies

Sten Ilmjärv- University of Tartu

Aare Abroi (Estonian Biocentre, .); Jaak Vilo (Quretec Ltd, .); Hedi Peterson (Quretec Ltd, .);

Short Abstract: Double coding regions in mammalian genomes have been widely studied due to frequent alternative splicing. We have identified double coding regions in papillomaviruses using amino acid sequence alignment based DNA conservation scoring. By comparing theoretical and real nucleotide frequencies we identified overlapping coding sequences for various papilloma types.

Long Abstract: Click Here

Poster U28

Assessing Differences among Next-Generation Sequencing Software for Genomic Resequencing Alignment and Detection of Variation

James Cavalcoli- University of Michigan

Edgar Otto (University of Michigan, Pediatric Nephrology); James MacDonald (University of Michigan, Human Genetics); Friedhelm Hildebrandt (University of Michigan, Pediatrics); Gilbert Omenn (University of Michigan, Internal Medicine);

Short Abstract: While resequencing a portion of Human Chr19 to identify disease-causing variants, we assessed the capacity and variability of a number of next-generation sequence analysis software tools for their ability to align, assemble, and detect genomic variations (polymorphisms and indels) compared to the reference genome (hg18 build 36.3).

Long Abstract: Click Here

Poster U29

An approach to subfamily assignment for large protein families

Yaoqing Shen- Universite de Montreal

Gertraud Burger (Universite de Montreal, Biochemistry); Franz Lang (Universite de Montreal, Biochemistry);

Short Abstract: We report a comprehensive bioinformatics analysis of the acyl-CoA dehydrogenase family (ACAD) family. We identified over 800 ACAD homologs from 250 species, recognized the subfamilies they belong to, compiled their taxonomic profiles, and traced back the evolution of the ACAD family.

Long Abstract: Click Here

Poster U30

Characterization of transcriptome splicing structure using high-throughput RNA-seq

Jinze Liu- University of Kentucky

Kai Wang (University of Kentucky, Computer Science); Stephen Coleman (University of Kentucky, Veterinary Science); James Macleod (University of Kentucky, Veterinary Science); Jan Prins (University of North Carolina at Chapel Hill, Computer Science);

Short Abstract: MAPSPAN robustly identify splices in the transcriptome sampled via RNA-seq short reads. A novel unsupervised algorithm maps spliced reads onto the reference genome. Compared with existing approaches, MAPSPAN demonstrates higher sensitivity and selectivity in identifying splices and their coverage on known datasets.

Long Abstract: Click Here

Poster U31

Does average viral genome sizes covary with that of microbes? A novel method applied to 150 metagenomes

Florent Angly- San Diego State University

Dana Willner (San Diego State University, Biology); Robert Schmieder (San Diego State University, Computer Science); Rebecca Vega-Thurber (Florida International University, Biology Department); Rob Edwards (San Diego State University, Computer Science); Forest Rohwer (San Diego State University, Biology);

Short Abstract: Viral genome vary in length by 1000X and are subject to different environmental pressures than microbes. We developed a method to estimate viral average genome size and applied it to 150 metagenomes to produce estimates for different biomes and verify if it covaries with microbial average genome size.

Long Abstract: Click Here

Poster U32

Increasing Short Read Mapping Speed by Masking of Residues Sequence Reads

Stefan Henz- Max Planck Institute for Developmental Biology

Fabio de Bona (Friedrich Miescher Laboratory, Machine Learning in Biology); Stefan R. Henz (Max Planck Institute for Developmental Biology, Molecular Biology); Korbinian Schneeberger (Max Planck Institute for Developmental Biology, Molecular Biology); Stephan Ossowski (Max Planck Institute for Developmental Biology, Molecular Biology); Detlef Weigel (Max Planck Institute for Developmental Biology, Molecular Biology); Gunnar Rätsch (Friedrich Miescher Laboratory, Machine Learning in Biology);

Short Abstract: Next generation sequencing technologies produce massive amounts of shortsequence reads with varying quality of their positions which slow-down thealignment as many possible mismatches have to be considered.We employ a machine-learning-based algorithm, RTrim, performing areads' segmentation into mappable and unmappable regions. By appropriately maskinglow-quality positions we can map these reads quicker since fewer mismatches are required.

Long Abstract: Click Here

Poster U33

Phylogeny in vertebrates of PEDF

Shivam sidana- JMIT

Niket ladha (JMIT, chemical);

Short Abstract: The PEDF gene first appears in vertebrates and our studies suggest that theregulation and biological actions of this gene are preserved across vertebrates. This analysis of the PEDF gene across phyla provides new information that will aid furthercharacterization of common functional motifs of this serpin in biological processes

Long Abstract: Click Here

Poster U34

Algebraic approach to DNA sequence homology assessment

Andrzej Brodzik- The MITRE Corporation

No additional authors

Short Abstract: We investigate difference sets and related combinatorial objects as models for novel DNA sequence homology markers. We construct representations of DNA sequences in the difference set space, and compute their alignment. This procedure permits identification of homologous DNA sequences in a small fraction of the time required by standard methods.

Long Abstract: Click Here

Poster U35

DistanceScan and Nash: Two novel tools for promoter analysis

Ekaterina Shelest- HKI, Hans Knoell Institute

Eugen Fazius (HKI, Hans Knoell Institute, Bioinformatics and systems biology); Vladimir Shelest (HKI, Hans Knoell Institute, Bioinformatics and systems biology); Reinhard Guthke (HKI, Hans Knoell Institute, Bioinformatics and systems biology);

Short Abstract: Nash is a motif-discovery tool based on a novel approach to the prediction of transcription factor binding sites (TFBS), which is alternative to widely used PWMs and HMMs. DistanceScan utilizes the method of distance distributions of TFBS pairs. It allows to select the functional combinations of motifs on non-random distances.

Long Abstract: Click Here

Poster U36

Promoters and The Transcription Factors - a Simple Relation via The miRNA

Chanchal Mitra- University of Hyderabad

Padmavathi Putta (University of Hyderabad, Biochemistry); Luciano Milanesi (Institute of Biomedical Technology, Bioinformatics);

Short Abstract: We have identified a relatively small set of GC-rich 6-nucleotide and 7-nucleotidesequences around the TSS in human promoter sequences. These sequences are distributed on both sides of the TSS and are likely to be involved in recognition and binding of various factors. They are relatively uncommon elsewhere.

Long Abstract: Click Here

Poster U37

ncSOLID a R package for non coding RNA digital sequencing

Raffaele Calogero- University of Torino

Cristina Della Beffa (University of Torino, Dipartimento di Scienze Cliniche e Biologiche); Francesca Cordero (University of Torino, Dipartimento di Scienze Cliniche e Biologiche);

Short Abstract: ncSOLID is a package for quantitative secondary analysis of non-coding transcriptome sequencing data generated with SOLID next-gen sequencing platform. The philosophy of the package is the organization of rna-seq data in a structure that allows the statistical detection of differential expression for ncRNAs, e.g. micro RNAs within Bioconductor framework.

Long Abstract: Click Here

Poster U38

SeqAn - An efficient C++ library for sequence analysis

David Weese- Free University of Berlin

Tobias Rausch (International Max Planck Research School for Computational Biology and Scientific Computing, Molecular Genetics); Marcel Schulz (International Max Planck Research School for Computational Biology and Scientific Computing, Molecular Genetics); Anne-Katrin Emde (Free University of Berlin, Computer Science); Andreas DÃ¶ring (Free University of Berlin, Computer Science); Knut Reinert (Free University of Berlin, Computer Science);

Short Abstract: SeqAn is an open source C++ library of efficient algorithms and data structures for the analysis of biological sequences. Using a template-based library design, SeqAn aims at providing (1) algorithms that are generic, fast and extensible and (2) data structures that allow the rapid prototyping of novel sequence analysis methods.

Long Abstract: Click Here

Poster U39

Efficient computation of good neighbor seeds

LUCIAN ILIE- University of Western Ontario

SILVANA ILIE (Ryerson University, Mathematics);

Short Abstract: The current state of the art of homology search involves the use of (multiple) spaced seeds. Particularly important are the neighbor seeds which combine high sensitivity with reduced space. We give the only polynomial-time algorithm that computes better neighbor seeds than previous ones while being several orders of magnitude faster.

Long Abstract: Click Here

Poster U40

Genome-wide computational analysis of eukaryotic core promoters

Holger Hartmann- Gene Center Munich

Claudia Gugenmus (Gene Center Munich, AG Soeding); Johannes Soeding (Gene Center Munich, AG Soeding);

Short Abstract: We have developed a sensitive method for core promoter analysis and detected all currently known but also several previously unknown motifs in yeast, fly and human. For yeast our results show that the core promoter is aligned to the +1 nucleosome rather than to the TSS.

Long Abstract: Click Here

Poster U41

An new bioinformatics analysis tools framework at EMBL-EBI

MickaÃ«l Goujon- European Bioinformatic Institute

Hamish McWilliam (European Bioinformatic Institute, External Services); Franck Valentin (European Bioinformatic Institute, External Services); Weizhong Li (European Bioinformatic Institute, External Services); Robert Langlois (European Bioinformatic Institute, External Services); Rodrigo Lopez (European Bioinformatic Institute, External Services);

Short Abstract: The popular framework to run analytical tools at the European Bioinformatic Institute has been redesigned to improve the user experience. The existing web interface and web services API have been reviewed and simplified to accommodate a larger audience and provide new and unique features that will greatly benefit the end-user.

Long Abstract: Click Here

Poster U42

Development of SOLiD SAGE and tag counting and identification software

Xiequn Xu- Life Technologies

Patrick Gilles (Life Technologies, R&D ASA); Jennifer Kilzer (Life Technologies, R&D ASA); Kevin Clancy (Life Technologies, R&D ASA); Adam Harris (Life Technologies, R&D ASA); Rob Bennett (Life Technologies, R&D ASA);

Short Abstract: We developed the SOLiDTM SAGE kit by modifying serial analysis of gene expression (SAGE) to produce longer tags and adapting it to SOLiDTM sequencing platform. Easy-to-use software with a graphical user interface has also been developed for the post-sequencing data analysis for labs with moderate computational resources.

Long Abstract: Click Here

Poster U43

EMBOSS: European Molecular Biology Open Software Suite

Peter Rice- European Bioinformatics Institute

Alan Bleasby (European Bioinformatics Institute, Rice Group); Jon Ison (European Bioinformatics Institute, Rice Group); Mahmut Uludag (European Bioinformatics Institute, Rice Group);

Short Abstract: EMBOSS is a mature package of software tools developed for the molecular biology community. It includes a comprehensive set of applications and C libraries for molecular sequence analysis and other tasks and integrates popular third-party software packages under consistent interfaces.

Long Abstract: Click Here

Poster U44

Base-pairing profile local alignment kernels for functional RNA analyses

Kengo Sato- Japan Biological Informatics Consortium (JBIC)

Yutaka Saito (Keio University, Department of Biosciences and Informatics); Yasubumi Sakakibara (Keio University, Department of Biosciences and Informatics);

Short Abstract: We developed base-pairing profile local alignment (BPLA) kernels for discrimination and detection of functional RNA sequences using SVMs, and confirmed the effectiveness of our method by not only computational experiments but also expression analysis via qRT-PCR.

Long Abstract: Click Here

Poster U45

Comparison of assembly strategies for high throughput de novo sequencing of bacterial genomes

Frank Panitz- University of Aarhus

Pernille Andersen (Faculty of Agricultural Sciences, Aarhus University, Department of Genetics and Biotechnology); Jakob Hedegaard (Faculty of Agricultural Sciences, Aarhus University, Department of Genetics and Biotechnology); Christian Bendixen (Faculty of Agricultural Sciences, Aarhus University, Department of Genetics and Biotechnology); Frank Panitz (Faculty of Agricultural Sciences, Aarhus University, Department of Genetics and Biotechnology);

Short Abstract: Different strategies were applied to generate optimal assemblies for two bacterial genome sequences based on de-novo sequencing using high-throughput 454 and Solexa paired-end reads. The quality of the hybrid assembly was assessed by the longest average contig size and also supported by gene prediction and comparative analysis to related genomes.

Long Abstract: Click Here

Poster U46

Determining the Reading Frame in Short DNA Fragments

Hochul Lee- San Diego State University

Peter Salamon (San Diego State University, Mathematics); Rob Edwards (San Diego State University, Center for Microbial Sciences); Forest Rohwer (San Diego State University, Biology); Ben Felts (San Diego State University, Computational Science Research Center); Sajia Akhter (San Diego State University, Computational Science Research Center);

Short Abstract: We describe an implementation and preliminary tests for an intelligent algorithm to select the protein-encoding reading frame in short fragments of DNA without relying on extrinsic information. The system will speed up current computational analyses and apply many new analytical methods to metagenomic datasets.

Long Abstract: Click Here

Poster U47

The GNUMAP Algorithm: Probabilistic Mapping of Oligonucleotides from Next-Generation Sequencing

Nathan Clement- Brigham Young University

Mark Clement (Brigham Young University, Computer Science); Quinn Snell (Brigham Young University, Computer Science); Evan Johnson (Brigham Young University, Statistics);

Short Abstract: GNUMAP addresses the analyses problems presented by an increase in the quantity of sequence data from next-generation sequencing technologies. The probabilistic nature of the mapping algorithm implemented in GNUMAP provides an accurate and efficient method for mapping large numbers of short sequences to a genome.

Long Abstract: Click Here

Poster U48

A novel predictor of mucin-type O-glycosylation sites

Yong-Zi Chen- china agricultural university

No additional authors

Short Abstract: we attempted to improve the prediction of O-glycosylation sites in mammalian proteins by seeking a new encoding scheme, named CKSAAP encoding. With the ability of reflecting characteristics of the sequences surrounding mucin-type O-glycosylation sites, and with the assistance of Support VectorMachine (SVM), the result showed that this method was more powerful than the other methods.

Long Abstract: Click Here

Poster U49

Raccess: A tool for genome-scale computation of structural accessibility of RNA transcripts.

Hisanori Kiryu- Computational Biology Research Center, AIST

Toutai Mituyama (Computational Biology Research Center, AIST, RNA Informatics Team); Kiyoshi Asai (University of Tokyo , Department of Computational Biology);

Short Abstract: We have developed a tool called Raccess for computing the accessibility of potential transcripts based on the Turner energy model of secondary structures. We have applied our tool to the entire human genome, and have analyzed the structural constraints imposed on the messenger RNAs and ancestral repeats.

Long Abstract: Click Here

Poster U50

Conserved Sequences of West Nile Viral Proteins as candidate targets for vaccine design

TinWee Tan- National University of Singapore

QiYing Koo (National University of SIngapore, Biochemistry); M. Asif Khan (National University of SIngapore, Biochemistry); Shweta Ramdas (National University of Singapore, Biochemistry); Keun-Ok Jung (Johns Hopkins University School of Medicine, Pharmacology and Molecular Sciences); Jerome Salmon (Johns Hopkins University School of Medicine, Pharmacology and Molecular Sciences); Olivo Miotto (University of Oxford, MRC Centre for Genomics and Global Health); Vladimir Brusic (Dana-Farber Cancer Institute, Cancer Vaccine Center); J Thomas August (Johns Hopkins University School of Medicine, Pharmacology and Molecular Sciences);

Short Abstract: The focus of this study is to identify and characterize WNV protein regions that have exhibited strong conservation throughout the recorded history of the virus, and that are potential targets of T-cell immune responses, using various bioinformatics-based methods and correlation with available experimental data.

Long Abstract: Click Here

Poster U51

Evolution of antigenic variant gene families within Plasmodium species

Diego Diez- Kyoto University

Nelson Hayes (Kyoto University, Kanehisa Laboratory); Susumu Goto (Kyoto University, Kanehisa Laboratory);

Short Abstract: We retrieved sequences involved in antigenic variation from different Apicomplexa and performed Pfam domain analysis. We found a gene family, which has undergone differential expansion in five Plasmodium species. We describe sequence and phylogenetic analyses on these families, revealing clues about the evolution of antigenic multi-gene families in other pathogens.

Long Abstract: Click Here

Poster U52

CARMA: Correction and Reference Morphing Algorithm

Thomas Otto- Wellcome Trust Sanger Institute

Mandy Sanders (Wellcome Trust Sanger Institute , Pathogen Genomics); Matt Berriman (Wellcome Trust Sanger Institute, Pathogen Genomics); Chris Newbold (John Radcliffe Hospital, Institute of Molecular Medicine);

Short Abstract: Second generation sequencing technology enables deep, low cost resequencing across multiple strains and species. We have developed an algorithm that iteratively maps short reads to a reference sequence. At each cycle, CARMA attempts to correct errors, or morph the reference into a new strain, and then evaluates its success.

Long Abstract: Click Here

Poster U53

CentroidFold: Predictions of RNA Secondary Structure for Estimating Accurate Base-pairs

Michiaki Hamada- Mizuho Information & Research Institute, Inc

Kengo Sato (Japan Biological Informatics Consortium, JBIC); Hisanori Kiryu (National Institute of Advanced Industrial Science and Technology (AIST), Computational Biology Research Center); Toutai Mituyama (National Institute of Advanced Industrial Science and Technology (AIST), Computational Biology Research Center); Kiyoshi Asai (University of Tokyo, Graduate School of Frontier Sciences);

Short Abstract: We developed software called CentroidFold for secondary structure predictionof RNA sequences, which includes the centroid estimator used in Sfoldas a special caseand is theoretically superior to MEA estimator used in CONTRAfold.A web server and stand-alone software are freely available athttp://www.ncrna.org/centroidfold/

Long Abstract: Click Here

Poster U54

On the quality of established datasets for benchmarking sequence database search and low-complexity handling tools: the ASTRAL compendium test case

Ioannis Kirmitzoglou- University Of Cyprus

Vasilis Promponas (University of Cyprus, Department of Biological Sciences);

Short Abstract: Benchmarking of sequence database search tools serves to establish protocols for routine or more elaborate searches. Low complexity regions (LCRs) complicate this procedure, requiring special handling.We provide new insights on validating the performance of relevant methods, taking into account LCRs, based on the widely used ASTRAL compendium datasets.

Long Abstract: Click Here

Poster U55

The SSAHA2 software pipeline for the mapping of DNA sequencing reads and genotype calling

Hannes Ponstingl- Wellcome Trust Sanger Institute

Yong Gu (Wellcome Trust Sanger Institute, Sequencing Informatics); Zemin Ning (Wellcome Trust Sanger Institute, Sequencing Informatics);

Short Abstract: The SSAHA2 software pipeline efficiently maps DNA sequencing reads onto a genomic reference sequence. A genotype call of the consensus sequence can be produced taking into account a heuristic score of the mapping quality. Reads from most types of sequencing platforms are supported including paired-end sequencing reads.

Long Abstract: Click Here

Poster U56

EMLIB: a C++ library to manage transcripts and genomic variations

Matteo Cereda- Scientific Institute IRCCS E.Medea

Manuela Sironi (Scientific Institute IRCCS E.Medea, Bioinformatics Laboratory); Uberto Pozzoli (Scientific Institute IRCCS E.Medea, Bioinformatics Laboratory);

Short Abstract: EMLIB is a C++ library containing a novel hierarchy of classes useful to define transcripts, to manage sequence variations and to calculate position-specific quantitative features in a “variation dependent” way. EMLIB provides an intuitive and powerful environment to gain insights about the effect of genomic variations.

Long Abstract: Click Here

Poster U57

BioHDF: Open binary file formats for large-scale data management - Project Update

Mark Welsh- Geospiza, Inc.

Todd Smith (Geospiza, Inc., CEO); N. Eric Olson (Geospiza, Inc., Product Development); Mike Folk (The HDF Group, -);

Short Abstract: BioHDF extends a mature Open Source technology for the storage of scientific data, Hierarchical Data Format, with features specific to Next Generation Sequencing. Initial prototyping of BioHDF has demonstrated clear benefits to storing sequences and their reference alignments in this structured binary format, including file compression and fast data retrieval.

Long Abstract: Click Here

Poster U58

Sequence analysis scale-up and acceleration using Grid and Cloud Computing yield efficient analyses of HIV-1 variants and other viruses

TinWee Tan- National University of Singapore

Yongli Hu (National University of Singapore, Biochemistry); Shen Jean Lim (National University of Singapore, Biochemistry); M Asif Khan (National University of Singapore, Biochemistry); Mark De Silva (National University of Singapore, Biochemistry); Kuan Siong Lim (National University of Singapore, Biochemistry); Martti Tammi (National University of Singapore, Biochemistry); J Thomas August (Johns Hopkins University School of Medicine, Pharmacology and Molecular Sciences);

Short Abstract: Sequence inundation in current paradigm affects the speed and scalability of immunoinformatics-driven sequence analysis of infectious agents such as HIV-1. To overcome these restrictions, we have customized and benchmarked bioinformatics analyses on Grid and Cloud computing and obtained good results of enhanced efficacy and scalability.

Long Abstract: Click Here

Poster U59

Comparative analysis of local compositional complecity in plant encoded proteins

Vasilis Promponas- University of Cyprus

Eleni Mytilineou (University of Cyprus, Department of Biological Sciences); Ioannis Kirmitzoglou (University of Cyprus, Department of Biological Sciences);

Short Abstract: Different approaches have identified numerous low complexity regions (LCR) in protein sequences. However, few systematical studies exist for elucidating their possible biological significance.We take advantage of the completetion of two model plant organism genomes to investigate their functional roles and the mechanisms responsible for LCR appearance, maintenance or modification.

Long Abstract: Click Here

Poster U60

Detecting biases in Next Generation Sequence data

Rudiger Brauning- AgResearch

Anar Khan (AgResearch, Bioinformatics, Mathematics and Statistics); Ken Dodds (AgResearch, Bioinformatics, Mathematics and Statistics); Jo-Ann Stanton (Anatomy and Structural Biology, University of Otago); Chris Mason (Anatomy and Structural Biology, University of Otago);

Short Abstract: A recent study (1) looked at systematic bias in amplicon sequencing by NGS platforms. We look at WGS data generated for the Watson genome project using the 454 platform. After filtering for identical reads we specifically analyse bias at the beginning of each sequence read.(1) Harismendy et al., Genome Biology 2009,

Long Abstract: Click Here

Poster U61

VARiD: Variation Detection in Color-Space and Letter-Space

Adrian Dalca- University of Toronto

Michael Brudno (University of Toronto, Computer Science);

Short Abstract: We present VARiD - a Hidden Markov Model for SNP and Indel identification with AB-SOLiD color-space and regular letter-space reads. VARiD combines both types of data in a single framework which allows for homozygous and heterozygous calls. On both simulated and real datasets VARiD demonstrates very high specificity and sensitivity.

Long Abstract: Click Here

Poster U62

Statistically Significant Ranking of NGS Differential Peaks

Bryan Beresford-Smith- NICTA

Adam Kowalczyk (NICTA, VRL); Thomas Conway (NICTA, VRL); Izhak Haviv (Baker IDI Heart and Diabetes Institute, The Blood and DNA Profiling Facility); Richard Tothill (Baker IDI Heart and Diabetes Institute, The Blood and DNA Profiling Facility);

Short Abstract: Several statistics are presented for replacing ad hoc heuristics such as fold-ratio for the identification of differential peaks in NGS data. The statistical framework leads naturally to a power law for peak significance versus number of reads. The tests have been applied to ChIP-Seq data sets to demonstrate their usefulness.

Long Abstract: Click Here

Poster U63

Prediction of Protein Disordered and Ordered Region

Meijing Li- Chungbuk national university

Yoon Kyeong Lee (Chungbuk National University, Signal Transduction and Systems Biology Laboratory); Jin Hyoung Park (Chungbuk National University, Database/Bioinformatics Laboratory); Heon Gyu Lee (Electronics and Telecommunications Research Institute, Postal& Logistics Research Dep.); Hak Yong Kim (Chungbuk National University, Signal Transduction and Systems Biology Laboratory); Keun Ho Ryu (Chungbuk National University, Database/Bioinformatics Laboratory);

Short Abstract: In this paper, we proposed emerging sequence-based prediction method for identifying protein disordered and ordered region from protein sequence. In the experiment, disordered sequence data: DisProt, ordered sequence data: PDB as training data. The test data is from CASP7. The experiment result is better than result of published prediction methods.

Long Abstract: Click Here

Poster U64

Positive selection contributes to the emergence of new HIV-1 lineages while high substitution rates determines viral pathogenesis in epidemically linked patients

Elcio Leal- Federal University of Sao Paulo

No additional authors

Short Abstract: The contribution of HIV-1 diversity to AIDS was evaluated in epidemically linked individuals composed by one blood donor and two blood recipients. The same HIV-1 source, transmitted during blood transfusion, indicated positive selection as a key factor to the emergence of lineages while substitution rates determine the disease outcome.

Long Abstract: Click Here

Poster U65

Benchmarking promoter prediction software

Thomas Abeel- VIB-UGent

Yvan Saeys (VIB-UGent, Plant Systems Biology); Yves Van de Peer (VIB-UGent, Plant Systems Biology);

Short Abstract: Recently many new promoter prediction programs (PPPs) have emerged, but a common benchmarking strategy is lacking. We propose a multi-faceted protocol as a gold standard for PPP evaluation. We benchmarked 17 PPPs and further investigated the best four. The importance of PPPs will only increase, as more genomes are sequenced.

Long Abstract: Click Here

Accepted Posters

Preparing your Poster - Information and Poster Size
Poster presentation video taped for posting to the SciVee website Information Poster Schedule
Poster Categories
Search for a Poster

View Posters By Category

Search Posters:

↑ TOP

Poster Number	Matches
Last Name
Co-Authors	Contains
Title
Abstract	Contains