Accepted Posters

Category 'U'- Sequence analysis'
Poster U01
Anvaya: A Workflow environment for High Throughput Comparative Genomics
Ruma Banerjee- Centre for Development of Advanced Computing
Alok Bhandari (C-DAC, Bioinformatics); Avik Datta (C-DAC, Bioinformatics); Avinash Bayaskar (C-DAC, Bioinformatics); Bhakti Limaye (C-DAC, Bioinformatics); Harshal Inamdar (C-DAC, Bioinformatics); Pankaj Vats (C-DAC, Bioinformatics); Rajanikanth Tupakula (C-DAC, Bioinformatics); Ramakrishnan E.P. (C-DAC, Bioinformatics); Rashmi Mahajan (C-DAC, Bioinformatics); Renu Gadhari (C-DAC, Bioinformatics); Sandeep Malviya (C-DAC, Bioinformatics); Sankalp Jain (C-DAC, Bioinformatics); Sonal Dahale (C-DAC, Bioinformatics); Sunitha Manjari K. (C-DAC, Bioinformatics); Vivek Gavane (C-DAC, Bioinformatics); Rajendra Joshi (C-DAC, Bioinformatics);
Short Abstract: Anvaya is a stand-alone client-server workflow environment that consists of Bioinformatics tools and databases loosely coupled together to execute a set of analyses tools in series or in parallel. It provides pre-defined workflows for genome annotation and comparative genomics like EST assembly, phylogenetic reconstruction and microarray analysis.
Long Abstract:Click Here

Poster U02
Prediction of KOPS motifs involved in segregating of bacterial chromosomes in Lactocoques / Streptocoques
Fabrice Touzain- INRA
Sophie Nolivos (CNRS, LMGM, Toulouse); François Cornet (CNRS, LMGM, Toulouse); Pascal Le Bourgeois (CNRS, LMGM, Toulouse); Sophie Schbath (INRA, MIG, Jouy-en-Josas); Meriem El Karoui (INRA, Micalis, Jouy-en-Josas);
Short Abstract: KOPS is a motif involved in chromosome segregation process. From known motifs in three bacteria, we inferred prediction rules based on over-representation and orientation skew criteria. The resulting program gave us predicted KOPS in Lactococcus lactis, Streptococcus pneumoniae and Streptococcus agalactiae. L. lactis motif was experimentally validated.
Long Abstract:Click Here

Poster U03
RNA Secondary Structure Prediction of Multiple Sequences
Zhenjiang Xu- University of Rochester
Zhenjiang Xu (University of Rochester, Biochemistry and Biophysics); David Mathews (University of Rochester, Biochemistry and Biophysics);
Short Abstract: We developed a new algorithm, Multilign, to predict RNA secondary structure of multiple sequences. The benchmark result on tRNA, 5S rRNA, SRP RNA, RNase P and 16S rRNA shows prediction accuracy of Multilign is better or comparable to those of other algorithms predicting secondary structures of multiple sequences.
Long Abstract:Click Here

Poster U04
A systematic approach to predicting organism-specific subcellular localization suggests species-specific sorting patterns
Patrick Xuechun Zhao- The Samuel Roberts Noble Foundation
Rakesh Kaundal (The Samuel Roberts Noble Foundation, Plant Biology Division);
Short Abstract: The present study addresses an important fundamental question regarding how well the subcellular localization predictors perform when grouping all eukaryotes together, versus making predictions for narrower phylogenetic lineages. Combining machine learning and homology-based approaches, we demonstrate the advantages of developing 'organism-specific' methods over 'general' ones to predict protein subcellular localization.
Long Abstract:Click Here

Poster U05
Analysis Tools for Targeted Metagenomics
Qiong Wang- Michigan State University
Jordan Fish (Michigan State University, Center for Microbial Ecology); Benli Chai (Michigan State University, Center for Microbial Ecology); James Tiedje (Michigan State University, Center for Microbial Ecology); James Cole (Michigan State University, Center for Microbial Ecology); Yanni Sun (Michigan State University, Department of Computer Science and Engineering );
Short Abstract: "Targeted Metagenomics" is directed at deep-sequencing of environmentally important protein-coding genes. New tools are needed to process these data. RDP FrameBot detects and corrects frameshift artifacts in targeted amplicon reads, while RDP mcClust implements the memory-constrained hierarchical clustering algorithm for datasets that are too large to cluster in memory.
Long Abstract:Click Here

Poster U06
An SNN-GA Approach for the Prediction of Transcription Factor Binding Sites
Heike Sichtig- University of Florida
Alberto Riva (University of Florida, Molecular Genetics & Microbiology, University of Florida Genetics Institute);
Short Abstract: This work describes the application of a novel machine learning method, based on the combination of Spiking Neural Networks and Genetics Algorithms, to the problem of identifying Transcription Factor Binding Sites in DNA sequences. We present the details of our method and an evaluation of its predictive performance.
Long Abstract:Click Here

Poster U07
Computational Analysis of Genome-wide Synthetic Lethal Screen with a tyrosine kinase inhibitor in Leukemia cell using BiNGS!SL-seq
Jihye Kim- University of Colorado Denver School of Medicine,
Hyunmin Kim (University of Colorado Denver School of Medicine,, Department of Biochemistry and Molecular Genetics); Dexiang Gao (University of Colorado Denver, Department of Biostatistics and Informatics, Colorado School of Public Health); Mark Gregory (University of Colorado Denver School of Medicine, Department of Biochemistry and Molecular Genetics); Heather Selby (University of Colorado Denver School of Medicine, Division of Medical Oncology, Department of Medicine); Tiejun Tong (University of Colorado Boulder, Department of Applied Mathematics); Tzu Phang (University of Colorado Denver School of Medicine, Division of Pulmonary Sciences and Critical Care Medicine, Department of Medicine); James DeGregori (University of Colorado Denver School of Medicine, Department of Biochemistry and Molecular Genetics); Aik Choon Tan (University of Colorado Denver School of Medicine, Division of Medical Oncology, Department of Medicine);
Short Abstract: We performed genome-wide RNA interference deep sequencing screen to identify synthetic lethal interactions with a tyrosine kinase inhibitor in leukemia. We performed the analysis and interpretation of the shRNA reads using BiNGS!SL-seq and identify promising genes whose suppression is synthetic lethal with the drug. Selected genes were experimentally validated.
Long Abstract:Click Here

Poster U08
A Case Study for Genome-wide Smad4 Binding Sites Using BiNGS!ChIP-seq
Hyunmin Kim- Colorado School of Medicine
jihye kim (University of Colorado School of Medicine, Division of Medical Oncology, Department of Medicine); Tzu L. Phang (University of Colorado School of Medicine, Division of Pulmonary Sciences and Critical Care Medicine, Department of Medicine); Dexiang Gao (University of Colorado School of Medicine, Department of Biostatistics and Informatics, Colorado School of Public Health); Tiejun Tong (University of Colorado Boulder, Department of Applied Mathematics); Heather Selby (University of Colorado School of Medicine, Division of Medical Oncology, Department of Medicine); Qinghong Zhang (University of Colorado School of Medicine, Department of Dermatology); Xiao-Jing Wang (University of Colorado School of Medicine, Department of Pathology); David Bentley (University of Colorado School of Medicine, Department of Biochemistry and Molecular Genetics); Aik Choon Tan (University of Colorado School of Medicine, Division of Medical Oncology, Department of Medicine);
Short Abstract: We developed BiNGS!ChIP-seq that provides various methods for analyzing and interpreting ChIP-seq data tightly coupled with biological inputs. Using this system, we characterized the genome-wide binding sites of Smad4 in normal keratinocytes, and found this protein binding sites in several genes that play key roles in promoting carcinogenesis.
Long Abstract:Click Here

Poster U09
Probeset design for target-capture sequencing
FANG FANG- University of Southern California
Andrew Smith (University of Southern California, Computational Biology);
Short Abstract: Target-capture provides a cost and time efficient way for genomic sample selection in the next-generation sequencing. However, until now there has been no rigorous study for designing probesets of hybridization. Here we present a system for probeset design based on statistical analyses relating probe properties to capture performance.
Long Abstract:Click Here

Poster U10
BS-Scorer: scoring the fifth base and improved mapping bisulphite deep sequencing reads for cytosine methylation analysis
Huy Dinh- Gregor Mendel Institute of Molecular Plant Biology, Vienna
Fritz Sedlazeck (Max F. Perutz Laboratories, Vienna, Center for Integrative Bioinformatics, Vienna); Ortrun Mittelsten Scheid (Gregor Mendel Institute of Molecular Plant Biology, Vienna, Mittelsten Scheid Lab); Arndt von Haeseler (Max F. Perutz Laboratories, Vienna, Center for Integrative Bioinformatics, Vienna);
Short Abstract: Bisulphite deep sequencing is widely used for genome-wide DNA methylation analysis, but annotating the reads of highly converted regions is difficult. Our approach incorporates the cytosine and thymine frequencies in the reference genome in the alignment-scoring matrix. Integrating this matrix in a Smith-Waterman algorithm improves the annotation efficiency significantly.
Long Abstract:Click Here

Poster U11
Quantitative Deep Sequencing and Detecting HIV-1 Drug Resistance
John Archer- University Of Manchester
Andrew Rambaut (University of Edinburgh, Institute of Evolutionary Biology); David Robertson (University of Manchester, Faculty of Life Sciences);
Short Abstract: Mapping of viral high-throughout pyrosequencing data is rarely adequately optimized. In ultra-deep genotypic studies this has potential to introduce bias to the population structure. We present a framework for mapping this data and apply it to the detection of resistance to CCR5 antagonists within HIV-1 viral populations.
Long Abstract:Click Here

Poster U12
Inferring de novo genomic homology with sets of minimal absent words
Sara Garcia- University of Aveiro
Armando Pinho (University of Aveiro, Signal Processing Laboratory, IEETA/DETI); João Rodrigues (University of Aveiro, Signal Processing Laboratory, IEETA/DETI); Carlos Bastos (University of Aveiro, Signal Processing Laboratory, IEETA/DETI); Paulo Ferreira (University of Aveiro, Signal Processing Laboratory, IEETA/DETI);
Short Abstract: We present a new methodology for finding minimal absent words in genomic sequences. Minimal absent words are absent words with the property of being present if their left- or rightmost character is removed. We use sets of minimal absent words for inferring de novo genomic homology.
Long Abstract:Click Here

Poster U13
Threading Alignments Based on Sampling and Sequence Dependent Scoring Function
Zhiquan He- University of Missouri-Columbia
No additional authors
Short Abstract: We have designed and implemented a new protocol for protein threading by integrating multiple information from query sequence and template structure,
sampling the alignment parameters to generate a pool of alignments, then filtering and ranking the alignments for final output.
Long Abstract:Click Here

Poster U14
SMALT - a new mapper for DNA sequencing reads
Hannes Ponstingl- Wellcome Trust Sanger Institute
No additional authors
Short Abstract: A new computer program is presented that efficiently maps DNA sequencing reads onto genomic reference sequences with very high sensitivity and low error rates. Reads from most types of sequencing platforms can be mapped including paired-end reads. A range of output formats, e.g. SAM and PSL, are supported.
Long Abstract:Click Here

Poster U15
Alternative Isoforms Identification and Quantification from RNA-Seq Data
Xiaobo Zhou- The Methodist Hospital Research Institute
Zheng Xia (The Methodist Hospital Research Institute, Radiology);
Short Abstract: We presented a maximum a posteriori (MAP) model to identify alternative isoforms structure and to quantify isoform expression simultaneously from RNA-Seq data. Results on simulation and real RNA-Seq data demonstrated the feasibility of our method to deal with the two issues.
Long Abstract:Click Here

Poster U16
Choosing the right coverage depth and read length for an RNA-seq experiment
Madelaine Gogol- Stowers Institute for Medical Research
Ron Yu (University of Kansas Medical Center, Department of Anatomy and Cell Biology); Hua Li (Stowers Institute for Medical Research, Bioinformatics);
Short Abstract: When designing an RNA-seq experiment, one must choose a read length as well as decide how deep to sequence their samples (number of lanes). Here, we study how gene expression (mean, variance) and subsequent differential expression analysis are affected by the coverage depth and read length using mouse RNA-seq data.
Long Abstract:Click Here

Poster U17
Novel Epitopes of the Ebola Virus for Rational Vaccine Design: Therapeutic Development and Protein Conservation.
Sophia Banton- Florida Atlantic University
Mirjana Pavlovic (Florida Atlantic University, Electrical Engineering); Zvi Roth (Florida Atlantic University, Electrical Engineering);
Short Abstract: Towards rational vaccine design, novel B- and T- cell Ebola virus epitopes were identified using various computational algorithms. These include a 3D epitope (EAIVNAQPK...MHNQDG) which was also extracted from a structure of the Ebola glycoprotein bound to human antibody KZ52. One epitope (DYHKILTAG) surprisingly contains a Eukaryotic protein fingerprint.
Long Abstract:Click Here

Poster U18
Structural Variation Analysis with Strobe Reads
Anna Ritz- Brown University
Ali Bashir (Bacific Biosciences, Bioinformatics); Benjamin Raphael (Brown University, Computer Science);
Short Abstract: A new sequencing technology called strobe sequencing from Pacific Biosciences generates multiple subreads from a DNA fragment, generalizing paired reads produced by other technologies. We introduce a structural variant detection algorithm using strobe sequencing and show that strobe reads provide better sensitivity and specificity than paired reads in simulation.
Long Abstract:Click Here

Poster U19
An exact approach to mapping next-generation sequencing reads containing structural variation
Anne-Katrin Emde- Freie Universität Berlin
David Weese (Freie Universität Berlin, Department of Computer Science); Marcel H. Schulz (Max-Planck-Institute for Molecular Genetics, Computational Molecular Biology Group); Stefan Haas (Max-Planck-Institute for Molecular Genetics, Computational Molecular Biology Group); Knut Reinert (Freie Universität Berlin, Department of Computer Science);
Short Abstract: NGS has many applications, among them genome resequencing. Sequenced reads containing structural variation compared to a reference genome may be hard to map with conventional mapping strategies. Here we present tools for mapping reads containing structural variants and for subsequent variant detection, implemented in the highly efficient SeqAn C++ library.
Long Abstract:Click Here

Poster U20
Exploring Enviornmental Niches Via Short Read Metagenomic Sequencing
Vinicio Reynoso- Loyola University Chicago
Catherine Putonti (Loyola University Chicago, Biology, Computer Science);
Short Abstract: Next-generation sequencers provide a feasible means of assessing entire environmental niches. Herein we present a pipeline for analyzing short-reads. This pipeline incorporates both BLAST-like searches, in order to identify species for which sequences are available, as well as taxonomical binning of novel reads based upon k-mer compositions.
Long Abstract:Click Here

Poster U21
A method for detecting small scale human tandem repeat length polymorphism using Illumina paired-end sequencing data
Weldon Whitener- Wellcome Trust Sanger Institute
No additional authors
Short Abstract: Microsatellites are common motifs in the human genome; however, little is known about the patterns of polymorphism within them. To rectify this, we have developed a method to use Illumina sequencing data to identify microsatellites that differ in length between an individual's genome and the human reference genome.
Long Abstract:Click Here

Poster U22
Identification of Putative Novel Protein Coding Genes from Metagenomic Samples
David Messina- Stockholm University
Erik Sonnhammer (Stockholm University, Stockholm Bioinformatics Centre); Björn Andersson (Karolinska Institute, Cell and Molecular Biology);
Short Abstract: Despite the steady rise in sequence information, there is a persistent, significant fraction of that sequence which does not match any known sequence, called ORFans. We have developed an novel method using Ka/Ks ratios to identify putative protein-coding sequences from ORFan population of metagenomic sequence data.
Long Abstract:Click Here

Poster U23
Analyses of reagents developed for Oligomerized Pool ENgineering (OPEN) reveal insights into interactions between zinc finger proteins and DNA
Fengli Fu- Iow state university
Keith Joung (Harvard Medical School, Pathology); Daniel Voytas (University of Minnesota, Department of Genetics, Cell Biology & Development and Center for Genome Engineering);
Short Abstract: This poster presents the analysis results of the regents developed for Oligomerized Pool ENgineering (OPEN). The interaction rules between zinc finger protein and DNA were revealed and proved useful in aiding in engineering zinc finger proteins by experimental test.
Long Abstract:Click Here

Poster U24
he GNUMAP project: probabilistic mapping of next generation sequencing reads with applications
William Johnson- Brigham Young University
Mark Clement (Brigham Young University, Computer Science); Quinn Snell (Brigham Young University, Computer Science); Nathan Clement (Brigham Young University, Computer Science); Spencer Clement (Brigham Young University, Computer Science);
Short Abstract: We present an unbiased probabilistic approach to mapping sequencing reads to a reference genome. We show that our approach typically maps more reads and is more accurate than other existing approaches. In addition, we present recent developments to our algorithm to accommodate SNP calling, RNA editing, and bisulfite conversion.
Long Abstract:Click Here

Poster U25
Modeling the sequencing background in ChIP-Seq experiments
Abhishek Mitra- University of Texas Medical Branch at Galveston
Andrzej Kudlicki (University of Texas Medical Branch at Galveston, Biochemistry and Molecular Biology); Maga Rowicka (University of Texas Medical Branch at Galveston, Biochemistry and Molecular Biology);
Short Abstract: Next-generation sequencing (NGS) is rapidly gaining popularity, especially for identifying protein-DNA interactions (ChIP-Seq). The properties of sequencing background (control) are not well understood. We propose an approach to modeling these backgrounds that improves the quality of the ChIP-Seq data analysis and elucidates how such backgrounds arise.
Long Abstract:Click Here

Poster U26
Identifying fusion genes from short read transcriptome data
Pora Kim- Ewha Womans University
Namshin Kim (Korea Research Institute of Bioscience and Biotechnology, Korean Bioinformation Center); Sanghyuk Lee (Korea Research Institute of Bioscience and Biotechnology, Korean Bioinformation Center);
Short Abstract: Fusion genes are important players in cancer development. Previously, we developed an algorithm for identifying fusion gene candidates by analyzing transcriptome data such as mRNA and EST sequences. In this work, the algorithm was extended to cover the single and paired-end short reads.
Long Abstract:Click Here

Poster U27
An Accuracy Evaluation of Read Alignment Algorithms
André Kahles- Friedrich Miescher Laboratory of the Max Planck Society
Jonas Behr (Friedrich Miescher Laboratory of the Max Planck Society, Friedrich Miescher Laboratory); Regina Bohnert (Friedrich Miescher Laboratory of the Max Planck Society, Friedrich Miescher Laboratory); Gunnar Rätsch (Friedrich Miescher Laboratory of the Max Planck Society, Friedrich Miescher Laboratory);
Short Abstract: Although many algorithms for aligning spliced reads were developed and widely used over the past years, no comprehensive comparison is available, yet. Based on artificial data and alignments generated during the RGASP competition, we evaluate and compare performance, accuracy, and error distribution of the most common spliced read alignment strategies.
Long Abstract:Click Here

Poster U28
PALMapper: Fast and Accurate Spliced Alignments of Sequence Reads
Geraldine Jean- Max Planck Society
Gunnar Raetsch (Max Planck Society, Friedrich Miescher Laboratory); Andre Kahles (Max Planck Society, Friedrich Miescher Laboratory); Soeren Sonnenburg (Berlin Institute of Technology, Machine Learning Group); Fabio De Bona (Max Planck Society, Friedrich Miescher Laboratory); Korbinian Schneeberger (Max Planck Institute for Developmental Biology, Department of Molecular Biology); Jörg Hagmann (Max Planck Institute for Developmental Biology, Department of Molecular Biology); Detlef Weigel (Max Planck Institute for Developmental Biology, Department of Molecular Biology);
Short Abstract: RNA-seq produces huge amounts of sequence reads that are short,
error prone and that can be spliced. We have developed PALMapper that is
the RNA-seq read mapper combining GenomeMapper and a faster
version of QPALMA to accurately and efficiently align reads, while
taking advantage of each read's quality information and computational
splice site predictions.
Long Abstract:Click Here

Poster U29
Protein interactions and ligand binding: from protein subfamilies to functional specificity
Antonio Rausell- Spanish National Cancer Research Centre (CNIO)
David Juan (Spanish National Cancer Research Centre (CNIO), Structural Biology and BioComputing Programme); Florencio Pazos (National Centre for Biotechnology (CNB-CSIC), Computational Systems Biology Group); Alfonso Valencia (Spanish National Cancer Research Centre (CNIO), Structural Biology and BioComputing Programme);
Short Abstract: The divergence accumulated during the evolution of protein families translates into their internal organization as subfamilies, and it is reflected in the characteristic patterns of differentially conserved residues. These specifically conserved positions in protein subfamilies are known as “specificity determining positions” (SDPs). Previous studies have limited their analysis to the study of the relationship between these positions and ligand-binding specificity. We have systematically extended this observation to include the role of differential protein interactions in the segregation of protein subfamilies and explored in detail the structural distribution of SDPs at protein interfaces. Our results show the extensive influence of protein interactions in the evolution of protein families and the widespread association of SDPs with interfaces. The combined analysis of SDPs in interfaces and ligand-binding sites provides a more complete picture of the organization of protein families, constituting the necessary framework for a large-scale analysis of the evolution of protein function.
Long Abstract:Click Here

Poster U30
Algorithm for Phylogenetic Tree Building and Taxonomic Classification using Curated Phylogenetic Tree
David Knox- University of Colorado
Robin Dowell (Univ of Colorado, MCDB);
Short Abstract: Analysis of microbial communities requires phylogenetic and taxonomic analysis to understand diversity, but processing millions of sequence reads is time consuming using current analysis methods. ParsInsert uses parsimonious insertion into a curated tree of known sequences to efficiently produce both a complete phylogenetic tree and taxonomic classification of all sequences.
Long Abstract:Click Here

Poster U31
Detecting Breakpoints of Large Deletions and Medium Sized Insertions from Pair-end Short Reads in 1000 Genomes Project and Cancer genome project
Kai Ye- Leiden University Medical Center
No additional authors
Short Abstract: Recently we developed a novel procedure based on Pindel algorithm to process multiple samples, in order to investigate disease-related variants and genetic survey of large populations. We add tags to the reads to indicate their sources. Then we run Pindel using the entire pool of reads as the input. We modified our Pindel program to report sample sources of the supporting reads for each identified event. With such information we are able to discern which samples have what indels. We demonstrate our new development with low coverage sequence data of human chr1 from 170 inividuals in the pilot study of 1000 genomes project and high coverage sequence data of paired normal/tumor genomes.
Long Abstract:Click Here

Poster U32
Sequence homology assessment in the difference set space
Andrzej Brodzik- The MITRE Corporation
No additional authors
Short Abstract: Recently, a new mathematical approach for the analysis of DNA sequences was proposed. In this approach the DNA sequence is represented by its sub-sampled version derived from the distribution of cyclic difference sets. A difference set is a concept borrowed from combinatorial design theory. This representation is convenient for two reasons. First, cyclic difference sets capture the random character of the DNA sequence. In effect, sequence homology markers based on difference sets can be directly linked with sequence regions of high combinatorial complexity that are subject to fewer sequencing errors and with regions that have long evolutionary history. Second, the analysis of DNA sequences can be replaced by the analysis of much shorter difference set sequences, thereby significantly reducing the computational cost of many computational biology tasks, including sequence homology evaluation, sequence complexity assessment, and sequence variation detection. In this presentation we will describe the mathematical underpinning of the approach.
Long Abstract:Click Here

Poster U33
A fast mapping tool for DNA sequence reads from the SOLiD™ System: part of Bioscope™
Zheng Zhang- Life Technologies
Jing Zhai (Life Technologies, GS); Danwei Guo (Life Technologies, GS); Yuandan Lou (Life Technologies, GS); Asim Siddiqui (Life Technologies, GS);
Short Abstract: We present a noval mapping software for next generation sequencing data. It addresses the problem of variable error rates at different region of the reads, providing greatly improved sensitivity as well as fast running time.
Long Abstract:Click Here

Poster U34
Jalview: Now and Next.
James Procter- University of Dundee
Peter Troshin (University of Dundee, College of Life Sciences); David Martin (University of Dundee, College of Life Sciences); Geoff Barton (University of Dundee, College of Life Sciences);
Short Abstract: Jalview (www.jalview.org) is a widely used open-source program for sequence alignment visualization. It is designed to allow both casual and expert users to easily visualize, edit, analyse and annotate multiple sequence alignments, and in this poster, we highlight the latest developments in the Jalview system.
Long Abstract:Click Here

Poster U35
Sequence read information improves accuracy of SNP detection
Allison Regier- University of Notre Dame
Upeka Samarakoon (University of Notre Dame, Biological Sciences); Asako Tan (University of Notre Dame, Biological Sciences); Michael Ferdig (University of Notre Dame, Biological Sciences); Scott Emrich (University of Notre Dame, Computer Science and Engineering);
Short Abstract: The assembly step is critical for subsequent use of genome data. While genome assemblies can be useful for sequence analysis, they may introduce errors and/or hide ambiguities.
By using sequence read information directly rather than accessing it through the genome assembly, we increased the reliability of genome-wide SNP detection.
Long Abstract:Click Here

Poster U36
Neighbourhood effect and amino acid preference in extremophiles
Chinmay Dwibedi- VIT University
Gurunathan Jayaraman (VIT University, Bioinformatics);
Short Abstract: We hypothesize that under certain environmental conditions there exist a preference for a particular amino acid or a particular group of amino acids within a species and also neighbourhood effect as a phenomenon due to which a particular amino acid in a sequence is biased by its nearest amino acids
Long Abstract:Click Here

Poster U37
FastQC – A Quality Control tool for High Throughput Sequencing Data
Simon Andrews- The Babraham Institute
No additional authors
Short Abstract: FastQC is an application which generates a quality report for high throughput sequence data sets. It can run either as an interactive GUI application or be integrated into an analysis pipeline. It provides a comprehensive set of results to help visualise different aspects of a sequence library.
Long Abstract:Click Here

Poster U38
Bismark, a New Tool for Mapping and Methylation Analysis of Bisulfite-Seq Data
Felix Krueger- The Babraham Institute
Marek Piatek (The Babraham Institute, Bioinformatics); Simon Andrews (The Babraham Institute, Bioinformatics);
Short Abstract: Next-generation sequencing of bisulfite treated DNA (BS-seq) is a powerful tool to detect cytosine methylation on a genome-wide scale. We have developed a BS-seq analysis tool, termed Bismark, which performs time-efficient mapping of bisulfite data to a reference genome as well as methylation calling in a single step.
Long Abstract:Click Here

Poster U39
A congruent objective function for key steps in the phylogenomic method
Marcin Cieslik- University of Virginia
Cameron Mura (University of Virginia, Chemistry);
Short Abstract: Phylogenomics is a complex methodology for protein function prediction in the context of evolutionary data. We derive a rigorous objective function for tree estimation, family classification and the inference of specific residues. The congruent model allows for a simple and algorithmically efficient implementation of phylogenomic protocols.
Long Abstract:Click Here

Poster U40
Analysis of AB SOLiD™ System Targeted Resequencing Reads
Jonathan Manning- Life Technologies
No additional authors
Short Abstract: Sequencing reads from enriched, multiplexed samples were analyzed to separately quantify the performance of target selection, library construction, multiplexing, enrichment, and sequencing. The reads mapping uniquely on target were further analyzed to identify variants within the selected regions of interest.
Long Abstract:Click Here

Poster U41
A method for detecting structural variants from massive paired end genome sequences by mapping signatures
Daisuke Ueta- Osaka University
Shigeto Seno (Osaka University, Graduate school of information science and technology); Yoichi Takenaka (Osaka University, Graduate school of information science and technology); Hideo Matsuda (Osaka University, Graduate school of information science and technology);
Short Abstract: We proposed a method to detect structural variants using paired end data generated by high throughput sequencer. Our method ranks detected results properly using criteria derived from the length of pair spans and the number of contributive pairs. We confirmed the effectiveness by detecting structural variants in bacterial genomes.
Long Abstract:Click Here

Poster U42
Structural filters for RNA homology search
Diana Kolbe- Janelia Farm Research Campus
No additional authors
Short Abstract: Covariance models (CMs) are powerful tools for identifying and aligning structural RNA genes. Due to high computational requirements, CM search is often preceded by BLAST or HMM filters, which don't use secondary structure. I present a fast profile-based structural filter, which finds hits sequence-based methods miss.
Long Abstract:Click Here

Poster U43
SSU-ALIGN: a tool for structural alignment of SSU rRNA sequences
Eric Nawrocki- HHMI Janelia Farm Research Campus
No additional authors
Short Abstract: SSU-ALIGN is a freely available, open source software package for creating large-scale secondary-structure-based multiple sequence alignments of small subunit ribosomal RNA (SSU) sequences using covariance models (CMs). SSU-ALIGN accurately classifies and
structurally aligns archaeal, bacterial or eukaryotic SSU sequences at a rate of roughly one full-length sequence per second.
Long Abstract:Click Here

Poster U45
A novel profile Hidden Markov Model to predict microRNAs and their targets simultaneously
Shu Yang- The University of Hong Kong
Kalpana Agrawal (The University of Hong Kong, Department of Biochemistry); Tak Wah Lam (The University of Hong Kong, Department of Computer Sciences); Pak Chung Sham (The University of Hong Kong, Department of Psychiatry); Kathryn S. E. Cheah (The University of Hong Kong, Department of Biochemistry); Junwen Wang (The University of Hong Kong, Department of Biochemistry);
Short Abstract: To further understand the miRNA-target interaction, we developed a novel HMM that can predict miRNA and target simultaneously. Unlike current strategies which only use features from either targets or miRNAs alone, our model integrates the features together and thus captured evolutionary relationships of both. It outperformed RNAhybrid in preliminary tests.
Long Abstract:Click Here

Poster U46
MISO: an open-source LIMS for small-to-large scale sequencing centres
Robert Davey- The Genome Analysis Centre
Xingdong Bian (The Genome Analysis Centre, Sequence Informatics); Richard Holland (Eagle Genomics, Operations and Delivery); Mario Caccamo (The Genome Analysis Centre, Head of Bioinformatics);
Short Abstract: Proprietary lab information managements systems (LIMS) are often "black box" in their approach and not cost-effective for many sequencing environments. We are developing MISO, a completely open-source LIMS using freely available tools, for small-to-large scale sequencing centres whose demand for metadata storage is vital yet cannot consider expensive solutions.
Long Abstract:Click Here

Poster U47
A Robust High-Throughput Second Generation Sequencing Analysis Pipeline
Sanjeev Bhaskar- Wellcome Trust Sanger Institute
No additional authors
Short Abstract: Second generation sequencing has triggered a major effort to develop tools to manage and analyse large scale sequence data; keeping up with this rapidly changing area, a modularised status-driven analysis pipeline which allows rapid integration of new analysis software has been built. We discuss the tools and approaches used.
Long Abstract:Click Here

Poster U48
Threshold Average Precision (TAP-k): A Retrieval E?cacy Measure for Bioinformatics
Hyrum Carroll- National Institutes of Health (NIH)
Maricel G. Kann (University of Maryland, Baltimore County, Biology Department); Sergey L. Sheetlin (National Institutes of Health, National Center for Biotechnology Information); John L. Spouge (National Institutes of Health, National Center for Biotechnology Information);
Short Abstract: Threshold Average Precision (TAP-k) is a retrieval measure that faithfully accounts for biased data sets by query-averaging and reflects standard usage in bioinformatics by encorporating E-value thresholds. Current methods like pooled ROC scores are susceptable to being arbitarily skewed by a single query and do not represent actual usage.
Long Abstract:Click Here

Poster U49
EMBOSS: European Molecular Biology Open Software Suite
Peter Rice- European Bioinformatics Institute
Alan Bleasby (European Bioinformatics Institute, Services Division); Jon Ison (European Bioinformatics Institute, Services Division); Mahmut Uludag (European Bioinformatics Institute, Services Division);
Short Abstract: EMBOSS is a mature package of software tools developed for the molecular biology community. It includes a comprehensive set of applications for sequence analysis and other tasks and integrates popular third-party software packages
under a consistent interface. EMBOSS includes extensive C programming libraries.
Long Abstract:Click Here

Poster U50
Ultra-deep sequencing of HIV quasi-species to detect minority variants
Oliver Hofmann- Harvard School of Public Health
Alyssa Porter (Harvard School of Public Health, Bioinformatics Core); Jonathan Li (Harvard Medical School, Section of Retroviral Therapeutics); Rui Wang (Harvard School of Public Health, Biostatistics); John Spritzler (Harvard School of Public Health, Biostatistics); Michael Hughes (Harvard School of Public Health, Center for Biostatistics in Aids Research); Daniel Kuritzkes (Harvard Medical School, Section of Retroviral Therapeutics); Winston Hide (Harvard School of Public Health, Biostatistics);
Short Abstract: A patient infected by HIV will carry a large population of related viral strains described as a quasi-species.We describe the analysis of an HIV quasi-species using second-generation sequencing technology at 200,000-fold coverage to reliably distinguish minority variants at a lower boundary of 1% clonal variation from common sequencing errors.
Long Abstract:Click Here

Poster U51
Identification of disease associated variants using the Carpe Novus tool
elizabeth worthey- The Medical College of Wisconsin
jeff depons (MCW, HMGC); George Kowalski (MCW, HMGC); Greg McQuestion (MCW, HMGC); Wes Rood (MCW, HMGC); Alexander Stoddard (MCW, HMGC); Brad Taylor (MCW, HMGC); David Dimmock (MCW, HMGC); Howard Jacob (MCW, HMGC);
Short Abstract: Carpe Novus is our system for storing, annotating and prioritising sequence variants. This system supports identification of functionally important (and therefore potentially disease associated) variants from the large numbers identified by whole genome or exome sequencing. A wealth of functional and molecular data is generated and queryable for each variant.
Long Abstract:Click Here

Poster U52
Increasing the Genome Analyzer’s output using IBIS
Martin Kircher- Max Planck Institute for evolutionary Anthropology
Udo Stenzel (Max Planck Institute for evolutionary Anthropology, Evolutionary Genetics/Bioinformatics); Janet Kelso (Max Planck Institute for evolutionary Anthropology, Evolutionary Genetics/Bioinformatics);
Short Abstract: Fast and accurate base calling can be performed with IBIS on all Illumina GA versions using a statistical learner without the need to specifically model the sequencing process. We demonstrate how an indexed control DNA library can be spiked into each lane, allowing for lane-specific quality measurement and providing sufficient training data for IBIS.
Long Abstract:Click Here

Poster U53
Detection of Fusion Genes in mRNA-Seq Data
Sven Bilke- National Cancer Institute
Bob Walker (NCI, Cancer Genetics Branch); Marbin Pineda (NCI, Cancer Genetics Branch); Princy Francis (NCI, Cancer Genetics Branch); Ogan Abaan (NCI, Cancer Genetics Branch); Meltzer Paul (NCI, Cancer Genetics Branch);
Short Abstract: Fusion oncoproteins are known to be initiating events in several
cancers. Identification of recurrent fusions is therefore therefore highly important in cancer genomics.

Here we present an algorithm to identify and prioritize putative fusion transcripts within mRNA-Seq data.
Long Abstract:Click Here

Poster U54
A novel probabilistic approach based framework for sequence assembly
Limin Fu- University of California, San Diego
Sitao Wu (University of California, San Diego, Center for Research in Biological Systems); Weizhong Li (University of California, San Diego, Center for Research in Biological Systems);
Short Abstract: A novel framework is proposed for genome sequence assembly based on probabilistic approach, which can be used for de novo assembling, reference-guided assembling or consensus assembling from results of multiple assembling programs. Preliminary tests demonstrate good assembly quality for de novo assembling, even for reads with low coverage depth.
Long Abstract:Click Here

Poster U55
Decoding Sequence Classification Models for Acquiring New Biological Insights
Ulrich Bodenhofer- Johannes Kepler University
Andreas Kothmeier (Johannes Kepler University, Institute of Bioinformatics); Ingrid Abfalter (Johannes Kepler University, Institute of Bioinformatics); Carsten Mahrenholz (Charité Medical School, Institute of Medical Immunology); Sepp Hochreiter (Johannes Kepler University, Institute of Bioinformatics);
Short Abstract: We propose a new way of analyzing biological sequences based on SVMs by means of prediction profiles that allow assessing the importance and class tendency of individual sequence positions. The reduction to a few typical sequences is accomplished by affinity propagation. We present coiled coil classification as a case study.
Long Abstract:Click Here

Poster U56
Developing an exact method to find similar pairs with small edit-distance
Kana Shimizu- Computational Biology Research Center (CBRC)
Koji Tsuda (Computational Biology Research Center (CBRC))
Short Abstract: We introduce an exact method to enumerate all similar pairs for an arbitrary edit-distance threshold. Our method is based on multiple sorting that can drastically reduce pairwise comparison in exchange for increasing sorting operation. Experimental results show efficiency of the proposed comparing to naïve method and suffix array based method.
Long Abstract:Click Here

Poster U57
High-throughput DNA sequencing – concepts and limitations
Janet Kelso- Max-Planck Institute for Evolutionary Anthropology
Martin Kircher (Max Planck Institute for evolutionary Anthropology, Evolutionary Genetics / Bioinformatics);
Short Abstract: Recent advances in DNA sequencing make it possible to generate huge amounts of sequence data very rapidly and at substantially lower cost. We review sequencing technologies currently available and show how vast increases in throughput are associated with both new and old types of errors in the resulting sequence data.
Long Abstract:Click Here

Poster U58
Mining for cancer causing mutations in whole genome sequence data
Andrew Menzies- Wellcome Trust Sanger Institute
Philip Stephens (Wellcome Trust Sanger Institute, Cancer Genome Project); David Beare (Wellcome Trust Sanger Institute, Cancer Genome Project); Adam Butler (Wellcome Trust Sanger Institute, Cancer Genome Project); Simon Forbes (Wellcome Trust Sanger Institute, Cancer Genome Project); Mingming Jia (Wellcome Trust Sanger Institute, Cancer Genome Project); David Jones (Wellcome Trust Sanger Institute, Cancer Genome Project); Catherine Leroy (Wellcome Trust Sanger Institute, Cancer Genome Project); John Marshall (Wellcome Trust Sanger Institute, Cancer Genome Project); Keiran Raine (Wellcome Trust Sanger Institute, Cancer Genome Project); Lucy Stebbings (Wellcome Trust Sanger Institute, Cancer Genome Project); Jon Teague (Wellcome Trust Sanger Institute, Cancer Genome Project); Andrew Futreal (Wellcome Trust Sanger Institute, Cancer Genome Project); Michael Stratton (Wellcome Trust Sanger Institute, Cancer Genome Project); Peter Campbell (Wellcome Trust Sanger Institute, Cancer Genome Project);
Short Abstract: The Cancer Genome Project at the Wellcome Trust Sanger Institute, UK is using high throughput sequencing technologies to systematically screen exome and genome samples for cancer causing mutations. We employ a verity of approaches to detect substitutions, insertions/deletions, chromosomal rearrangements and copy number variations from massively parallel sequencing data.
Long Abstract:Click Here

Poster U59
Makeflow for Bioinformatics
Andrew Thrasher- University of Notre Dame
Irena Lanc (University of Notre Dame, Department of Computer Science and Engineering); Scott Emrich (University of Notre Dame, Department of Computer Science and Engineering); Douglas Thain (University of Notre Dame, Department of Computer Science and Engineering);
Short Abstract: We utilize an implementation of the Directed Acyclic Graph (DAG) abstraction to perform sequence alignment in parallel using campus grids. This has enabled us to achieve 900x speedup on 1000 nodes in our largest test. We utilize this system with the SHRiMP and SSAHA alignment packages.
Long Abstract:Click Here

Poster U60
Galaxy NGS functionality from sample tracking to SNP calling: An interactive poster
Ramkrishna Chakrabarty- Pennsylvania State University
James Taylor (Emory University, Departments of Biology and Mathematics & Computer Science); Anton Nekrutenko (Pennsylvania State University, Center for Comparative Genomics and Bioinformatics); Greg Von Kuster (Pennsylvania State University, Center for Comparative Genomics and Bioinformatics); Mark Chee (Prognosys Biosciences, Inc, ); The Galaxy Team (http//galaxyproject.org, );
Short Abstract: None On File
Long Abstract:Click Here

Poster U61
Identify unknown population structure based on DNA sequence and geography
Melanie Lou- McMaster University
Geoffrey Brian Golding (McMaster University, Biology);
Short Abstract: Based on several population genetic parameters, DNA sequence and geographical data, the goal is to set up a statistically robust top-down clustering method where we can determine the identity and relationship of unknown specimens that look similar and are hard to distinguish by non-experts.
Long Abstract:Click Here

Poster U62
Finding sequence-specific DNA methylation sites using base resolution methylome data
WENYU CHUNG- University of Texas at Dallas
W.-Y. Chung (The University of Texas at Dallas, Department of Molecular Cell Biology); A.D. Smith (University of Southern California, Molecular and Computational Biology); W Liao (The University of Texas as Dallas, Department of Molecular Cell Biology); W. Tang (Tsinghua University, TNLIST and Dept. of Automation); M.Q. Zhang (University of Texas at Dallas, Department of Molecular Cell Biology);
Short Abstract: The allelic-skewing DNA methylation has been demonstrated as a genome-wide phenomenon. We utilized the base-pair resolution data in human to directly identify sequence-specific DNA methylation regions. Our approach will not be limited by the known single nucleotide polymorphism in the database nor be biased by regions digested by restriction enzymes.
Funding: This work is supported by NIH grant U01 ES107166.
Long Abstract:Click Here

Poster U63
Hidden Chromosome Symmetry, Gigantic Palindromes and Nonlinear Dynamics Model
Sergei Larionov- Moscow State University
Maria Poptsova (Cornell University, Weill Cornell Medical College); Sergei Rybalko (Universite de Franche-Comte, Institut FEMTO-ST); Alexander Loskutov (Moscow State University, Physics Faculty); Eugeny Ryadchenko (Moscow State University, Physics Faculty); Ilya Zakharov (Vavilov Institute of General Genetics, Laboratory of Comparative Genetics of Animals);
Short Abstract: Maps of 2D DNA walk of 671 examined chromosomes show composition complexity change from symmetrical half-turn in bacteria to pseudo-random trajectories in archaea, fungi and humans. In silico transformation of gene order and strand position returns most of the analyzed chromosomes to a symmetrical bacterial-like state with one transition point.
Long Abstract:Click Here

Poster U64
CloVR: Automated Sequence Analysis using Virtual Machines and Cloud Computing
Malcolm Matalka- University of Maryland
Samuel Angiuoli (University of Maryland, Institute for Genome Sciences); Owen White (University of Maryland, Institute for Genome Sciences); W. Florian Fricke (University of Maryland, Institute for Genome Sciences);
Short Abstract: CloVR is a virtual appliance that integrates bioinformatics tools into
robust, user-friendly, and automated pipelines that run on Cloud
computing platforms. We show results for BLAST searches
(CloVR-Search), single microbial whole-genome shotgun (WGS) assembly
and annotation (CloVR-Microbe), and metagenomic WGS gene prediction
and BLAST comparison (CloVR-Metagenomics).
Long Abstract:Click Here

Poster U65
Accurate Mapping of RNA-seq Reads for Splice Junction Discovery
Kai Wang- University of Kentucky
Darshan Singh (University of North Carolina, Department of Computer Science); Zheng Zeng (University of Kentucky, Department of Computer Science); Stephen Coleman (University of Kentucky, Gluck Equine Research Center, Department of Veterinary Science); Yan Huang (University of Kentucky, Department of Computer Science); Gleb L. Savich (University of North Carolina, Department of Genetics and UNC Lineberger Comprehensive Cancer Center); Xiaping He (University of North Carolina, Department of Genetics and UNC Lineberger Comprehensive Cancer Center); Piotr Mieczkowski (University of North Carolina, Department of Genetics and UNC Lineberger Comprehensive Cancer Center); Sara A. Grimm (University of North Carolina, Department of Genetics and UNC Lineberger Comprehensive Cancer Center); Charles M. Perou (University of North Carolina, Department of Genetics and UNC Lineberger Comprehensive Cancer Center); James N. MacLeod (University of Kentucky, Gluck Equine Research Center, Department of Veterinary Science); Derek Y. Chiang (University of North Carolina, Department of Genetics and UNC Lineberger Comprehensive Cancer Center); Jan F. Prins (University of North Carolina, Department of Computer Science); Jinze Liu (University of Kentucky, Department of Computer Science);
Short Abstract: MapSplice is an unsupervised algorithm that maps RNA-seq reads to reference genome for unbiased splice junction discovery. It focuses on high specificity and sensitivity in the detection of splices, as well as high CPU and memory efficiency. Both synthetic data experiment and a validation with quantitative PCR confirmed the accuracy of the splice junction.
Long Abstract:Click Here

Poster U66
HMM to HMM Alignment in the UCSC SAM System
Richard Hughey- University of California Santa Cruz
Alexander Atkins (University of California Santa Cruz, Biomolecular Engineering);
Short Abstract: Direct alignment of hidden Markov models (HMMs) has enabled significant improvement in remote homology alignment and protein structure prediction. We present a new fully probabilistic alignment algorithm that draws all parameters from the HMMs themselves, and present results on the method as implemented in UCSC's SAM HMM system.
Long Abstract:Click Here

Poster U67
Mapping Reads with Insertion and Deletion Error using BLASR
Mark Chaisson- Pacific Biosciences
No additional authors
Short Abstract: We present a method for rapidly aligning reads produced on the Pacific Biosciences Single-Molecule Real-Time sequencing platform. The dominant error modes in single molecule sequencing methods is insertion and deletion. Our method is fast and sensitive across a wide range of insertion and deletion error.
Long Abstract:Click Here

Poster U68
Evolutionary classification and understanding the catalytic diversity of the Tautomerase Superfamily
Abhiman Saraswathi- National Institutes of Health
Aravind L (National Institutes of Health, National Center for Biotechnology Information);
Short Abstract: Tautomerase superfamily is an enzyme family with promiscuous functions and the smallest known subunit. Through extensive sequence searches and structural analysis, we have elucidated the evolution of this superfamily and predict several key residues responsible for the functional differences between families using sequence conservation analysis methods based on entropy.
Long Abstract:Click Here

Poster U70
nhmmer : DNA-to-DNA database search with HMMER3
Travis Wheeler- HHMI Janelia Farm Research Campus
No additional authors
Short Abstract: We present nhmmer, a tool built on the HMMER3 framework, to be used for searching sequence databases for homologs of DNA sequences. The tool nhmmer is nearly as fast as BLAST, and early tests show it to be more accurate and able to identify more remote homologs.
Long Abstract:Click Here

Poster U71
Visualizing protein family nanoanatomy using ProfileGrids
Alberto Roca- University of California, Irvine
Aaron Abajian (University of California, Irvine, Molecular Biology & Biochemistry); David Vigerust (Vanderbilt University Medical Center, Investigative Pathology);
Short Abstract: ProfileGrids are a new visualization method to analyze large multiple sequence alignments. For the influenza hemagglutinin protein family, we define a uniform nomenclature of the bioinformatic elements. The patterns of glycosylation sites are related to recent swine flu sequences. We describe updated JProfileGrid software that includes features for protein databases.
Long Abstract:Click Here

Poster U72
Centroid series: fundamental programs of sequence analysis for non-coding RNAs
Michiaki Hamada- Mizuho Information & Research Institute, Inc
Kengo Sato (University of Tokyo, Graduate School of Frontier Sciences); Hisanori Kiryu (University of Tokyo, Graduate School of Frontier Sciences); Toutai Mituyama ( National Institute of Advanced Industrial Science and Technology (AIST), Computational Biology Research Center); Kiyoshi Asai (University of Tokyo, Graduate School of Frontier Sciences);
Short Abstract: We present four fundamental programs of sequence analysis for non-coding RNAs. Those programs are carefully designed based on the viewpoint of maximum expected accuracy (MEA). The functionality of the programs has been confirmed by several computational experiments. The programs are freely available at http://www.ncrna.org/.
Long Abstract:Click Here

Poster U73
Comparison of SNP calling methods in 454 Sequencing data
Wayne Clarke- Agriculture and Agri-Food Canada
Christina Eynck (Agriculture and Agri-Food Canada, Saskatoon Research Centre); Isobel Parkin (Agriculture and Agri-Food Canada, Saskatoon Research Centre); Kishore Gali (NRC Plant Biotechnology Institute (NRC-PBI), DNA Technologies Laboratory); Christine Sidebottom (NRC Plant Biotechnology Institute (NRC-PBI), DNA Technologies Laboratory); Andrew Sharpe (NRC Plant Biotechnology Institute (NRC-PBI), DNA Technologies Laboratory);
Short Abstract: Next-generation sequencing technologies, such as those from Roche and Illumina, are being utilized for high-throughput SNP discovery with increasing frequency. Consequently, software to analyze such data and call high quality polymorphisms is being sought by researchers. This poster details the evaluation of different SNP calling methods using 454 sequence data.
Long Abstract:Click Here

Poster U74
Multiscale segmentation of genomic signals
Theo Knijnenburg- Institute for Systems Biology
Stephen Ramsey (Institute for Systems Biology, ); Ilya Shmulevich (Institute for Systems Biology, );
Short Abstract: In any complex system information is present at multiple levels. We propose a multiscale segmentation algorithm, which enables the multi-level analysis of genomic signals. Genomic signals, which are functions of genomic location, e.g. ChIP-seq data or conservation data, are divided up into segments at multiple scales and scored for enrichment.
Long Abstract:Click Here

Poster U75
Characterization of genomic structural variations in a neuroblastoma cell line using mate-pair sequencing
Valentina Boeva- Institut Curie
No additional authors
Short Abstract: Neuroblastoma is the most common solid tumor in childhood. There is not much known about mutations that lead to neuroblastoma oncogenesis. Here we applied the Illumina mate-pair sequencing technique to sequence two neuroblastoma cell lines. We identified structural rearrangements in these cell lines and found genes producing broken or chimeric transcripts. We combined the two lists of genes with breakpoints and analyzed abnormalities present in both cell lines. Genes we discovered to be rearranged in both cell lines can have a therapeutic potential. The list includes genes playing a role in cell cycle control and tumor suppression and a gene encoding a protein important for DNA double-strand break repair.
Long Abstract:Click Here

Poster U76
Drastic Speed Gain for RNA Folding Achieved with Modified Dynamic Programming Algorithm
Philipp Bucher- Federal Institute of Technology (EPFL), Swiss Institute of Bioinformatics
Slavica Dimitrieva (Federal Institute of Technology (EPFL), Swiss Institute of Bioinformatics, Swiss Institute for Experimental Cancer Research (ISREC)); Philipp Bucher (Federal Institute of Technology (EPFL), Swiss Institute of Bioinformatics , Swiss Institute for Experimental Cancer Research (ISREC));
Short Abstract: Since a couple of years, computing power is not keeping up anymore with the rapid progress of new sequencing technologies. As a result, computational analysis of sequence data has become the major bottleneck in high-throughput genomics creating a new demand for faster algorithms. We present an example of an algorithmic improvement which makes RNA secondary structure prediction much faster. Exact RNA-folding algorithms compute the minimum free energy (MFE) structure for a given sequence in cubic time and quadratic memory. The unfavorable time-complexity makes folding of the largest known RNA molecules (e.g. Titin mRNA with over 100 kb) prohibitively slow. Here we describe a small modification of the standard RNA folding algorithm, which allows for conditional execution of the inner-most loop. The modification is compatible with currently used energy functions and has no effect on the results. Applying this modification to the RNAfold code from the Vienna package, we observed a 20 to 40 fold speed-advantage for both real and synthetic random sequences. For instance, the SARS genomic RNA was folded in less than 6 minutes on a desktop computer, as compared to nearly 3 hours required by the original version. Our algorithmic improvement extend the application range of MFE folding to larger RNA secondary structure and significantly reduce the costs for whole genome scans for folding potential. The principle of conditional inner-loop execution is readily applicable to variations of the classical RNA folding algorithm such covariance-based secondary structure prediction methods. The C code used in this work is publicly available via SourceForge at http://sibrnafold.sourceforge.net/
Long Abstract:Click Here

Poster U77
HIV-1 Codon Optimization: Implications for Vaccine Development
Anne Bet- University of Alabam at Birmingham
Anne Bet (University of Alabama at Birmingham, Microbiology);
Short Abstract: To overcome the enormous diversity of HIV-1, many vaccines being tested in clinical trials have been designed to generate broad T-cell responses to immunogenic proteins. To ensure efficient translation of viral proteins, the primary reading frames can be codon-optimized. While this approach increases gene expression in the primary reading frame, it is still not known how optimization impacts expression of the alternative reading frames (ARF). Our lab recently showed that peptides derived from ARFs (cryptic epitopes, CE) are frequently recognized during HIV-1 infection. We therefore sought to determine the impact of codon optimization on CE by comparing transmitted founder virus (TFV) sequences from 12 acutely Clade B-infected individuals in the U.S. with 4 vaccine sequences from Virogenetics (ALVAC-HIV), GeoVax (MVA/HIV62), Merck (MRKAd5), and the Vaccine Research Center (VRCrAd5). ALVAC-HIV and MVA/HIV62 contain non-optimized Gag proteins; both MRKAd5 and VRCrAd5 are codon-optimized. Using pairwise alignments and phylogenetic analyses, Gag protein sequences from the vaccines were compared with a consensus sequence generated from the TFV sequences. The ARFs of the non-optimized and optimized sequences had on average 90% and 35% similarity, respectively, with the TFV consensus sequence. The non-optimized sequences had genetic distances smaller than the distances between TFV sequences whereas the codon-optimized sequences had distances 2.1 to 4.8 (+/-0.5) times greater. Because cytotoxic T-cells cannot recognize epitopes that differ by more than 33%, these data suggest that translation of ARFs in codon-optimized Gag sequences produces aberrant CE that may negatively impact the potential breadth of T-cell responses.
Long Abstract:Click Here

Poster U78
MG-RAST to the Clouds
Jared Wilkening- Argonne National Laboratory
Narayan Desai (Argonne National Laboratory, Mathematics and Computer Science Division); Andreas Wilke (University of Chicago, Computation Institue); Mark D'Souza (University of Chicago, Computation Institue); Folker Meyer (Argonne National Laboratory, Mathematics and Computer Science Division);
Short Abstract: The Metagenomics RAST Server (MG-RAST) is an analysis pipeline that has been
used to analyzing over 210 Gbp of metagenomic sequence data. This poster
describes the decisions we faced in adapting MG-RAST to the cloud
computational paradigm and the architecture that has enabled us meet our users needs.
Long Abstract:Click Here

Accepted Posters


View Posters By Category
Search Posters:
Poster Number Matches
Last Name
Co-Authors Contains
Title
Abstract Contains






↑ TOP