ISMB 2008 ISCB

16th Annual
International Conference
Intelligent Systems
for Molecular Biology


Metro Toronto Convention Centre (South Building)
Toronto, Canada


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

















Accepted Posters
Category 'P'- Sequence Analysis'
Poster P01
BLAST options, reciprocal best hits, and reciprocal shortest distances
Gabriel Moreno-Hagelsieb- Wilfrid Laurier University
Kristen Latimer (Wilfrid Laurier University, Biology);
Short Abstract: None On File
Long Abstract: Click Here

Poster P02
Tracing evolutionary pressure
Kai Ye- Leiden/Amsterdam Center for Drug Research
Gert Vriend (Radboud University Nijmegen Medical Centre , CMBI ); Ad IJzerman (Leiden/Amsterdam Center for Drug Research, Medicinal Chemistry);
Short Abstract: None On File
Long Abstract: Click Here

Poster P03
The quaternionic model of the DNA code, with applications to pathogenic strain forensics
Andrzej Brodzik- The MITRE Corporation
No additional authors
Short Abstract: We review mathematical foundations of the quaternionic periodicity transform for tandem repeat detection, the quaternionic model of the DNA code. Subsequently, we describe improvements made in the recent release of the code, and illustrate utility of the tool in a homology study of two closely related bacillus anthracis strains.
Long Abstract: Click Here

Poster P04
Accurate multiple sequence and structural alignment of distantly related proteins
Jimin Pei- Howard Hughes Medical Institute, UT Southwestern Medical Center
Bong-Hyun Kim (UT Southwestern Medical Center, Biochemistry); Nick Grishin (UT Southwestern Medical Center, HHMI and Biochemistry);
Short Abstract: None On File
Long Abstract: Click Here

Poster P05
Improved algorithms for local alignment
LUCIAN ILIE- University of Western Ontario
SILVANA ILIE (University of Toronto, Computer Science);
Short Abstract: None On File
Long Abstract: Click Here

Poster P06
Biclustering as a method for multiple sequence alignment
Shu Wang- Samsung Telecommunications America
Shu Wang (Samsung Telecommunications America, Wireless Terminal Lab); Robin Gutell (University of Texas At Austin, Section of Integrative Biology);
Short Abstract: None On File
Long Abstract: Click Here

Poster P07
Novel substitution matrices for sequence searches of a biased genome
Umadevi Paila- Centre for DNA Fingerprinting and Diagnostics
Rohini Kondam (Centre for DNA Fingerprinting and Diagnostics, Computational and Functional Genomics Group & Sun Centre of Excellence in Medical Bioinformatics); Akash Ranjan (Centre for DNA Fingerprinting and Diagnostics, Computational and Functional Genomics Group & Sun Centre of Excellence in Medical Bioinformatics);
Short Abstract: A bias in the nucleotide composition is reflected in the amino acid composition of proteins. We computed novel substitution matrices from a unique protein dataset for effective sequence searches of a biased genome. We achieved better quality alignments for proteins that otherwise aligned poorly to orthologs with the standard matrices.
Long Abstract: Click Here

Poster P08
Alignment and De novo Assembly of Transcriptome Reads from Solexa Sequencing
Zemin Ning- The Wellcome Trust Sanger Institute
Yong Gu (Sanger Institute, Informatics); Ben Blackburne (Sanger Institute, Informatics); Hannes Ponstingl (Sanger Institute, Informatics);
Short Abstract: We have developed ssaha_pileup to detect SNPs/indels, by aligning mircoreads to a reference. We report results from three lanes of transcriptome data from Solexa sequencing of Plasmodium falciparum and Caenorhabditis elegans. We also generated de novo assemblies using FuzzyPath and compare the contig alignment with that by direct read mapping.
Long Abstract: Click Here

Poster P09
Fine-scale mapping of copy number alterations with next generation sequencing
Derek Chiang- Broad Institute of MIT and Harvard
Gad Getz (Broad Institute of MIT and Harvard, Cancer Program); David Jaffe (Broad Institute of MIT and Harvard, Genome Sequencing and Analysis Program); Xiaojun Zhao (Novartis, NIBR); Carsten Russ (Broad Institute of MIT and Harvard, Genome Sequencing and Analysis Program); Matthew Meyerson (Dana-Farber Cancer Institute, Medical Oncology); Eric Lander (Broad Institute of MIT and Harvard, Genome Sequencing and Analysis Program);
Short Abstract: Next generation sequencing technologies yield both counts and sequences. We present a segmentation algorithm to infer copy number alterations genomic shotgun sequencing. We present results of benchmarking this algorithm on three tumor-normal cell line pairs.
Long Abstract: Click Here

Poster P10
Infernal 1.0: RNA sequence analysis using covariance models
Eric Nawrocki- Howard Hughes Medical Institute Janelia Farm Research Campus
Sean Eddy (Howard Hughes Medical Institute, Janelia Farm Research Campus);
Short Abstract: The Infernal software package used by the Rfam database implements profile stochastic context-free grammars (SCFGs) called covariance models (CMs) for RNA homology search and alignment. The new version 1.0 of Infernal introduces E-values and a query-dependent HMM filtering procedure for accelerating homology search.
Long Abstract: Click Here

Poster P11
RNA alignments with incomplete sequence
Diana Kolbe- Janelia Farm Research Campus
No additional authors
Short Abstract: Alignment methods for structural RNAs use structure-based models, allowing deletion or replacement of a helix, but not allowing only one side to be missing. However, sequence data may be fragmentary, as in metagenomic sequencing. I present a new method to align incomplete sequences to models of structural RNAs.
Long Abstract: Click Here

Poster P12
MirTif: a post-processing filter for predicted microRNA target genes
Kuo-Bin Li- Center for Systems and Synthetic Biology
Yu-Ping Wang (National Yang-Ming University, Center for Systems and Synthetic Biology);
Short Abstract: MirTif is a machine learning-based post-processing tool designed to reduce the false positive predictions produced by microRNA target gene prediction software. It was built upon experimentally validated microRNA-target interactions.
Long Abstract: Click Here

Poster P13
A Bioinformatics Pipeline for the Metagenomics Search of Human Pathogens
Jared Flatow- Northwestern University
Simon Lin (Northwestern University, Robert H. Lurie Cancer Center); Warren Kibbe (Northwestern University, Robert H. Lurie Cancer Center); Anne Rowley (Northwestern University, Feinberg School of Medicine);
Short Abstract: We use about 400 sequences from a pilot microbiome survey of human infant lung to construct a classification pipeline. The pipeline is flexible, extensible, and efficient. The analysis of metagenomics data is a stepping stone for future hypothesis-testing that will eventually yield comprehensive conclusions on the pathogenesis of the human infections.
Long Abstract: Click Here

Poster P14
Classifying kinetoplastid parasite protein phosphatases using an ontology and sequence analysis methods
Rachel Brenchley- University of Manchester
Robert Stevens (University of Manchester, School of Computer Science); Lydia Tabernero (University of Manchester, Faculty of Life Sciences);
Short Abstract: Combining ontological descriptions with bioinformatics sequence analysis has provided an effective classification system for protein phosphatases. Investigating the genomes of kinetoplastid parasites has highlighted interesting, novel phosphatases. Developing an online classification tool, as part of the PhosphaBase database, will aid researchers by targeting phosphatases of interest from newly sequenced genomes.
Long Abstract: Click Here

Poster P15
Genome-wide Identification of polymorphic Endogenous Retroviral Elements in Mice
Ying Zhang- University of British Columbia
Irina Maksakova (UBC, Genetics); Liane Gagnier (BC Cancer Research Centre, Terry Fox Laboratory); Louie Van de Lagemaat (Wellcome Trust Genome Campus, Wellcome Trust Sanger Institute); Dixie Mager (UBC, Medical Genetics);
Short Abstract: Using genomic DNA available for four mouse inbred strains and pairwise sequence alignment, we detected high levels of polymorphism of endogenous retroviral elements in mice, with ~700 of them located in gene introns. Our finding suggests that these mobile elements play a significant role in genetic drift of mouse lines.
Long Abstract: Click Here

Poster P16
BayesMD: FLEXIBLE BAYESIAN MODELLING FOR MOTIF DISCOVERY
Man-Hung Eric Tang- Bioinformatics Centre, University of Copenhagen
Ole Winther (Bioinformatics Centre, University of Copenhagen, Department of Biology); Anders Krogh (Bioinformatics Centre, University of Copenhagen, Department of Biology);
Short Abstract: We present a Bayesian motif discovery model that builds-in biological knowledge in a principled modular fashion. Our versatile framework enables problem specific motif inference using all information on statistical properties of binding sites, background models and positional preferences such as information on evolutionary conservation and low complexity.
Long Abstract: Click Here

Poster P17
A fast structural multiple alignment of RNA sequences for genome scale analysis
Yasuo Tabei- University Of Tokyo
Hisanori Kiryu (Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST)); Taishin Kin (Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST)); Kiyoshi Asai (Graduate School of Frontier Science, University of Tokyo, Computational Biology);
Short Abstract: We propose a fast algorithm for multiple structural alignments of RNA sequences that
is an extension of our pairwise structural
alignment method (SCARNA).
The accuracies of the implemented software, are at least as favorable as those of
state-of-art algorithms that are computationally
much more expensive in time and memory.
Long Abstract: Click Here

Poster P18
Metagenomic Analysis with Galaxy: Windshield Genomics and Beyond
Samir Wadhawan- The Pennsylvania State University
Sergei Kosakovsky-Pond (University of California, Antiviral Research Center); Wen-Yu Chung (The Pennsylvania State University, Center for Comparative Genomics and Bioinformatics); Guru Ananda (The Pennsylvania State University, Center for Comparative Genomics and Bioinformatics); Dan Blankenberg (The Pennsylvania State University, Center for Comparative Genomics and Bioinformatics); Nate Coraor (The Pennsylvania State University, Center for Comparative Genomics and Bioinformatics); Greg Von Kuster (The Pennsylvania State University, Center for Comparative Genomics and Bioinformatics); James Taylor (New York University, Courant Institute of Mathematical Sciences); Anton Nekrutenko (The Pennsylvania State University, Center for Comparative Genomics and Bioinformatics);
Short Abstract: Currently there are no integrated sets of tools for analyzing metagenomic data obtained from next generation sequencers such as 454 FLX sequencing. Here we present the first freely available resource, Galaxy (http://g2.bx.psu.edu) for short-read metagenomic analyses that requires nothing more than a web- browser.
Long Abstract: Click Here

Poster P19
Probing Metagenomics by Rapid Analysis of Mega-datasets
Weizhong Li- UCSD
No additional authors
Short Abstract: We present an efficient pipeline RAMMCAP for quickly analyzing extremely large metagenomic datasets. It includes ultra-fast sequence clustering, annotation, metagenome comparison, and unique visualization interface. We studied two largest metagenomic collections, the “Global Ocean Sampling” and “Nine Biomes”, together with 22 million sequences. Our findings are available from http://tools.camera.calit2.net/camera/rammcap.
Long Abstract: Click Here

Poster P20
Unveiling Potential Variants: Phylogenetic Analysis of Eight African Species of Artemisia
Ijeoma Dike- Covenant University
A Adebayo (Covenant University, Department of Biological Sciences); S Fatumo (Covenant University, Department of Computer and Information Sciences); G Olasehinde (Covenant University, Department of Biological Sciences); I Ewejobi (Covenant University, Department of Computer and Information Sciences); C Ekenna (Covenant University, Department of Computer and Information Sciences); E Adebiyi (Covenant University, Department of Computer and Information Sciences);
Short Abstract: Artemisia encompasses over 300 variants, though African species are few and scarcely studied. Internal transcribed spacer (ITS) sequences of nuclear ribosomal DNA were used to construct a phylogeny. The proposed hypotheses on their origin was tested to aid in identification of potential medicinal variants using Artemisia annua as a yard-stick.
Long Abstract: Click Here

Poster P21
Data mining pipelines for next-generation sequencing platforms
Lalit Ponnala- Cornell University
Qi Sun (Cornell University, Life Sciences Core Laboratories Center);
Short Abstract: As a bioinformatics facility, we have been implementing data mining
pipelines for next-generation sequencing projects including ChIP-seq,
allele-specific gene expression profiling, SNP identification, and de
novo EST and genome sequence assembly. Comparisons of multiple sequence
alignment tools will be presented, as well as some preliminary results
from these pipelines.
Long Abstract: Click Here

Poster P22
Molecular epidemiology of dengue virus type 3 in Brazil, 2003 - 2008
Luiza Castro- FMRP/USP
Daniel M. M. Jorge (FMRP/USP, Department of Genetics); Benedito A L Fonseca (FMRP/USP, Centre of Virology);
Short Abstract: Dengue is the most important arbovirus of public health. To evaluate DENV-3 genotypes in Brazil we analyzed the NS1 region of DENV-3 from São Paulo State from 2003 through 2008. Brazilian sequences are part of GIII and are the more derived group with more potential to segregate and evolve.
Long Abstract: Click Here

Poster P23
SVD based tool to annotate protein function and validate proteins database structure
Mariana Simoes- Federal University of Minas Gerais (UFMG)
Gabriel Vedder (Federal University of Minas Gerais, Bioinformatics); João Nitzsche (Federal University of Minas Gerais, Bioinformatics); Marcos Santos (Federal University of Minas Gerais , Computer Science);
Short Abstract: We present another alternative to recovery information from biological databases. We propose a tool that represents, as a vector, the proteins sequences according to the occurrence of triples of amino acids. The results show that SVD is a very effective tool for comparison of a query sequence to protein databases.
Long Abstract: Click Here

Poster P24
A High Speed Sequence Search Method for Next Generation Sequencers
Sanchit Misra- Northwestern University
Alok Choudhary (Northwestern University, Electrical Engineering and Computer Science); Ramanathan Narayanan (Northwestern University, Electrical Engineering and Computer Science); Simon Lin (Northwestern University, Northwestern Medical School);
Short Abstract: High throughput and low cost NGSs have made the problem of "searching for near-exact and global matches of a query sequence Q in a genomic database G" very significant. We present an efficient algorithm, which dynamically reduces the search space to achieve an order of magnitude speedup over state-of-the-art techniques.
Long Abstract: Click Here

Poster P25
Functional Residue Characterization with Protein Similarity Networks
Leonard Apeltsin- UCSF
Thomas Ferrin (UCSF, Bioinformatics); Patricia Babbitt (UCSF, Bioinformatics); John Morris (UCSF, Bioinformatics);
Short Abstract: We have developed an algorithm that takes in a sequences dataset and produces a filtered similarity network with a discretely defined topology. By propagating residue conservation information across that network, we are able to predict which residues are likely to residue in the active site of the protein structure.
Long Abstract: Click Here

Poster P26
EMBOSS
Peter Rice- European Bioinformatics Institute
Alan Bleasby (European Bioinformatics Institute, EBI); Jon Ison (European Bioinformatics Institute, EBI); Mahmut Uludag (European Bioinformatics Institute, EBI);
Short Abstract: The European Molecular Biology Open Software Suite (EMBOSS) is a mature package of software
tools and libraries developed for the molecular biology community. It includes a comprehensive set of applications
for molecular sequence analysis and other tasks and integrates popular third-party
software packages
under a consistent interface.
Long Abstract: Click Here

Poster P27
The BLOSUM matrix model: Does greedy clustering underestimate substitution probabilities?
Kathryn Duffy- Dalhousie University
Norbert Zeh (Dalhousie University, Faculty of Computer Science); Robert Beiko (Dalhousie University, Faculty of Computer Science);
Short Abstract: We show that the method utilized to create the BLOSUM matrices results in a loss of data due to excessive clustering of distantly related sequences. We explore the impact this has on the resulting matrices, their performance, and whether the substitution probabilities of distantly related amino acid pairs are underestimated.
Long Abstract: Click Here

Poster P28
The OryzaSNP Web Resource for SNPs Identified by Array-based Resequencing in Rice
Kevin Childs- Michigan State University
Regina Bohnert (Max Planck Institute, Friedrich Miescher Laboratory of the Max Planck Society); Georg Zeller (Max Planck Institute, Friedrich Miescher Laboratory of the Max Planck Society); Gunnar Rätsch (Max Planck Institute, Friedrich Miescher Laboratory of the Max Planck Society); Richard Clark (Max Planck Institute, Department of Molecular Biology); Detlef Weigel (Max Planck Institute, Department of Molecular Biology); Keyan Zhao (Cornell University, Dept. of Biological Stat. and Comp. Biol.); Badri Padhukasahasram (Cornell University, Dept. of Biological Stat. and Comp. Biol.); Carlos Bustamante (Cornell University, Dept. of Biological Stat. and Comp. Biol.); Victor Ulat (International Rice Research Institute, Genetics); Richard Bruskiewich (International Rice Research Institute, Genetics); Kenneth McNally (International Rice Research Institute, Genetics); Hei Leung (International Rice Research Institute, Genetics); David Mackill (International Rice Research Institute, Genetics); Thomas Bureau (McGill University, Department of Biology); Douglas Hoen (McGill University, Department of Biology); Renee Stokowski (Perlegen Sciences, Genetics); Dennis Ballinger (Perlegen Sciences, Genetics); Kelly Frazer (Perlegen Sciences, Genetics); David Cox (Perlegen Sciences, Genetics); Rebecca Davidson (Colorado State University, Bioagr. Sciences and Pest Management); Jan Leach (Colorado State University, Bioagr. Sciences and Pest Management); C. Robin Buell (Michigan State University, Plant Biology Department);
Short Abstract: The OryzaSNP Consortium has identified SNPs in rice by surveying 100 Mb of unique and low copy genomic sequence from each of twenty diverse rice cultivars. The Perlegen array-based resequencing method was used to identify SNPs. These data are available via a genome browser and web-based search forms.
Long Abstract: Click Here

Poster P29
SolexaTools: An Open Source Sequence Analysis Framework
Brian O'Connor- UCLA
Zugen Chen (UCLA, Human Genetics); Barry Merriman (UCLA, Human Genetics); Dmitriy Skvortsov (UCLA, Human Genetics); Nils Homer (UCLA, Human Genetics); Michael Clark (UCLA, Human Genetics); Ascia Eskin (UCLA, Human Genetics); Jordan Mendler (UCLA, Human Genetics); Stanley Nelson (UCLA, Human Genetics);
Short Abstract: The SolexaTools project was started to meet the computation needs of scientists using the Solexa massively parallel sequencer. This open source project provides two key features for the community. First, a laboratory information management system for tracking experiments and, second, a pipeline framework for the analysis of data.
Long Abstract: Click Here

Poster P30
Mining G-protein Coupled Receptors and Cytochrome b561 from Crop Genomes
Stephen Opiyo- University of Nebraska, Lincoln
No additional authors
Short Abstract: Very few G-protein coupled receptors (GPCRs) and Cytochrome b561 (Cyt-b561) are found in crop genomes. We used partial least squares (PLS-T_ACC) to mine GPCRs and Cyt-b561 proteins from crop genomes. PLS-T_ACC performed better than profile hidden Markov models and PSI-BLAST in mining GPCRs and Cyt-b561 proteins from crop genomes.
Long Abstract: Click Here

Poster P31
SHRiMP: The Short Read Mapping Package
Stephen Rumble- University of Toronto
Michael Brudno (Professor, Computer Science);
Short Abstract: SHRiMP, the Short Read Mapping Package, is a method for mapping very short reads produced by next generation sequencing technologies to a reference genome. Our method includes a spaced kmer filtering technique, a vectorized Smith-Waterman algorithm, separate full color-space and letter-space alignment approaches, and computation of false discovery statistics.
Long Abstract: Click Here

Poster P32
A Directed Acyclic Graph as a Representation of All Near Optimal Sequence Alignments
Vladimir Yanovsky- University of Toronto
Michael Brudno (University of Toronto, Computer Science);
Short Abstract: We propose a representation of all near optimal alignments as a Directed Acyclic Graph (DAG). This allows us to avoid the problem faced by alignment algorithms - the need to choose among alignments having similar scores. We extend the standard sequence alignment paradigms to alignment between DAGs.
Long Abstract: Click Here

Poster P33
On Evaluating the Performance of Compression Based Techniques for Sequence Comparison
Ramez Mina- University of Nebraska at Omaha
Hesham Ali (University of Nebraska at Omaha, College of Information Science and Technology); Dhundy Bastola (University of Nebraska at Omaha, College of Information Science and Technology);
Short Abstract: Sequence alignment has been the method of choice for many researchers in comparing sequences. Apparently it fails to produce accurate results in many cases, particularly when sequences contain repeats or mobile subsequences. We present a study to evaluate the performance of compression based algorithms in detecting similarity among biological sequences.
Long Abstract: Click Here

Poster P34
A probability distribution for homology search in DNA with multiple spaced seed hits
Denise Mak- Boston University
Gary Benson (Boston University, Bioinformatics Program);
Short Abstract: We present a novel method to calculate the probability distribution for
number of matching positions detected between homologous sequences using
multiple spaced seed hits. Intended usage is repeat detection in genomic
DNA. The computation is probability parameter independent and uses a model
of confirmable homologous alignments.
Long Abstract: Click Here

Poster P35
An Improved Probabilistic Protein-DNA Recognition Code for the C2H2 Zinc Finger Trancription Factor Family
Ryan Christensen- Washington University in St Louis
Gary Stormo (Washington University in St Louis, Genetics);
Short Abstract: We present an improved protein-DNA recognition code for the C2H2 zinc finger family of transcription factors (TFs). Our probabilistic model can be used to predict the DNA binding specificity of a query TF, in the form of a position specific weight matrix, and visa versa. Alternative models are explored.
Long Abstract: Click Here

Poster P36
Tracking lead compounds in African species of Commiphora
A Adebayo- Covenant University
O Obembe (Covenant University, Department of Biological Sciences); I Dike (Covenant University, Department of Biological Sciences); A Ajayi (Covenant University, Department of Biological Sciences); O Ogunlana (Covenant University, Department of Biological Sciences); E Adebiyi (Covenant University, Department of Computer and Information Sciences);
Short Abstract: The medicinal importance of Commiphora africana is documented whereas little is known about the medicinal applications of other African species. We have generated a phylogenetic tree for eight African species, using data from chloroplast marker rps16 intron. Sequences of C. africana-related species can be analysed for compounds of medicinal value.
Long Abstract: Click Here

Poster P38
POIMs: Positional Oligomer Importance Matrices — Understanding Support Vector Machine Based Signal Detectors
Alexander Zien- Max Planck Society
Sören Sonnenburg (Fraunhofer FIRST, IDA); Petra Philips (Max Planck Society, Friedrich Miescher Lab); Gunnar Rätsch (Max Planck Society, Friedrich Miescher Lab);
Short Abstract: Frequently the most accurate signal detectors are support vector
machines (SVMs) with k-mer features.

Positional Oligomer Importance Matrices (POIMs) capture and visualize
relevant sequence patterns by computing expected contributions of all
k-mers to the total SVM score.

We demonstrate the usefulness of POIMs for splice site detectors and
other examples.
Long Abstract: Click Here

Poster P39
Identification of amino acids important for urinary tract infection in FimH using positive selection
Swaine Chen- Washington University School of Medicine
Chia Hung (Washington University School of Medicine, Molecular MIcrobiology); Julie Bouckaert (Vrije Universiteit Brussel, Molecular and Cellular Interactions); Scott Hultgren (Washington University School of Medicine, Molecular Microbiology);
Short Abstract: Urinary tract infections are common E. coli infections. We found that
FimH, an E. coli virulence factor, is evolving under positive selection.
We demonstrate that individual amino acids under positive selection
affect FimH function and bacterial fitness. This work validates positive
selection as a method for probing bacterial pathogenesis.
Long Abstract: Click Here

Poster P40
HiProbe 2.0: Hierarchical oligonucleotide primer and probe design tool
Won-Hyong Chung- Kyungpook National University
Seong-Bae Park (Kyungpook National University, Department of Computational Engineering);
Short Abstract: HiProbe 2.0 is a visual program that designs cluster-specific oligonucleotide primers and probes from a highly conserved sequence set based on its hierarchical clustering information. Our work is useful for automated design of oligonucleotides when neither universal probes nor sequence-specific probes are unable to design.
Long Abstract: Click Here

Poster P41
A new way of seeing DNA
Maik Friedel- Fritz-Lipmann Institute
Jürgen Sühnel (group leader, FLI Biocomputing ); Thomas Wilhelm (group leader, IFR systems biology group);
Short Abstract: We have developed a genome browser that encodes the sequence by geometrical or other physico-chemical dinucleotide properties. The values of these properties are plotted as a dinucleotide-based sequence graph. This enables a user to recognize sequence patterns that are not easily seen in the usual character string representation.
Long Abstract: Click Here

Poster P42
In silico microarray probe design for diagnosis of multiple pathogens
Ravi Vijaya Satya- Biotechnology HPC Software Applications Institute
Nela Zavaljevski (Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Biotechnology HPC Software Applications Institute); Kamal Kumar (Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Biotechnology HPC Software Applications Institute); Elizabeth Bode (U.S. Army Medical Research Institute of Infectious Diseases, 2Diagnostic Systems Division); Susana Padilla (U.S. Army Medical Research Institute of Infectious Diseases, 2Diagnostic Systems Division); Leonard Wasieloski (U.S. Army Medical Research Institute of Infectious Diseases, 2Diagnostic Systems Division); Jeanne Geyer (U.S. Army Medical Research Institute of Infectious Diseases, 2Diagnostic Systems Division); Jaques Reifman (Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Biotechnology HPC Software Applications Institute);
Short Abstract: We present a high-throughput software tool for the design of microarray probes for pathogen diagnostics. The tool designs probes for identification of a single pathogen or a group of related pathogens. Hybridization results with probes designed for eight Burkholderia genomes demonstrate the effectiveness of the designed probes.
Long Abstract: Click Here

Poster P43
SeqMonk: Visualisation and Analysis for Next Gen Sequencing
Simon Andrews- The Babraham Institute
No additional authors
Short Abstract: SeqMonk provides a flexible set of tools for quantitiation, visualisation and analysis of mapped sequencing data. It can be used for the statistical analysis of next generation sequencing data.
Long Abstract: Click Here

Poster P44
DISPARE : a DIScriminative PAttern REfinement algorithm using PWMs
Isabelle da Piedade- PhD Student University of Copenhagen
Man-Hung Eric Tang (PhD student, Bioinformatics Center); Anders Krogh (Professor, Bioinformatics Center IMBF);
Short Abstract: We describe a novel algorithm: DIScriminative PAttern REfinement algorithm (DISPARE). It is an iterative weight matrix optimization method that aims to distinguish more efficiently the true TFBS sites from a negative control set. Our method aims to derive better PWMs from ChIP-chip data.
Long Abstract: Click Here

Poster P45
Proteome-wide Prediction of Acetylation Substrates
Amrita Basu- Rockefeller University
Kristie Rose (University of Virginia, Chemistry); Sandra Hake (Adolf Butenandt Institute, Molecular Biology); Chen Yue (UT Southwestern Medical Center, Chemistry); Beatrix Ueberheide (Rockefeller University , Mass Spectrometry and Gaseous Ion Chemistry); Yingming Zhao (UT Southwestern Medical center, Biochemistry); Donald Hunt (University of Virginia, Chemistry); David Allis (Rockefeller University, Chromatin Biology); Eran Segal (Weizmann Institute of Science, And Applied Mathematics);
Short Abstract: Acetylation is a well-studied post-translational modification that has been associated with a broad spectrum of biological processes, notably gene regulation. We present a computational approach combined with experimental validation to predict protein acetylation proteome-wide based on the sequence characteristics of lysine acetylation within histone proteins.
Long Abstract: Click Here

Poster P46
BLAST options and orthologs as reciprocal best hits
Gabriel Moreno-Hagelsieb- Wilfrid Laurier University
Kristen Latimer (Wilfrid Laurier University, Biology);
Short Abstract: Testing two options of BLAST allowed us to increase the number and quantity of orthologs detected as RBH. We also found that reciprocal shortest distances (based on evolutionary rate assumptions) yield lower quality results and only a slight increase in number making RSD not worth the extra computation.
Long Abstract: Click Here

Poster P47
Asymmetry of mature miRNA selection: insight from physicochemical approach
Seungyoon Nam- Seoul National University
Won-joon Son (Seoul National University, School of Chemistry); Sanghyuk Lee (Ewha Womans University, Division of Life and Pharmaceutical Sciences); Seokmin Shin (Seoul National University, School of Chemistry);
Short Abstract: The asymmetry of mature miRNA selection in miRNA precursors is quite a complex. We inspect aspects of the asymmetry in terms of physicochemical property.
Long Abstract: Click Here

Poster P48
Gene Scrambling Analysis of Ciliates
Jing Liu- University of Saskatchewan
Ian McQuillan (University of Saskatchewan, Computer Science);
Short Abstract: Stichotrichous ciliates possess "scrambled" genes. We build two algorithms. One aligns scrambled and unscrambled genes and achieves improved sensitivity relative to previous algorithms. The second tries to test the hypothesis that scrambled genes could develop into an unscrambled version by mainly using structural information, which we show is possible algorithmically.
Long Abstract: Click Here

Poster P49
Sequence Alignment Using Compatible Constraint Set
Lingling Jin- University of Saskatchewan
Ian McQuillan (University of Saskatchewan, Computer Science);
Short Abstract: We allow biologists to augment alignments with their objectives, which can be embodied as constraints on alignments. Several “compatible” constraints are called a CCS. We determine an algorithm finding a new alignment that achieves the highest score and satisfies the CCS, by using information from the original alignment to obtain speed improvements, without losing accuracy.
Long Abstract: Click Here

Poster P50
Evaluation of software for sequence assembly in the context of genomes containing varied numbers of plasmids
Tejumoluwa Abegunde- University of Saskatchewan
Anthony Kusalik (Supervisor, Computer Science);
Short Abstract: Sequence Assembly is a key area in Bioinformatics. A common problem in sequence assembly is dealing with repeats. Another is dealing with an unknown number of plasmids since these increase the number of repeats. This work evaluates some existing sequence assemblers to determine one that best deals with sequences containing varied numbers of plasmids.
Long Abstract: Click Here

Poster P51
E1DS: catalytic site prediction based on 1-dimensional signatures of concurrent conservation
Ting-Ying Chien- National Taiwan University
Darby Tien-Hao Chang (National Cheng Kung University, Electrical Engineering); Chien-Yu Chen (National Taiwan University, Bio-Industrial Mechatronics Engineering); Yi-Zhong Weng (National Taiwan University, Computer Science and Information Engineering); Chen-Ming Hsu (Yuan Ze University, Computer Science and Engineering);
Short Abstract: E1DS is designed for annotating enzyme sequences based on a repository of 1D signatures. This work has recently been accepted by the web server issue of Nucleic Acids Research and will be published in July, 2008. In this paper, E1DS is shown to have good performance in identifying catalytic residues.
Long Abstract: Click Here

Poster P52
High throughput sequence alignment with suffix arrays and q-grams
Fan Meng- University of Michigan
Justin Wilson (University of Michigan, Psychiatry Department/MBNI); Manhong Dai (University of Michigan, Psychiatry Department/MBNI); Stanley Watson (University of Michigan, Psychiatry Department/MBNI);
Short Abstract: We propose a sequence alignment algorithm designed for aligning high throughput sequencing results under the assumptions of no insertions or deletions and variable length sequences. We unite the “sort and search” approach of enhanced suffix arrays with q-grams and extend perfect q-gram matches to a specified mismatch tolerance.
Long Abstract: Click Here

Poster P53
Biological Sequence Simulation For Complex Evolutionary Hypotheses
Cory Strope- University of Nebraska
Kevin Abel (University of Nebraska, Computer Science and Engineering); Stephen Scott (University of Nebraska, Computer Science and Engineering); Etsuko Moriyama (University of Nebraska, School of Biological Sciences and Center for Plant Science Innovation);
Short Abstract: We introduce an upgrade of our sequence simulator, indel-seq-gen, with improved capabilites to simulate motifs, lineage-specific evolution, coding (with exons and introns) and non-coding DNA, and a more powerful method of indel evolution that includes relative time and location of indel. This allows us to test previously unassessable evolutionary hypotheses.
Long Abstract: Click Here

Poster P54
Fast Learning of Variable Order Markov Chains
Marcel Schulz- Max Planck Institute For Molecular Genetics
David Weese (Free University Berlin, Algorithmic Bioinformatics); Hugues Richard (Max Planck Institute for Molecular Genetics, Computational Molecular Biology); Tobias Rausch (Free University Berlin, Algorithmic Bioinformatics); Andreas Doering (Free University Berlin, Algorithmic Bioinformatics); Knut Reinert (Free University Berlin, Algorithmic Bioinformatics); Martin Vingron (Max Planck Institute for Molecular Genetics, Computational Molecular Biology);
Short Abstract: Variable order Markov chains (VOMCs) have been applied to a
variety of problems in bioinformatics, from gene annotation
to protein domain detection.
We report about our new linear time learning algorithms, which are orders of magnitude faster than previous

algorithms. In addition, we show how to learn more sophisticated models for proteins.
Long Abstract: Click Here

Poster P55
How Different Are Transcription-Factor Binding Sites from Their Background Sequences?
Keishin Nishida- Institute of Medical Science, University of Tokyo,
Kenta Nakai (Institute of Medical Science, University of Tokyo, , Human Genome Center);
Short Abstract: To assess the inherent character of known transcription-factor binding sites, 122 JASPAR count matrices were classified based on the overlapping of score distributions between their artificial binding sites and background sequences. For each group, various attributes, such as structural features and species’ origin, were tested for their statistical significance.
Long Abstract: Click Here

Poster P56
Methylated DNA Sequence Alignment
M. Elizabeth Locke- University of Western Ontario
Mark Daley (University of Western Ontario, Computer Science);
Short Abstract: DNA methylation is important in genetic regulation, and is typically analysed with statistical clustering. We propose a pairwise sequence alignment algorithm which incorporates methylation profile data directly. This algorithm can compare methylation in even diverse sequences and is the first step toward efficient, automated analysis of DNA methylation data.
Long Abstract: Click Here

Poster P57
GMF: a fast and simple motif finder integrating genome-wide evidence of binding with overrepresented sequence patterns
Stoyan Georgiev- Duke University
Karthik Jayasurya (Duke University, Institute for Genome Sciences and Policy); Sayan Mukherjee (Duke University, Institute for Genome Sciences and Policy); Uwe Ohler (Duke University, Institute for Genome Sciences and Policy);
Short Abstract: We propose a novel, genome-wide scale enumerative strategy for identifying gene cis-regulatory elements. In addition to a set of non-coding regulatory regions, our approach makes use of condition-specific gene scores, such as p-values resulting from ChIP-chip experiments, to identify motifs based on strong aggregate evidence of gene co-regulation.
Long Abstract: Click Here

Poster P58
The PROMALS3D tool for accurate multiple sequence and structure alignments
Jimin Pei- University of Texas Southwestern Medical Center
Bong-Hyun Kim (University of Texas Southwestern Medical Center, Department of Biochemsitry); Nick Grishin (University of Texas Southwestern Medical Center, Howard Hughes Medical Institute);
Short Abstract: Multiple sequence alignment is a valuable tool in computational analysis of biological sequences and structures. We present PROMALS3D, an advanced method that constructs multiple sequence and/or structure alignments by intergrating information from database homologs, secondary structure predictions and available 3D structures. PROMALS3D webserver is available at: http://prodata.swmed.edu/promals3d/.
Long Abstract: Click Here

Poster P59
De Novo Assembly of Short Reads with High Error Rates
Mark Chaisson- University of California, San Diego
Dumitru Brinza (University of California, San Diego, Computer Science); Pavel Pevzner (University of California, San Diego, Computer Science);
Short Abstract: A common practice in de novo fragment assembly is to trim the tails of the reads where the base calling error rate exceeds 1-2%. In EULER-USR we substitute trimming with error correction on a de Bruijn graph allowing utilization of the full length of extended high-throughput short reads.
Long Abstract: Click Here

Poster P60
SLIPPER: An Iterative Mapping Pipeline for Short DNA Sequences
Lee Edsall- Ludwig Institute for Cancer Research
Terry Gaasterland (University of California, San Diego, Laboratory of Computational Genomics); Bing Ren (Ludwig Institute for Cancer Research, Laboratory of Gene Regulation);
Short Abstract: Combining Illumina’s Genome Analyzer Pipeline with SLIPPER, our iterative mapping pipeline, increases the proportion of mapped sequences significantly over using the Illumina software alone. A proof of principle analysis of eight lanes mapped an additional 5.33% of reads to the human genome.
Long Abstract: Click Here

Poster P61
Transcriptome Analysis using Solexa Sequencing Technology
Sammy Assefa- Wellcome Trust Sanger Institute
Dr. Thomas Keane (Wellcome Trust Sanger Institute, Pathogen Genomics);
Short Abstract: We have analyzed the transcriptome of a prokaryotic (Salmonella Typhi) and eukaryotic (Plasmodium falciparum) human pathogen. We outline the development of a computational pipeline to facilitate mapping and visualization of the transcriptome data. Our results show the accuracy of the approach compared to existing microarray data.
Long Abstract: Click Here

Poster P62
Distinctive Features in Aquaporin Sequences from Five Vertebrate Species
Cynthia Jeffries- Jackson State University
Raphael Isokpehi (Jackson State University, Biology); Hari Cohly (Jackson State University, Biology); Rajendram Rajnarayanan (Tougaloo College, Chemistry); Hugh Nicholas (Pittsburgh Supercomputing Center, NRBSC);
Short Abstract: We used global multiple sequence alignment, MEME pattern identification, Neighbor-Joining consensus bootstrap analysis and Principal Components Analysis to examine the relationships among 71 aquaporin proteins from five vertebrate species: zebra fish, chicken, cow, mouse and human. These analyses revealed early splits that robustly divide the sequences into four major groups.
Long Abstract: Click Here

Poster P63
Maximum Likelihood Energy Estimation from Binding Sites
Yue Zhao- Washington University in St. Louis
Gary Stormo (Washington University in St. Louis, Genetics);
Short Abstract: We present a new method of estimating specific binding energy from a list of binding sites. Our method uses maximum likelihood estimates of factor concentration to parameterize a biophysical model, resulting in higher level of performance compared to existing methods.
Long Abstract: Click Here



Accepted Posters
View Posters By Category
Search Posters:
Poster Number Matches
Last Name
Co-Authors Contains
Title
Abstract Contains