Posters

Poster numbers will be assigned May 30th.
If you can not find your poster below that probably means you have not yet confirmed you will be attending ISMB/ECCB 2015. To confirm your poster find the poster acceptence email there will be a confirmation link. Click on it and follow the instructions.

If you need further assistance please contact submissions@iscb.org and provide your poster title or submission ID.

Category H - 'Metagenomics'
H01 - Improving quality of error discovery in frequency estimation for parallel sequencing datasets
Milko Krachunov, , Bulgaria
Dimitar Vassilev, Bioinformatics group, AgroBioInstitute, Bulgaria
Maria Nisheva, Faculty of Mathematics and Informatics, Sofia University, Bulgaria
Ognyan Kulev, Faculty of Mathematics and Informatics, Sofia University, Bulgaria
Valeriya Simeonova, Faculty of Mathematics and Informatics, Sofia University, Bulgaria
Vladimir Dimitrov, Faculty of Mathematics and Informatics, Sofia University, Bulgaria
Short Abstract: Motivation: In mixed datasets, with multiple species or subgenomes, such as present in metagenomics sequencing or study of polyploid genomes, the tasks of error detection and correction are made difficult by the significant variation in the data. Due to the presence of reads of homologous sequences, the correct bases, including SNPs, can have unusually low rates of occurrence.
Result: Our work is focused on the development of novel approaches for error identification. The starting point is a base analytic approach which attempts to account for variation by introducing similarity weights. The main tool for improvement are two machine learning models with over 99% accuracy that can be combined with any other error detection approach as an additional filter to reduce the false positives among predicted errors. When compared to use of frequencies, their combination halves the number of corrections by reducing false positives.
TOP
H02 - A new multi-core focused strategy to increase BLAST efficiency.
Ravi Ramos, Universidade Federal do Rio de Janeiro, Brazil
Allan Azevedo-Martins, Universidade Federal do Rio de Janeiro, Brazil
Turan Urmenyi, Universidade Federal do Rio de Janeiro, Brazil
Rosane Silva, Universidade Federal do Rio de Janeiro, Brazil
Short Abstract: Background. Improving computer hardware allows bioinformatics results to be obtained faster. However, software needs upgrades to take full advantage from the new multi-core processors. In the field of alignment algorithms, several alternatives to BLAST were developed addressing parallelization, which is not efficient in BLAST. This work aims to improve BLAST parallelization and add other features without affecting the alignment results.
Results. We developed a new strategy to run BLAST tools, adding full multi-core support. The strategy was implemented in C language, compiled and tested in several Linux distributions. It can also be adapted to other operational systems or programming languages. As it doesn't change BLAST's source code, the alignment results are identical. In a 8 threads computer, we achieved a 5-fold speedup whereas in a 48 threads computer, we achieved a 52-fold speedup, when comparing with regular BLASTN of metagenomic data (using '-num_threads' maximum value on each computer). The strategy directed 100% of CPU usage towards the alignment, in spite of consuming more memory. Regular BLASTN alternated from 1 to all threads usage with an average of 2 for BLASTN depending on its parameters. Also, we introduced estimated time remaining, BLAST pause, network distribution to run a single BLAST in several servers/computers at once and a simple terminal interface.
Conclusions. We improved BLAST efficiency and hardware management by implementing a strategy that: runs significantly faster than regular BLAST; pauses BLAST to reorganize input files in queue; and estimates the time remaining, therefore, allowing to schedule other tasks.
TOP
H03 - An efficient method for taxonomic binning performance optimization
Magali Jaillard, bioMerieux, France
Maud Tournoud, bioMerieux, France
Faustine Meynier, bioMerieux, France
Jean-Baptiste Veyrieras, bioMerieux, France
Short Abstract: Taxonomic binning accuracy is pivotal for comprehensive metagenome analyses. Alignment-based methods for taxonomic binning proceed in two steps: read mapping against a reference database (RDB) and taxon identification using the best hits. Binning performance thus depends on mapper parameters, mapping quality score threshold for taxon identification and completeness of the RDB.
In order to improve taxonomic binning, we built models to predict several performance indicators from mapper parameters and mapping quality score threshold. We designed an experimental plan (ie. a set of scenarios with different parameter values) to efficiently browse the parameter space in a minimal number of experiments. For each scenario, prediction models were fitted on simulated metagenomes using BWA-backtrack, BWA-MEM, TMAP, and Bowtie2.
Observed and predicted performance indicator values were very close, thus validating the prediction models for each performance indicator. We observed that i) BWA-backtrack was less sensitive to parameter value modification than other mappers, ii) TMAP and BWA-MEM showed globally better performances and were less impacted by missing genomes in the RDB, iii) the minimum mapping quality score for taxonomic identification had a large impact on the prediction accuracy.
Finally we evaluated on a spiked HMP dataset a configuration with optimal predicted performances, together with a bad and the default configurations. We found that the optimal scenario reduced the number of incorrect taxonomic assignations, without reducing the number of mapped reads.
In conclusion we presented a validated strategy to optimize taxonomic binning performances at a minimal computational cost.
TOP
H04 - Top-down approach for taxonomic identification and p-value generation for multiple organisms in metagenomic NGS datasets
Vitor C. Piro, Robert Koch-Institut - NG4 Bioinformatik, Germany
Lindner Martin, Robert Koch-Institut - NG4 Bioinformatik, Germany
Bernhard Y. Renard, Robert Koch-Institut - NG4 Bioinformatik, Germany
Short Abstract: The fast increase of complete genome sequences available on public databases has allowed better predictions of the microbial content from sequenced environmental and clinical samples. The identification of species and their quantification are common tasks in metagenomics and pathogen detection studies. The most recent techniques are built on mapping the sequenced reads against a reference database (e.g., whole genomes, marker genes, proteins) and performing further analysis. Although these methods proved to be useful in many scenarios, they generally do not support their predictions with significance measures, such as p-values. These values could be very useful for inferring the presence of a specific organism when statistical significance is required. We propose a new reference-based method to identify taxonomic groups and generate p-values in a given NGS dataset. Opposed to most current methods, we do not use the bottom-up lowest common ancestor algorithm. Instead, we develop a new approach that uses a top-down analysis of a taxonomic tree. The algorithm starts at the root node and successively identifies children nodes with significant alignments until it reaches the leaf nodes or until no more significant alignments are found. The significance is based on p-values estimated by permutation tests, comparing the matches of mapped reads between children of the same ancestor. Since several tests are performed, Meinshausen's hierarchical testing procedure is used for multiple testing corrections. We showed in experiments that the proposed method works for single and multiple organisms and can identify low taxonomic groups with high precision.
TOP
H05 - SPINGO: a rapid species-classifier for metagenomic amplicon sequences
Feargal Ryan, University College Cork, Ireland
Guy Allard, University College Cork, Ireland
Ian Jeffery, University College Cork, Ireland
Marcus Claesson, University College Cork, Ireland
Short Abstract: Taxonomic classification is a cornerstone of the study of microbial communities. Currently, most existing methods are either too slow, restricted to specific communities, highly sensitive to taxonomic inconsistencies or limited to genus level classification. As crucial microbiota information is hinging on high-level resolution it is imperative to increase taxonomic resolution to species level. In response to this need we have developed SPINGO, a flexible and stand-alone software dedicated to high-resolution assignment (species and genus) of sequences of any variable 16S rDNA gene region from any environment. SPINGO outperforms other methods in terms of classification accuracy at species level assignments, and is faster than other popular tools for this task. When considering accuracy, flexibility and speed SPINGO proves its value as the method of choice for high-level taxonomic assignment. This combination is becoming increasingly important for rapidly and accurately processing of amplicon data generated by the most high-throughput of next generation sequencing technologies.
TOP
H06 - Tools for fast comparative metagenome analysis
Kathrin Asshauer, University of Goettingen, Germany
Heiner Klingenberg, Georg-August University of Göttingen, Germany
Robin Martinjak, Georg-August University of Göttingen, Germany
Thomas Lingner, Georg-August University of Göttingen, Germany
Peter Meinicke, Georg-August University of Göttingen, Germany
Short Abstract: Metagenomics has become a standard approach to analyze microbial communities from environmental and clinical samples. The vast amount of sequencing reads demands new bioinformatics tools which can efficiently deal with metagenomic datasets on a large-scale. We developed UProC, Taxy-Pro, Mixture-of-Pathways (MoP) model, and CoMet-Universe to quickly determine and compare the functional and phylogenetic composition of metagenomic samples.

Taxy-Pro implements a mixture model for inferring the taxonomic composition. The mixture model is based on protein domain frequencies and allows to assess the taxonomic coverage of a metagenome by known organisms in terms of a model quality index. Taxy-Pro is currently among the computationally most efficient taxonomic profiling approaches and is freely available at www.gobics.de/TaxyPro. The MoP model extends the taxonomic mixture model to a statistical modeling of the metabolic potential of metagenomes. To overcome computationally intense homology searches, we link the taxonomic profile of the metagenome to a set of pre-computed metabolic reference profiles.

The CoMet-Universe web-server (http://comet2.gobics.de/) provides a comprehensive suite for taxonomic, functional and metabolic profiling of metagenomes. The basis for all analyses is a computationally efficient identification of protein domains that allows to process large amounts of unassembled short reads by orders of magnitude faster than with a conventional BLAST-based approach. Beyond the analysis of uploaded metagenome data, in CoMet-Universe the user has the possibility to compare a particular metagenome with more than thousand precomputed profiles. For offline computation, the ultra-fast protein classification (UProC) engine of our web-server is available as an open source C library (http://uproc.gobics.de/).
TOP
H07 - Investigating the composition of the rumen metavirome.
Thomas Hitch, Aberystwyth University,
Short Abstract: In recent years rumen research has largely focused on the role of bacterial and Archaea species in rumen function. Due to this the rumen virome (the population of viruses present) has been mostly neglected. Having a greater understanding of this important population, will help us understand their role in horizontal gene transfer (HGT) and their effect on different populations of Bacteria and Archaea in the rumen.

We gathered over 1.5Tb of metagenomic sequence data from multiple metagenomic studies and devised computational approaches to identifying and quantifying their viral component. We found that over 300 million reads from our dataset could be attributed to viruses and we used this to create a metavirome assembly.

This assembly was used to assess the abundance and diversity of the metavirome within different regions of the rumen, across different diets consumed by the host and associated with different host phenotypes. The largest group were the double-stranded DNA viruses (in particular represented by the Bacteriophages), however small number of Single-stranded RNA viruses were also detected. This information could help us further understand the role of the viruses in the rumen and devise strategies for controlling specific populations of bacteria and archaea in the rumen.
TOP
H08 - Impact of exercise and dietary protein on the human gut microbiota
Orla OSullivan, Teagasc, Ireland
Owen Cronin, University College Cork, Ireland
Wiley Barton, Teagasc, Ireland
Peter Skuse, Teagasc, Ireland
Michael Molloy, University College Cork, Ireland
Paul Cotter, Teagasc, Ireland
Fergus Shanahan, Alimentary Pharmabiotic Centre, Ireland
Short Abstract: The human intestinal tract is home to a considerable population of microorganisms. These microbial colonisers confer numerous health benefits on the host, contributing to nutrition and metabolism, conditioning of the immune system, protection against pathogens and social behaviour. Of late, the importance of microbial diversity as a biomarker for health has come to the fore with reduced microbial diversity being associated with autism, gastrointestinal diseases and obesity-associated inflammatory characteristics. Numerous factors can impact the relationship between the host and gut microbes with diet, age, environment and host genetics playing pivotal roles. Accepted as being of particular importance, the interplay between diet and the gut microbiota continues to be elucidated; it is already apparent that dietary patterns among elderly individuals impact on microbial composition, diversity and health. Despite the significant impacts of exercise, its relationship with the gut microbiota has, until recently, not been investigated. Recently we demonstrated that professional athletes have increased microbial diversity compared with controls. This increased diversity correlated with exercise levels and/or protein intake, raising questions as to whether exercise, and/or dietary protein could positively modulate gut microbial diversity. To address this, we are in the midst of an intervention study involving non-professional athletes, who undergo a predetermined exercise regime and/or increase their protein intake over a period of 6 weeks, with a view to increasing their microbial diversity. Furthermore, through shotgun metagenomics and metabolomics, we are investigating any alterations in functions which can be attributed to altered exercise and/or protein levels in these subjects.
TOP
H09 - Microbial analysis of intestinal biopsies investigating the association of anatomical site and inflammatory status in patients with inflammatory bowel disease
Jessica Forbes, Public Health Agency of Canada - NML and University of Manitoba, Canada
Wenhua Tang, University of Manitoba, Canada
Shadi Sepehri, University of Manitoba, Canada
Ehsan Khafipour, University of Manitoba, Canada
Denis Krause, University of Manitoba, Canada
Gary Van Domselaar, Public Health Agency of Canada - NML and University of Manitoba , Canada
Charles Bernstein, University of Manitoba and the IBD Clinical and Research Centre, Canada
Short Abstract: Crohn’s disease (CD) and ulcerative colitis (UC) are clinically distinct prototypes of inflammatory bowel disease. They are multifaceted intestinal disorders with uncertain etiologies thought to be influenced by a dysbiosis (microbial imbalance) of the gastrointestinal tract.

We resected biopsies from different mucosal sites (ileum, cecum, colon, rectum) at colonoscopy and histologically defined them as inflamed or noninflamed. We performed 16S rRNA sequencing to analyze population structures. Quality control and OTU classification of reads were performed using mothur with statistical analyses executed in the R package, phyloseq.

The structure of microbial communities varied among CD and UC patients and healthy controls. Genera including Fusobacterium (p=0.05) and Marinobacter (p=0.04) were more abundant in inflamed CD versus inflamed UC. The abundance of Clostridium (p=0.008) and Haemophilus (p=0.0007) were elevated in noninflamed CD versus noninflamed UC. Within each disease, different segments of the intestine demonstrated insignificant variation of microbial communities. However, for CD, the abundance of Pseudomonas (p=0.003) varied between anatomical sites, which were highest in the colon and rectum. Mucosal sites between disease groups presented more conclusive results. Bacteroides (p=0.03) was highest in the ileum of UC. In the colon, Pseudomonas (p=0.0002), Sporacetigenium (p=0.05) and Actinomyces (p=0.01) were more prevalent in CD. Pseudomonas (p=0.0007) and Bacteroides (p=0.0007) in the rectum were most abundant in healthy controls and UC, respectively.

Distinct microbial communities were observed between disease groups dependent on mucosal site and whether inflamed or noninflamed tissue was sampled. The relative lack of intra-individual variation among mucosal sites presents similarly to previous studies.
TOP
H10 - Metagenomic dissimilarity metric: k-mer method compared with reference-based methods
Veronika Dubinkina, MIPT, RIPCM, Russia
Alexander Tyakht, RIPCM, Russia
Dmitry Alexeev, MIPT, RIPCM, Russia
Short Abstract: During the last decade next-generation sequencing technologies developed explosively, and a large amount of short read metagenomic data has been accumulated. Due to the improvement of technology and reduction of sequencing mismatches, large class of methods for metagenomic analysis appeared. Their algorithms are based on work with k-mers (oligonucleotides of length k) directly from metagenomic reads. In comparison with reference-based methods, the main advantages of k-mer approach are: compressed representation of sequences and inclusion of entire data array into analysis. Among these methods, the most simple and effective for exploratory analysis of large data sets is comparison of sequences by calculation of pairwise distances between them on the basis of k-mer spectra.
In this study we investigated the influence of different characteristics of studied data on the difference between matrices of pairwise dissimilarities obtained on the basis of k-mer spectra and traditionally used reference-based methods (mapping to the catalogs of genes and genomes, MetaPhlAn, Unifrac). The database consisted of 280 metagenomic samples of the human gut, sequenced in large-scale metagenomic projects; simulated metagenomes were also used. Comparison of dissimilarity matrices showed that k-mers has the best conformity with functional structure of metagenomes. The analysis revealed specific differences between the cohorts from the two studies (USA and China). We analyzed the contribution of taxonomic composition, sequence quality scores and other factors to these differences. For the purposes of feature selection, we developed several filtering algorithms - basing on entropy and variance; the methods were applied to real data demonsrating their efficiency.
TOP
H11 - The MetAnnotate framework for HMM-based functional profiling and comparison of metagenomes
Briallen Lobb, University of Waterloo, Canada
Daniel Kurtz, University of Waterloo, Canada
Gabriel Moreno-Hagelsieb, Wilfrid Laurier University, Canada
Josh Neufeld, University of Waterloo, Canada
Andrew Doxey, University of Waterloo, Canada
Pavel Petrenko, University of Waterloo, Canada
Short Abstract: Finding genes of interest in metagenomes is often complicated by the short read lengths, taxonomic novelty and diversity of metagenomic sequences. We introduce a framework called MetAnnotate that uses profile-based and profile-profile comparison methods for sensitive detection and annotation of divergent and even ORFan sequences within metagenomes.

MetAnnotate detects homologs of existing domain families using profile HMMs, and assigns taxonomy through a best hit or phylogenetic approach. By focusing on specific proteins or pathways of interest, MetAnnotate can perform function-specific taxonomic profiling across hundreds or thousands of metagenome samples. Although many predicted genes in a metagenome can be annotated in this way, often a significant proportion has no detectable homology to database content. These ORFan sequences can make up over 50% of predicted genes in metagenomes, severely limiting functional studies. A second component of our pipeline therefore investigates metagenomic ORFans through a profile-profile comparison approach.

As a proof of principle, we applied HMM-based annotation of metagenomes to characterize the microbial producers of vitamin B12 in 430 aquatic metagenomes. Second, we analyzed the ORFan population of the Great Prairie Grand Challenge, Global Ocean Sampling and Human Gut Microbiomes, and have annotated 21,940 ORFan families (15% of all ORFans) with a false discovery rate of 1.4%.

Ultimately, the MetAnnotate pipeline is an effective strategy for sensitive functional profiling of metagenomes, and is capable of uncovering highly divergent homologs that contribute important environment-specific functions.
TOP

View Posters By Category

Search Posters:


TOP