Posters
Poster numbers will be assigned May 30th.
If you can not find your poster below that probably means you have not yet confirmed you will be attending ISMB/ECCB 2015.
To confirm your poster find the poster acceptence email there will be a confirmation link.
Click on it and follow the instructions.
If you need further assistance please contact submissions@iscb.org and provide your poster title or submission ID.
Category I - 'Open Science and Citizen Science'
I01 - Equivalent Input Produces Different Output in the UniFrac Significance Test
Short Abstract: UniFrac is a well-known tool for measuring phylogenetic distance between biological communities and assessing statistically significant differences between communities. We identify a discrepancy in the UniFrac methodology that causes isomorphically identical inputs to produce different outputs in tests of statistical significance.
We show that results from the UniFrac tool can vary greatly for the same input depending on the arbitrary choice of input format. Practitioners should be aware of this issue and use the tool with caution to ensure consistency in their analyses.
TOP
I02 - Semi-supervised Multi-view Gaussian Processes for Microbial Growth Prediction
Short Abstract: We propose semi-supervised multi-view Gaussian process (GP) model for microbial growth prediction. Our semi-supervised GP model is formulated using co-regularization approach, namely we construct GPs for different views, such that the training error of each hypothesis on the labeled data is small and, at the same time, the hypotheses give similar predictions for the unlabeled data. Our model is naturally suitable for taking into account multiple data representations and learning complex non-linear interactions. We apply the proposed model for describing and predicting growth, succession, and proliferation of microbial species in the spoilage process. In our empirical evaluation on the recently collected biological dataset the proposed approach notably outperforms several regression techniques and leads to better understanding of the role of various bacterial species and their influence on spoilage process.
TOP
I03 - Mixture models for taxonomic and metabolic profiling of metagenomes
Short Abstract: Metagenomics, as a culture-independent approach, enables the exploration of complex microbial communities by massive sequencing of community-specific DNA. Large-scale projects like the Human Microbiome Project or the Earth Microbiome Project emphasize the increasing importance of metagenomics for biomedical and ecosystem research. Due to the vast amount of data, fast algorithms and tools have to be developed to meet the requirements of taxonomic and metabolic profiling of metagenomes.
We have developed Taxy-Pro for inferring the taxonomic composition over the whole range of biological entities – including all domains of life and viruses. Taxy-Pro implements a novel mixture model based on protein domain frequencies. Our results indicate that Taxy-Pro estimates may provide important advantages when analyzing data with a high fraction of archaeal or viral DNA. Taxy-Pro is freely available at http://www.gobics.de/TaxyPro/ as a Matlab/Octave toolbox or through the CoMet web server (http://comet.gobics.de/).
Further, we have extended the taxonomic mixture model to a statistically adequate modeling of the metabolic potential of metagenomes. Here, organism-specific metabolic profiles are first estimated by applying the mixture model based on KEGG orthologs and pathways. Then the metabolic profile of a metagenome is computed by combining the taxonomic profile with the organism-specific metabolic profiles. This provides a concise summary of the functional capacity of a metagenomic sample enabling the identification of relevant metabolic differences between distinct microbial communities.
TOP
I04 - Metatranscriptomics of colonic bacteria in inflammatory bowel diseases
Short Abstract: Crohn’s disease and ulcerative colitis are inflammatory bowel diseases (IBD) characterized by chronic and relapsing inflammation of the gastro-intestinal tract. They cause lifelong suffering, as well as considerable drainage of health care resources. Although their etiology is still unclear there is a growing body of evidence for a significant microbial factor. In this study we focus on the global gene expression of these communities through mRNA sequencing. We collected colonic biopsies from inflamed and non-inflamed colonic mucosa in 19 IBD patients and using RNA-Seq with unprecedented depth we compared microbial metatranscriptomes in inflamed and non-inflamed colonic mucosa. This was done using 600Gb of Illumina HiSeq RNA-Seq technology (15Gb/sample). Human reads were filtered out using Tophat2 and remaining reads were mapped against a reference database constructed from all sequenced bacterial and viral genomes from the gastrointestinal subset of the human microbiome project. We subsequently used DE-Seq to analyze the count data. Preliminary analysis using Real-Time-PCR revealed transcriptional differences for two pathogenic Bacteroides species, where homologs of genes involved in tissue-destruction were up-regulated in inflamed mucosa of UC patients. We also observed higher expression rates of E.coli in inflamed mucosa, as has previously been observed in CD patients. Meta-transcriptome analysis confirmed these results and added a multitude of other gene candidates that were significantly up/down regulated in inflamed tissue. Thus, our analysis revealed transcriptional differences for known microbial pathogens. High-throughput RNA-Seq analysis added extra value to these findings and is a source of continuing analysis with great potential for further interesting findings.
TOP
I05 - Comparing de novo assemblers for metagenomic data
Short Abstract: The main goal of metagenomics is to characterize the structure and dynamics of communities of non-clonal microorganisms. One step in metagenomic analysis is reconstruction of genomes by assembling sequence reads. Unlike a traditional sequencing project, which aims to determine the complete genome sequence of a single organism, metagenomic analyses require thousands of (partial) genomes from a microbial community to be sequenced and assembled simultaneously. Over the past few years, different methods have been developed or revised specifically for the de novo assembly of next generation sequencing data; however, there are only a few tools that specifically focus on metagenomic data. Given the considerable difficulties involved in assembling such data, including inadequate and partial sampling of some genomes, different organism compositions and evolutionary relationships, and the presence of repetitive fragments, reconstructing the full metagenome is a very demanding task.
Here we provide an evaluation of current de novo short read assembly tools on metagenomic data. We test a number of state-of-the-art assemblers that were designed specifically for metagenomic data, as well as some that were not. The accuracy, performance, and computational requirements of these assemblers were evaluated using three datasets of simulated sequence reads, each having a different community complexity (low, medium, or high), as well as real reads obtained from the sequencing of environmental samples using Ion Torrent technology. Our evaluation of assemblers suggested that although no single assembler performed best on all of our criteria, MIRA slightly outperformed the other programs.
TOP
I06 - Towards a single click solution for processing ultra-deep sequencing microbiomic data.
Short Abstract: Ultra-deep sequencing technologies now allow the detailed characterization of bacterial communities. Sequencing amplicons comprising variable regions of the 16S rRNA gene is a useful strategy to identify the members of human-associated or environmental communities. Deep sequencing using the Illumina platform generates large data-sets of hundreds of millions of reads, requiring the development of complex computational tools to filter/quality-trim, cluster and classify the sequenced amplicons. Identification of the phylogeny of a given sequence type is mainly performed using services such as the RDP, Silva or Greengenes websites. However, with such short sequences as produced by the Illumina platform, human verification is often needed before a taxon can be correctly assigned. In order to simplify data processing and annotation we developed a pipeline for short 16S rRNA reads (80 to 120 bp), which greatly reduces both computation time and need for manual curation. The pipeline starts with Illumina multiplexed fastq files and a table of multiplexing primers, and generates an annotated table of the most abundant phylotypes (PT). The method uses an iterative blast search to identify and verify the nearest neighbor that matches each of the query PTs. This method can make a fast and accurate classification of the community members present across a large number of samples. A test case analysis of the community of 300 samples of the human nares is presented. While not completely removing the need for human verification, the pipeline greatly reduces the time and effort required to classify and annotate 16S rRNA-based microbiomics.
TOP
I07 - Improved metagenome analysis using MEGAN5
Short Abstract: Metagenome analysis is conceptually and computationally challenging. We have developed a number of new methods for improving taxonomic and functional analysis of metagenome data. While some improvements are algorithmic, such as an improvement of the LCA algorithm for taxonomic binning, others aim at using additional resources for analysis, such as the COG/EGGNOG of function. We have developed a frame-work for performing attribute-guided, function-oriented analysis of multiple samples, involving ecological indices and PCoA analysis. This poster gives an overview of the new methods.
TOP
I08 - Adaptive strategies of Staphylococcus aureus in the human anterior nares revealed by an RNA-seq approach
Short Abstract: The human anterior nares are a transition zone from the skin to the nasal cavity and the principle habitat for both commensals and opportunistic pathogens where approximately 20 - 30% of the human population is persistently colonised by Staphylococcus aureus. This S. aureus colonisation of the nares is asymptomatic, however, nasal carriage has a crucial function as a source of invasive infections. Although differences in colonisation may be due to host factors such as host immunity, age and gender, and/or environmental factors, the interplay between these factors and the interactions among community members have yet to be thoroughly investigated.
As a first step to elucidate the functional attributes of important nasal community members, the transcriptional (RNA-seq) profiles of two clinically relevant strains were evaluated and included the Barnim epidemic methicillin-resistant S. aureus strain (SPA type T032, a typical coloniser of the anterior nares) and USA 300 (a community-associated clonal lineage). Both strains were grown in nutrient rich media as well as an artificial nasal medium simulating “nasal” conditions, revealing niche specific responses. In vitro assays were compared with metatranscriptomic (RNA-seq) profiles of pooled nasal swab samples from healthy individuals previously identified as carriers of S. aureus. This revealed insights into the strategies used by S. aureus for colonisation and adaptation in the human nares.
TOP
I09 - Confident Taxonomic Binning of Metagenomes
Short Abstract: Metagenome research uses random shotgun sequencing of microbial community DNA to study the genetic sequences of its members without cultivation. The process of structuring the sequence data by taxa is called Taxonomic Binning. It provides an initial view on the organismal complexity and clade-specific functions, and is therefore important for the understanding of microbial interactions. Binning results can also be used to refine genome assembly and to target interesting species within a community.
The processing of large and complex sequencing data poses a challenge to binning algorithms. The major drawbacks in current techniques comprise computational infeasibility, the need for prior specification of organismal composition, short sequence length, low sample coverage, high number of false predictions and systematic bias towards overrepresented taxa or dominant community members. taxator-tk implements an algorithm that is based on nucleotide local alignments and largely avoids heuristic assumptions or parameters, similar to phylogenetic placement techniques. Our method is conservative with respect to missing or contradicting data and minimizes bias towards certain taxa. This leads to few false taxon assignments and a high precision, providing a reliable basis for further analysis. taxator-tk focuses on speed and simplicity. It uses fast alignment and parallel computing and avoids the need for a specific pre-formatted reference data or run-time parameters.
TOP
I10 - EBI’s Metagenomics Pipeline: An automated pipeline for the analysis of metagenomic data
Short Abstract: The EBI Metagenomics service provides an interface for submission, analysis and archiving of metagenomic data and aims to provide insights into the functional and metabolic potential of a sample.
Here we present the Metagenomics Pipeline, a bioinformatics tool for large scale Next Generation Sequence data analysis. The pipeline takes input sequences in a range of widely supported formats including 454 and Illumina. Here we give an overview of the pipeline content (including quality control, feature prediction and functional analysis steps) and the technical architecture used. The pipeline makes use of the Taverna Workflow Management System and is able to run in a distributed environment (using LSF). This results in a flexible, robust and easily maintainable pipeline.
TOP
I11 - Developing an integrated bioinformatics platform to support real-time monitoring of agricultural environments
Short Abstract: Amplicon-based metagenomics targeting DNA barcode regions permit large scale monitoring of environmental samples. While community tools and resources are constantly improving, routinely processing, managing and interpreting environmental metagenomics data poses challenges. In the agricultural context, Fungi and Nematodes are common pathogens, which are not often targets of existing tools and platforms. Further, interpreting this information at the levels of species, strain or pathotype is not possible using many existing tools.
To support our surveillance activities in the agricultural environment, a platform consisting of open source tools, and custom pipelines and databases has been established. At present, the system supports analysis of more than 60M Fungal, Nematode and Bacterial sequences generated using the GS FLX platform, in the near-future results from the MiSeq platform will be integrated, and comparison with shotgun metagenomics data is being explored.
A web-interface provides user tools to manage sample metadata and reference databases, and provides methods for interactively interrogating the identified sequences. Three different identification pipelines are currently employed: first, a lowest-common ancestor analysis of BLAST searches to reference databases (à la MEGAN), second, short exact signature oligonucleotides are used to mine and compare target groups (dubbed “oligo fishing”) and third, the QIIME RDP classifier pipeline (which recently added beta support for Fungal sequences). Integrating the output from these three methods provides users with greater confidence in lower taxonomy rank assignments and increases the reliability of such a monitoring regime.
TOP
I12 - Clinical Pathoscope: An alignment and filtering pipeline for rapid pathogen identification in unassembled, next-generation sequencing data
Short Abstract: The rapid and accurate identification of pathogens in human tissue samples is a necessity as disease-causing pathogens increasingly develop resistance to broad spectrum antibiotics and remain one of the greatest public health burdens worldwide. With the increased affordability of high-throughput sequencing, it is now possible to investigate the microbiome of a given sample with high sensitivity. However, clinical samples contain a mixture of genomic sequences from various sources, which complicates the identification of pathogens. Here we present Clinical Pathoscope, a pipeline to rapidly and accurately remove host contamination, isolate viral reads, and deliver a diagnosis. To optimize the Clinical Pathoscope pipeline, data was simulated from human, bacterial, and viral genomes to create biologically realistic clinical samples which represented a diverse variety of host-pathogen landscapes. These data were then used to evaluate the accuracy, usability, and speed of multiple alignment algorithms and filtration methods. The optimal alignment algorithm and filtration method were implemented in the Clinical Pathoscope pipeline to isolate viral reads. These reads were then mapped against a robust viral database and assigned to their appropriate genomes of origin. We demonstrate our approach using sequenced nasopharyngeal aspirate samples from children with upper respiratory tract infections. Unique to other methods, Clinical Pathoscope can rapidly identify multiple pathogens from mixed samples and distinguish between very closely related species with very little coverage of the genome and without the need for genome assembly.
TOP
I13 - Relating the metatranscriptome and metagenome of the human gut
Short Abstract: Typical microbial residents and ecologies of the human microbiome have now been well-studied. However, the microbiota's >8 million genes and their transcriptional regulation remain largely uncharacterized. We conducted one of the first human microbiome studies in a well-phenotyped prospective cohort incorporating taxonomic, metagenomic, and metatranscriptomic profiling at multiple body sites. The results establish the feasibility of metatranscriptomic investigations in subject-collected samples from the Health Professionals Follow-up Study. Replicate stool and saliva samples were collected from 8 subjects, and three different RNA preservation methods were assessed (frozen, ethanol, and RNAlater). Within-subject microbial species, gene, and transcript abundances were highly concordant across sampling methods, with only transcripts and only a small fraction (<5%) displaying significant between-method variation. Their functions were consistent with reprogramming in response to storage media environment (carbon source and osmolarity). Next, we investigated relationships between the oral and gut microbial communities, identifying a subset of abundant oral microbes that routinely survive transit to the gut. Comparison of the gut metagenome and metatranscriptome revealed three distinct functional clusters: (i) the ~50% of microbial genes whose RNA and DNA levels are strongly correlated; (ii) genes detected only at the DNA level, including inactive biosynthesis and stress-response factors; and (iii) genes detected only at the RNA level, including functions specific to the gut’s archaeal inhabitants, e.g. methanogenesis. Globally, we observe that RNA-level functional profiles are significantly more individualized than DNA-level profiles across subjects but less variable than microbial composition, indicative of subject-specific whole-community regulation occurring at the transcriptional level.
TOP
I14 - Efficient Search for Similar Microbial Communities Based on a Novel Indexing Scheme and Similarity Score
Short Abstract: It has long been intriguing scientists to effectively compare different microbial communities (also referred as “metagenomic samples” here) in a large scale: given a set of unknown samples, find similar metagenomic samples from a large repository, and examine how similar these samples are. With the current metagenomic samples accumulated, it is possible to build a database of metagenomic samples of interests. Any metagenomic samples could then be searched against this database to find the most similar metagenomic sample(s). However, it is not yet clear how to efficiently search for metagenomic samples against a large metagenomic database.
In this work, we have proposed a novel method, Meta-Storms, that could systematically and efficiently organize and search metagenomic data. It includes the following components: (i) creating a database of metagenomic samples based on their taxonomical annotations, (ii) efficient indexing of samples in the database based on a hierarchical taxonomy indexing strategy, (iii) searching for a metagenomic sample against the database by a fast scoring function based on quantitative phylogeny, and (iv) managing database by index export, index import, data insertion, data deletion and database merging. We have collected more than 7000 metagenomic data from the public domain and in-house facilities upto the end of year 2012, and tested the Meta-Storms method on these datasets. Our experimental results show that Meta-Storms is capable of database creation and effective searching for a large number of metagenomic samples, and it could achieve similar accuracies compared to the current popular significance-testing based methods.
TOP
I15 - Automating and improving taxonomic assignment with a high-resolution microbial phylogeny
Short Abstract: Thousands of microbial genomes have now been sequenced at an accelerating rate. Automatically recovering their phylogenetic relationships and assigning putative taxonomy is crucial for comparative genomics, molecular evolution, and microbial community characterization. Information from these genomes can be leveraged to automate and scale phylogenetic reconstruction and the placement of new genomes at a resolution not previously attainable. We thus present an improved, validated high-resolution microbial tree of life using a novel automated methodology and an associated method of taxonomic assignment. We identified >400 proteins strongly conserved throughout large subsets of 3,737 sequenced microbial genomes. The most informative subsequences of these were selected to maximize diversity and phylogenetic resolution for alignment and tree building. The new phylogeny was more taxonomically consistent than existing approaches (>92% species-level precision) and showed that including hundreds of proteins continues to improve topological accuracy. The method further resulted in measures of the phylogenetic diversity captured by all currently sequenced microbial clades at the phylum through species levels. The developed pipeline also automatically classifies newly sequenced genomes to the most specific confident taxonomic level. This provided improved taxonomic assignments for 130 previously poorly classified genomes, detected 157 with likely erroneous taxonomic labels, and confidently corrected 46 of these. Examples are provided of accurate classifications from subspecies to deep branching candidate divisions (OP1, OP5, OP11, and TM7). Quantitative evaluation confirmed the absence of false positives in these methods and suggested their utility for taxonomic curation of existing genomic databases and for metadata quality control of newly submitted genomes.
TOP
I16 - Metagenomic inference and biomarker discovery for the gut microbiome in inflammatory bowel disease
Short Abstract: The inflammatory bowel diseases have been consistently linked to dysbiosis in the gut microbiota. This microbial dysfunction has not been fully characterized, however, due to the lack of methods assessing community functional activity and statistically associating it with disease. In this study, "virtual" metagenomes were inferred using 16S rRNA gene sequencing of 231 biopsies and stool samples. This incorporated analysis of 1,119 microbial genomes and was validated by shotgun metagenomics . A multivariate approach linking microbiome shifts to disease, treatment, or environment recovered dysbioses in ~2% of microbial clades, including depletion of Clades IV and XIVa Clostridia and enrichment of Enterobacteriaceae. However, microbial functional activity was more consistently disrupted in disease, with 12% of pathways associated with IBD. These included decreases in short-chain fatty acid production, oxidative stress, and shifts from amino acid biosynthesis towards transport. These results provide initial methods for assessing biomolecular functions corresponding to changes in microbial community ecology.
TOP
I17 - Alignment-free discrimination of human skin metagenomes
Short Abstract: The human skin contains a large number of bacteria that can be easily transferred to surfaces. Given that the bacterial composition of skin is personalized and stable across time, the information obtained from skin microbiomes could be used for forensic purposes.
The analysis of metagenomic samples generated by Next-Generation Sequencing (NGS) usually involves mapping reads to known genes or pathways and comparing the obtained profiles. However, this is a time-consuming and computationally intensive task. Another drawback of alignment-based methods is that they are restricted by the availability and completeness of existing databases. We developed an alignment-free approach that is based on k-mer frequencies to assess the level of similarity between metagenomic samples. This method does not need the reference genome and thus avoids biases common for alignment-based approaches. The method was tested on simulated metagenomics data and publicly available 16S rRNA NGS data.
Our approach was found to be able to distinguish between skin, gut, and oral metagenomic samples. Analysis of skin (right palm) metagenomes shows the stability of bacterial composition within an individual. Our method provides better clustering of metagenomes by individuals in comparison with the most commonly used for 16S rRNA data analysis tool - UniFraq.
Together these results indicate that the k-mer analysis can be used to reveal relationships between NGS samples without any alignment. The method can be applied to any sequence data. It is much less laborious than other approaches and potentially could be used for a broad number of tasks.
TOP
I18 - Comparative metagenomic analysis of electrogenic microbial communities
Short Abstract: Microorganisms called exoelectrogens can transfer electrons exogenously and therefore have been used as electron suppliers in microbial fuel cells (MFCs).
MFC technologies represent innovative approach for generating bioelectricity from a variety of biodegradable matters, usually organic waste. Some microbes, mostly bacteria, growing on anodic surface in MFC, degrade organic compounds and release electrons. Captured by anode, the electrons then travel to the cathode via connecting wire completing the circuit and producing electrical power. Thus, MFC system allows bacteria-driven waste removal with simultaneous electricity generation.
For the most effective MFC performance it is important to determine the most successful combination of microorganisms that a) can exogenously transfer electrons; b) can degrade a wide range of contaminants.
We performed shotgun sequencing and metagenome analyses of electrochemically active anodic biofilms from several effectively working MFCs. Taxonomic profiling revealed complex structure of anodophilic microbial communities detecting both known and so far unknown electricigens implying that the diversity of microbes capable of exoelectrogenic activity is just beginning to be discovered.
Functional metagenome annotation of the analyzed communities helped us learn more about the fundamentals of microbial electrogenesis and predict metabolic potential of mixed and individual strains for efficient waste treatment.
TOP
I19 - Exploring variation aware contig graphs for (comparative) metagenomics using MARYGOLD
Short Abstract: While many tools are available to study variation and its impact in single
genomes, there is a lack of algorithms for finding such variation in metagenomes. This hampers the interpretation of metagenomics sequencing datasets, which are increasingly acquired in research on the (human) microbiome, in environmental studies and in the study of processes in the production of foods and beverages. Existing algorithms often depend on the use of reference genomes, which poses a problem when a metagenome of a priori unknown strain composition is studied. In this paper, we develop a method to perform reference-free detection and visual exploration of genomic variation, both within a single metagenome and between metagenomes.
We present the MaryGold algorithm and its implementation, which efficiently detects bubble structures in contig graphs using graph decomposition. These bubbles represent variable genomic regions in closely related strains in metagenomic samples. The variation found is presented in a condensed Circos based visualization which allows for easy exploration and interpretation of the found variation.
We validated the algorithm on two simulated datasets containing three resp. seven Escherichia coli genomes and showed that finding allelic variation in these genomesimproves assemblies. Additionally, we applied MaryGold to two public real metagenomic datasets. It enabled us to find within-sample genomic variation in the metagenomes of a Kimchi fermentation process and of the microbiome of a pre-mature infant. Moreover, we used MaryGold for between-sample variation detection and exploration, bycomparing sequencing data sampled at different time points for both of these datasets.
TOP
View Posters By Category
TOP