Posters - Schedules
Posters Home

View Posters By Category

Monday, July 24, between 18:00 CEST and 19:00 CEST
Tuesday, July 25, between 18:00 CEST and 19:00 CEST
Session A Poster Set-up and Dismantle
Session A Posters set up:
Monday, July 24, between 08:00 CEST and 08:45 CEST
Session A Posters dismantle:
Monday, July 24, at 19:00 CEST
Session B Poster Set-up and Dismantle
Session B Posters set up:
Tuesday, July 25, between 08:00 CEST and 08:45 CEST
Session B Posters dismantle:
Tuesday, July 25, at 19:00 CEST
Wednesday, July 26, between 18:00 CEST and 19:00 CEST
Session C Poster Set-up and Dismantle
Session C Posters set up:
Wednesday, July 26,between 08:00 CEST and 08:45 CEST
Session C Posters dismantle:
Wednesday, July 26, at 19:00 CEST
Virtual
B-179: Inflammation in Children with CKD Linked to Gut Dysbiosis and Metabolite Imbalance
Track: MICROBIOME
  • Ulrike Löber, Max Delbrück Center for Molecular Medicine, Germany
  • Johannes Holle, Charité-Universitätsmedizin Berlin, Germany
  • Sofia Forslund, Experimental and Clinical Research Center, a cooperation of Charité Berlin and Max Delbrück Center, Germany
  • Hendrik Bartolomaeus, Charité-Universitätsmedizin Berlin, Germany
  • Nicola Wilck, Charité-Universitätsmedizin Berlin, Germany


Presentation Overview: Show

CKD is characterized by a sustained proinflammatory response of the immune system, promoting hypertension and cardiovascular disease. The underlying mechanisms are incompletely understood but may be linked to gut dysbiosis. Dysbiosis has been described in adults with CKD; however, comorbidities limit CKD-specific conclusions.
We analyzed the fecal microbiome (16S), metabolites, and immune phenotypes in 48 children (with normal kidney function, CKD stage G3-G4, G5 treated by hemodialysis, or kidney transplantation) with a mean±SD age of 10.6±3.8 years. 16S amplicon sequences were filtered, quality controlled and taxonomically assigned using the LotuS pipeline. To assess the effect of clinical group on microbiome and metabolome features, each feature was tested in turn. A co-abundance network of host, microbiome and metabolome features was calculated from the dataset as a whole by assessing pairwise Spearman correlations and adjusted for multiple testing using Benjamini-Hochberg FDR correction as implemented in the psych R package (v 1.9.12).
We observed compositional and functional alterations of the microbiome, including diminished production of short-chain fatty acids.
Gut barrier dysfunction and microbial metabolite imbalance apparently mediate the proinflammatory immune phenotype, thereby driving the susceptibility to cardiovascular disease. The data highlight the importance of the microbiota-immune axis in CKD, irrespective of confounding comorbidities.

B-180: Fast and robust metagenomic sequence comparison through sparse chaining with skani
Track: MICROBIOME
  • Jim Shaw, University of Toronto, Canada
  • Yun William Yu, University of Toronto, Canada


Presentation Overview: Show

Sequence comparison algorithms for metagenome-assembled genomes (MAGs) often have difficulties dealing with data that is high-volume or low-quality. We present skani (https://github.com/bluenote-1577/skani), a method for calculating average nucleotide identity (ANI) using sparse approximate alignments. skani is more accurate than FastANI for comparing incomplete, fragmented MAGs while also being > 20 times faster. For searching a database of > 65, 000 prokaryotic genomes, skani takes only seconds per query and 6 GB of memory. skani is a versatile tool that unlocks higher-resolution insights for larger, noisier metagenomic data sets.

B-181: A METAGENOMIC SURVEY OF SOIL MICROBIAL COMMUNITIES ASSOCIATED WITH THE SUNGBO EREDO MONUMENT, SOUTH WESTERN, NIGERIA
Track: MICROBIOME
  • Adesola Ajayi, Augustine University, Nigeria
  • Adetolu Amusan, Augustine University, Nigeria
  • Marion Adebiyi, Landmark University, Nigeria
  • Grace Olasehinde, Covenant University, Nigeria


Presentation Overview: Show

Available studies suggest that less than one percent of microorganisms present on earth are culturable. This indicates that our current view of the structural and functional diversity of the microbial world is severely limited. This study explored unculturable organisms from soil samples associated with the Sungbo-Eredo monument. The Sungbo Eredo monument is an ancient monumental public work with a system of defensive walls and ditches surrounding the Ijebu Kingdom in South Western, Nigeria. Previous reports indicate that it is approximately 180 km in length. A huge section of the monument cuts through the Augustine University, Ilara Epe, Nigeria campus, forming two-sided vertical walls with a deep ridge in-between. Our preliminary studies revealed species of Bacillus, Staphylococcus, Cellobiococcus and Micrococcus and fungal isolates belonging to the Genus Saccharomyces, Aspergillus and Botrytis. Using metagenomics, isolation of the soil DNA was carried out. A paired-end library sequencing technology (NextSeq 500 Illumina) was used; the reads were assembled and coding DNA sequences (CDS) were identified in combination with the NCBI BLAST reference sequences. This paper provides the raw data and assembly (reads and contigs), followed by an initial functional and taxonomic analysis, as a base-line for further studies on the Sungbo Eredo monument.

B-182: Compendium of specialized metabolite biosynthetic diversity encoded in bacterial genomes
Track: MICROBIOME
  • Athina Gavriilidou, Eberhard Karls University of Tübingen, Germany
  • Satria A. Kautsar, Joint Genome Institute, United States
  • Nestor Zaburannyi, Helmholtz Institute for Pharmaceutical Research Saarland, Germany
  • Daniel Krug, Helmholtz Institute for Pharmaceutical Research Saarland, Germany
  • Rolf Müller, Helmholtz Institute for Pharmaceutical Research Saarland, Germany
  • Marnix H. Medema, Helmholtz Institute for Pharmaceutical Research Saarland, Germany
  • Nadine Ziemert, Eberhard Karls University of Tübingen, Germany


Presentation Overview: Show

The emergence of antibiotic-resistant pathogens is on the rise, making our need for new antimicrobial compounds dire. Bacterial secondary metabolites have been the focus of discovery efforts for decades now. Even after the introduction of bioinformatics methods such as targeted genome mining for sequences related to biosynthesis, the identification rate of new structures has been decelerating. It is disputable if natural sources of new compounds are still abundant and whether discovery efforts should be focused on established antibiotic producers or rather on understudied taxa. In our project, we applied scalable algorithms to answer these questions on a kingdom scale. We analysed ~170,000 bacterial genomes and more than 45,000 Metagenome Assembled Genomes (MAGs). We have concluded that only 3% of the genomic potential for natural products is currently accessible to us. We studied the association between phylogenetic distance and diversification of the producer’s genetic potential for useful compounds. Our calculations connect the emergence of most biosynthetic diversity in evolutionary history close to the taxonomic rank of genus. Finally, our comparative analyses establish Streptomyces as by far the most biosynthetically diverse taxon and feature multiple gifted but less researched bacteria.

B-183: Microbiome profiling from Fecal Immunochemical Test reveals microbial signatures with potential for Colorectal Cancer screening
Track: MICROBIOME
  • Olfat Khannous-Lleiffe, Barcelona Supercomputing Center (BSC-CNS) and Institute for Research in Biomedicine (IRB), Spain
  • Jesse R. Willis, Barcelona Supercomputing Center (BSC-CNS) and Institute for Research in Biomedicine (IRB), Spain
  • Ester Saus, Barcelona Supercomputing Center (BSC-CNS) and Institute for Research in Biomedicine (IRB), Spain
  • Víctor Moreno, Catalan Institute of Oncology, Spain
  • Sergi Castellví-Bel, Gastroenterology Department, Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Spain
  • Toni Gabaldón, Barcelona Supercomputing Center (BSC-CNS) and Institute for Research in Biomedicine (IRB), Spain


Presentation Overview: Show

Colorectal cancer (CRC) is the third most common cancer and the second leading cause of cancer deaths worldwide. Early diagnosis of CRC, which saves lives and enables better outcomes, is generally implemented through a two-step population screening approach based on the use of Fecal Immunochemical Test (FIT) followed by colonoscopy if the test is positive. However, the FIT step has a high false positive rate, and there is a need for new predictive biomarkers to better prioritize cases for colonoscopy. Here we used 16S rRNA metabarcoding from FIT positive samples to uncover microbial taxa, taxon co-occurrence and metabolic features significantly associated with different colonoscopy outcomes, underscoring a predictive potential and revealing changes along the path from healthy tissue to carcinoma. Finally, we used machine learning to develop a two-phase classifier which reduces the current false positive rate while maximizing the inclusion of CRC and clinically relevant samples.

B-184: Benchmarking of shotgun metagenomics analysis results obtained with state-of-the-art optimized pipelines on simulated gut microbiome samples
Track: MICROBIOME
  • Marianne Borderes, MaaT Pharma, France
  • Cyrielle Gasc, MaaT Pharma, France
  • Emmanuel Prestat, MaaT Pharma, France
  • Carole Schwintner, MaaT Pharma, France


Presentation Overview: Show

Shotgun metagenomics analysis results are strongly dependent on the used bioinformatics software, databases and parameters. With the aim to compare our gut microbiome data analysis performances and to identify the main drivers impacting the results, we compared the taxonomic and functional analysis results obtained with our gutPrint® proprietary pipeline, MgRunner, with state-of-the-art optimized and routinely used analysis pipelines. Two simulated datasets, one for each type of analysis, were shared with three distinct pipelines’ owners and standard results were collected. Several complementary and commonly used evaluation metrics were computed. Overall, we did not observe any single pipeline performing best on all metrics and analyses, and benchmarking the functional analysis results was challenging. MgRunner held a good global position as compared to the evaluated pipelines and it showed the best F1 score values, demonstrating the best trade-off between precision and recall. As expected, the used approach and database had a strong impact on results and performances, but so did the taxa abundance filter threshold. Ultimately, this study shows the importance of evaluating complete analysis pipelines with optimized parameters using reference taxonomic and functional standard datasets, to properly assess the quality of obtained gut microbiome profiles and the limits of analysis pipelines.

B-185: A curated data resource of 214K metagenomes for characterization of the global antimicrobial resistome
Track: MICROBIOME
  • Hannah-Marie Martiny, Technical University of Denmark, Denmark
  • Patrick Munk, Technical University of Denmark, Denmark
  • Christian Brinch, Technical University of Denmark, Denmark
  • Frank Aarestrup, Technical University of Denmark, Denmark
  • Thomas N. Petersen, Technical University of Denmark, Denmark


Presentation Overview: Show

The growing threat of antimicrobial resistance (AMR) calls for new epidemiological surveillance methods, as well as a deeper understanding of how antimicrobial resistance genes (ARGs) have been transmitted around the world. The large pool of sequencing data available in public repositories provides an excellent resource for monitoring the temporal and spatial dissemination of AMR in different ecological settings. However, only a limited number of research groups globally have the computational resources to analyze such data. We retrieved 442 Tbp of sequencing reads from 214,095 metagenomic samples from the European Nucleotide Archive (ENA) and aligned them using a uniform approach against ARGs and 16S/18S rRNA genes. Here, we present the results of this extensive computational analysis and share the counts of reads aligned. Over 6.76∙108 read fragments were assigned to ARGs and 3.21∙109 to rRNA genes, where we observed distinct differences in both the abundance of ARGs and the link between microbiome and resistome compositions across various sampling types. This collection is another step towards establishing global surveillance of AMR and can serve as a resource for further research into the environmental spread and dynamic changes of ARGs.

B-186: PhaBOX: A web server for identifying and characterizing bacteriophage contigs in metagenomic data
Track: MICROBIOME
  • Jiayu Shang, City University of Hong Kong, Hong Kong
  • Cheng Peng, City University of Hong Kong, Hong Kong
  • Herui Liao, City Univeristy of Hong Kong, Hong Kong
  • Xubo Tang, City Univeristy of Hong Kong, Hong Kong
  • Yanni Sun, City Univeristy of Hong Kong, Hong Kong


Presentation Overview: Show

Bacteriophages (phages) play key roles in regulating the composition/function of the microbiome by infecting their host bacteria. Being the most abundant organisms on Earth, novel phages awaiting to be discovered constitute a large portion of ""viral dark matter"". Metagenomic sequencing has become a popular means for phage discovery. However, lacking an integrated and easy-to-use software prevents people from fully taking advantage of the sequencing data for phage discovery and analysis.

In this work, we developed PhaBOX, a web server that provides comprehensive and convenient phage analysis from metagenomic data. PhaBOX integrates our previously published tools: PhaMer, PhaTYP, PhaGCN, and CHERRY, for phage identification, lifestyle prediction, taxonomy classification, and host prediction, respectively. All these tools combined the strength of alignment-based strategies and deep learning models to learn different sequence-based features, including protein organizations, sequence homology, and protein-protein associations. The performance of these tools outperforms the available programs based on our rigorous tests on highly diverged phages, short contigs, mock metagenomic data, and real metagenomic data. PhaBOX supports visualization of the essential features for making predictions. With the user-friendly graphical interface, users with or without informatics training can easily use the web server to analyze metagenomic data phages.

B-187: Identifying plasmid contigs from metagenomic data using Transformer
Track: MICROBIOME
  • Xubo Tang, City University of Hong Kong, Hong Kong
  • Jiayu Shang, City University of Hong Kong, Hong Kong
  • Yongxin Ji, City University of Hong Kong, Hong Kong
  • Yanni Sun, City University of Hong Kong, Hong Kong


Presentation Overview: Show

Plasmids are mobile genetic elements that carry important accessory genes. Cataloging plasmids is crucial for elucidating their roles in promoting horizontal gene transfer between bacteria. Metagenomic sequencing is the main source for discovering new plasmids. However, it is difficult to detect plasmid contigs, which are often short and have heterogeneous origins. Available tools for plasmid contig detection have limitations. Specifically, alignment-based tools tend to miss diverged plasmids, while learning-based tools often have lower precision.
In this work, we develop a plasmid detection tool PLASMe that exploits both alignment and learning-based strategies. Closely related plasmids to the reference plasmids can be easily identified using alignment, while diverged plasmids can be predicted using order-specific Transformer models. By encoding plasmids as a language defined on the protein cluster-based token set, Transformer can learn the importance of proteins and correlation through the positionally token embedding and attention mechanism. We compared PLASMe and other tools on complete plasmids, plasmid contigs, and contigs assembled from CAMI2 simulated data. PLASMe achieved the highest F1-score. We also tested it on real metagenomic and plasmidome data. The examination of some commonly used marker genes shows that PLASMe exhibits more reliable performance than other tools.

B-188: A systems immunology study to assess the impact of early-life antibiotic exposure and the gut microbiota on infant vaccine immune responses.
Track: MICROBIOME
  • Feargal Ryan, South Australian Health and Medical Research Institute & Flinders University, Australia


Presentation Overview: Show

Integrating multi-omics data is critical to understanding the role that the gut microbiome plays in regulating host immunity development. In 2017, we established the systems immunology Antibiotics and Immune Responses (AIR) study to understand how early life antibiotic exposure may be impacting responses to scheduled vaccines. The AIR study assessed the impact of neonatal antibiotic exposure on infant vaccine immune responses at 7 and 15 months in a cohort of 255 vaginally-born, healthy, term infants. Neonates directly exposed to antibiotics had significantly lower antibody titres against multiple different vaccine antigens, most notably to polysaccharides in the PCV13 pneumococcal vaccine.
To understand the host-microbe interactions underpinning how antibiotic exposure impacted vaccine responses we undertook a longitudinal multi-omics analysis integrating shotgun metagenomics and bacterial load measurements (n=409 samples) with whole blood transcriptomics (RNAseq, n=329 samples) multi-parameter immunophenotyping by flow cytometry (n=156) and vaccine antibody responses (n=499). Through this analysis we identified a direct correlation with Bifidobacterium spp. and vaccine responses, which appeared to be mediated by changes in B cell transcriptional signatures pre-vaccination. The AIR study provides a compelling example of the utility of multi-omics microbiota research for understanding how host-microbe interactions shape health.

B-189: Pan-cancer analysis reveals tumor microbiome associations with host molecular aberrations
Track: MICROBIOME
  • Chenchen Ma, Department of Human Cell Biology and Genetics, School of Medicine, Southern University of Science and Technology, China
  • Changxing Su, Department of Human Cell Biology and Genetics, School of Medicine, Southern University of Science and Technology, China
  • Jiaxuan Li, Department of Human Cell Biology and Genetics, School of Medicine, Southern University of Science and Technology, China
  • Shimin Shuai, Department of Human Cell Biology and Genetics, School of Medicine, Southern University of Science and Technology, China


Presentation Overview: Show

Host-microbiome interaction is known to play a pivotal role in the cancer ecosystem, yet the associations have not been systematically investigated at the pan-cancer and multi-omics levels. Here, we evaluated nearly 10,000 samples across 32 cancer types collected from The Cancer Genome Atlas (TCGA), to investigate the association between the tumor microbiome (taxa, n=1,630) and tumor microenvironment composition (cell types, n=20), epigenome (CpG island methylation, n=30,716), transcriptome (gene expression, n=10,216) and proteome (protein expression, n=193). We identified 836,738 candidate associations between the tumor microbiome and host molecular aberrations across multiple cancers. Besides cancer-specific associations, we also revealed recurrent pan-cancer associations between microbes (Lachnoclostridium, Flammeovirga, Terrabacter and Campylobacter) and immune cells, as well as between microbes (Collimonas and Sutterella) and fibroblasts, which were further validated by cell type estimations derived from pathological images and methylation data. We also identified several potential microbe and gene/protein expression associations mediated by DNA methylation using the sequential mediation analysis. Furthermore, our survival analysis demonstrated that tumor microbes may affect the patient's overall survival and progression-free survival. Finally, a user-friendly web portal, Multi-Omics and Microbiome Associations in Cancer (MOMAC) was constructed for users to explore potential host-microbe interactions in cancer.

B-190: Predicting Incident Heart Failure from the Microbiome: The FINRISK DREAM challenge
Track: MICROBIOME
  • Pande Putu Erawijantari, Department of Computing, Faculty of Technology, University of Turku, Finland
  • Ece Kartal, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany, Germany
  • Rajesh Shigdel, Department of Clinical Science, University of Bergen, Bergen, Norway, Norway
  • Mike Inouye, Department of Public Health and Primary Care, Cambridge University, Cambridge, United Kingdom
  • Pekka Jousilahti, Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
  • Mohit Jain, Department of Medicine; Department of Pharmacology, UCSD, CA., United States
  • Rob Knight, Center for Microbiome Innovation;Department of Pediatric; Department of Computer Science & Engineering, UCSD, CA, United States
  • Veikko Saloma, Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland, Finland
  • Teemu Niiranen, Division of Medicine, Turku University Hospital, Department of Internal Medicine, University of Turku, Finland
  • Aki S. Havulinna, Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Finland, Finland
  • Julio Saez-Rodriguez, Institute for Computational Biomedicine, Heidelberg University, Germany; Informatics for Life,Germany, Germany
  • Rebecca T Levinson, Institute for Computational Biomedicine, Heidelberg University, Germany; Informatics for Life, Germany, Germany
  • Leo Lahti, Department of Computing, Faculty of Technology, University of Turku, Turku, Finland, Finland


Presentation Overview: Show

Heart failure (HF) is a complex clinical syndrome characterized by the heart's inability to meet the body's blood supply needs, which affects approximately 26 million adults globally. Current diagnosis relies heavily on symptoms and clinical history, underscoring the importance of early identification of individuals at risk. We launched the FINRISK Microbiome DREAM challenge (synapse.org/finrisk) in September 2022 aimed to investigate the potential of gut microbiome compositions (n=5749 taxonomic features) in predicting HF risk in a large population of 7231 Finnish adults (FINRISK 2002, n = 493 incident HF cases). To protect the privacy of individuals, we provided synthetic data that closely mimics the real data. Challenge participants' models were evaluated using Harrell's C and the Hosmer-Lemeshow test, with robust ranking ensured through bootstrap sampling of predicted and true scores. After evaluations we selected 2 teams as winners whose performances were comparable (Harrell’s C statistic: 0.8394, 0.8351; Hosmer-Lemeshow test: 0.0033, 0.012, respectively for both teams). Both teams employed regression methods with different approaches for defining feature importance. The challenge offered a platform for advancing our understanding of microbiome's role in HF. The challenge paves the way for future research to improve HF risk prediction and patient outcomes.

B-191: Redefining the definition of microbiome health
Track: MICROBIOME
  • Kinga Zielińska, Malopolska Centre of Biotechnology, Jagiellonian University, Poland
  • Dagmara Błaszczyk, Malopolska Centre of Biotechnology, Jagiellonian University, Poland
  • Krzysztof Mnich, University of Białystok, Computational Centre, Bialystok, Konstantego Ciołkowskiego 1M, Poland
  • Witold Wydmański, Malopolska Centre of Biotechnology, Jagiellonian University, Poland
  • Valentyn Bezshapkin, Malopolska Centre of Biotechnology, Jagiellonian University, Poland
  • Tomasz Kościółek, Malopolska Centre of Biotechnology, Jagiellonian University, Poland
  • Witold Rudnicki, University of Białystok, Computational Centre, Bialystok, Konstantego Ciołkowskiego 1M, Poland
  • Paweł P Łabaj, Malopolska Centre of Biotechnology, Jagiellonian University, Poland


Presentation Overview: Show

The ability to evaluate one’s health status based on a gut sample, independently of clinical diagnosis, has been a focus of numerous human microbiome studies. Current approaches are restricted to measuring alterations in the “core microbiome”, a set of taxa most frequent among healthy individuals, as potential signs of dysbiosis. Our analyses of the Human Microbiome Project2 (HMP2) samples, however, identified functional aspects of the gut microbiome as more conserved among healthy individuals. This aligns well with previous hypotheses that a functional profile can be achieved by various combinations of taxa and their interactions, therefore, a shift towards defining health status based on the functional profiles is required. We used our newest deep learning-based pipeline to expand the functional annotation of the HMP2 samples and applied MultiDimensional Feature Selection algorithm (MDFS) to select top features depending on their relative interactions. This unique approach leads to a better understanding of the functional profiles of the microbiome by identifying relative contributions and interactions of key species that contribute to it. Our goal is to better understand the microbial synergies that influence the functional aspects of the microbiome and, ultimately, to redefine what is currently known about human gut microbiome health.

B-192: ganon2: high performance algorithms to perform up-to-date and scalable metagenomics analysis
Track: MICROBIOME
  • Vitor C. Piro, Freie Universität Berlin, Germany
  • Knut Reinert, Freie Universität Berlin, Germany


Presentation Overview: Show

The fast growth of public repositories of sequences greatly contributes to the success of metagenomics applications. However, they are growing in a much faster pace than the resources to use them. This challenges current methods, which struggle to take full advantage of the massive and fast data generation. We propose a generational leap on performance and usability with ganon2, a sequence classification method for metagenomics. ganon2 indexes large datasets with a small memory footprint maintaining fast, sensitive and precise classification results. This is possible with the use of the HIBF, minimizers and further improvements. Based on NCBI-RefSeq representative genomes (arc+bac+fun+vir) from 03-2023, ganon2 indices are up-to 90% smaller and faster to build than kraken2+bracken2, requiring only 20GB of space and 18 minutes on a standard server. Using CAMI1+2 challenge sets, ganon2 achieved up-to 50% higher F1-score and lower L1-norm in taxonomic profiling with similar time performance compared to the fastest state-of-the-art methods. ganon2 packs an extensive list of features and will be released with a comprehensive and continuous benchmark, comparing several tools, datasets, databases, parameters and scenarios. ganon2 will enable the usage of larger reference sets in daily microbiome analysis, improving resolution of results. Most features are already available: https://github.com/pirovc/ganon

B-193: Augmenting 16S pairwise similarity data using a novel syntenic measure in bacteria
Track: MICROBIOME
  • Vivek Ramanan, Brown University, United States
  • Indra Neil Sarkar, Brown University, United States


Presentation Overview: Show

Relationships between bacterial taxa have traditionally been defined using 16S rRNA nucleotide similarity. As sequencing technology improves, additional pairwise information on genome sequences can also provide valuable information on genomic relationships. Mapping orthologous gene locations between a pair of genomes, known as synteny, is traditionally implemented in the discovery of new species and has not been systematically applied to bacterial genomes. Using a dataset of 379 unique bacterial genomes from GenBank, we developed and tested a new measure of synteny similarity between a pair of genomes, which we scale onto 16S rRNA distance using covariance matrices. Based on the functions of orthologous genes used (i.e. core genes, antibiotic resistance genes, virulence genes), we observe different topological arrangements of bacterial relationship networks. Applying (1) complete linkage hierarchical clustering and (2) KNN graph structures to syntenic-scaled 16S data resulted in differential clustering at the genus level while preserving general groupings. This indicates that including syntenic relationships into pairwise genome similarity measures can provide more granular relationships for within-genera taxa, particularly in functional contexts related to pathogenicity and antibiotic resistance.

B-194: Beneath the ashes: Soil microbial diversity after a burn
Track: MICROBIOME
  • Siddharth Uppal, University of Wisconsin-Madison, United States
  • Jason Kwan, University of Wisconsin-Madison, United States
  • Thea Whitman, University of Wisconsin-Madison, United States
  • Jamie Woolet, Colorado State University and University of Wisconsin-Madison, United States
  • Muthusubramanian Venkateshwaran, University of Wisconsin - Platteville, United States
  • Christopher Baxter, University of Wisconsin - Platteville, United States
  • Yari B. Johnson, US Army Corps of Engineers and University of Wisconsin - Platteville, United States


Presentation Overview: Show

Soil’s microbial biodiversity seems to contradict the principle of competitive exclusion, in which one species will eventually dominate others that compete for the same limited resources. The factors contributing to soil’s biodiversity remain in question. We aim to address this by studying changes in soil biodiversity over time following a disturbance. A fire significantly reduces the microbial diversity in soil, which gradually increases in complexity. We analyzed soil samples collected after prescribed burns of restored prairie at four time points to examine the soil microbial community establishment over time.

Analysis of the 16S rRNA data revealed microbial communities in burned plots differed significantly from the communities in the unburned plots a week and a month after the burn, but the scale of this difference reduced five months after the burn. Shotgun metagenomic analysis revealed that the relative abundance of spore-forming Firmicutes increased more than any other taxa after the burn. Surprisingly, Patescibacteria, which have reduced genomes, were one of the most increased genomes a week after the burn, possibly hinting at Black Queen hypothesis at play. Furthermore, our findings suggest that faster growth rates might be contributing to increased relative abundance of some genomes a week after the burn.

B-195: stRainy: assembly-based metagenomic strain phasing using long reads
Track: MICROBIOME
  • Ekaterina Kazantseva, ITMO University, St. Petersburg, Russia, Russia
  • Ataberk Donmez, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA, United States
  • Mihai Pop, Department of Computer Science, University of Maryland, College Park, MD, USA, United States
  • Mikhail Kolmogorov, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA, United States


Presentation Overview: Show

Bacterial species in microbial communities are often represented by mixtures of strains. Variation in strain genomes may have important phenotypic effects, however strain-level deconvolution of microbial communities remains challenging. Short-read approaches can be used to detect small-scale variation between strains, but fail to phase these variants into contiguous haplotypes. Recent advances in long-read metagenomics resulted in complete de novo assemblies of various bacterial species. However, current assembly approaches often suppress strain-level variation, and instead produce species-level consensus representation. Strain variants are often unevenly distributed, and regions of high and low heterozygosity may interleave in the assembly graph, resulting in tangles. To address this, we developed an algorithm for metagenomic phasing and assembly called stRainy. Our approach takes a sequence graph as input, identifies graph regions that represent collapsed strains, phases them and represents the results in an expanded and simplified assembly graph. We benchmark stRainy using simulated data and mock metagenomic communities with both PacBio HiFi and Oxford Nanopore reads and show that it achieves strain-level deconvolution with high completeness and low error rates, compared to the other strain assembly and phasing approaches.

B-196: Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis
Track: MICROBIOME
  • Anupam Gautam, University of Tübingen/Max Planck Institute for Biology Tübingen, Germany
  • Daniel H Huson, University of Tübingen, Germany


Presentation Overview: Show

In microbiome analysis, one main approach is to align metagenomic sequencing reads against a protein reference database, such as NCBI-nr, and then to perform taxonomic and functional binning based on the alignments. This approach is embodied, for example, in the standard DIAMOND+MEGAN analysis pipeline, which first aligns reads against NCBI-nr using DIAMOND and then performs taxonomic and functional binning using MEGAN. Here, we propose the use of the AnnoTree protein database, rather than NCBI-nr, in such alignment-based analyses to determine the prokaryotic content of metagenomic samples. We demonstrate a 2-fold speedup over the usage of the prokaryotic part of NCBI-nr and increased assignment rates, in particular assigning twice as many reads to KEGG. In addition to binning to the NCBI taxonomy, MEGAN now also bins to the GTDB taxonomy.

IMPORTANCE The NCBI-nr database is not explicitly designed for the purpose of microbiome analysis, and its increasing size makes its unwieldy and computationally expensive for this purpose. The AnnoTree protein database is only one-quarter the size of the full NCBI-nr database and is explicitly designed for metagenomic analysis, so it should be supported by alignment-based pipelines.

B-197: Taxonomic and functional diversity of rumen and fecal microbiomes in Nelore cattle linked to diet and production phenotypes
Track: MICROBIOME
  • Liliane Costa Conteville, Embrapa Pecuária Sudeste, São Carlos, SP, Brazil
  • Juliana Virginio Silva, Universidade Federal de São Carlos, Centro de Ciências Biológicas e da Saúde, São Carlos, SP, Brazil
  • Bruno Gabriel Nascimento Andrade, Munster technological University, Cork, Ireland
  • Adhemar Zerlotini, Embrapa Informática Agropecuária, Campinas, SP, Brazil
  • Gerson Barreto Mourão, Centro de Genômica Funcional, Universidade de São Paulo, Piracicaba, SP, Brazil
  • Luiz Lehmann Coutinho, Centro de Genômica Funcional, Universidade de São Paulo, Piracicaba, SP, Brazil
  • Julio Cesar Pascale Palhares, Embrapa Pecuária Sudeste, São Carlos, SP, Brazil
  • Alexandre Berndt, Embrapa Pecuária Sudeste, São Carlos, SP, Brazil
  • Sergio Raposo Medeiros, Embrapa Pecuária Sudeste, São Carlos, SP, Brazil
  • Luciana Correia de Almeida Regitano, Embrapa Pecuária Sudeste, São Carlos, SP, Brazil


Presentation Overview: Show

We investigated the relationship between ruminal and fecal microbiome composition and functionality with diet and production traits in Nelore bulls (Bos indicus) using shotgun metagenomics. Two groups of 26 animals received different dietary treatments: conventional (corn silage, corn, soybean, rumen-protected fat, and urea) or by-products (citrus pulp, corn germ, corn germ oil meal, and peanut shell) diet. The ruminal microbiome showed higher microbial diversity than the fecal microbiome and significant differences in the abundance of functions related to the metabolism of carbohydrates and proteins were observed. Diet impacted on the microbial diversity (mainly archaeal), microbial composition and functional profile of both ruminal and fecal microbiomes. In total, 89 microbial genera and functions related to the metabolism of the diets’ components were differentially abundant between the dietary groups. Considering the animals' phenotypes, we identified associations between microbial diversity and genera abundances with increase and decrease of the phenotypes’ indexes related to feed efficiency and methane emission. We also identified potential functions associated with these taxa that may have the potential to impact on these phenotypes. These results have the potential to contribute to the development and implementation of strategies to reduce the environmental impact of livestock and improve animal production.

B-198: Assembly and characterization of 916 metagenome-assembled genomes from the gastrointestinal microbiome of Nelore bulls
Track: MICROBIOME
  • Liliane Costa Conteville, Embrapa Pecuária Sudeste, São Carlos, SP, Brazil
  • Leticia Maria Aquino, Universidade Federal de São Carlos, São Carlos, SP, Brazil
  • Juliana Virginio Silva, Universidade Federal de São Carlos, São Carlos, SP, Brazil
  • Bruno Gabriel Andrade, Munster technological University, Cork, Ireland
  • Adhemar Zerlotini, Embrapa Informática Agropecuária, Campinas, SP, Brazil
  • Gerson Barreto Mourão, Centro de Genômica Funcional, Universidade de São Paulo, Piracicaba, SP, Brazil
  • Luiz Lehmann Coutinho, Centro de Genômica Funcional, Universidade de São Paulo, Piracicaba, SP, Brazil
  • Julio Cesar Pascale Palhares, Embrapa Pecuária Sudeste, São Carlos, SP, Brazil
  • Luciana Correia de Almeida Regitano, Embrapa Pecuária Sudeste, São Carlos, SP, Brazil


Presentation Overview: Show

Ruminants digest complex dietary components due to the complex microbial communities they harbor in their gastrointestinal tract. However, a large percentage of these microbes are unknown or poorly characterized due to inherent difficulties in cultivation procedures. In this study, based on metagenomic data, we generated 916 bacterial and archaeal genomes assembled from the rumen and feces of 52 Nelore bulls (Bos indicus). Data was assembled, binned, dereplicated and taxonomically assigned using MEGAHIT, MetaBAT2, dRep and GTDB-Tk, respectively. Most bacterial genomes belong to Firmicutes and Bacteroidota and all archaeal genomes belong to the genus Methanobrevibacter. Prokka annotation identified ~2 million coding regions. We searched these regions for CAZymes and antibiotic resistance genes using CAZy and MEGARes databases. Most CAZymes encoded glycoside hydrolases and glycosyltransferases whose families are associated with the metabolism of components present in the two diets that were administered to the animals. The resistance genes identified have the potential to confer resistance to antibiotics used in both veterinary and human medicine to combat infectious pathogens. Thus, our dataset substantially improves the coverage of available microbial genomes from ruminants and represents a valuable resource for the investigation of the genetic content and metabolic potential of these organisms.

B-199: Dietary impacts on rumen metatranscriptome of Nelore cattle
Track: MICROBIOME
  • Juliana Virginio da Silva, Universidade Federal de São Carlos, São Carlos, SP, Brazil
  • Liliane Costa Conteville, Embrapa Pecuária Sudeste, São Carlos, SP, Brazil
  • Bruno Gabriel Nascimento Andrade, Munster technological University, Cork, Ireland
  • Adhemar Zerlotini, Embrapa Informática Agropecuária, Campinas, SP, Brazil
  • Gerson Barreto Mourão, Universidade de São Paulo, Piracicaba, SP, Brazil
  • Luiz Lehmann Coutinho, Universidade de São Paulo, Piracicaba, SP, Brazil
  • Julio Cesar Pascale Palhares, Embrapa Pecuária Sudeste, São Carlos, SP, Brazil
  • Luciana Correia de Almeida Regitano, Embrapa Pecuária Sudeste, São Carlos, SP, Brazil


Presentation Overview: Show

Metagenomics and metatranscriptomics have been applied to investigate alterations in the composition and functionality of the ruminal microbiome induced by diet. However, few studies integrate these omics to compare the impact of different diets in beef cattle. By integrating metatranscriptome data and 447 previously constructed metagenome-assembled genomes (MAGs), we analyzed differences in the expression profile of the ruminal microbiome of two groups of 26 Nelore bulls that received different dietary treatments: conventional or by-products. Coding regions and functional annotations of MAGs were performed using Prokka and KofamKOALA, respectively. Kallisto was used to map metatranscriptome reads against MAGs. The data were normalized and analyzed using R. About 960,000 coding regions were obtained and around 884,000 had hits against metatranscriptome. 14,897 genes with known functions were differentially expressed between the diets (8,456 for by-products and 6,531 for conventional) considering the Wilcoxon paired test. Briefly, our work showed that different diets can cause variations in the functionality of the ruminal microbiome, mainly by impacting the expression levels of genes with functions related to carbohydrate, energy and lipid metabolism. And these differences can be observed in several bacterial phyla. Future work intends to associate these results with feed efficiency and methane emissions.

B-200: CRISPR-resolved virus-host interactions in a municipal landfill include non-specific viruses, hyper-targeted viral populations, and interviral conflicts
Track: MICROBIOME
  • Nikhil George, University of Waterloo, Canada
  • Laura Hug, University of Waterloo, Canada


Presentation Overview: Show

Viruses of microbes are the most abundant biological entities on the planet and impact microbial community structure and ecosystem services. Viruses are underrepresented in reference databases, notably for those identified from engineered systems. We examined host-virus interactions via host CRISPR spacer to viral protospacer mapping in a municipal landfill across two years. ~4% of our sequenced metagenomes were of viral origin. A total of 458 unique virus-host connections captured hyper-targeted viral populations and host CRISPR array adaptation over time. Hyper-targeted viral elements showed a relative depletion in Chi sites which we predict influences host CRISPR-Cas systems’ disproportionate recruitment of spacers from these elements. Four bacteriophages were predicted to infect across multiple phyla, suggesting that some bacteriophages are far less host-specific than is currently understood. We detected 161 viral elements that encode CRISPR arrays, including one with 187 spacers, the longest virally-encoded CRISPR array described to date. Virally-encoded CRISPR arrays targeted other viral elements in interviral conflicts. CRISPR-encoding proviruses within host chromosomes were latent examples of CRISPR-immunity-based superinfection exclusion. Our networks highlight virus-host interactions that are rare or have not previously been described, suggesting that landfills, as heterogeneous contaminated sites with unique selective pressures, are key locations for atypical virus-host dynamics.

B-201: A Multi-Omics Approach for Inflammatory Bowel Disease Biomarkers Detection
Track: MICROBIOME
  • Mehdi Auberson, Philip Morris International R&D, Switzerland
  • Giuseppe Lo Sasso, Philip Morris International R&D, Switzerland
  • Nikolai Ivanov, Philip Morris International R&D, Switzerland
  • Björn Titz, Philip Morris International R&D, Switzerland
  • Nicolas Sierro, Philip Morris International R&D, Switzerland
  • Emmanuel Guedj, Philip Morris International R&D, Switzerland
  • Remi Dulize, Philip Morris International R&D, Switzerland
  • David Bornand, Philip Morris International R&D, Switzerland
  • Dariusz Peric, Philip Morris International R&D, Switzerland
  • Lusine Khachatryan, Philip Morris International R&D, Switzerland
  • Oksana Lavrynenko, Philip Morris International R&D, Switzerland
  • Catherine Nury, Philip Morris International R&D, Switzerland
  • Sophie Dijon, Philip Morris International R&D, Switzerland
  • Aziz Fennouri, Philip Morris International R&D, Switzerland
  • Matthieu Porchet, Philip Morris International R&D, Switzerland
  • James Battey, Philip Morris International R&D, Switzerland
  • Csaba Laszlo, Philip Morris International R&D, Switzerland
  • Yang Xiang, Philip Morris International R&D, Switzerland


Presentation Overview: Show

Inflammatory bowel disease (IBD) refers to a group of chronic idiopathic immune-mediated diseases driven by many factors: genetic susceptibility, eating habits, and the intestinal microbiome. Disease development consists of many interconnected biological processes. A systematic and comprehensive analysis of the molecular layers, proteins, lipids, and small molecules, and their interactions is therefore required for an overview of the disease’s origin and progression. The standard Single-Omics approaches offer incomplete perspectives on a biological system, making it impossible to understand the scope of pathogenic processes. Combining the data from many -omics fields (Multi-Omics Approach) provides a comprehensive and detailed analysis that provides clearer insights into biological mechanisms and facilitates the development of well-targeted and efficient therapies.
We implemented two Multi-Omics data integration strategies searching for molecular biomarkers and compared the findings with results from different Single-Omics approaches. We integrated host proteomics, lipidomics, metabolomics, metagenomics, and metatranscriptomics data originating from patients with different types of IBD and healthy controls. Our results indicate that incorporating Multi-Omics is beneficial for IBD sample classification compared to each of the Single-Omics approaches. Identified Multi-Omics disease-related signatures were used to establish a correlation Multi-Omics network, thus highlighting the biological processes possibly connected to disease status.

B-202: Does imprinting play a role in the onset of early childhood obesity?
Track: MICROBIOME
  • Muhammad Arshad, New York University Abu Dhabi, United Arab Emirates
  • Yvonne Valles, New York University Abu Dhabi, United Arab Emirates
  • Nizar Drou, New York University Abu Dhabi, United Arab Emirates
  • Kristin Gunsalus, New York University Abu Dhabi, United Arab Emirates
  • Raghib Ali, New York University Abu Dhabi, United Arab Emirates
  • Abdishakur Abdulle, New York University Abu Dhabi, United Arab Emirates


Presentation Overview: Show

Maternal imprinting is a powerful mechanism by which the mother’s genotype determines the phenotype of her offspring independently of the offspring’s’ own genotype. If the imprinting goes far beyond the mother’s genetic makeup to include the mother’s hologenome is still not known, particularly in obesity where the gut microbiome has been proven to be significantly different between obese and non-obese (“normoweight”) individuals. To address this gap, we examined whether the mother’s pre-pregnancy weight status “imprints” the fetus, priming the risk of early childhood obesity onset. Using shotgun sequencing, we analyzed stool samples (n=206) taken at five time points (at birth, 2, 4, 6 and 12 months) throughout first year of life from infants born to obese (Case, n=23) and normoweight mothers (Control, n=23). We found that the mothers’ pre-pregnancy BMI status did not significantly influence the infants’ gut microbiome diversity and that although the Firmicutes/Bacteroidetes ratio, a recognised obesity biomarker, was significantly higher in infants born to obese mothers (Figure 1), the trend was mainly driven by the introduction of solid foods. Our preliminary results suggest that offsprings are not imprinted with the mother’s hologenome, and that maternal imprinting mechanisms in the context of obesity need to be investigated further.

B-203: Three longitudinal regimes of the human gut microbiome
Track: MICROBIOME
  • Zuzanna Karwowska, Jagiellonian University MCB, Poland
  • Paweł Szczerbiak, Jagiellonian University MCB, Poland
  • Tomasz Kościółek, Jagiellonian University MCB, Poland


Presentation Overview: Show

Despite the majority of microbiome studies being cross-sectional, it is widely acknowledged that the microbiome is a dynamic ecosystem.
Here, we analyse how the gut microbiome changes over time as a community, how different bacterial species behave over time, and whether there are clusters of bacteria that exhibit similar fluctuations?
We show that a healthy human gut microbiome is stationary, seasonal, and non-random. Moreover, we demonstrate that it is self-explanatory to some extent, and its behavior can be predicted.
The analysis of individual bacterial species uncovered the existence of three distinct longitudinal regimes in the healthy human gut microbiome. These regimes consist of 1) stationary and highly prevalent bacteria that exhibit resistance to environmental changes; 2) volatile bacteria that exhibit dynamic reactions to external stimuli, causing their presence to fluctuate over time; and 3) white noise. Clustering analysis revealed the presence of taxonomically diverse bacterial groups that exhibit similar fluctuations over time.
In conclusion, our study highlights the importance of longitudinal data and provides new insights into the dynamics of the healthy human gut microbiome. We offer clear guidelines for clinicians and statisticians who conduct longitudinal studies and develop models to predict the behavior of the gut microbiome over time.

B-204: Towards improved biome understanding and classification - exploiting intra-community synergies
Track: MICROBIOME
  • Dagmara Błaszczyk, Jagiellonian University, Malopolska Centre of Biotechnology, Krakow, Gronostajowa 7a, Poland
  • Witold Wydmański, Jagiellonian University, Poland
  • Krzysztof Mnich, University of Białystok, Poland
  • Kinga Zielinska, Malopolska Centre of Biotechnology, Jagiellonian University, Poland
  • Valentyn Bezshapkin, Małopolska Centre of Biotechnology, Poland
  • Michał Kowalski, Jagiellonian University, Poland
  • Alina Frolova, The Institute of Molecular Biology and Genetics of NASU, Ukraine
  • Renata Zbieć-Piekarska, Central Forensic Laboratory of the Police, Poland
  • Wojciech Branicki, Jagiellonian University, Malopolska Centre of Biotechnology, Krakow, Gronostajowa 7a, Poland
  • Witold Rudnicki, Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Poland
  • Paweł P. Łabaj, Małopolska Centre of Biotechnology of Jagiellonian University, Poland


Presentation Overview: Show

Microbiota research is increasingly focused on the exposome factors and their relation to metadata. It allows for finding microorganisms that are specific to ecological niches. However, the most study examines microorganisms found in a sample as separate features, which does not give the whole picture of interactions between microorganisms in ecological niches. Our project aims to discover the synergies between microorganisms and examine their impact on the classification of samples in one of the established Polish microclimate clusters.
In our study, we use 240 soil samples collected from different locations in Poland. The sampling locations have been selected based on climate characteristics supported by over 20 years of weather conditions parameters history and represent three different Polish microclimate clusters.
We use MDFS (MultiDimensional Feature Selection) (Mnich & Rudnicki, 2020), which is based on the Mutual-information theory, to reveal synergies between microorganisms in corresponding climate niches. This further allows us to investigate how exploiting microbial synergies impacts the classification of samples into specific climate clusters.
The first results indeed confirm the existence of microclimate-related and local-specific microbial communities. By discovering synergies we are able to reduce the number of features while maintaining a similar level of classification.

B-205: Metagenomic Binning using Graph Neural Networks
Track: MICROBIOME
  • Hansheng Xue, School of Computing, Australian National University, Australia
  • Vijini Mallawaarachchi, Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Australia
  • Yu Lin, School of Computing, Australian National University, Australia
  • Lexing Xie, School of Computing, Australian National University, Australia
  • Vaibhav Rajan, School of Computing, National University of Singapore, Australia


Presentation Overview: Show

Metagenomics studies genomic material derived from the mixed microbial communities found in diverse environments and has significant implications for both human health and environmental sustainability. Metagenomic binning refers to the clustering of genomic subsequences obtained from high-throughput DNA sequencing into distinct bins, each corresponding to a constituent organism in the community. In contrast to the earlier methods using the composition and abundance of sequences, certain graph-based binning tools have been proposed that leverage homophily information from the assembly graph. However, the binning problem is exacerbated by fragment-level assembly graph, heterophilous constraints from single-copy marker genes, unknown binning numbers, and skewed bin size distribution. In this paper, we formulate metagenomic binning as a combination of graph learning and constraint satisfaction problems and design a reference-free binning tool, NeuroBin, which involves (i) a graph neural network model to learn the fragment-level assembly graph meanwhile respecting constraints, (ii) a constrained-contigs matching algorithm to generate initial bins with an accurate count, (iii) a neural network-based combinatorial optimization model to minimize the constraints violated by binning, and (iv) a local refinement strategy to adjust binning results. Extensive experiments conducted on simulated datasets demonstrate that NeuroBin surpasses the state-of-the-art binning methods based on the assembly graph.

B-206: Unraveling Diet-Related Heterogeneous Microbial Interactions with NEGMoE: A Nutrition-Aware Graphical Mixture of Experts Model
Track: MICROBIOME
  • Xiangnan Xu, Humboldt-Universität zu Berlin, Germany
  • Marco Simnacher, Humboldt-Universität zu Berlin, Germany
  • Michal Lubomski, Royal North Shore Hospital, Australia
  • Ryan Davis, The University of Sydney, Australia
  • Carolyn Sue, The University of Sydney, Australia
  • Jean Yang, The University of Sydney, Australia
  • Samuel Muller, Macquarie University, Australia
  • Sonja Greven, Humboldt-Universität zu Berlin, Germany


Presentation Overview: Show

The gut microbiome is known to play a crucial role in human health and is influenced by various factors, particularly diet. However, the relationship between diet and the gut microbiome is complex and heterogeneous, as individuals with different diets can provide different sources of energy, which can impact not only the abundance of the microbiome but also its relationships. A better understanding of these complex diet-microbiome interactions holds promise for the development of personalized nutrition.

To uncover such diet-related heterogeneous microbial interactions, we propose a novel method, the Nutrition-Ecotype Graphical Mixture of Experts (NEGMoE), which models microbial co-abundance networks and accounts for diet-specific cohort variability via a mixture of experts’ model with a graphical lasso penalty.

We applied NEGMoE to real-world microbiome Parkinson’s disease (PD) datasets. Two subcohorts with different energy and fiber intakes are identified. We observed differential correlation structures among the microbiome within these two diet subcohorts. The correlation between the short-chain fatty acid producer Genus, Faecalibacterium and Blautia, increased in the high-fiber group, while the correlation between Bifidobacterium and Ruminococcus decreased in this group. These taxa have been shown to be important in PD. Our results further demonstrate that dietary fiber can influence interactions among them.

B-207: Axolotl: A Scalable Apache Spark-based Library for High-troughput Genomic Data Analysis
Track: MICROBIOME
  • Satria Kautsar, DOE Joint Genome Institute, Lawrence Berkeley National Lab, US, United States
  • Harrison Ho, School of Natural Sciences, University of California at Merced, US, United States
  • Zhong Wang, DOE Joint Genome Institute, Lawrence Berkeley National Lab, US, United States


Presentation Overview: Show

Next-generation sequencing has substantially increased genomic data volume and complexity, often exceeding terabytes in size. Traditional bioinformatic tools, designed for single computer operations, struggle to cope with these datasets. Despite the emergence of parallel frameworks like Apache Spark, Dask, Polars, and Ray, their application to genomic problems remains limited.

We introduce Axolotl, a scalable library built on Apache Spark, specifically designed for large-scale genomic data analysis. Axolotl creates genomics-specific function modules, enabling biologists to utilize Python for distributed computing environments. Users can harness Spark's built-in SQL and Machine Learning libraries for scalable bioinformatics analysis. We present two distinct use cases: a global-scale examination of over 1.5 million biosynthetic gene clusters (BGCs) and a distributed batch computation of polygenic risk scores (PRS). Axolotl efficiently processes these datasets in parallel using 32+ 16-core compute nodes, virtually combining the power of a 512-core, 4TB RAM machine, with entire analysis pipelines implemented in fewer than 50 lines of code.

Our findings highlight Axolotl's potential to revolutionize how researchers tackle large genomic data sets, enabling swift, scalable, and accurate analyses across a broad spectrum of omics applications.

B-208: Interpretable two-stage predictive modeling of compositional sequencing data
Track: MICROBIOME
  • Minh Viet Tran, Ludwig Maximilian University Munich, Munich Center for Machine Learning (MCML), Helmholtz Munich, Germany
  • Christian L. Müller, Ludwig Maximilian University Munich, Helmholtz Munich, Munich Center for Machine Learning (MCML), Flatiron Institute, Germany


Presentation Overview: Show

Robust, interpretable, and reproducible prediction of environmental or host characteristics from associated sparse high-dimensional compositional high-throughput amplicon sequencing, metagenomic sequencing, or single-cell RNASeq cell type data remains a considerable statistical challenge in microbiome and single cell research.

Here, we propose a two-stage classification and regression workflow that (i) incorporates a priori known hierarchical grouping of the microbial or cell type features and (ii) explicitly accounts for available non-compositional covariates. The first stage models compositional predictors via a generalized log-contrast formulation that allows a task-specific tree-guided aggregation of the microbial taxa or human cell types. The second stage further simplifies the first-stage model by applying sparsity-inducing penalties to log-ratio pairs of aggregated variables, improving both predictability and interpretability of the models.

We show that, on real-world benchmarks, our proposed framework achieves comparable performance with state-of-the-art methods on various tasks, while simultaneously delivering sparse interpretable models. Our analysis on two Irritable Bowel Syndrome (IBS) datasets also revealed a fine-grained log ratio at the Phylum to Order level that can serve as a novel microbial biomarker for IBS.

Overall, this workflow contributes to the growing toolbox for analyzing microbial and single-cell count data.

B-209: Deciphering the secondary metabolism of the human gut microbiome
Track: MICROBIOME
  • Martin Larralde, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Germany
  • Laura Carroll, Department of Clinical Microbiology, SciLifeLab, Umeå University, Sweden
  • Jonas Simon Fleck, Institute of Human Biology (IHB), Roche Innovation Center Basel, Switzerland
  • Ruby Ponnudurai, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Germany
  • Alessio Milanese, Department of Biology, ETH Zürich, Switzerland
  • Elisa Cappio Barazzone, Department of Health Sciences and Technology, ETH Zürich, Switzerland
  • Georg Zeller, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Germany


Presentation Overview: Show

Biosynthetic Gene Clusters (BGCs) are regions of co-localized genes encoding the biosynthetic machinery capable of synthesizing a particular secondary metabolite. BGCs found in the gut microbiome are known to produce antibiotics, but can also encode the production of compounds relevant for host health, such as colibactin, a genotoxin contributing to colorectal cancer.

We present three scalable and accurate methods for the computational discovery and analysis of BGCs: GECCO, an accurate and scalable genome mining method for identifying BGCs in (meta-)genomes; HTGCF, a high-throughput clustering pipeline for grouping BGCs into Gene Cluster Families (GCFs), based on sequence and protein-content similarity; and CONCH, a data-driven machine-learning approach for predicting the chemical structure of a BGC compound.

We applied these methods to analyze over 300,000 genomes (including metagenomic assemblies) from human gut bacteria, identifying many BGCs (>400,000 BGCs, 64.5%) undiscovered by existing methods (e.g. antiSMASH). Clustering these BGCs into GCFs revealed that the majority (>65% of GCFs) was confined to a single species. After manually screening BGCs predicted in Bacteroidota, we selected gene clusters with rare biosynthetic features previously unseen in the human gut for further experimental validation.

B-210: Comprehensive benchmarking of differential abundance methods in microbiome data
Track: MICROBIOME
  • Marco Cappellato, University of Padova, Italy
  • Giacomo Baruzzo, University of Padova, Italy
  • Barbara Di Camillo, University of Padova, Italy


Presentation Overview: Show

Efficient and cost-effective high throughput DNA sequencing techniques have enhanced the study of complex microbial systems, leading to important conclusions in different fields. Differential abundance (DA) analysis finds a microbial signature looking at differences in taxa abundances between classes of samples. While many bioinformatics methods have been specifically developed for microbiome data, currently there is no consensus about the best approach.
In this work we performed an extensive benchmarking of 12 widely used DA methods (ALDEx2, eBay, ANCOM-BC, corncob, MaAsLin2, metagenomeSeq, edgeR, DESeq2, ...) across scenarios and covariates not yet investigated by previous studies such as the combined effect of sample size, percentage of DA taxa, sequencing depth, fold change, variability of taxa, low abundance DA taxa, normalisation and different ecological niches.
We simulated count data with DA features (both in terms of absolute and relative abundances) using the metaSPARSim simulator, with great attention to resemble real data characteristics (e.g. compositionality, sparsity and taxa intensity-variability relationship).
This paper provides researchers useful recommendations to properly conduct DA analysis in their own datasets. Moreover, the proposed assessment framework is released in the metaBenchDA R package and through a Docker container, thus representing a robust and reproducible tool for future benchmarking studies.