Posters - Schedules
Posters Home

View Posters By Category

Monday, July 24, between 18:00 CEST and 19:00 CEST
Tuesday, July 25, between 18:00 CEST and 19:00 CEST
Session A Poster Set-up and Dismantle
Session A Posters set up:
Monday, July 24, between 08:00 CEST and 08:45 CEST
Session A Posters dismantle:
Monday, July 24, at 19:00 CEST
Session B Poster Set-up and Dismantle
Session B Posters set up:
Tuesday, July 25, between 08:00 CEST and 08:45 CEST
Session B Posters dismantle:
Tuesday, July 25, at 19:00 CEST
Wednesday, July 26, between 18:00 CEST and 19:00 CEST
Session C Poster Set-up and Dismantle
Session C Posters set up:
Wednesday, July 26,between 08:00 CEST and 08:45 CEST
Session C Posters dismantle:
Wednesday, July 26, at 19:00 CEST
Virtual
Accelerated nanopore basecalling with SLOW5 data format
Track: BOSC
  • Hiruna Samarakoon, Genomics Pillar, Garvan Institute of Medical Research, Sydney, NSW, Australia., Australia
  • James M. Ferguson, Genomics Pillar, Garvan Institute of Medical Research, Sydney, NSW, Australia., Australia
  • Hasindu Gamaarachchi, Genomics Pillar, Garvan Institute of Medical Research, Sydney, NSW, Australia., Australia
  • Ira W. Deveson, Genomics Pillar, Garvan Institute of Medical Research, Sydney, NSW, Australia., Australia


Presentation Overview: Show

Nanopore sequencing is emerging as a key pillar in the genomic technology landscape but computational constraints limiting its scalability remain to be overcome. The translation of raw current signal data into DNA or RNA sequence reads, known as ‘basecalling’, is a major friction in any nanopore sequencing workflow. Here, we exploit the advantages of the recently developed signal data format ‘SLOW5’ to streamline and accelerate nanopore basecalling on high-performance computer (HPC) and cloud environments. SLOW5 permits highly efficient sequential data access, eliminating a significant analysis bottleneck. To take advantage of this, we introduce Buttery-eel, an open-source wrapper for Oxford Nanopore’s Guppy basecaller (Guppy) that enables SLOW5 data access, resulting in performance improvements that are essential for scalable, affordable basecalling. Basecalling a realistic human whole-genome sequencing dataset (at ~30X coverage), in FAST5 format, with 4 GPUs, took a minimum of 13.3 hours (cloud-system) and a maximum of 41.6 hours (distributed-file-system) with Guppy (high accuracy model). In BLOW5 format, basecalled with Buttery-eel, overall runtimes were reduced to ~5 hours on every system, corresponding to 2.7-fold (cloud-system), 3.4-fold (parallel-file-system) and 9.1-fold (distributed-file-system) improvements, respectively.

BREC 2.0: An even shinier app for recombination rate estimates analysis along genomes
Track: BOSC
  • Yasmine Mansour, University Ferhat Abbas Setif 1, Setif, Algeria, Algeria
  • Annie Chateau, Laboratory of Computer Science, Robotics and Microelectronics of Montpellier (LIRMM), Montpellier, France, France
  • Anna-Sophie Fiston-Lavier, Institute of Evolution Science of Montpellier (ISEM), Montpellier, France, France


Presentation Overview: Show

Introduction

Here, we propose an updated version of BREC, our previously published R-package offering a Shiny app. BREC (Boundaries and RECombination rate estimates) is an automated computational tool, based on the Marey maps method, allowing to identify heterochromatin boundaries along chromosomes and estimating local recombination rates.
BREC is non-genome-specific, running even on non-model genomes as long as genetic and physical maps are available. BREC is data-driven. Thus, it includes a two-phased data pre-processing module.

Contribution

We chose to test a first deployment of the BREC shiny app on the self-service platform https://www.shinyapps.io/. This will allow users to switch to an install-free alternative with a direct online access in order to improve the user experience and avoid most of the technical issues related to portability and scalability. This process is a work in progress.

Conclusion

We believe our tool has the potential to help bring data science into the service of genome biology and dynamics. We introduce BREC as an open source, free software solution, yielding a fast, easy-to-use, and broadly accessible resource.
BREC 2.0 is a work in progress. More details are available in my thesis manuscript accessible here: https://www.theses.fr/en/2021MONTS116.

FAIR Data Stewardship in Computational Workflows
Track: BOSC
  • Katarzyna Kamieniecka, University of Bradford‬, United Kingdom
  • Khaled Jumah, University of Bradford‬, United Kingdom
  • Krzysztof Poterlowicz, University of Bradford‬, United Kingdom


Presentation Overview: Show

The FAIR data stewardship created the foundation for sharing and publishing digital assets, especially data. The FAIR principles in computational workflows emphasize machine accessibility and emphasize that all digital assets should be findable, accessible, interoperable and reusable. Tools and workflows encode how a scientific process occurs and data are generated. It is therefore important that the platform supports the creation of FAIR data and itself adheres to FAIR principles. The data stewardship initiative focuses on providing training in FAIR data management aspects.

Here we would like to introduce two types of workflow frameworks with their specific nature in terms of development and composition, ensuring data integrity and putting reproducibility at the fore. Coarse-grained, Galaxy project [https://galaxyproject.org/] focusing on chaining locally hosted or distributed tools and fine-grained Nextflow framework [https://www.nextflow.io/] focusing on optimising computational resources. Encouraging computational reproducibility in research, we will describe our experience of being involved in the Data Stewardship initiative Elixir UK and putting our findings into use with Galaxy/Nextflow data analyses and training materials development.

FLAN: Federated Learning-based Ancestry Prediction Pipeline
Track: BOSC
  • Aleksandr Medvedev, GENXT, Skolkovo Institute of Science and Technology, United Kingdom
  • Dmitry Kolobkov, GENXT, Vavilov Institute of General Genetics, United Kingdom
  • Satyarth Mishra Sharma, Skolkovo Institute of Science and Technology, Russia
  • Mikhail Lebedev, GENXT, United Kingdom
  • Egor Kosaretskiy, GENXT, United Kingdom
  • Ruslan Vakhitov, GENXT, United Kingdom
  • Pavel Nikonorov, GENXT, United Kingdom


Presentation Overview: Show

Ancestry prediction has both medical and recreational applications, supporting personalised healthcare decisions and allowing individuals to explore their genetic heritage. Ancestry prediction services use machine learning models trained on reference panels consisting of individuals with verified ancestry. Since creating such panels is expensive, ancestry prediction models are often trained on a limited number of ancestries and underperform on clients of unseen ancestries. Thus, data collaboration between data banks holding diverse reference data sets from different parts of the world may be beneficial. While pooling sensitive genetic data together to train a collaborative model is not possible, federated learning may provide a viable solution. Here, we propose a federated ancestry-from-genotype prediction module. This module consists of a data preparation pipeline, a machine learning model operating on local data, and the model federation interface that trains a model on data split into isolated silos. This structure gives a lot of flexibility to the model, which allows for different nodes to have different data formats and feature spaces, training different machine learning models and choosing a federation strategy best suited for system requirements and resources. It is available at https://github.com/genxnetwork/flan

GRAPE: Genomic Relatedness Detection Pipeline
Track: BOSC
  • Aleksandr Medvedev, GENXT, United Kingdom
  • Mikhail Lebedev, GENXT, United Kingdom
  • Andrew Ponomarev, GENXT, Ukraine
  • Mikhail Kosaretskiy, Atlas Biomed, United Kingdom
  • Dmitriy Osipenko, Atlas Biomed, United Kingdom
  • Alexander Tischenko, GENXT, United Kingdom
  • Egor Kosaretskiy, GENXT, United Kingdom
  • Hui Wang, GENXT, United States
  • Dmitry Kolobkov, GENXT, United Kingdom
  • Vitalina Chamberlain-Evans, GENXT, University of Cambridge, United Kingdom
  • Ruslan Vakhitov, GENXT, United Kingdom
  • Pavel Nikonorov, GENXT, United Kingdom


Presentation Overview: Show

Classifying relatedness between individuals has scientific and commercial applications. For instance, unrecognized population structure can lead to false positive results in genome-wide association studies. Accurate relationship classification is also required for genetic linkage analysis to identify disease-associated loci. DNA relatives matching service is a leading driver for the direct-to-consumer genetic testing market. GRAPE is an open-source Genomic RelAtedness detection PipelinE. It combines comprehensive QC, preprocessing, IBD (Identity-by-Descent) segment inference and relatedness inference in one end-to-end solution. In the new GRAPE version, we added and tested RaPID IBD inference tool for phased datasets. Its accuracy is on par with the other tools and it can process a dataset with 500K samples in 6 hours on 4 CPU cores. We significantly sped up the preprocessing part and added batch support. We also improved the post-processing of IBD segments and relatedness inference so it can handle tens of millions of IBD segments without using more than 16GB of RAM. Other minor changes include the development of the simulation workflow to create datasets with up to 500K samples with the single command and the workflow to remove all relatives from the input file. GRAPE is available from: https://github.com/genxnetwork/grape

PhyloGenes: A web-based tool for plant gene function inference using phylogenetics
Track: BOSC
  • Swapnil Sawant, Phoenix Bioinformatics, United States
  • Tanya Berardini, Phoenix Bioinformatics, United States
  • Trilok Prithvi, Phoenix Bioinformatics, United States
  • Peifen Zhang, Computercraft Corporation (for NCBI), United States


Presentation Overview: Show

PhyloGenes (phylogenes.org) is a web-based bioinformatics tool that that leverages advanced search and indexing technology based on Apache Solr to index and analyze genes and phylogenetic trees of over 8,000 gene families across 50 organisms, including 40 plant species. It integrates experimental and phylogenetically-inferred gene function annotations, publications, and sequence alignments from PantherDB, Gene Ontology, and UniProt and displays them using interactive and efficient visualization tools based on D3.js. By presenting information in a way that visually reflects phylogenetic relationships, PhyloGenes facilitates more effective inference of gene function. By making annotation evidence and sources and other metadata clearly visible and traceable, PhyloGenes will improve the accuracy of inferred gene functions. PhyloGenes enables users to address research questions such as identifying orthologs, predict unknown gene function, and discover novel gene families in plants. PhyloGenes contributes to the open source bioinformatics ecosystem by providing a visually intuitive, robust, and transparent solution for gene function inference that could be adapted to other sets of organisms.