In-person Tutorials (All times SAST)

Virtual Tutorials (All times SAST)

Tutorial IP1: Genotype Imputation and Data Analysis for African Populations: A Practical Tutorial Using AfriGen-D Resources

Room: Atlantic 1
Date: April 17, 2025
Time: 13:00-17:00

Organizers
Mamana Mbiyavanga, University of Cape Town
Lyndon Zass, University of Cape Town
Sumir Panji, University of Cape Town
Nicola Mulder, University of Cape Town

Max Participants: 30

Description
The African Genomics Data Hub (AfriGen-D) provides essential resources for analyzing African genetic data, addressing unique challenges posed by the continent's exceptional genetic diversity. This hands-on tutorial focuses on genotype imputation and downstream analysis using AfriGen-D resources.

Through practical exercises, participants will master data quality control specific to African genetic data, execute imputation and basic GWAS workflows, and learn to interpret results using the AfriGen-D Imputation Service, African Genomics Medicine Portal (AGMP), and African Genomics Variation Database (AGVD).

Enhance your African genomics research capabilities with this practical tutorial. Using AfriGen-D resources, learn to prepare data, perform genotype imputation, conduct basic GWAS analysis, and interpret results with tools optimized for African genetic diversity.

Learning Objectives

Materials

Schedule

13:00-13:30 Introduction to AfriGen-D resources and data discovery
13:30-14:30 Data preparation and quality control (hands-on)
14:30-14:45 Break
14:45-15:45 Imputation workflow and monitoring (hands-on)
15:45-16:30 Post-imputation quality assessment and basic GWAS
16:30-17:00 Variant annotation and interpretation using AGMP/AGVD

- top -

Tutorial IP2: Simulation-Based Inference for Computational Biology: Integrating AI, Bayesian Modeling, and HPC

Room: Pacific 2
Date: April 17, 2025
Time: 09:00-13:00

Organizers
Alina Bazarova, Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany
Jose Ignacio Robledo, Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany
Stefan Kesselheim, Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany

Max Participants: 25

Description
This tutorial introduces Simulation-Based Inference (SBI), a framework combining Bayesian modeling, AI techniques, and high-performance computing (HPC) to address key challenges in computational biology, such as performing reliable inference with limited data by using AI-based approximate Bayesian computation. Moreover, it tackles the problem of intractable likelihood functions, thereby allowing to utilize Bayesian inference for biological systems with multiple sources of stochasticity. The tutorial also demonstrates how to leverage HPC environments to drastically reduce inference runtimes, making it highly relevant for large-scale biological problems. This tutorial bridges theoretical foundations with hands-on applications in computational biology. Participants will learn to implement SBI frameworks using diverse biological models, such as molecular dynamics simulations, agent-based tumor growth models, count data modeling, and Lotka-Volterra systems. Practical exercises in Jupyter notebooks guide attendees through SBI workflows, from simple coin-flipping examples to more complex biological simulations, ensuring accessibility for participants with varied backgrounds. The tutorial’s inclusion of cutting-edge methods like Sequential Neural Posterior Estimation and its emphasis on parallelization and HPC scalability align closely with the scientific community's focus on innovation in computational biology. A previous iteration of the tutorial at the Helmholtz AI Conference 2024 received excellent reviews and led to interdisciplinary discussions, highlighting its broad applicability and impact. For this conference, the content has been further refined with additional examples relevant to the community, ensuring it meets the needs of bioinformatics researchers.

Learning Objectives

Intended Audience and Level
This tutorial is designed for researchers working in computational biology and bioinformatics, modeling natural processes, and applying AI or Bayesian inference techniques. It is well-suited for:

The tutorial is intermediate in content level. While no in-depth knowledge of statistical or Deep Learning methods is required, participants should have basic familiarity with these concepts and have experience in using Python. Experience with HPC systems is beneficial but not mandatory. Attendees are required to have a laptop for accessing the HPC system.  Individual access accounts will be provided prior to or at the tutorial.

- top -

Tutorial IP4: Introductions to constraint-based modeling using cobrapy

Room: Pacific 2
Date: April 17, 2025
Time: 13:00-17:00

Organizers
Alia Benkahla, Institut Pasteur de Tunis
Feryel Guennich, Institut Pasteur de Tunis
Oussema Souiai, Institut Pasteur de Tunis
Emna Harigua-Souiai, Institut Pasteur de Tunis

Max Participants: 25

Description
COBRApy is a user-friendly open source Python package that makes learning this modeling accessible and convenient. A hands-on workshop, which included exercises and problem-solving, would introduce participants to this technique. By bringing together researchers interested in the development of this type of modeling, this type of workshop would not only teach a valuable skill, but also encourage the development of new collaborations. The practical skills participants acquire can immediately be applied to their research, deepening knowledge and accelerating discoveries.

Intended Audience and Level
Researchers and students with basic Python knowledge interested in applying constraint-based modeling to biological systems.

Materials

Schedule

13:00-13:15 Introduction to Constraint-Based Modeling
  • Briefly explain the core concepts of CBM: stoichiometry, constraints, objective functions, flux balance analysis (FBA)
  • Briefly introduce COBRApy as a powerful and user-friendly Python package for CBM.
13:15-14:00 Introduction to CBM using COBRApy and Working with Models
  • Introduction o Google Colab.
  • Installing COBRApy and its dependencies.
  • Create a model and understanding basic cobra objects (reactions, metabolites, genes):
    • Importing and exploring existing metabolic models (e.g., E. coli core model).
    • Understanding the structure of a COBRApy model object: reactions, metabolites, genes.
    • Basic model manipulation: adding/removing reactions and metabolites.
14:00-15:00 Performing Flux Balance Analysis (FBA)
  • Setting up an FBA problem: defining the objective function (e.g., biomass production).
  • Genome-scale modelling.
  • Studying the model:
    • Inspecting the model's numbers
    • Inspecting the systems' boundaries
    • Running a Flux Balance Analysis (FBA).
  • Hands-on exercises.
15:00-15:30 In-silico gene knockouts
  • Single knockout study.
  • Systems-wide knockout study.
15:30-16:00 Working with Experimental Data and Model Integration (30 mins)
  • Discuss how to integrate experimental data (e.g., transcriptomics, metabolomics) with CBM models.
  • Example of integrating gene expression data to constrain model fluxes.
16:00-16:15 Wrap-up
  • Open discussion for questions and troubleshooting.
  • Summary of key concepts and resources for further learning.
  • Potential future directions and advanced topics in CBM.

- top -

Tutorial IP5: Building agentic workflows for bioinformatics. 

Room: Pacific 1
Date: April 17, 2025
Time: 09:00-17:00

Organizers
Dionizije Fa, Entropic j.d.o.o.
Mateo Čupić, Entropic j.d.o.o.
Bruno Pandža, Entropic j.d.o.o.

Max Participants: 25

Description
Agentic workflow is a process of interacting with Large Language Models (LLMs) to complete complex tasks - allowing practitioners to build pipelines that integrate data retrieval, reasoning, and execution steps. This tutorial will guide participants through the conceptual and practical foundations of setting up their own agentic workflows. By combining prompt engineering techniques, retrieval-augmented generation tool use and deployment strategies that safeguard data privacy, tutorial participants will learn how to build, deploy and tune their own personal copilots for use in bioinformatics workflows.

The capabilities of agentic workflows—driven by improving LLMs —are rapidly expanding, while cloud offerings are making these advanced computational tools more accessible than ever before. By integrating agentic workflows into bioinformatics pipelines, practitioners can significantly reduce their time-to-analysis. Lowering the barrier to entry for novices and allowing expert practitioners to scale their work with greater efficiency, these workflows democratize cutting-edge computational methods and ensure that the tutorial participants can capitalize and leverage the latest advances in their work and careers in general. This tutorial will integrate state-of-the-art prompting techniques, retrieval augmentation strategies, add context to model selection and explore the fundamentals between choosing amongst the different techniques and current trends.

Learning Objectives

Intended Audience and Level
The tutorial is aimed at bioinformaticians who are beginners in AI and LLMs. However, it is strongly recommended to have programming experience in Python as well as using command line tools and bioinformatics software.

Attendees should have:

Participants will be provided with installation instructions in advance to ensure a smoother experience.

Materials

Download tutorial materials

Schedule

  Introduction & Overview
  • Overview of LLMs and prompt engineering concepts
  • Introduction to agentic workflows
  • Use-cases and examples in bioinformatics to motivate learning
  Foundational Concepts and Setup
  • Environment setup
  • Setting up the software packages and outline of the pipeline
  • Q&A for clarification of key concepts
  Building a Simple Agentic Workflow
  • Step-by-step construction of a basic workflow: prompting, and refining responses
  • Demonstration of an example pipeline using a provided dataset (e.g., sequence)
  • Discussion on integrating workflows into existing bioinformatics infrastructures
  Advanced Techniques and Troubleshooting
  • Refinement: improving prompt quality, adding more complex tools
  • Handling complex multi-step analyses
  • Troubleshooting common errors and optimizing workflows for performance
  Wrap-up and Future Directions
  • Recap of key takeaways and practical resources
  • Future trends in agentic workflows and integrating emerging tools
  • Open discussion, participant feedback, and next steps for continued learning

- top -

Tutorial VT1: Multiomics Data Integration using Graph Based Machine Learning

Date: April 10, 2025
Time: 12:00-16:00

Organizers
Loni Taylor, PMP, CETL, PhD Candidate, Meharry Medical College, Nashville, TN, USA.
Bishnu Sarker, PhD, Assistant Professor of Computer Science and Data Science, Meharry Medical College, Nashville, TN, USA.
Animesh Acharjee, PhD, Assistant Professor, University of Birmingham, UK.

Max Participants: 40

Description
This tutorial introduces participants to the integration of multiomics data from genomics, proteomics, transcriptomics, and metabolomics, focusing on computational approaches to uncover hidden relationships between biological entities. The session will cover techniques such as Non-negative Matrix Factorization (NMF), machine learning, and Graph Neural Networks (GNNs) to model multi-layered biological interactions and predict biological outcomes such as disease classification, drug responses, and biomarker discovery. Attendees will gain hands-on experience in processing and analyzing real-world multiomics datasets using open-source tools such as Python, pandas, scikit-learn.

Learning Objectives
By the end of this tutorial, attendees will be able to:

Intended Audience and Level
This tutorial is designed for intermediate to advanced learners with an interest in multiomics data integration and its applications in machine learning. It is ideal for bioinformaticians, computational biologists, data scientists, and machine learning practitioners who want to expand their knowledge of graph-based methods in multiomics analysis.
The tutorial is targeted at professionals, researchers, and graduate students with at least a basic understanding of:

- top -

Tutorial VT2: Machine Learning Models for Drug Response Prediction

Room: TBD
Date: April 11, 2025
Time: 14:00-16:00

Organizers
Dennis Wang, A*STAR Bioinformatics Institute, A*STAR Institute for Human Development and Potential, National Hear
Yurui Chen, Institute for Human Development and Potential (IHDP), Agency for Science, Technology and Research (A*STAR), Department of Mathematics, National University of Singapore, Singapore, Republic of Singapore
Dr. Evelyn Lau, Institute for Human Development and Potential (IHDP), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
Dr. Juan Jose Giraldo Gutierrez, National Heart and Lung Institute, Imperial College London, London, Department of Computer Science, The University of Sheffield

Max Participants: 50

Description
This tutorial provides a comprehensive overview of machine learning techniques applied to drug response prediction on cancer cell lines, with a focus using Graph Neural Networks (GNNs) and Gaussian processes (GPs). Participants will gain both theoretical knowledge and practical experience through interactive lectures and hands-on demonstrations.

Learning Objectives

Intended Audience and Level
This tutorial is designed for bioinformatics researchers, data scientists, and professionals in computational biology with a basic understanding of machine learning concepts. Prior experience with Python programming will be beneficial but not mandatory.

Materials
Participants will receive access to:

Schedule

14:00-14:15 Welcome and Overview
- Introduction to tutorial objectives and schedule.
14:15-14:45 Session 1: Introduction to Machine Learning for Drug Response and Development
- Overview of machine learning applications in drug development.
- Key concepts and terminology.
14:45-16:00 Session 2: Graph Neural Networks (GNNs) and Deep Learning for Drug Response Prediction
- Introduction to GNNs and their relevance in biomedical research.
- Case studies of GNN applications in drug response prediction.
- Walkthrough of a pre-prepared GNN model for drug response prediction using PyTorch and torch_geometric.
- Discussion on model evaluation techniques.
- Materials at Github

- top -