In-person Tutorials (All times SAST)
- Tutorial IP1: Genotype Imputation and Data Analysis for African Populations: A Practical Tutorial Using AfriGen-D Resources
- Tutorial IP2: Simulation-Based Inference for Computational Biology: Integrating AI, Bayesian Modeling, and HPC
- Tutorial IP4: Introductions to constraint-based modeling using cobrapy
- Tutorial IP5: Building agentic workflows for bioinformatics
Virtual Tutorials (All times SAST)
- Tutorial VT1: Multiomics Data Integration using Graph Based Machine Learning
- Tutorial VT2: Machine Learning Models for Drug Response Prediction
Tutorial IP1: Genotype Imputation and Data Analysis for African Populations: A Practical Tutorial Using AfriGen-D Resources
Room: Atlantic 1
Date: April 17, 2025
Time: 13:00-17:00
Organizers
Mamana Mbiyavanga, University of Cape Town
Lyndon Zass, University of Cape Town
Sumir Panji, University of Cape Town
Nicola Mulder, University of Cape Town
Max Participants: 30
Description
The African Genomics Data Hub (AfriGen-D) provides essential resources for analyzing African genetic data, addressing unique challenges posed by the continent's exceptional genetic diversity. This hands-on tutorial focuses on genotype imputation and downstream analysis using AfriGen-D resources.
Through practical exercises, participants will master data quality control specific to African genetic data, execute imputation and basic GWAS workflows, and learn to interpret results using the AfriGen-D Imputation Service, African Genomics Medicine Portal (AGMP), and African Genomics Variation Database (AGVD).
Enhance your African genomics research capabilities with this practical tutorial. Using AfriGen-D resources, learn to prepare data, perform genotype imputation, conduct basic GWAS analysis, and interpret results with tools optimized for African genetic diversity.
Learning Objectives
- Navigate AfriGen-D catalogues for data discovery
- Master data preparation and quality control using the AfriGen-D Imputation Service
- Execute and monitor imputation workflows using the AfriGen-D Imputation Service
- Perform post-imputation quality assessment
- Perform basic GWAS analysis using the AfriGen-D Imputation Service
- Anotate and interpret genetic variants using African-specific resources (AGMP and AGVD)
Materials
- Personal laptop with internet connection
- Basic command-line knowledge
- Familiarity with genetic data formats
Schedule
13:00-13:30 | Introduction to AfriGen-D resources and data discovery |
13:30-14:30 | Data preparation and quality control (hands-on) |
14:30-14:45 | Break |
14:45-15:45 | Imputation workflow and monitoring (hands-on) |
15:45-16:30 | Post-imputation quality assessment and basic GWAS |
16:30-17:00 | Variant annotation and interpretation using AGMP/AGVD |
Tutorial IP2: Simulation-Based Inference for Computational Biology: Integrating AI, Bayesian Modeling, and HPC
Room: Pacific 2
Date: April 17, 2025
Time: 09:00-13:00
Organizers
Alina Bazarova, Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany
Jose Ignacio Robledo, Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany
Stefan Kesselheim, Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany
Max Participants: 25
Description
This tutorial introduces Simulation-Based Inference (SBI), a framework combining Bayesian modeling, AI techniques, and high-performance computing (HPC) to address key challenges in computational biology, such as performing reliable inference with limited data by using AI-based approximate Bayesian computation. Moreover, it tackles the problem of intractable likelihood functions, thereby allowing to utilize Bayesian inference for biological systems with multiple sources of stochasticity. The tutorial also demonstrates how to leverage HPC environments to drastically reduce inference runtimes, making it highly relevant for large-scale biological problems. This tutorial bridges theoretical foundations with hands-on applications in computational biology. Participants will learn to implement SBI frameworks using diverse biological models, such as molecular dynamics simulations, agent-based tumor growth models, count data modeling, and Lotka-Volterra systems. Practical exercises in Jupyter notebooks guide attendees through SBI workflows, from simple coin-flipping examples to more complex biological simulations, ensuring accessibility for participants with varied backgrounds. The tutorial’s inclusion of cutting-edge methods like Sequential Neural Posterior Estimation and its emphasis on parallelization and HPC scalability align closely with the scientific community's focus on innovation in computational biology. A previous iteration of the tutorial at the Helmholtz AI Conference 2024 received excellent reviews and led to interdisciplinary discussions, highlighting its broad applicability and impact. For this conference, the content has been further refined with additional examples relevant to the community, ensuring it meets the needs of bioinformatics researchers.
Learning Objectives
- Understand the Principles of Simulation-Based Inference (SBI): learn the theoretical foundations of SBI, including its relationship with Bayesian inference and its advantages in handling complex biological systems.
- Explore SBI Methods (SNPE, SNLE, and SNRE): gain an understanding of Sequential Neural Posterior Estimation (SNPE), Sequential Neural Likelihood Estimation (SNLE), and Sequential Neural Ratio Estimation (SNRE) and their applications in computational biology.
- Learn how to design and implement SBI frameworks for representative biological scenarios, such as molecular dynamics, cell growth, count data modeling, and Lotka-Volterra systems.
- Leverage HPC for SBI Workflows: understand how to use high-performance computing (HPC) environments to scale SBI workflows and efficiently distribute computational workloads.
Intended Audience and Level
This tutorial is designed for researchers working in computational biology and bioinformatics, modeling natural processes, and applying AI or Bayesian inference techniques. It is well-suited for:
- Researchers seeking to infer model parameters from sparse or simulated data.
- Scientists interested in uncertainty quantification and critical assessment of model fits using Bayesian techniques.
- Researchers experienced in Bayesian statistics looking to address intractable likelihoods or optimize inference workflows using AI-driven methods.
- AI researchers interested in advanced applications of Deep Learning architectures, such as normalizing flows and likelihood ratio estimation.
- Users of HPC systems or those interested in leveraging HPC for scaling simulations and training distributed AI models.
The tutorial is intermediate in content level. While no in-depth knowledge of statistical or Deep Learning methods is required, participants should have basic familiarity with these concepts and have experience in using Python. Experience with HPC systems is beneficial but not mandatory. Attendees are required to have a laptop for accessing the HPC system. Individual access accounts will be provided prior to or at the tutorial.
Room: Pacific 2
Date: April 17, 2025
Time: 13:00-17:00
Organizers
Alia Benkahla, Institut Pasteur de Tunis
Feryel Guennich, Institut Pasteur de Tunis
Oussema Souiai, Institut Pasteur de Tunis
Emna Harigua-Souiai, Institut Pasteur de Tunis
Max Participants: 25
Description
COBRApy is a user-friendly open source Python package that makes learning this modeling accessible and convenient. A hands-on workshop, which included exercises and problem-solving, would introduce participants to this technique. By bringing together researchers interested in the development of this type of modeling, this type of workshop would not only teach a valuable skill, but also encourage the development of new collaborations. The practical skills participants acquire can immediately be applied to their research, deepening knowledge and accelerating discoveries.
Intended Audience and Level
Researchers and students with basic Python knowledge interested in applying constraint-based modeling to biological systems.
Materials
- Slides with key concepts and code examples.
- Google Colab notebooks with hands-on exercises.
- Pre-prepared environment with COBRApy and necessary data.
Schedule
13:00-13:15 | Introduction to Constraint-Based Modeling
|
13:15-14:00 | Introduction to CBM using COBRApy and Working with Models
|
14:00-15:00 | Performing Flux Balance Analysis (FBA)
|
15:00-15:30 | In-silico gene knockouts
|
15:30-16:00 | Working with Experimental Data and Model Integration (30 mins)
|
16:00-16:15 | Wrap-up
|
Room: Pacific 1
Date: April 17, 2025
Time: 09:00-17:00
Organizers
Dionizije Fa, Entropic j.d.o.o.
Mateo Čupić, Entropic j.d.o.o.
Bruno Pandža, Entropic j.d.o.o.
Max Participants: 25
Description
Agentic workflow is a process of interacting with Large Language Models (LLMs) to complete complex tasks - allowing practitioners to build pipelines that integrate data retrieval, reasoning, and execution steps. This tutorial will guide participants through the conceptual and practical foundations of setting up their own agentic workflows. By combining prompt engineering techniques, retrieval-augmented generation tool use and deployment strategies that safeguard data privacy, tutorial participants will learn how to build, deploy and tune their own personal copilots for use in bioinformatics workflows.
The capabilities of agentic workflows—driven by improving LLMs —are rapidly expanding, while cloud offerings are making these advanced computational tools more accessible than ever before. By integrating agentic workflows into bioinformatics pipelines, practitioners can significantly reduce their time-to-analysis. Lowering the barrier to entry for novices and allowing expert practitioners to scale their work with greater efficiency, these workflows democratize cutting-edge computational methods and ensure that the tutorial participants can capitalize and leverage the latest advances in their work and careers in general. This tutorial will integrate state-of-the-art prompting techniques, retrieval augmentation strategies, add context to model selection and explore the fundamentals between choosing amongst the different techniques and current trends.
Learning Objectives
- Develop a theoretical and practical understanding and experience of how to integrate and automate bioinformatics analyses using LLM agents
- Gain hands-on experience building agentic workflows for bioinformatics
- Learn current trends in agentic workflows, fundamentals and differences of LLM models, best practices in writing prompts and deploying local LLM agents
- Understand how to extend and customize existing tools to fit specific research domains or specialized datasets
Intended Audience and Level
The tutorial is aimed at bioinformaticians who are beginners in AI and LLMs. However, it is strongly recommended to have programming experience in Python as well as using command line tools and bioinformatics software.
Attendees should have:
- Working knowledge of Python, package management and the command line (Unix)
- Familiarity with standard bioinformatics data formats and tools (e.g., FASTA, FASTQ)
- PC with a Python environment set up prior to the tutorial (exact requirements to be defined later)
- Comfortable understanding of basic bioinformatics concepts and common analysis pipelines
Participants will be provided with installation instructions in advance to ensure a smoother experience.
Materials
- Slides and Documentation: Detailed slides summarizing key concepts will be shared.
- Code Examples and Repositories: A public code repository containing example workflows, scripts, and configuration files will enable participants to continue experimenting independently.
Schedule
Introduction & Overview
|
|
Foundational Concepts and Setup
|
|
Building a Simple Agentic Workflow
|
|
Advanced Techniques and Troubleshooting
|
|
Wrap-up and Future Directions
|
Date: April 10, 2025
Time: 12:00-16:00
Organizers
Loni Taylor, PMP, CETL, PhD Candidate, Meharry Medical College, Nashville, TN, USA.
Bishnu Sarker, PhD, Assistant Professor of Computer Science and Data Science, Meharry Medical College, Nashville, TN, USA.
Animesh Acharjee, PhD, Assistant Professor, University of Birmingham, UK.
Max Participants: 40
Description
This tutorial introduces participants to the integration of multiomics data from genomics, proteomics, transcriptomics, and metabolomics, focusing on computational approaches to uncover hidden relationships between biological entities. The session will cover techniques such as Non-negative Matrix Factorization (NMF), machine learning, and Graph Neural Networks (GNNs) to model multi-layered biological interactions and predict biological outcomes such as disease classification, drug responses, and biomarker discovery. Attendees will gain hands-on experience in processing and analyzing real-world multiomics datasets using open-source tools such as Python, pandas, scikit-learn.
Learning Objectives
By the end of this tutorial, attendees will be able to:
- Understand key computational approaches for integrating and analyzing multiomics data, including NMF, machine learning, and GNNs.
- Apply open-source tools to implement predictive models for disease classification, biomarker identification, and drug response analysis.
- Evaluate model performance and interpret results to derive meaningful biological insights.
Intended Audience and Level
This tutorial is designed for intermediate to advanced learners with an interest in multiomics data integration and its applications in machine learning. It is ideal for bioinformaticians, computational biologists, data scientists, and machine learning practitioners who want to expand their knowledge of graph-based methods in multiomics analysis.
The tutorial is targeted at professionals, researchers, and graduate students with at least a basic understanding of:
- Omics Data (Genomics, Proteomics, Metabolomics)
- Machine Learning, especially Neural Networks
- Programming (Python, familiarity with ML libraries like PyTorch or TensorFlow)
Room: TBD
Date: April 11, 2025
Time: 14:00-16:00
Organizers
Dennis Wang, A*STAR Bioinformatics Institute, A*STAR Institute for Human Development and Potential, National Hear
Yurui Chen, Institute for Human Development and Potential (IHDP), Agency for Science, Technology and Research (A*STAR), Department of Mathematics, National University of Singapore, Singapore, Republic of Singapore
Dr. Evelyn Lau, Institute for Human Development and Potential (IHDP), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
Dr. Juan Jose Giraldo Gutierrez, National Heart and Lung Institute, Imperial College London, London, Department of Computer Science, The University of Sheffield
Max Participants: 50
Description
This tutorial provides a comprehensive overview of machine learning techniques applied to drug response prediction on cancer cell lines, with a focus using Graph Neural Networks (GNNs) and Gaussian processes (GPs). Participants will gain both theoretical knowledge and practical experience through interactive lectures and hands-on demonstrations.
Learning Objectives
- Understand Machine Learning Applications in Drug Development: Learn how machine learning models predict drug responses and facilitate drug development.
- Explain Graph Neural Networks (GNNs): Grasp the fundamentals of GNNs and their specific applications in biomedical data analysis.
- Develop and Evaluate GNN Models for Drug Prediction: Acquire skills in building and assessing GNN models for drug response prediction using tools like PyTorch and torch_geometric.
- Explain Probabilistic Models for Drug Prediction: Grasp the importance of probabilistic models to quantify uncertainty when predicting drug response curves.
- Building a Probabilistic Model based on Gaussian Processes (GPs) for Drug Prediction: Gain abilities to apply Gaussian process models for predicting dose responses.
Intended Audience and Level
This tutorial is designed for bioinformatics researchers, data scientists, and professionals in computational biology with a basic understanding of machine learning concepts. Prior experience with Python programming will be beneficial but not mandatory.
Materials
Participants will receive access to:
- Presentation slides.
- A take-home Jupyter Notebook (Google Colab) with:
- A step-by-step tutorial on building a basic GNN model for drug response prediction. Guidance on understanding and preparing biomedical data for GNNs.
- A colab tutorial on GP model that predicts dose-response curves.
- Insights into model training, evaluation, and interpretation.
- Additional resources for further exploration of the subject.
Schedule
14:00-14:15 | Welcome and Overview - Introduction to tutorial objectives and schedule. |
14:15-14:45 | Session 1: Introduction to Machine Learning for Drug Response and Development - Overview of machine learning applications in drug development. - Key concepts and terminology. |
14:45-16:00 | Session 2: Graph Neural Networks (GNNs) and Deep Learning for Drug Response Prediction - Introduction to GNNs and their relevance in biomedical research. - Case studies of GNN applications in drug response prediction. - Walkthrough of a pre-prepared GNN model for drug response prediction using PyTorch and torch_geometric. - Discussion on model evaluation techniques. - Materials at Github |