In-person Tutorials (All times EDT)
- Tutorial IP1: Agentic AI System for In Silico Team Science: From LLM Basics to Lab Assistant Agents
- Tutorial IP2: Large Language Models and Agentic AI for Biomedical Informatics
- Tutorial IP3: From Trees to Networks
- Tutorial IP4: Generating realistic synthetic biological data using phylogenetics
- Tutorial IP5: Computational Network Analysis of Omics Data
- Tutorial IP6: Network Biology Workflows with Cytoscape and NDEx: Visualization, Analysis, Sharing, Automation, and Apps
- Tutorial IP7: Quantum Computing for Multi-omics analyses
- Tutorial IP8: Metagenomic sequence analysis using k-mer based methods
- Tutorial IP9: Learning models of molecular sequence recognition from NGS data using biophysical machine learning
Virtual Tutorials: (All times EDT) Presented through the conference virtual platform
- Tutorial VT1: Genomic LLMs in Practice: A Hands-On Introduction with Hugging Face
- Tutorial VT2: Building Interactive Visualizations of Single-Cell and Spatial Data in Python
- Tutorial VT3: Hello Nextflow: Getting started with workflows for bioinformatics
- Tutorial VT4: Multimodal Integration and Multimodal Causal Inference using R/Bioconductor
- Tutorial VT5: Bridging the Gap: Single-Cell Analysis using No-Code AI Workflows
- Tutorial VT6: Programming robust LLM Agents for assisting scientific tasks
- Tutorial VT7: Hello nf-core: Level up your workflows with community-curated best practices and developer resources
- Tutorial VT8: Foundation model and graph learning for modeling, analyzing, and interpreting single-cell omics and histopathology data
Room: TBD
Date: July 12, 2026
Start Time: 09:00
End Time: 13:00
Max Participants: 60
Organizers
- Jason H. Moore, PhD – Research Data Scientist, Cedars-Sinai Medical Center
- Binglan Li, PhD – Research Data Scientist, Cedars-Sinai Medical Center
- Philip J. Freda, PhD – Research Scientist I, Cedars-Sinai Medical Center
Speakers
- Jason H. Moore, PhD – Research Data Scientist, Cedars-Sinai Medical Center
- Binglan Li, PhD – Research Data Scientist, Cedars-Sinai Medical Center
- Philip J. Freda, PhD – Research Scientist I, Cedars-Sinai Medical Center
Description
Large language models (LLMs) are now widely used in biomedical research for code generation and decision support. However, going from chatbots to reliable lab assistant AI agents that can work closely with biomedical data and tools remains challenging. This tutorial aims to bridge the gap between concepts and production.
This tutorial balances conceptual introductions to various topics in biomedical agentic AI with hands-on exercises. The first half of the tutorial focuses on the basics. We begin with a background introduction to the definition of agentic AI and the untapped opportunities in biomedical research. We then dive into common LLM techniques and building blocks of agentic AI systems, the combination of which shape agentic AI system performance and behaviors. The attendees will experiment with various LLM and agentic AI techniques through hands-on exercises.
The second half of the tutorial guides the attendees to develop their own agentic AI systems. The attendees will have the opportunity to construct an agent that parses a natural-language task, selects the appropriate tools, runs the data analyses, and produces a useful report. Finally, we discuss advanced topics on multi-agent systems, human-in-the-loop workflow, lightweight logging, and cost tracking.
Through guided exercises, participants will:
- Configure an agent for their preferred LLM provider.
- Register bioinformatics-relevant tools (e.g., local Python scripts for sequence analysis or variant annotation) and versatile biomedical MCP servers.
- Orchestrate single-agent, multi-agent, and human-in-the-loop agentic AI systems.
- Build a simple end-to-end “bioinformatics assistant” or “bioinformatics team” that can propose, explain, and automate a small computational biology workflow.
All code (Python scripts, Jupyter notebooks, example tools) and slides will be shared via an open GitHub repository.
Learning Objectives
After this tutorial, participants will be able to:
- Explain the core concepts of agentic AI
- Understand key LLM capabilities relevant to agents
- Configure and run a basic agent
- Integrate domain-specific tools and resources
- Build and extend a small agentic pipeline for a bioinformatics task
Intended Audience and Level
This is an intermediate level tutorial. This tutorial is intended for computational biologists, bioinformaticians, and data scientists interested in applying agentic AI systems in their daily research workflows. The attendees do not need prior LLM or agentic AI experience. However, the attendees should be familiar with, and ideally proficient in, Python (e.g., edit code and read functions), command line, and Git/GitHub.
Schedule
| 09:00-09:30 |
[Lecture]
|
| 09:30-09:45 |
[Lecture + Demo]
|
| 09:45-10:00 | [Hands-on Session 1] Environment and LLM Setup |
| 10:00-10:15 | Coffee Break |
| 10:15-10:45 |
[Lecture + Demo]
|
| 10:45-11:15 | [Hands-on Block 2] Implementing LLM techniques and agentic AI system designs |
| 11:15-11:30 | Coffee Break |
| 11:30-11:45 | [Lecture] Background: Engineering a Research-Ready Agent |
| 11:45-12:15 | [Hands-on Block 3] Building a Bioinformatics Agent from a Template |
| 12:15-12:30 | [Lecture] Advanced Agentic AI Features |
| 12:30-12:45 | [Hands-on Block 4] Unlocking advanced agentic AI features (multiagent, reliability, cost tracking) |
| 12:45-13:00 | [Concluding Remark] Wrap-Up, Best Practices, and Q&A |
Room: TBD
Date: July 12, 2026
Start Time: 14:00
End Time: 18:00
Max Participants: 50
Organizers
- Robert Xiangru Tang, Professor, Yale University, USA
- Mark Gerstein, Postdoc, Yale University, USA
- Xuan Wang, Assistant Professor, Virginia Tech, USA
- Wenqi Shi, Assistant Professor, University of Texas Southwestern Medical Center, USA
Speakers
- Robert Xiangru Tang, Professor, Yale University, USA
- Mark Gerstein, Postdoc, Yale University, USA
- Xuan Wang, Assistant Professor, Virginia Tech, USA
- Wenqi Shi, Assistant Professor, University of Texas Southwestern Medical Center, USA
Description
Large Language Models (LLMs) such as ChatGPT have demonstrated strong capabilities in understanding, generating, and reasoning over natural language. In bioinformatics and biomedical informatics, these models are rapidly emerging as a new computational paradigm with the potential to transform literature mining, data integration, workflow automation, and biomedical reasoning. This tutorial provides a practical, introductory, and hands-on guide to understanding and applying LLMs and agentic AI systems in biomedical data science research. The tutorial begins with a concise introduction to LLMs and their evolution, followed by an overview of both commercial and open-source models commonly used in scientific applications. Participants will learn core techniques including prompt design, retrieval-augmented generation for biomedical literature and databases, text-to-SQL query generation, and bioinformatics code generation in Python. Building on these foundations, the tutorial introduces agentic AI systems that enable multi-step reasoning, tool use, and workflow orchestration for complex bioinformatics and biomedical informatics tasks. Through guided hands-on exercises and real-world case studies, participants will gain practical experience applying LLMs and agentic AI to realistic biomedical research scenarios. The tutorial also emphasizes responsible and rigorous use of these models, addressing limitations such as hallucination, bias, robustness, and reproducibility in scientific and clinical contexts. By the end of the tutorial, attendees will be equipped with foundational knowledge, practical skills, and best practices to responsibly integrate LLMs and agentic AI into biomedical research workflows.
Learning Objectives
By the end of the tutorial, participants will be able to:
- Understand key characteristics of large-scale biomedical data
- Apply LLMs to biomedical literature mining and data querying
- Use LLMs to accelerate bioinformatics programming workflows
- Implement retrieval-augmented generation for biomedical QA
- Understand agentic AI systems for complex biomedical tasks
- Critically assess limitations and risks of LLM-based methods
Intended Audience and Level
This tutorial is intended for graduate students, researchers, and practitioners in bioinformatics, computational biology, and biomedical informatics. It is suitable for beginners or users with limited prior experience with LLMs, while intermediate users will benefit from the advanced topics and case studies. Basic familiarity with Python and data analysis concepts is recommended but not required.
Schedule
| 09:00-09:10 | Welcome and Overview - Robert Xiangru Tang |
| 09:10-10:10 | Foundations of LLMs for Biomedical Data Science (with hands-on) - Robert Xiangru Tang |
| 10:10-10:45 | Retrieval-Augmented Generation for Biomedical Literature and Databases (hands-on) - Wenqi Shi |
| 10:45-11:00 | Coffee Break |
| 11:00-11:45 | Agentic AI Systems for Bioinformatics and Biomedical Informatics - Xuan Wang |
| 11:45-12:10 | QA / Interactive Discussion All Speakers |
| 12:10-12:55 | Hands-on: Bioinformatics Coding and Agentic Workflows Robert Xiangru Tang |
| 12:55-13:00 | Limitations, Responsible AI, and Wrap-Up Mark Gerstein, Wenqi Shi |
Room: TBD
Date: July 12, 2026
Start Time: 09:00
End Time: 13:00
Max Participants:
Organizers
- Prof. Daniel H. Huson (Professor of Algorithms in Bioinformatics, IMBI)
- Dr. Anupam Gautam (PostDoc IBMI and Max-Planck Institute for Biology Tübingen)
- Ms. Banu Cetinkaya (PhD candiate, IBMI)
Speakers
- Prof. Daniel H. Huson (Professor of Algorithms in Bioinformatics, IMBI)
- Dr. Anupam Gautam (PostDoc IBMI and Max-Planck Institute for Biology Tübingen)
- Ms. Banu Cetinkaya (PhD candiate, IBMI)
Description
Phylogenetic trees are the standard representation of evolutionary relationships, yet many biological datasets contain conflicting signals that cannot be explained by a single tree. Processes such as incomplete lineage sorting, recombination, hybridization, and horizontal gene transfer require more general evolutionary models based on phylogenetic networks. In parallel, Bayesian phylogenetic inference has shifted the focus from single optimal trees to distributions of plausible trees, increasing the demand for methods that summarize and interpret complex tree sets.
This half-day tutorial introduces participants to the complete workflow from sequence data to phylogenetic networks. The first half covers gene tree reconstruction using maximum-likelihood methods and Bayesian phylogenetic analysis. The second half focuses on the construction, visualization, and interpretation of phylogenetic networks. We will explore how to compute networks from distances, sequences and trees, using the latest SplitsTree release. Participants will learn how to use PhyloSketch to sketch, capture and layout trees and networks. We will look into the use of phylogeny GPTs to plan analyses.
The tutorial combines conceptual foundations with practical application. By the end of the session, participants will be able to assess when network approaches are appropriate, compute networks from real data, and critically interpret network structures in a biological context. The tutorial is aimed at researchers with basic knowledge of phylogenetic analysis who wish to extend their analyses toward modern network-based evolutionary inference.
Learning Objectives
After completing the tutorial, participants will be able to:
- Understand when and why phylogenetic networks are required beyond trees.
- Perform gene tree inference using likelihood and Bayesian approaches.
- Construct phylogenetic networks from alignments and tree sets.
- Use SplitsTree and PhyloSketch for network analysis and visualization.
- Use a phylogeny GPT chat to plan analyses.
- Critically interpret reticulation and uncertainty in networks.
Intended Audience and Level
Graduate students, postdoctoral researchers, and practitioners in bioinformatics, computational biology, and evolutionary biology with basic experience in sequence analysis and phylogenetic trees.
Level: Intermediate
Schedule
|
Part I: From Sequence Data to Gene Trees (110 minutes)
|
|
| Coffee Break (20 min) | |
|
Part II: From Trees to Networks (110 minutes)
|
Room: TBD
Date: July 12, 2026
Start Time: 14:00
End Time: 18:00
Max Participants: 20
Organizers
- Sungsik Kong, Research Scientist, RIKEN Center for Interdisciplinary Theoretical and Mathematical Sciences, Saitama, Japan
- Max Hill, Temporary Assistant Professor, Department of Mathematics, University of Hawai’i at Manoa, HI, USA
Speakers
- Sungsik Kong, Research Scientist, RIKEN Center for Interdisciplinary Theoretical and Mathematical Sciences, Saitama, Japan
- Max Hill, Temporary Assistant Professor, Department of Mathematics, University of Hawai’i at Manoa, HI, USA
Description
Synthetic data are essential for developing bioinformatics tool and for constructing training datasets for artificial intelligence model development. More specifically, synthetic biological data with known true parameter values are crucial for evaluating the performance of the developed methods. However, selecting appropriate parameters for producing biologically realistic data requires substantial biological knowledge. In this tutorial, we provide foundational theoretical and biological background for generating synthetic biological data, including species trees, gene trees within those species trees, and sequence data derived from those gene trees, along with hands-on experience in computationally generating them. We discuss the Yule model and the birth-death model for species tree generation, the multispecies coalescent for gene tree generation, and models of sequence evolution ranging from the simplest Jukes-Cantor to more complex General Time Reversible models for DNA sequence generation. We utilize R package phytools for species tree generation, ms for gene tree generation, and seq-gen for DNA sequence generation. Finally, we examine critical assumptions underlying these simulations and how altering parameter settings such as speciation and extinction rates, population sizes and speciation times, and base frequencies and transition rates produces datasets that convey specific biological scenarios.
Learning Objectives
- Generate synthetic biological data, including species trees, gene trees, and DNA sequence data, using popular computational tools.
- Gain an overview of input structure of the software for generating synthetic biological data, including its capabilities and limitations.
- Understand the basic statistical models used in phylogenetics and population genetics, along with the underlying biological assumptions.
- Develop foundational knowledge of how variations in parameter values in synthetic data generation relate to biological scenarios, such as:
- Branch lengths (coalescent units, expected number of mutations, generations)
- Speciation and extinction rates
- Population size
- Mutation and/or nucleotide substitution rates, transition and transversion rates
- Sequence length and number of loci
- Recognize potential applications and uses of the generated synthetic data (e.g., machine learning training)
Intended Audience and Level
We welcome those who are interested in using synthetic biological data for various purposes, including phylogenetic or bioinformatic methods and machine learning model development, but with little or no prior knowledge of biology. The course will provide a brief introduction to relevant theoretical backgrounds useful to understand the procedure. Some experiences with command line interface (or R, Python, or Julia programming language) would be highly preferred. Laptop computers with an Internet connection will be required.
Schedule
| 09:00-09:45 |
Introduction and preliminaries: Basic background on phylogenetics
|
| 09:45-10:45 |
Hands-on activity: Generating species trees and gene trees
|
| 10:45-11:00 | Coffee break |
| 11:00-13:00 |
Hands-on activity: Generating DNA sequences
|
Room: TBD
Date: July 12, 2026
Start Time: 09:00
End Time: 13:00
Max Participants: 40
Organizers
- Anthony Gitter - Associate Professor, Biostatistics and Medical Informatics, University of Wisconsin-Madison and Morgridge Institute for Research
- Alexander Morin - Ph.D. Student, Computer Science, Virginia Tech
- T. M. Murali - Professor and Associate Department Head of Research, Computer Science, Virginia Tech. Director, NSF COMPASS Center
- Anna Ritz - Associate Professor, Biology Department, Reed College
- Neha Talluri - PhD Student, Biomedical Data Science, University of Wisconsin-Madison and Morgridge Institute for Research
Speakers
- Anthony Gitter - Associate Professor, Biostatistics and Medical Informatics, University of Wisconsin-Madison and Morgridge Institute for Research
- Alexander Morin - Ph.D. Student, Computer Science, Virginia Tech
- T. M. Murali - Professor and Associate Department Head of Research, Computer Science, Virginia Tech. Director, NSF COMPASS Center
- Anna Ritz - Associate Professor, Biology Department, Reed College
- Neha Talluri - PhD Student, Biomedical Data Science, University of Wisconsin-Madison and Morgridge Institute for Research
Description
Gene regulatory networks capture how transcription factors and other regulators control gene expression in cells. Biological pathways, such as signaling cascades or metabolic routes, describe coordinated molecular interactions that drive cellular responses to internal and external stimuli. Curated pathways and gene regulatory network models provide useful high-level views of molecular organization, but they are often incomplete and fail to capture the condition-specific interactions and regulatory processes that arise in response to particular stimuli or disease states. Both gene regulatory networks and pathways can be studied through network analysis, specifically through gene regulatory network inference methods and pathway reconstruction algorithms, which uses high-throughput omics data (such as genomic, proteomic, transcriptomic, or metabolomic) assays to construct and analyze condition-specific networks.
Pathway reconstruction algorithms identify plausible condition specific pathways by combining omics datasets with prior knowledge of molecular interactions. For example, given a protein-protein interaction network and measurements of changes in protein phosphorylation or abundance in response to a treatment, these algorithms can infer a pathway that explains how the observed changes arise. Gene regulatory network inference methods identify condition-specific regulatory relationships from gene expression data. Given transcriptomic measurements, such as single-cell RNA sequencing data, these methods infer regulatory interactions between transcription factors and their target genes that explain observed expression patterns. Dozens of both pathway reconstruction and gene regulatory network inference approaches have been developed across the network biology community.
Despite their potential and prevalence, pathway reconstruction algorithms and gene regulatory network inference methods can be difficult to use in practice. Fragmented software environments, inconsistent or unmaintained implementations, limited data standards, and complex parameter tuning all create barriers for new and experienced users alike. The current challenges make it difficult to select appropriate tools, manage inputs, compare outputs across methods, and interpret the resulting subnetworks. These barriers limit the broader adoption of pathway reconstruction approaches, even though the underlying methods are powerful and widely applicable. Recent efforts have begun to address these problems by developing software tools and frameworks that aim to make these types of network analysis more accessible and easier to use. SPRAS and BEELINE are software frameworks that tackle these issues for pathway reconstruction and gene regulatory networks, respectively.
This tutorial will provide participants an introduction to pathway reconstruction and gene regulatory network inference, highlight two open source platforms (SPRAS and BEELINE) that simplify running and comparing different algorithms, and give examples of how predicted pathways and gene regulatory networks can be used to generate biological insights. It is designed for participants with entry-level computational skills, such as running command-line operations and editing YAML files, along with a basic understanding of omics data. No prior experience with pathway reconstruction algorithms or gene regulatory network inference methods are required. The tutorial will offer practical guidance for incorporating pathway reconstruction tools and gene regulatory network inference tools into real biological research. All tutorial material and the underlying software are available online with open source licenses.
Learning Objectives
By the end of this tutorial, participants will be able to:
- Understand the pathway reconstruction and gene regulatory network inference problems
- Learn about algorithms that integrate omics data with prior knowledge networks.
- Prepare omics and network data for pathway reconstruction and gene regulatory network analyses, including the role of parameter choices
- Run Docker images and the Snakemake software on their local machine.
- Use SPRAS and BEELINE to run, manage, compare, and evaluate multiple methods
- Apply pathway reconstruction methods and gene regulatory network inference methods to real datasets through guided hands-on exercises.
- Interpret and evaluate reconstructed pathways and gene regulatory networks using real datasets through guided hands-on exercises.
Intended Audience and Level
This tutorial is intended for:
- Researchers who work with high throughput omics datasets, want to infer condition-specific pathways or gene regulatory networks, and want to learn more about how biological networks can help generate hypotheses
- Bioinformaticians, computational biologists, and systems biologists interested in network-based methods
No prior experience with pathway reconstruction algorithms is required. Familiarity with omics data and basic command-line and YAML editing skills is helpful. The tutorial will include brief introductions to Docker/Apptainer, Snakemake, and Anaconda, though prior exposure to these tools is beneficial. Participants should have a laptop that already allows them to download and install software without needing administrative approval.
Schedule
| 09:00-09:30 |
Software Setup and Environment Preparation
Participants who already have everything installed may join at 9:30. |
|
09:30-10:15 |
Introduction to Pathway Reconstruction and Gene Regulatory Network Inference This session will introduce the conceptual foundations of pathway reconstruction and gene regulatory network inference. This sets the stage for the later hands-on sessions. |
| 10:15-10:45 |
SPRAS part 1
|
| 10:45-11:00 | Coffee break |
| 11:00-11:30 |
SPRAS part 2
|
| 11:30-11:45 | Brief session break |
| 11:45-12:45 |
Benchmarking Gene Regulatory Network Inference Using BEELINE
|
| 12:45-13:00 | Open Q&A and Troubleshooting Session |
Tutorial IP6: Network Biology Workflows with Cytoscape and NDEx: Visualization, Analysis, Sharing, Automation, and Apps
Room: TBD
Date: July 12, 2026
Start Time: 14:00
End Time: 18:00
Max Participants: 100
Organizers
- John "Scooter" Morris, PhD, Executive Director, Resource for Biocomputing, Visualization, and Informatics (RBVI), University of California, San Francisco (UCSF); the “Roving Engineer” for Cytoscape, and an Adjunct Assistant Professor of Pharmaceutical Chemistry at UCSF
- Christopher Churas, Senior Software Engineer, NDEx Project and Cytoscape Core Developer since 2018 at University of California, San Diego (UCSD)
- Jing Chen, Director of Software Development in Ideker Lab at UC San Diego, Architect of NDEx Project and a Cytoscape Core Developer
Speakers
- John "Scooter" Morris, PhD, Executive Director, Resource for Biocomputing, Visualization, and Informatics (RBVI), University of California, San Francisco (UCSF); the “Roving Engineer” for Cytoscape, and an Adjunct Assistant Professor of Pharmaceutical Chemistry at UCSF
- Christopher Churas, Senior Software Engineer, NDEx Project and Cytoscape Core Developer since 2018 at University of California, San Diego (UCSD)
- Jing Chen, Director of Software Development in Ideker Lab at UC San Diego, Architect of NDEx Project and a Cytoscape Core Developer
Description
Biological networks provide a powerful framework to organize and interpret high-throughput data, from protein interactions and pathway models to disease-associated networks and knowledge graphs. Cytoscape is a widely used open-source platform for interactive network visualization and analysis, supported by an ecosystem of interoperable tools and services, including NDEx (Network Data Exchange) for publishing and sharing networks, Cytoscape Web for browser-based visualization, and automation interfaces for integrating network analysis into reproducible pipelines. This half-day, in-person tutorial is structured as two consecutive blocks. Block 1 is geared toward participants who are new to Cytoscape and/or network biology. Through short concept overviews and guided exercises, attendees will import a simple edge list dataset, apply data-driven visual styles and layouts, retrieve a biologically relevant network from a public source (e.g., STRING or NDEx), run a basic analysis workflow, and publish results to NDEx (including exporting CX2 and Cytoscape session files) for sharing and reuse in desktop and web contexts. Block 2 is geared toward bioinformaticians and tool/workflow developers. Participants will learn practical integration patterns (e.g., launching Cytoscape from an external application), drive Cytoscape programmatically from Python using py4cytoscape and the Cytoscape Automation/REST interfaces, and extend Cytoscape via a hands-on Service App exercise using a provided template repository and an optional AI-assisted coding demonstration. We will also briefly situate Cytoscape and NDEx within the broader network-analysis landscape (e.g., igraph/networkx, Gephi) and discuss when interactive visual analytics versus scripted workflows are most appropriate
Learning Objectives
- Explain core concepts in network biology and recognize common biological network types (interaction networks, pathways, knowledge graphs).
- Import an edge list (TSV) and associated attributes into Cytoscape and apply data-driven visual styles and layouts to reveal structure in the data.
- Identify appropriate network sources (e.g., STRING, GeneMANIA, NDEx, WikiPathways, Pathway Commons) and import at least one external network for analysis.
- Run a basic network analysis workflow in Cytoscape (topology summary with NetworkAnalyzer; optional clustering/enrichment via selected apps).
- Share and publish networks and visualizations using NDEx, CX2, Cytoscape sessions, and Cytoscape Web to support reproducible research and collaboration.
- Integrate external tools and web applications with Cytoscape using practical patterns (e.g., 'Open in Cytoscape' actions and web-based viewing).
- Automate Cytoscape from Python with py4cytoscape to support reproducible, scriptable workflows and pipeline integration.
- Implement and test a minimal Cytoscape Service App from a provided template, adding a new REST endpoint and validating end-to-end behavior in Cytoscape
Intended Audience and Level
This tutorial is designed for two complementary audiences: (i) biologists and bioinformaticians who are new to Cytoscape and want a practical introduction to network biology workflows (Block 1), and (ii) bioinformaticians, workflow authors, and tool developers who want to integrate Cytoscape/NDEx into pipelines or extend Cytoscape via apps (Block 2). Block 1 assumes general familiarity with biological datasets but does not require prior Cytoscape experience. Block 2 assumes participants can run a Jupyter notebook and make small edits to Python code. R users are welcome; we will briefly introduce RCy3 and provide pointers to analogous workflows.
Schedule
| 09:00-09:15 | Welcome; quick audience poll; what is network biology?; Cytoscape + ecosystem overview; where Cytoscape/NDEx fit among other network tools. |
| 09:15-09:45 | Import a small TSV edge list (galFiltered dataset); map node attributes to color/size; apply a layout; export a figure (PNG/SVG). |
| 09:45-10:20 | Find/import a relevant network from ONE source (e.g., STRING or NDEx); run a basic analysis pipeline (NetworkAnalyzer + optional clustering with cyCommunityDetection/clusterMaker2). |
| 10:20-10:45 | Save/share/publish: publish a network to NDEx (shareable link); save Cytoscape session; export CX2; load/view the CX2 network in Cytoscape Web. |
| 10:45-11:00 | Coffee break |
| 11:00-11:10 | Implement an 'Open in Cytoscape' action from an external application via Cytoscape Automation/cyREST; web-viewing options with Cytoscape Web. |
| 11:10-11:50 | Cytoscape automation in Python: run and modify a Jupyter notebook using py4cytoscape to load a network (e.g., from NDEx), apply style/layout, run analysis, and export CX2/PNG; discuss calling scripts from larger workflows. |
| 11:50-12:00 | LLM-assisted Cytoscape scripting via MCP or cyREST (optional for participants; no account required to follow). |
| 12:00-12:15 | Cytoscape app model overview (Desktop, Web, Service apps); App Store tour and example apps. |
| 12:15-13:00 | Build a minimal Cytoscape Service App (Python) from a template repo; add one REST endpoint (e.g., filter edges by confidence threshold); register and test end-to-end in Cytoscape. Solution branch provided. |
Room: TBD
Date: July 12, 2026
Start Time: 09:00
End Time: 18:00
Max Participants: 50
Organizers
- Aritra Bose, PhD, Staff Research Scientist, IBM Research, Yorktown Heights, NY
- Filippo Utro, PhD, Senior Research Scientist, IBM Research, Yorktown Heights, NY
- Laxmi Parida, PhD, IBM Fellow, ISCB Fellow, Yorktown Heights, NY
Speakers
- Aritra Bose, PhD, Staff Research Scientist, IBM Research, Yorktown Heights, NY
- Filippo Utro, PhD, Senior Research Scientist, IBM Research, Yorktown Heights, NY
- Laxmi Parida, PhD, IBM Fellow, ISCB Fellow, Yorktown Heights, NY
Description
Single-cell and population-level multi-omics analyses are transforming our understanding of biological complexity by integrating genomics, proteomics, and transcriptomics to reveal the molecular underpinnings of disease. As these datasets grow in scale and dimensionality, emerging technologies like artificial intelligence and quantum computing (QC) are poised to overcome classical computational limits. Recent breakthroughs in QC have demonstrated remarkable potential for machine learning, optimization, and biomedical discovery, spanning applications from biomarker identification, clinical trials, to therapeutic design [1-5]. This tutorial introduces participants to the fundamentals of quantum computing and its practical application in multi-omics analysis. Through guided, hands-on exercises, participants will learn how to preprocess and encode biological data for quantum algorithms, explore quantum machine learning and tensor decomposition, and use data complexity measures to understand when quantum approaches can outperform classical ones. The session concludes with quantum–classical hybrid workflows applied to real-world multi-omics datasets, offering actionable insights into how QC can accelerate therapeutic discovery and precision medicine.
Learning Objectives
Participants will learn a new paradigm on analyzing multi-omics data with hands-on experience with a quantum computer. Specifically, the major takeaways of this tutorial would be:
- Understanding the fundamentals of quantum computing, including learning how to implement algorithms in quantum hardware with quantum gates and circuits using Qiskit.
- Practical experience of pre-processing multi-omics data and preparing it for a quantum hardware experiment.
- Analyze machine learning methods on multi-omics data, understand their shortcomings, and review the impact of data complexity measures on ML models.
- How to apply QML and other quantum algorithms to multi-omics data for performing binary classification tasks and multi-omics data integration.
- Learn the design of experiments for biomedical data using quantum computers by gaining an in-depth knowledge of quantum-classical hybrid workflows.
- Understand when to apply QML models and benchmark them with classical ML models.
Intended Audience and Level
This tutorial is aimed at computational biologists, bioinformaticians, clinicians, practitioners, data analysts, including early-career to senior researchers in the fields of healthcare and life sciences enthusiastic to learn about new frontiers of computational biology. There are very few prerequisites for the tutorial, listed as follows:
- Create an IBM Quantum account in IBM Quantum Learning website, click on “Create an IBMid” and follow the instructions.
- Watch the Qiskit Global Summer School videos – QML 2021 (optional).
- Entry-level knowledge of multi-omics data analyses and machine learning concepts.
- Review https://github.com/IBM/QBioCode for tutorial content of ISMB 2024 and 2025.
Schedule
| 09:00-09:15 | Introduction |
| 09:15-10:15 | Quantum computing fundamentals with Qiskit |
| 10:15-10:45 | Classical Machine learning applications in multi-omics data |
| 10:45-11:00 | Coffee Break |
| 11:00-12:00 | Current state of Quantum algorithms for multi-omics data analysis |
| 12:00-13:00 | Data Complexity measures and learning algorithms |
| 13:00-14:00 | Lunch Break |
| 14:00-14:30 | Quantum-classical hybrid framework for machine learning applications in multi-omics |
| 14:30-16:00 | Implement your own quantum-classical machine learning hybrid framework |
| 16:00-16:15 | Coffee Break |
| 16:15-17:15 | Reviewing results from the practical session |
| 17:15-17:45 | Interactive Q&A |
| 17:15-18:00 | Concluding Remarks |
Room: TBD
Date: July 12, 2026
Start Time: 09:00
End Time: 18:00
Max Participants: 50
Organizers
- Shayesteh Arasti, Department of Computer Science and Engineering, UC San Diego
- David Koslicki, Associate Professor of Computer Science & Biology, The Pennsylvania State University
- Ben Langmead, Professor of Computer Science, Johns Hopkins University
- Siavash Mirarab, Professor of Electrical and Computer Engineering, UC San Diego
- Ali Osman Berk Sapci, Bioinformatics and Systems Biology Program, UC San Diego
- Fengzhu Sun, Professor of Quantitative and Computational Biology and Mathematics, USC
- Yun William Yu, Associate Professor of Computational Biology, School of Computer Science, Carnegie Mellon University
Speakers
- Shayesteh Arasti, Department of Computer Science and Engineering, UC San Diego
- David Koslicki, Associate Professor of Computer Science & Biology, The Pennsylvania State University
- Ben Langmead, Professor of Computer Science, Johns Hopkins University
- Siavash Mirarab, Professor of Electrical and Computer Engineering, UC San Diego
- Ali Osman Berk Sapci, Bioinformatics and Systems Biology Program, UC San Diego
- Fengzhu Sun, Professor of Quantitative and Computational Biology and Mathematics, USC
- Yun William Yu, Associate Professor of Computational Biology, School of Computer Science, Carnegie Mellon University
Description
Modern metagenomics has greatly benefited from k-mer–based methods due to their scalability and accuracy on large and complex datasets. A variety of methods with somewhat different goals and philosophies fit under the broad banner of “k-mer–based”, and it has been difficult for users to keep this breadth in mind. This tutorial provides a comprehensive overview of k-mer–based approaches for analyzing metagenomic samples, focusing on both methodological foundations and practical applications. The tutorial brings together developers of widely used tools to present complementary perspectives on tasks such as taxonomic profiling and classification, phylogenetic placement, and virus identification, together with tips on downstream analysis and comparison of metagenomic samples. Fundamentally, these methods use k-mers to avoid alignment. However, they differ in the specific computational tasks and in approach.
Participants will learn about core concepts underlying both short and long k-mer–based analysis, including sketching, minimizers, and distance estimation, and how choices affect accuracy, resolution, and computational efficiency. They will learn about methods that analyze each read, then summarize results, methods that jointly analyze all reads of a sample, and methods that bridge the two approaches. Hands-on exercises will allow participants to apply the tools they are interested in to example datasets. Emphasis will be placed on understanding the strengths, limitations, and appropriate applications of each approach.
By the end of the tutorial, participants will gain a sufficient understanding of the key differences between approaches and the trade-offs involved. They will learn how to evaluate which approaches are most suitable for specific metagenomic tasks, and gain insights for designing analyses for their own applications and datasets
Learning Objectives
- Understanding core algorithmic concepts underlying k-mer–based metagenomic analysis, including sketching, minimizers, and distance and containment calculations.
- Understanding differing problem formulations related to metagenomic characterization.
- Comparing major classes of k-mer–based methods for those metagenomic tasks and understanding differences between approaches, such as per-read taxonomic classification, sketching-based profiling, containment analysis, and distance estimation.
- Learning how different approaches and parameter choices affect accuracy, resolution, and scalability for each specific method.
- How to (not) interpret outputs of different tools and (not) integrate results into larger metagenomic analysis frameworks.
- Learning how to capture important biological signals from potentially noisy and often complex results in downstream analyses by consolidation of noisy read-level assignments, detection of the presence or absence of novel taxa, and identification of differentially abundant taxa.
- Selecting appropriate tools for specific metagenomic applications based on dataset characteristics (e.g., read length, sequencing depth, complexity, presence of novel taxa) and computational constraints.
Intended Audience and Level
This tutorial is intended for both practitioners (e.g., biologists and bioinformaticians working on metagenomic data) and method developers interested in metagenomic sequence analysis. Participants are expected to have a prior exposure to metagenomic data and workflows, as well as proficiency with the Unix-based command line. Some familiarity with computational concepts is helpful but not required.
Schedule
| 09:00-09:15 | Siavash Mirarab Introduction, schedule, and logistics |
| 09:15-09:45 | Siavash Mirarab Background, overview of different problems and k-mer-based approaches |
| 09:45-10:45 | Ben Langmead Taxonomic classification using Kraken2 |
| 10:45-11:00 | Coffee Break |
| 11:00-12:00 | Fengzhu Sun Phage identification using VirFinder and phage-host interaction using d2* |
| 12:00-13:00 | David Koslicki k-mer sketching for fast metagenomic analysis using sourmash and FMH |
| 13:00-14:00 | Lunch Break |
| 14:00-15:00 | Yun William Yu Abundance profiling using Sylph and ANI calculation using skani |
| 15:00-16:00 | Ali Osman Berk Sapci Estimating distances from reads to genomes and phylogenetic placement using krepp |
| 16:00-16:15 | Coffee Break |
| 16:15-17:15 | Shayesteh Arasti Consolidating read placements into a few phylogenetic placements using DecoDiPhy |
| 17:15-17:30 | Siavash Mirarab Discussion of applications & closing remarks |
| 17:30-17:40 | Audience survey on the effectiveness of the course |
Tutorial IP9: Learning models of molecular sequence recognition from NGS data using biophysical machine learning
Room: TBD
Date: July 12, 2026
Start Time: 09:00
End Time: 18:00
Max Participants: 30
Organizers
- H. Tomas Rube, Assistant Professor, University of California, Merced
- Harmen J. Bussemaker, Professor, Columbia University
- Lucas A. N. Melo, PhD Candidate, Stanford University
- Shaoxun Liu, PhD Candidate, Columbia University
Speakers
- H. Tomas Rube, Assistant Professor, University of California Merced
- Harmen J. Bussemaker, Professor, Columbia University
- Lucas A. N. Melo, PhD Candidate, Stanford University
- Shaoxun Liu, PhD Candidate, Columbia University
Description
Biophysical machine learning is an emerging approach for analyzing data from high-throughput genomics assays based on next-generation sequencing (NGS). In this tutorial we will provide the conceptual framework and hands-on experience to analyze such data. The examples will be focused on the DNA binding specificity of transcription factors, but the approach can be used to characterize many other classes of proteins. There will be three modules, each of which combines a short lecture with a hands-on exercise. The first module focuses on how sequence specificity can be represented in a biophysically meaningful way, and reviews the relationships between binding affinity, binding free energy, and on/off rates, along with a discussion of symmetries and invariances that can hamper interpretation if not properly taken into account; the hands-on exercise will focus on how the impact of non-coding variants can be predicted. In the second module we will discuss how sequence-to-affinity models can be used to understand input-to-output enrichment patterns associated with selection assays such as SELEX-seq and ChIP-seq, and how to interpret these assays biophysically. We will also discuss how this framework naturally leads to a principled approach to inferring the sequence-to-affinity models from raw NGS data associated with the selection assays. During the last module, we will give a broader overview of applications of this approach, including chromatin accessibility data (ATAC-seq) and assays of kinase specificity and peptide binding specificity based on protein display libraries. Applicants are encouraged to bring their own NGS datasets and get help getting started on analyzing these, but can also work on pre defined exercises.
Learning Objectives
- Participants will learn how to use biophysical machine learning to analyze high-throughput sequencing data to learn accurate sequence-to-function models. Following the tutorial, the students will be able to:
- Obtain an overview of different experimental methods that generate training data suitable for quantitative free-energy regression, including ChIP-seq, (Epi)-SELEX-seq, RNA Bind-n-Seq, and protein-display screens.
- Obtain an overview of different types of protein-sequence interactions. These proteins include transcription factors, RNA-binding proteins, nucleases, kinases, peptide binding domains and the sequences include normal and chemically modified DNA, RNA and normal and phosphorylated.
- Formulate linear and non-linear models mapping sequence to binding affinity or enzyme efficiency.
- Know how to pre-process high-throughput sequencing reads into data tables amenable to free-energy regression.
- Understand how to model multiple sequencing libraries and the relationship between this modeling and parameter identifiability.
- Gain experience implementing free-energy regression using the ProBound (and PyProbound) machine learning framework.
- Know how to display sequence-to-affinity models as logos and learn how to scan biological sequences for interaction sites.
Intended Audience and Level
This tutorial will be designed for participants at a wide range of levels, including graduate students, early career researchers, and industrial scientists. For students that come with a sequencing dataset, the course will provide all the skills needed to formulate a free-energy regression problem and implement the appropriate analysis. For students with a theoretical background, the course will also examine the connections between free-energy regressions and neural networks such as CNNs. To engage efficiently with the hand-on activities, we recommend that the students have basic familiarity with programming and the python programming language.
The slides and materials will be available online after the course through a dedicated GitHub repository.
Schedule
| 09:00-09:45 | Welcome, overview of biophysical machine learning. |
| 09:45-10:00 | Short break & discussion. |
| 10:00-10:45 | Lecture 1: Biophysical representations of sequence-to-function relationships and the compatibility with CNNs. |
| 10:45-11:00 | Coffee break. |
| 11:00-12:00 | Exercise 1: Scoring biological sequences, displaying additive free-energy models as sequence logos, interpreting coefficients. |
| 12:00-13:00 | Lecture 2: High-throughput experiments for quantifying sequence recognition, how modeling enables reconstruction of thermodynamics, setting up a biophysical supervised learning problem. |
| 13:00-14:00 | Lunch break. |
| 14:00-16:00 | Exercise 2: Pre-processing sequencing reads to a data table amenable to free-energy regression, learning well-calibrated TF binding models. |
| 16:00-16:15 | Coffee break. |
| 16:15-17:00 | Lecture 3: Overview of strategies adopted by different research groups, TF-TF cooperativity, learning from complementary datasets. |
| 17:00-17:45 | Exercise 3: A smorgasbord of applications. |
| 17:45-18:00 | Wrap up and Q&A. |
Room: Virtual
Date: July 6 & 7, 2026 (Full Day tutorial split across two days)
Start Time: 09:00
End Time: 13:00
Max Participants: 30
Organizers
- Megha Hegde – PhD Researcher, Kingston University London
- Dr Farzana Rahman – MSc Data Science Course Leader, Kingston University London
- Professor Jean-Christophe Nebel, Professor of Computer Science, Kingston University London, UK
- Dr Ragothaman M. Yennamalli - Senior Assistant Professor, SASTRA Deemed to be University, India
- Shashank Ravichandran - Member of Technical Staff, Athenahealth, India
Speakers
- Megha Hegde – PhD Researcher, Kingston University London
- Dr Farzana Rahman – MSc Data Science Course Leader, Kingston University London
- Professor Jean-Christophe Nebel, Professor of Computer Science, Kingston University London, UK
- Dr Ragothaman M. Yennamalli - Senior Assistant Professor, SASTRA Deemed to be University, India
- Shashank Ravichandran - Member of Technical Staff, Athenahealth, India
Description
Though originally developed for natural language processing, large language models (LLMs) are also well suited to modelling the language of life. DNA sequences can be treated as sentences over a four-letter alphabet, with overlapping k-mer segments acting as “words”, hence enabling natural-language approaches to extracting biological information. Transformer-based LLMs are especially well-suited to this task, using multi head attention to capture long-range interactions within genomic sequences, and avoiding the vanishing gradient problems common in recurrent networks. Contemporary genomic LLMs, such as DNABERT and its successors, leverage this Transformer architecture in conjunction with specialised tokenisation methods to represent DNA and, after fine-tuning, have proven effective on tasks such as enhancer classification and splice site annotation. Recent work extends these models to multimodal omics data, integrating genomic sequences with transcriptomics and spatial data to tackle a whole new range of tasks. Such models have made strides in areas such as variant effect prediction, with DeepMind’s AlphaMissense achieving an 89% accuracy in classifying missense variant effects by combining structural context with evolutionary conservation data.
This full-day tutorial will provide attendees with a theoretical and practical foundation in LLMs for genomics research. Instructors will explain why Transformers are well-suited to modelling DNA, review tokenisation strategies (single-nucleotide, k-mer, and byte-pair encoding), and discuss how to deal with long-context genomic data. Participants will gain hands-on experience in loading genomic datasets, tokenising sequences, and using pre-trained models from Hugging Face for inference and fine-tuning on DNA variant effect prediction problems. A guided walkthrough of the Hugging Face Trainer API will cover setting training arguments, monitoring loss via early stopping, and saving fine-tuned models. The tutorial will conclude with a session where attendees will work through a mini-project applying these skills, with instructors on hand to answer questions and provide guidance. By the end of this tutorial, participants will understand the principles behind genomic LLMs and be able to implement state-of-the-art Transformer-based models for DNA variant effect prediction.
Learning Objectives
- Explain why Transformer-based LLMs are well-suited to modelling DNA sequences.
- Understand how Transformer-based LLMs model DNA sequences.
- Be able to implement state-of-the-art Transformer-based genomic LLMs from Hugging Face for DNA variant effect prediction problems.
Intended Audience and Level
This tutorial is aimed at academics or researchers who work with genomics data and would like to integrate LLMs into their workflow. The tutorial will be delivered at an appropriate level for beginners to LLMs and will educate learners to upskill to an intermediate level.
Prerequisites: Good knowledge of Python and at least basic knowledge of machine learning/deep learning. No need for previous experience with LLMs or Transformer-based models.
Schedule
| 09:00-10:00 |
Introduction to Genomic LLMs - Jean-Christophe Nebel /Ragothaman Yennamalli
|
| 10:00-10:45 |
Introduction to Hugging Face for Genomic LLMs - Megha Hegde /Farzana Rahman
|
| 10:45-11:00 | Coffee Break |
| 11:00-12:00 |
Training Genomic LLMs - Megha Hegde /Farzana Rahman
|
| 12:00-13:00 |
Training Genomic LLMs Using PyTorch and the Hugging Face Trainer API - Megha Hegde /Shashank Ravichandran
|
| 13:00-14:00 | Lunch Break |
| 14:00-15:00 |
Training Genomic LLMs Using PyTorch and the Hugging Face Trainer API (continued) - Megha Hegde /Shashank Ravichandran
|
| 15:00-15:15 |
Introduction to Workshop - Megha Hegde /Farzana Rahman Introduction to and explanation of a hands-on activity for participants using the skills acquired in the morning sessions |
| 15:15-16:00 |
Workshop: Try It Yourself - Shashank Ravichandran /Megha Hegde
|
| 16:00-16:15 | Coffee Break |
| 16:15-17:15 |
Workshop: Try It Yourself (continued) - Shashank Ravichandran /Megha Hegde Continuation of previous session. |
| 17:15-18:00 |
Recap & Reflection - Farzana Rahman / Ragothaman Yennamalli
|
Room: Virtual
Date: July 6, 2026
Start Time: 14:00
End Time: 18:00
Max Participants: 40
Organizers
- Mark Keller, Postdoctoral Research Fellow, Harvard Medical School
- Eric Mörth, Postdoctoral Research Fellow, Harvard Medical School
- Nils Gehlenborg, Associate Professor, Harvard Medical School
Speakers
- Mark Keller, Postdoctoral Research Fellow, Harvard Medical School
- Eric Mörth, Postdoctoral Research Fellow, Harvard Medical School
- Nils Gehlenborg, Associate Professor, Harvard Medical School
Description
As single-cell and spatial biology datasets grow larger and more complex, with more modalities and types of downstream analysis results, tasks such as visualization also become more difficult. Using interactive techniques can help to alleviate challenges, enabling users to navigate to regions of interest in large spatial datasets, select genes of interest among all of those profiled, and select particular cells or cell types for follow-up analyses. However, creating such interactive visualizations often requires using unfamiliar tools, and it can be challenging to switch between producing static and interactive outputs. Once created, sharing interactive visualizations with collaborators can be challenging, as the corresponding data and software (as opposed to plain image files) must be made accessible to the visualization recipients.
The Vitessce visualization framework for single-cell and 2D/3D spatial biology data (https://vitessce.io/) provides building-blocks to compose scalable and interactive visualizations and share them on the web, with support for common data formats defined in the Scverse and Open Microscopy Environment (OME) ecosystems. Vitessce is deployed in numerous data portals to provide thousands of single-cell and spatial data visualizations, e.g., in the NIH HuBMAP Data Portal, in the NIH Cellular Senescence Data Portal, in the NIH Kidney Precision Medicine Project Atlas, and many other academic and commercial resources (see https:/ vitessce.io/docs/showcase/).
In Vitessce, multiple interactive visualizations can be linked together to construct applications that display information from multiple data types and modalities. The EasyVitessce Python package is designed to enable computational biologists to quickly author Vitessce-based visualizations with their own data using familiar Scverse plotting syntax.
In our tutorial, we introduce core concepts of single-cell and spatial biology data visualizations and illustrate how they can be applied through hands-on training with Vitessce and EasyVitessce in Jupyter Notebooks.
Learning Objectives
- Understand the importance of interactive visualization during analysis of single-cell and 2D/3D spatial biology data.
- Learn approaches to create interactive visualizations of different data types for single-cell, and 2D/3D spatial biology data using Python and the Vitessce framework.
- Learn approaches to enhance the interactivity of visualizations by using multiple views and coordination of different properties.
- Learn approaches to deploy and share interactive visualizations on the web.
- Learn about open problems and innovations in visualization for single-cell and spatial datasets.
Intended Audience and Level
This tutorial targets an audience with familiarity of Python and single-cell data analysis using tools in the Scverse ecosystem, such as AnnData and Scanpy. Audience members should be comfortable setting up Python environments, installing packages and managing dependencies, and working in Jupyter notebooks. (Audience members are not expected to have experience with creation of interactive visualizations.)
Schedule
| 09:00-09:40 |
Introduction (40mins)
|
| 09:40-10:15 |
Hands-on Session 1: Single Spatial View (50 mins)
|
| 10:15-10:30 | Coffee Break |
| 10:30-11:00 |
Hands-on Session 2: Multiple Views (30 mins)
|
| 11:00-11:45 |
Hands-on Session 3: Available View Types (45 mins)
|
| 11:45-12:00 | Coffee Break |
| 12:00-12:30 |
Hands-on Session 4: Multiple Coordinated Views (30 mins)
|
| 13:30-12:50 |
Hands-on Session 5: Share Your Visualizations (20 mins)
|
| 12:50-13:00 | Closing |
Room: Virtual
Date: July 6, 2026
Start Time: 09:00
End Time: 18:00
Max Participants: N/A
Organizers
- Geraldine Van der Auwera, PhD
Speakers
- Geraldine Van der Auwera, PhD
Description
Nextflow is a powerful and flexible open-source workflow management system that simplifies the development, execution, and scalability of data-driven computational pipelines. It is widely used in bioinformatics and related scientific fields to automate complex analyses, making it easier to manage and reproduce large-scale data analysis workflows.
Hello Nextflow is intended as a “getting started” course for students and early-career researchers who are completely new to Nextflow. The tutorial aims to equip participants with foundational knowledge and skills in three key areas: (1) understanding the logic of how data analysis workflows are constructed, (2) Nextflow language proficiency and (3) command-line interface (CLI) execution.
Through guided, goal-oriented exercises, participants will learn to:
- Use core components of the Nextflow language to construct simple multi-step workflows effectively.
- Launch Nextflow workflows locally, navigate output directories to access results, interpret log outputs for insights into workflow execution, and troubleshoot basic issues that may arise during workflow execution.
By the end of the tutorial, participants will be well-prepared for tackling the next steps in their journey to develop and apply reproducible workflows for their scientific computing needs. Additional study-at-home materials will be provided for them to continue learning and developing their skills further.
The training materials are open-source and freely available on the Nextflow training portal at https://training.nextflow.io/latest/hello_nextflow
Learning Objectives
This tutorial aims to teach foundational skills for building and running pipelines with Nextflow:
- Launch a Nextflow workflow locally
- Find and interpret outputs (results) and log files generated by Nextflow
- Troubleshoot common issues
- Utilize core Nextflow components sufficient to build a simple multi-step workflow
- Describe next-step concepts such as channels and operators
- Configure pipeline execution to run on a variety of common computing platforms including HPC and cloud
- Apply best practices for reproducibility, portability and code re-use that make pipelines FAIR, including code modularity and software containers
Intended Audience and Level
This tutorial is designed for learners who are completely new to Nextflow. Some familiarity with the command line, basic scripting concepts and common file formats is assumed. The exercises are all domain-agnostic, so no prior scientific knowledge is required.
Schedule
| 09:00-09:10 | Welcome & introductions |
| 09:10-09:50 |
Hello World:
|
| 09:50-10:30 | Hello Channels: Introduction to Nextflow channels and operators for processing large or complex inputs and parallelizing execution effortlessly. |
| 10:30-10:45 | Coffee Break |
| 10:45-11:45 | Hello Workflow: Expanding the use of channels to chaining multiple steps together and handling transfer of data between steps. |
| 11:45-12:00 | Morning wrap-up: Recap and open Q&A |
| 12:00-13:00 | Lunch Break |
| 13:00-13:30 | Hello Modules: Applying code modularity principles to increase reusability and decrease maintenance burden. |
| 13:30-14:30 | Hello Containers: Using containers as a mechanism for managing software dependencies in the context of reproducible bioinformatics workflows. |
| 14:30-14:45 | Coffee Break |
| 14:45-15:35 | Hello Config: Setting up and managing a pipeline’s configuration to customize its behavior, adapt it to different environments, and optimize resource usage. |
| 15:35-16:00 | Afternoon wrap-up: Recap, open Q&A and next steps |
Room: Virtual
Date: July 6, 2026
Start Time: 09:00
End Time: 18:00
Max Participants:
Organizers
- Himel Mallick, Cornell University
Speakers
- Himel Mallick, Cornell University
- Saptarshi Roy, Texas A&M University
- Sreya Sarkar, University of Iowa
Description
Biological and biomedical studies increasingly generate multimodal datasets that combine multiple molecular layers, such as genomics, transcriptomics, epigenomics, microbiome profiles, metabolomics, imaging data, and clinical or environmental measurements. While the availability of such data creates new opportunities to study complex biological systems, it also raises fundamental analytical challenges. Researchers must decide how to represent heterogeneous data types, how to integrate signals across modalities, and how to move beyond association toward mechanistic understanding.
Many existing multimodal analysis workflows focus primarily on predictive performance or exploratory associations, often treating integration as a purely technical exercise. However, in many biological and clinical applications, the central goal is not only prediction but also understanding how and why an exposure, perturbation, or intervention affects an outcome through multiple biological layers. Addressing such questions requires principled approaches to multimodal integration that are compatible with causal reasoning and uncertainty quantification.
This tutorial aligns directly with the ISMB Tutorials program by providing hands-on, educational training in multimodal integration and multimodal causal inference using open-source tools in R and Bioconductor. The emphasis is on methodological foundations, reproducible workflows, and balanced comparison of widely used integration strategies rather than on presenter-specific software. By combining practical instruction on multimodal data representation with causal mediation concepts, the tutorial fills an important gap for ISMB participants seeking to translate integrated multi-omics analyses into biologically interpretable and mechanistically grounded insights.
Learning Objectives
By the end of this tutorial, participants will be able to:
- Represent heterogeneous multimodal datasets using standard Bioconductor data structures.
- Distinguish between early, intermediate, and late fusion strategies and understand their practical implications.
- Apply diagonal, horizontal, and vertical integration paradigms to common biological study designs.
- Use latent factor and representation learning models to identify shared and modality-specific signals.
- Formulate and interpret multimodal causal mediation analyses involving multiple biological layers.
- Assess uncertainty and interpret effect estimates in high-dimensional, multimodal settings.
- Construct reproducible end-to-end multimodal analysis workflows in R.
Intended Audience and Level
This tutorial is intended for computational biologists, bioinformaticians, statisticians, and data scientists who work with multimodal biological data. It is particularly relevant for researchers interested in integrating multiple data layers to study biological mechanisms or treatment effects. Participants are expected to have basic familiarity with R and standard statistical concepts. No prior experience with causal inference, mediation analysis, or advanced multimodal methods is required.
Schedule
| 09:00-09:30 | Introduction and Motivation: Overview of multimodal biological data and motivating examples from multi-omics, microbiome, imaging, and clinical studies. Discussion of why integration is necessary, common pitfalls of single-modality analyses, and the distinction between associative and causal questions in multimodal research. |
| 09:30-10:15 | Multimodal Data Structures and Study Design: Principles for representing multimodal datasets in R/Bioconductor. Introduction to SummarizedExperiment and MultiAssayExperiment, sample alignment, partially observed modalities, and metadata organization. Emphasis on how study design choices affect downstream integration and inference. |
| 10:15-10:45 | Preprocessing and Harmonization Across Modalities: Practical considerations for preprocessing heterogeneous data types, including normalization, scaling, feature filtering, and handling missing data. Discussion of modality-specific preprocessing versus joint harmonization strategies. |
| 10:45-11:00 | Coffee Break |
| 11:00-11:45 | Multimodal Integration Paradigms: Overview of early, intermediate, and late fusion strategies. Introduction to diagonal, horizontal, and vertical integration paradigms, with examples illustrating when each approach is appropriate in biological studies. |
| 11:45-12:30 | Latent Factor and Representation Learning Models: Introduction to latent factor models and representation learning approaches for multimodal data. Discussion of shared versus modality specific structure, interpretability, and the role of dimensionality reduction in integrated analyses. |
| 12:30-13:30 | Lunch Break |
| 13:30-14:15 | Foundations of Causal Inference for Multimodal Data: Review of core causal inference concepts relevant to multimodal studies, including potential outcomes, confounding, and identifiability. Discussion of how causal questions arise naturally in multi-omics and systems biology settings. |
| 14:15-15:00 | Multimodal Causal Mediation Analysis: Extension of causal mediation analysis to multimodal settings with multiple, high-dimensional mediators across biological layers. Discussion of assumptions, challenges, and interpretation of mediation effects in integrated analyses. |
| 15:00-15:15 | Afternoon Break |
| 15:15-16:15 | Hands-On Multimodal Integration Workflow: Hands-on session guiding participants through a complete multimodal analysis pipeline using a real multi-omics dataset. Participants will perform data import, preprocessing, integration, and latent representation learning using reproducible R workflows. |
| 16:15-17:00 | Hands-On Multimodal Causal Mediation Analysis: Continuation of the hands-on analysis focusing on causal mediation. Participants will implement mediation analyses, examine uncertainty in effect estimates, and interpret results in a biological context. |
| 17:00-17:30 | Best Practices, Assumptions, and Common Pitfalls: Synthesis of best practices for multimodal integration and causal analysis. Discussion of common pitfalls, sensitivity considerations, and how to communicate causal findings responsibly. |
| 17:30-18:00 | Open Discussion, Q&A, and Wrap-Up: Open discussion with participants, questions, and feedback. Summary of key takeaways and pointers to additional resources and training materials. |
Room: Virtual
Date: July 7, 2026
Start Time: 14:00
End Time: 18:00
Max Participants: 50
Organizers
- Tawaun Lucas - Senior Computational Biologist, Biohub
- Meg Urisko - Staff User Experience Researcher-Single Cell, Biohub
Speakers
- Tawaun Lucas - Senior Computational Biologist, Biohub
- Meg Urisko - Staff User Experience Researcher-Single Cell, Biohub
Description
Single-cell RNA sequencing (scRNA-seq) has become an indispensable tool for studying biology. While computational biologists are adept at complex processing decisions, handling large file sizes and wrangling high-dimensional data, this often creates a bottleneck when collaborating with experimentalists.
This tutorial presents a "hybrid" collaborative workflow designed to democratize access to single-cell genomics and accelerate discovery. Participants will learn to leverage the AI Workspace, a no-code web platform developed by the Biohub that integrates automated workflows and AI models. We will demonstrate how to deploy automated pipelines, including the ‘Standard’, ‘Comparison’, and ‘Co-embedding’ workflows to handle heavy-duty processing, quality control, and annotation without writing code.
Specifically, we will utilize state-of-the-art models integrated into the platform, such as scVI, a variational autoencoder with pre-trained models for human and mouse data that supports cell type prediction, and TranscriptFormer, a transformer-based model that predicts gene-gene interactions. Crucially, this tutorial balances automation with customization. We will guide participants through data “handoff”, exporting processed, annotated Anndata objects (H5AD) from the platform into a variety of tools that can help with visualization and deeper analysis. This session aims to facilitate no-code analyses for biologists while also enabling computational biologists to set up accessible, interactive environments for their collaborators while retaining the flexibility of code for final analysis and visualization.
Learning Objectives
By the end of this tutorial, participants will be able to:
- Deploy Automated Pipelines: Execute standard scRNA-seq workflows (QC, normalization, dimensionality reduction) using a no-code interface to rapidly process raw counts.
- Apply Generative AI Models: Utilize advanced AI models—specifically scVI for cell type prediction and TranscriptFormer for embedding generation without requiring deep machine learning expertise.
- Facilitate Data Exploration: Set up interactive environments based on CELLxGENE Explorer that allow non coding collaborators to explore embeddings and gene expression independently.
- Bridge to Custom Code: Export processed objects (H5AD) and import them into software of choice.
- Create Custom Visualizations: Write scripts to generate publication-ready figures (e.g., custom UMAPs, dot plots) that leverage the AI-generated annotations.
Intended Audience and Level
- Audience: Experimental biologists, computational biologists, bioinformaticians, and core facility staff who are interested in using AI workflows for single-cell analysis. .
- Level: Beginner.
- Prerequisites: Beginner-level understanding of scRNA-seq analysis concepts (clustering, PCA, UMAP) and with R or Python. No prior experience with the AI Workspace platform is required.
Schedule
| 09:00-09:30 |
Introduction & The Collaboration Bottleneck
|
| 09:30-10:45 |
HANDS-ON BLOCK 1: Automated Workflows & AI Modeling (AI Workspace)
|
| 10:45-11:00 | Coffee Break |
| 11:00-12:15 |
HANDS-ON BLOCK 2: The "Handoff" to R & Custom Visualization
|
| 12:15-12:45 |
Advanced Integration & Wrap-up
|
| 12:45-13:00 |
Exploration and Dedicated Help
|
Room: Virtual
Date: July 7, 2026
Start Time: 09:00
End Time: 13:00
Max Participants: 30
Organizers
- Mgr. Vlastimil Martinek, PhD, University of Malta
- Andrea Gariboldi, University of Malta
- Panagiotis Alexiou, University of Malta
Speakers
- Mgr. Vlastimil Martinek, PhD, University of Malta
- Andrea Gariboldi, University of Malta
- Panagiotis Alexiou, University of Malta
Description
Work in computational biology can often benefit from LLM assistance, especially tasks like programming, file management, search, retrieval, and summarizing scientific texts. When we augment LLMs with tools, they become LLM Agents, gaining the ability to perform tasks outside of simple text generation, and assist us more directly with a larger degree of autonomy. However, tools and the ability to autonomously operate come with issues like security and error propagation in multi-step tasks, that hinder the usefulness of such agents. To use agents in a reliable and secure way, we need to utilize secured isolated environments, enforce structured outputs, and use programmatic validation of intermediate steps. Combining these approaches, we can create reliable agents that produce outputs with the correct format and functionality, and avoid relying on time consuming manual verification. In this tutorial, you will use Python, PydanticAI, and Docker to learn the basics of using LLM Agents to programmatically automate simple tasks. From there, you will automate more complicated tasks that require multiple steps, and ensure the robustness of the results with custom agent validators. You will learn to automate tasks like machine learning experimentation, creating reusable scripts, and extracting structure from scientific literature. You will also get hands-on experience with an agentic s
Learning Objectives
- Create a custom LLM agent with PydanticAI
- Learn when to use a single agent or multiple agents
- Program single-step agent to extract custom structure from a scientific paper
- Program multi-step agent to create a correct and reusable python script
- Learn how to make agents secure with Docker
- Learn how to ensure that agent produces valid outputs with validators and structured outputs
- Understand differences between custom agents and existing tools like claude-code and copilot
Intended Audience and Level
Participants should be comfortable programming in Python. No knowledge of LLMs, APIs, or Agents is required. A laptop with an Internet connection and a Google account is required for this tutorial.
Schedule
| 09:00-09:30 | Introduction to LLM Agents and Tools This block will introduce the concept of LLM Agents and how they diff er from simply using LLMs. Concepts like tool calls and special tokens will be explained to understand the inner workings of agents. |
| 09:30-10:00 | Pydantic AI and single-step agents This block will introduce Pydantic AI, a Python library for easily creating custom LLM agents while allowing us to have full control of their inner workings. |
| 10:00-10:45 | Programming assignment 1: Scientific paper structured summary In this block participants will use Pydantic AI to defi ne their own structure of how a paper summary should look, and program an agent that will ingest a scientifi c paper and be guaranteed to output this structure. |
| 10:45-11:00 | Coffee Break |
| 11:00-11:30 | Multi-step agents and validators This block will cover multi-step agents and ways to prevent issues that arise when agents operate autonomously. Docker will be introduced to cover security, and custom validators and checkpoints will be explained for handling errors and agent mistakes. |
| 11:30-12:15 | Programming assignment 2: Multi-step agent for creating robust scripts In this block, participants will program a secure multi-step agent that operates autonomously and adheres to constraints like required script functionality, required script parameters, and other user requirements. |
| 12:15-13:00 | Programming assignment 3: Co-developing machine learning model with an agentic system In this block, participants will use an agent built with Pydantic AI to quickly create a machine learning model prototype for a biomedical dataset of choice (DNA,RNA,Proteins, or Small molecules) |
Tutorial VT7: Hello nf-core: Level up your workflows with community-curated best practices and developer resources
Room: Virtual
Date: July 7, 2026
Start Time: 14:00
End Time: 18:00
Max Participants: 40
Organizers
- Geraldine Van der Auwera, PhD
Speakers
- Geraldine Van der Auwera, PhD
Description
The nf-core project is a widely recognized community resource that provides a rich collection of modular, reusable workflows and developer tools, along with best practices for building high-quality bioinformatics pipelines. While these resources are highly valuable, the breadth and sophistication of nf-core can be daunting for newcomers, making it challenging to know where to start.
Hello nf-core is a hands-on tutorial designed for students and early-career researchers who have some familiarity with Nextflow and want to take the next step in workflow development. The tutorial aims to equip participants with foundational skills in three key areas: (1) understanding the structure and conventions of nf-core pipelines, (2) integrating community-curated modules, and (3) creating their own nf-core-compatible modules.
Through guided, goal-oriented exercises, participants will learn to:
- Find and run existing nf-core pipelines, exploring their directory structure and configuration systems.
- Adapt a simple Nextflow workflow into an nf-core-compatible pipeline using template scaffolds.
- Integrate pre-built nf-core modules and create new modules following community standards.
By the end of the tutorial, participants will have transformed a basic Nextflow workflow into an nf-core-style pipeline with standardized structure and reusable components, gaining practical skills to confidently leverage nf-core resources in their own bioinformatics projects. Additional study-at-home materials will be provided to continue building expertise after the course.
The training materials are open-source and freely available on the Nextflow training portal at https://training.nextflow.io/latest/hello_nf-core
Learning Objectives
This tutorial aims to teach participants to use and develop nf-core compatible modules and pipelines, and to utilize nf-core tooling effectively.
By the end of this training, participants will be able to:
- Find and run nf-core pipelines
- Describe the code structure and project organization of nf-core pipelines
- Create a basic nf-core compatible pipeline from a template
- Convert basic Nextflow modules to nf-core compatible modules
- Add nf-core modules to an nf-core compatible pipeline
Intended Audience and Level
This tutorial is designed for learners who have some familiarity with Nextflow and want to build on that foundation using nf-core. Some experience with the command line, basic scripting concepts, and workflow development is assumed. Prerequisites can be satisfied by completing the “Hello Nextflow” tutorial. Some of the exercises demonstrate the use of domain-specific bioinformatics pipelines, but the focus is on the mechanics, so specific scientific domain knowledge is not required. No prior experience with nf-core or advanced programming is assumed.
Schedule
| 09:00-09:10 | Welcome & introductions |
| 09:10-09:30 | Run a demo pipeline Retrieve and run an existing nf-core pipeline, and examine its code structure to understand what makes these pipelines different from basic Nextflow workflows. |
| 09:30-10:45 | Rewrite Hello for nf-core Adapt an existing workflow to the nf-core template scaffold, starting from the simple workflow produced in the Hello Nextflow tutorial. |
| 10:45-11:00 | Coffee Break |
| 11:00-11:45 | Use an nf-core module: Integrate pre-built, tested modules that wrap common bioinformatics tools. |
| 11:45-12:45 | Make an nf-core module: Create an nf-core-style module using the specific structure, naming conventions, and metadata requirements that make modules shareable and maintainable by the community. |
| 12:45-13:00 | Wrap-up: Recap, open Q&A and next steps |
Tutorial VT8: Foundation model and graph learning for modeling, analyzing, and interpreting single-cell omics and histopathology data
Room: Virtual
Date: July 7, 2026
Start Time: 09:00
End Time: 18:00
Max Participants: N/A
Organizers
- Dr. Juexin Wang, Assistant Professor, Indiana University Indianapolis, United States
- Dr. Dong Xu, Professor, University of Missouri, United States
- Dr. Qin Ma, Professor, Ohio State University, United States
- Dr. Michael Eadon, Associate Professor, Indiana University, United States
- Dr. Guangyu Wang, Associate Professor, Houston Methodist, United States
- Dr. Pinaki Sarder, Associate Professor, University of Florida, United States
Speakers
- Dr. Juexin Wang, Assistant Professor, Indiana University Indianapolis, United States
- Dr. Dong Xu, Professor, University of Missouri, United States
- Dr. Qin Ma, Professor, Ohio State University, United States
- Dr. Michael Eadon, Associate Professor, Indiana University, United States
- Dr. Guangyu Wang, Associate Professor, Houston Methodist, United States
- Dr. Pinaki Sarder, Associate Professor, University of Florida, United States
Description
Emerging single-cell and spatial omics technologies, together with next generation AI/ML-driven digital histopathology, present both unprecedented opportunities and challenges for integrating molecular biology and pathology. How to model vast sequencing and imaging data in diverse modalities, perform computational analyses, and reveal and interpret biological and pathological mechanisms by identifying biologically and pathologically meaningful cell types, niches, and key markers are central questions in this area.
Computational methods and tools, particularly foundation models and graph learning approaches, offer a promising approach to addressing these challenges. Loki1 is a visual–omics foundation model OmiCLIP designed to bridge omics data and hematoxylin and eosin (H&E) images. Targeting single-cell multi-omics data, scPEFT2 introduces a flexible Parameter-Efficient Fine-Tuning framework to enhance the adaptation of single-cell large language models. TrimNN3 uses a graph learning approach to infer the conservative cellular community patterns in spatial omics. FUSION4 provides a one-stop web application platform for in-depth exploration of multi-omics data with brightfield histology.
Our tutorial will cover key advancements in foundation models and graph learning developed on singlecell omics and histopathology research over the past few years, emphasizing new opportunities in bioinformatics enabled by these advancements. We will start with a technical talk about the machine learning algorithms covered in approaches, including Loki, scPEFT, TrimNN, and FUSION, and from model training to model interpretation (discovery on cell types, niches, and key markers). We will then demonstrate the impact of machine learning on uncovering the underlying mechanisms of complex diseases, such as cancer, Alzheimer’s disease, and kidney disease.
Learning Objectives
- To understand the basic principles of foundation models, graph representation learning, and model interpretation.
- To understand the specifics of computational tools such as OmiCLIP, scPEFT, scBSP, SpaGFT, TrimNN, FUSION, and become aware of the appropriate tools to use in different applications in single-cell multi-omics, spatial transcriptomics, and histopathology studies.
- To gain hands-on experience in applying tools and interpreting results using standalone and cloudnative Python-based software OmiCLIP, R-based scBSP, and webserver-based FUSION.
Intended Audience and Level
The target audience is graduate students, researchers, scientists, and practitioners in both academia and
industry who are interested in the applications of foundation models and deep learning in bioinformatics
(Broad Interest). The tutorial is aimed toward entry-level participants with knowledge of the fundamentals
of biology and machine learning (beginner). Basic experience with Python and R programming languages
is recommended for the participants. The tutorial slides and materials for hands-on exercises (e.g., links to demo, code implementation, and datasets) will be posted online prior to the tutorial and made available to all participants.
Schedule
| 09:00-09:30 | Part 1: Overview: Introduction to deep graph presentation learning for single-cell/spatial omics/histopathology data. (Qin Ma) We will discuss how deep graph representation learning (DGRL), including but not limited to graph neural networks, provides a powerful and scalable framework for modeling multi-scale cellular interactions, integrating heterogeneous features, and uncovering biological insights that extend beyond the capabilities of traditional single-task computational tools. |
| 09:30-10:00 | Part 2: Overview of foundation models in bioinformatics. (Dong Xu) This overview introduces the concept of foundation models, and reviews recent developments of foundation models in single-cell and histopathology data. |
| 10:00-10:45 | Part 3: Introduction to visual-omics foundation models: (hands-on exercises) (Guangyu Wang) Overview of the OmiCLIP algorithm, a contrastive learning approach building a foundation model linking H&E images and transcriptomics using tissue patches from 10X Visium data. |
| 10:45-11:00 | Coffee Break |
| 11:00-12:00 | Part 4: Application #1: Spatial omics and histopathology analysis with multimodal foundation models and their downstream analysis: (hands-on exercises) (Guangyu Wang’s group) Introduce Loki and Thor platforms for tissue alignment, annotation via bulk RNA sequencing or marker genes, cell-type decomposition, image-transcriptomics retrieval, and spatial transcriptomics gene expression prediction from H&E images. |
| 12:00-13:00 | Part 5: Application #2: Single cell analysis with efficient finetune single cell foundation models, and use it for acquisition, model training, and interpretation (hands-on exercises). (Dong Xu’s group) Introduction to scPEFT, a toolbox built on a single-cell parameter-efficient fine-tuning framework that incorporates learnable, low-dimensional adapters into single-cell large language models. |
| 13:00-14:00 | Lunch Break |
| 14:00-14:30 | Part 6: Introduction of graph learning approaches. (Juexin Wang) This topic introduces the principles of graph learning in modeling and interpreting single-cell multi-omics. Starting from graph neural networks, this topic reviews several graph clustering models and the TrimNN model in niche discoveries in spatial omics research. |
| 14:30-15:00 | Part 7: Applications #3: Function-related cellular neighborhood analysis with data acquisition, model training, and interpretation (hands-on exercises). (Juexin Wang’s group) This application demos dataset acquisition, modeling, and analysis of cellular neighborhood analysis. In downstream analysis, we will demonstrate data visualization, performance comparison, and functional analysis, including gene ontology/pathway enrichment analysis and cell-cell communication analysis. The exercises are based on publicly available spatial transcriptomics and proteomics data, including mouse Alzheimer’s disease STARmap PLUS data. |
| 15:00-15:30 | Part 8: Applications #4: Functional tissue unit analysis with data acquisition, model training, and interpretation (hands-on exercises). (Qin Ma’s group) The application demos spatial omics data representation using graph signal processing theory. The exercises include running R-based BioGSP and Python SpaGFT on CODEX/CosMX tumor data. |
| 15:30-16:00 | Part 9: Application #5: Spatially variable features analysis with data acquisition, model training, and interpretation (hands-on exercises). (Dong Xu’s group or Juexin Wang’s group) This application demonstrates the process of identifying spatially variable genes using spatial transcriptomics and spatial ATAC-seq data. The exercises include running R-based scBSP software on mouse brain HDST data. |
| 16:00-16:15 | Coffee Break |
| 16:15-17:05 | Part 10: Application #6: Explore multi-omics data with brightfield histology with web-based applications for data acquisition, model fitting, and analysis (hands-on exercises). (Pinaki Sarder) This application demos integration of brightfield histology and associated spatial transcriptomics data, including data obtained via 10X Visium and 10X Xenium technologies. We will demo FTU segmentation from histology images, cell deconvolution and annotation using FUSION, cell abundance analysis, and integrated analysis comparing histology and spatial omics using the same platform, and generate figures as needed for manuscripts. |
| 17:05-17:55 | Part 11: Application #7: A case study to explore neighborhoods in kidney health and disease. (Michael Eadon) This application will highlight large repositories of publicly available single-cell, spatial omics, and histopathology data from the Human Biomolecular Atlas Program (HuBMAP) and Kidney Precision Medicine Project (KPMP) freely available for download. A use case will explore multi-omics integration within the merged HuBMAP and KPMP atlas version 2, demonstrating how the approaches learned throughout the day have extended our knowledge of kidney disease. |
| 17:55-18:00 | Summary (Juexin Wang) |

