Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

UPCOMING DEADLINES & NOTICES

  • Presenter registration deadline (for talks and/or posters)
    BiGEvo 2025
    May 1, 2025
  • Last day for tutorial registration, if not sold out (You have until 23:59 CDT)
    GLBIO 2025
    May 1, 2025
  • Publication fees due for accepted papers
    ISMB/ECCB 2025
    May 1, 2025
  • Last day to upload ANY/ALL files to the virtual platform (You have until 23:59 Anywhere on Earth) *no extensions*
    GLBIO 2025
    May 5, 2025
  • Last day to register
    BiGEvo 2025
    May 9, 2025
  • Abstract acceptance notifications sent (for talks and/or posters)
    ISMB/ECCB 2025
    May 13, 2025
  • Conference fellowship invitations sent (for talks and/or psoters)
    ISMB/ECCB 2025
    May 13, 2025
  • CAMDA extended abstracts submission deadline (for talks and/or posters) (You have until 23:59 Anywhere on Earth) *no extensions*
    ISMB/ECCB 2025
    May 15, 2025
  • Late-breaking poster submissions deadline (You have until 23:59 Anywhere on Earth) *no extensions*
    ISMB/ECCB 2025

    May 15, 2025
  • Deadline for submission
    INCOB 2025
    May 17, 2025
  • Last day for tutorial registration, if not sold out (You have until 23:59 CDT)
    BiGEvo 2025
    May 19, 2025
  • Early acceptance notifications from
    INCOB 2025
    May 19, 2025
  • Conference fellowship application deadline (You have until 23:59, Anywhere on Earth) *no extensions*
    ISMB/ECCB 2025
    May 20, 2025
  • Tech track acceptance notifications sent
    ISMB/ECCB 2025
    May 20, 2025
  • Late-breaking poster notifications sent
    ISMB/ECCB 2025
    May 22, 2025
  • CAMDA acceptance notifications sent
    ISMB/ECCB 2025
    May 22, 2025
  • Conference fellowship acceptance notification
    ISMB/ECCB 2025
    May 26, 2025
  • Presentation schedule posted
    ISMB/ECCB 2025
    May 28, 2025
  • Confirmation of participation notices sent
    ISMB/ECCB 2025
    May 28, 2025

Upcoming Conferences

A Global Community

  • ISCB Student Council

    dedicated to facilitating development for students and young researchers

  • Affiliated Groups

    The ISCB Affiliates program is designed to forge links between ISCB and regional non-profit membership groups, centers, institutes and networks that involve researchers from various institutions and/or organizations within a defined geographic region involved in the advancement of bioinformatics. Such groups have regular meetings either in person or online, and an organizing body in the form of a board of directors or steering committee. If you are interested in affiliating your regional membership group, center, institute or network with ISCB, please review these guidelines (.pdf) and send your exploratory questions to Diane E. Kovats, ISCB Chief Executive Officer (This email address is being protected from spambots. You need JavaScript enabled to view it.).  For information about the Affilliates Committee click here.

  • Communities of Special Interest

    Topically-focused collaborative communities

  • ISCB Member Directory

    Connect with ISCB worldwide

  • Green ISCB

    Environmental Sustainability Effort

  • Equity, Diversity, and Inclusion

    ISCB is committed to creating a safe, inclusive, and equal environment for everyone

Professional Development, Training, and Education

ISCBintel and Achievements

Details coming soon.

There will be a series of in-person and virtual tutorials prior to the start of the conference. Tutorial registration fees are shown at: https://www.iscb.org/ismb2024/register#tutorials

In-person Tutorials (All times EDT)

Virtual Tutorials: (All times EDT) Presented through the conference platform

- top -

Tutorial IP1: Advanced machine learning methods for modeling, analyzing, and interpreting single-cell omics and spatial transcriptomics data
SOLD OUT

Room: 518
Date: Friday, July 12, 2024 9:00 – 18:00 EDT

Organizer:
Juexin Wang

Speakers:
Mauminah Raina, (Ph.D. student) Indiana University Indianapolis, United States
Yi Jiang, (Ph.D. student) Ohio State University, United States
Lei Jiang, (Ph.D. student) University of Missouri, United States
Michael Eadon, Indiana University Indianapolis, United States
Juexin Wang, Indiana University Indianapolis, United States
Qin Ma, Ohio State University, United States
Dong Xu, University of Missouri, United States

Max Participants: 50

Website
https://github.com/juexinwang/Tutorial_ISMB2024

Description
Emerging single-cell omics and spatial transcriptomics technologies provide unprecedented opportunities and challenges for molecular biology studies. How to model these vast sequencing data in different modalities, perform computational analyses, and interpret mechanisms by identifying biological and pathological meaningful cell types, regulatory relations, and key markers are central questions in this aera.

Advanced machine learning methods and tools provide a promising approach to address these challenges. scGNN (https://github.com/juexinwang/scGNN) is a graph neural network based framework for clustering and imputing scRNA-seq data by modeling the single cells as a cell graph. Targeting single-cell multi-omics data, DeepMAPS (https://bmblx.bmi.osumc.edu/) introduces a heterogenous graph transformer to infer single-cell biological networks. BSP (https://github.com/juexinwang/BSP) proposes a granularity-based statistical approach to identify spatially variable genes on 2D and 3D spatial transcriptomics.

Our tutorial will cover key advancements in machine learning methods developed on single-cell multi-omics and spatial transcriptomics research over the past few years, emphasizing new opportunities in bioinformatics enabled by such advancements. We will start with a technical talk about the machine learning algorithms of covered approaches, including scGNN, DeepMAPS, and BSP, and from model training to model interpretation (discovery on cell types, regulatory relations, and key markers). We will then demonstrate the impact of machine learning on discovering

Learning Objectives

  • To understand the basic principles of deep learning, graph representation learning, and model interpretation.
  • To understand the specifics of computational tools such as scGNN, DeepMAPS, and BSP, and become aware of the appropriate tools to use in different applications in single-cell multi-omics and spatial transcriptomics studies.
  • To gain hands-on experience in applying tools and interpreting results using standalone python- based software scGNN, R-based BSP, webserver-based DeepMAPS, and integrated AI-ready platform.

Intended Audience and Level
The target audiences are graduate students, researchers, scientists, and practitioners in both academia and industry who are interested in applications of deep learning in bioinformatics (Broad Interest). The tutorial is aimed towards entry-level participants with knowledge of the fundamentals of biology and machine learning (beginner). Basic experience with Python and R programming languages is recommended for the participants.

The tutorial slides and materials for hands-on exercises (e.g., links to demo, code implementation, and datasets) will be posted online prior to the tutorial and made available to all participants.

Schedule

9:00

Part 1: Overview: Introduction to single-cell multi-omics and spatial transcriptomics and corresponding challenges.

  • Recent developments in single-cell multi-omics and spatial transcriptomics sequencing.
  • General computational approaches in data modeling.
  • The impact of advanced machine learning approaches on discovering cell types, regulatory relations, and key markers in complex diseases and other biological phenomena.
9:45

Part 2: Introduction to biological analyzing methods.

  • Data visualization on single cells
    • UMAP and t-sne
    • Sankey plot, Circos plot, and Dot plot
  • Cell-cell communications on scRNA-seq data and spatial transcriptomics data
    • CellChat, CellChatDB, and COMMOT
10:45 Coffee Break
11:00

Part 3: Clustering-based single-cell analysis and scGNN on AI-ready platform.

  • Modeling with graph neural networks.
    • Concepts of graph neural networks
    • Stacked graph autoencoders in scGNN 1.0
    • Integrating scRNA-seq and bulk RNA-seq in scGNN 2.0
  • AI-ready platform for single-cell analysis
12:00

Part 4: Applications #1: Single-cell RNA-seq dataset acquisition, model training, and analysis.

  • Dataset acquisition, including methods that convert the data to the required format of scGNN.
    • Data type: scRNA-seq
    • Format: comma-separated values (CSV), 10X sparse format and hdf5 file.
  • Modeling, including deep learning models and automated hyperparameter tuning.
    • Models
      • scGNN model
      • Quick mode
      • Including LTMG as regulatory priors
    • Default option
      • Basic models with hyperparameters.
    • Advanced option
      • Selecting the number of clusters with parameter resolution
      • Tuning the models by automated hyperparameter tuning algorithm.
    • Downstream analysis of Alzheimer’s disease
      • Application of Alzheimer’s disease using single-cell RNA-seq data
      • Cell type annotation and validation
13:00 Lunch
14:00

Part 5: Network analysis on single-cell multi-omics and DeepMAPS.

  • Network analysis using single-cell multi-omics.
    • Classical network analysis in integrating modalities
    • Identifying cell-specific regulatory relations
  • Modeling with heterogeneous graph transformer.
    • Transformer model
    • Various graph transformer model
    • Heterogenous graph transformer in modeling heterogeneous graph
14:30

Part 6: Applications #2: Single-cell multi-omics dataset acquisition, model training, and analysis.

  • Dataset acquisition, including methods that convert the data to the required format of DeepMAPS.
    • Data type: scRNA-seq, scATAC-seq, CITE-seq
    • Format: comma-separated values (CSV), 10X sparse format and hdf5 file.
  • Modeling, including deep learning models and automated hyperparameter tuning.
    • Models
      • DeepMAPS model
    • Default option
      • Basic models with hyperparameters tuned by us.
    • Job management
      • Monitoring the training steps via the interface.
      • Visualizing the learning curve during the model training.
      • Comparing the performance of different models.
      • Downstream analysis
  • Applications.
    • Analysis of human IFNB-stimulated and control PBMCs with multiple scRNA-seq data
    • Analysis of human PBMC and lung tumor leukocytes with CITE-seq data
16:00 Coffee Break
16:15

Part 7: Marker analysis on spatial transcriptomics and BSP.

  • Spatial variable genes identification on spatial transcriptomics.
    • Definition of spatially variable genes
    • Statistical approaches in identifying spatially variable genes
  • Granularity-based statistical approach.
    • BSP model
16:45

Part 8: Applications #3: Spatial transcriptomics dataset acquisition, model fitting, and analysis.

  • Dataset acquisition, including methods that convert the data to the required format of DeepMAPS.
    • Data type: 10X Visium, MERFISH, Seq-FISH, Slide-seq, and Slide-seq V2.
    • Format: comma-separated values (CSV), 10X sparse format and hdf5 file.
  • Modeling fitting.
    • Models
      • BSP model with lognormal and beta distributions
    • Default option
      • Basic models with different granularities.
    • Applications
      • Kidney research on 10X Visium
      • Rheumatoid Arthritis research on 3D spatial omics

- top -

Tutorial IP2: Just-in-time compiled Python for bioinformatics research SOLD OUT

Room: 524c
Date: Friday, July 12, 2024 9:00 – 18:00 EDT

Organizer:
Sven Rahmann

Speakers:
Johanna Schmitz, Center for Bioinformatics Saar and Saarland University, Saarland Informatics Campus, Saarbrücken, Germany; Saarbrücken Graduate School of Computer Science
Jens Zentgraf, Center for Bioinformatics Saar and Saarland University, Saarland Informatics Campus, Saarbrücken, Germany; Saarbrücken Graduate School of Computer Science
Sven Rahmann, Center for Bioinformatics Saar and Saarland University, Saarland Informatics Campus, Saarbrücken, Germany

Max Participants: 20

Description
Python has a reputation for being a clean and easy-to-learn language, but slow when it comes to execution, and difficult concerning multi-threaded execution. Nonetheless, it is one of the most popular languages in science, including bioinformatics, because for many tasks, efficient libraries exist, and Python acts as a glue language. In this tutorial, we explore how to write efficient multi-threaded applications in Python using the numba just-in-time compiler. In this way, we can use Python’s flexibility and the existing packages to handle high-level functionality (e.g., design the user interface, run machine learning models), and then use compiled Python for additional custom compute-heavy tasks; these parts can even run in parallel.

Over a full tutorial day, we introduce a small (but still interesting and relevant) problem as an example: efficient search for bipartite DNA motifs. We develop an efficient tool that outputs every match in a reference genome in a matter of seconds. Starting with an introduction to the problem and a (slow) pure Python implementation, we learn how to write more jit-compiler-friendly code, transition towards a compiled version and observe speed increases until we obtain C-like speed. We parallelize the tool to make it even faster, and add more options for more flexible searching. Finally, we add a simple but effective GUI, which can increase the potential user-base of such a tool by an order of magnitude.

Learning Objectives

  • Understand the difference between interpretation, lazy and eager/early compilation
  • Understand the possibilities and limitations of the numba just-in-time compiler
  • Explore several examples about when numba can accelerate your code (and when it cannot)
  • Understand pre-requisites for compiling a function
  • Learn the differences between compileable and non-compileable Python code
  • Learn about parallelizing Python in spite of the Global Interpreter Lock (GIL) with compiled functions
  • Learn how to scale up a prototype to handle large data
  • Get an understanding of DNA motif search

Intended Audience and Level
The tutorial addresses active bioinformatics researchers, from graduate students to principal investigators, who write software tools as part of their research. In particular, we address researchers who are looking for an easier transition from research prototype software to software that scales to large datasets and is usable by a large non-technical user-base. Therefore, our participants should have at least some experience developing bioinformatics research software.

Prior experience with the Python programming language is required, as well as some experience with managing environments with installed software, ideally using (bio)conda / mamba.

Schedule

9:00 Introduction to the numba just-in-time compiler for Python; small examples,
possibilities, limitations, how the compilation works. Last 30 minutes are short hands-on
exercises (timing iterated execution of a small function in pure vs. compiled Python).
10:45 Coffee break
11:00 Introduction to DNA motif search and a “motif description” mini-language, with
examples from the literature. Automaton-based pattern search and a bit-parallel algorithm.
Hands-on: Implementation in pure Python (45 min, 15-20 lines).
13:00 Lunch break
14:00 Transforming a Python implementation to a numba-compiled implementation;
separation of high-level and low-level code parts; managing memory allocations; introduction
of type annotations (1 hour principles, 1 hour supervised coding).
16:00 Coffee break
16:15 Parallelization: Using threads to parallelize the application (e.g. parallel search
across chromosomes); Replacing the command-line interface by a simple but effective GUI
using streamlit. Hands-on coding: Splitting the task, collecting and visualizing the results.

- top -

Tutorial IP3: Multi-omic data integration for microbiome research using scikit-bio

Room: 524a
Date: Friday, July 12, 2024 9:00 – 18:00 EDT

Organizer:
Qiyun Zhu

Speakers:
Qiyun Zhu
James Morton
Daniel McDonald
Matthew Aton
Lars Hunger

Max Participants: 40

Description
Modern microbiome research is marked by the extensive use of high-throughput, multi-omic data derived from complex biological systems, such as amplicons, metagenomes, metatranscriptomes, metaproteomes, and metabolomes, as well as data and metadata of the host or environment. The complexity and richness of data demand robust, scalable, and reproducible integration and analysis methods. Our full-day tutorial offers an essential guide to leveraging the expanded capabilities of scikit-bio, alongside the broader Python data science ecosystem. Scikit-bio is a core library behind the widely used QIIME 2 project, and provides various data structures, metrics and algorithms commonly used in bioinformatics. This tutorial is designed to provide researchers, educators, and developers with an overview of current trends, foundational principles, and analytical strategies in microbiome research. Participants will engage in hands-on exercises on handling data and metadata, analyzing communities and features, as well as correlating and predicting biological traits. This tutorial aims to equip attendees with knowledge and practical skills that are adaptable to various applications in microbiome research and beyond.

Exercises will be delivered through Jupyter Notebooks with clear code and documentation. Tutorial materials, including data, slides, and notebooks, will be hosted in a public GitHub repository under a BSD open-source license.

Learning Objectives
Participants will learn how to use scikit-bio and other common Python libraries to analyze and integrate multiple types of omic data that are usually involved in studies of microbiomes and their roles in the host or natural environment. Specifically, participants will:

  • Understand and work with various summarized omic data types.
  • Handle sparse, high-dimensional data tables and associated metadata.
  • Analyze community composition using ecological, phylogenetic and statistical approaches.
  • Identify important microbial or functional features associated with sample properties.
  • Construct supervised learning models to predict traits of hosts or environments.
  • Develop reusable workflows for microbiome research.

In the end of the full-day tutorial, each participant will complete an analytical workflow based on a demo dataset and can be customized and extended to other datasets.

Intended Audience and Level
This tutorial is for researchers, educators and developers interested in analyzing various types of biological “omic” data, such as metagenomics, metabolomics, and host transcriptomics. Attendees should have basic skills in Python (preferred), or any other programming language (such as R or C/C++). Experience with the Linux command line is not required. Optionally, attendees may benefit from basic knowledge in bioinformatics, biostatistics, and any specific biological research fields, such as microbiology, ecology, molecular biology, and epidemiology.

Each participant should bring their own laptop or tablet (with keyboard). The practices will be conducted using Google Colab or a local Jupyter environment, depending on the participant’s preference

Schedule

9:00

Introduction and software setup
Lecture: Current trends in microbiome research

  1. Overview the latest developments in microbiome research, emphasizing the shift towards high-throughput, multi-omics and meta-analysis.
  2. Introduce scikit-bio within the Python ecosystem, explaining that it is a core library of the widely used QIIME 2 project, highlighting its role in facilitating reproducible, scalable, and expandable bioinformatics research.

Exercise: Setting up the software environment.

  1. Walkthrough the setup of scikit-bio in Google Colab or a local Jupyter environment, depending on participant preference.
  2. Test the basic functionalities of scikit-bio, ensuring that the software is installed correctly and participants are comfortable with its interface and capabilities.
  3. Briefly demonstrate the basics of Python, in order to (re)familiarize participants with various technical backgrounds.
10:00

Working with various omic data types
Lecture: Omic data types in microbiome research

  1. Review typical and emerging omic data types in microbiome research, such as 16S, ITS, metagenomics, metatranscriptomics, metaproteomics, metabolomics, and corresponding host data.
  2. Discuss derived data types, such as taxonomic, functional, genetic and metabolic profiles.
  3. Navigate key public data sources, like Qiita, GNPS, NMDC, ENA and SRA.
  4. Discuss the challenges and techniques for integrating different omics. Examples include combining 16S and shotgun data using Greengenes2 and WoL2, and integrating sequencing and mass spectrometry data via KEGG.

Exercise: A real-world multi-omic dataset

  1. Download a demo dataset, which is subsampled from a representative study, consisting of several omic data types, metadata, and biological questions.
  2. Explore the components of the dataset using Python.
10:45 Coffee break
11:00

Working with sparse, high-dimensional data tables
Lecture: Nuances of omic data tables

  1. Introduce the structure of feature tables commonly used in omics.
  2. Discuss the nature of omic data, highlighting its high-dimensionality and sparsity, and their implications on data processing and analysis.
  3. Discuss and compare sparse vs. dense matrices, and the strategies for handling them.
  4. Introduce the BIOM-format, a Genomics Standards Consortium standard format for representing sparse feature data.

Exercise: Working with omic data tables

  1. Navigate the basic techniques for loading, viewing, and editing data tables using scikit-bio and BIOM-format.
  2. Manipulate data tables according to the statistical properties of data, and the biological properties of samples informed by metadata.
12:00

Analyzing microbial community structures
Lecture: Microbial community ecology

  1. Introduce the fundamentals of microbial communities, explaining the notions of alpha and beta diversity.
  2. Introduce phylogeny, gene ontology, and the general notion of knowledge graph, addressing their roles in community analysis.
  3. Discuss the compositionality of omic data, explaining its implications in data analysis.

Exercise: Community diversity analyses

  1. Calculate alpha and beta diversity metrics, with or without phylogeny.
  2. Construct diversity distance matrices.
  3. Perform ordination of communities, visualize, and interpret.
  4. Compare matrices and ordinations across different omics.
13:00 Lunch break
14:00

Inferring and associating critical features
Lecture: Microbial signatures and their biological roles

  1. Discuss the role of microbial signatures in biological processes, explaining that signatures may emerge in different levels: taxonomy, function, and molecules
  2. Introduce multivariate statistical tests (such as PERMANOVA).
  3. Introduce differential abundance analysis (such as ANCOM).

Exercise: Statistical modeling and tests

  1. Perform multivariate statistical tests to correlate microbial community composition with sample metadata.
  2. Perform differential abundance analyses to identify important microbial features that may play roles in specific biological processes.
  3. Perform canonical analyses to associate features across different omic data types, revealing interconnections and dependencies.
  4. Utilize metadata for flexible and sophisticated statistical modeling.
15:00

Predicting host and environmental traits
Lecture: Microbiomes are predictive of biology

  1. Overview the interactions between microbiomes, hosts and environments, and
    the importance of understanding these relationships.
  2. Introduce the principles of supervised machine learning, and interpretable machine learning.

Exercise: Constructing predictive models

  1. Bridge scikit-bio data structures with machine learning models in Scikit-learn.
  2. Construct predictive models for categorical and numeric traits of samples.
  3. Navigate and interpret the results in a biological context.
16:00 Coffee break
16:15

Developing an analytical protocol for publication
Lecture: Good practices in scientific data analysis

  1. Introduce the FAIR principles in scientific research.
  2. Discuss good practices for developing and publishing analytical protocols, addressing key considerations such as documentation, accessibility and reproducibility.

Exercise: Assembling an analytical protocol

  1. Wrapping up analyses in one Jupyter Notebook, and document.
  2. Test and validate the protocol using both the provided dataset and different data, emphasizing the adaptability and robustness of the protocol.
  3. Discuss how the workflows can be adapted or expanded for the participants’ specific research needs.
17:15

Debugging, wrapping-up and open questions
Exercise: Troubleshooting and expansion

  1. Address participants’ specific questions and challenges in completing the practice on the demo dataset.
  2. Address participants’ questions in analyzing specific data types with specific research goals.

Lecture: Looking beyond

  1. Open discussion on the current and future trends of multi-omic biological research and scientific computing.
  2. Welcome participants to contribute to open-source projects.

- top -

Tutorial IP4: Quantum-enabled multi-omics analysis

Room: 522
Date: Friday, July 12, 2024 9:00 – 18:00 EDT

Organizer:
Aritra Bose
Laxmi Parida

Speakers:
Aritra Bose, PhD, Research Scientist, IBM Research, Yorktown, NY
Hakan Doga, PhD, Postdoctoral Researcher, IBM Research, Cleveland, OH
Filippo Utro, PhD, Senior Research Scientist, IBM Research, Yorktown, NY
Laxmi Parida, PhD, ISCB Fellow, IBM Fellow

Max Participants: 50

Description
Single-cell and -omic analyses has provided profound insights on heterogeneity of complex tissues measuring multiple cells together, including a wide array of multi-omics data such as genomics, proteomics, transcriptomics, etc. The single cell analysis is often plagued by many uncertainties such as missingness, developing robust machine learning algorithms for discovering complex features across, finding patterns in spatial structure of single cell transcriptomics or proteomics, and most importantly integrating multi-omics data to create meaningful embeddings for the cells. Machine Learning (ML) techniques have been extensively used in analyzing, predicting, and understanding multi-omics data. For the purposes of this tutorial, we will use the term classical ML to refer to these the potential to overcome a lot of the above limitations of ML in single-cell analysis. This tutorial will be structured into five sessions as follows:

  • In the first session we will introduce quantum computing fundamentals such as notations, operations, quantum states, entanglement, quantum gates, and circuits.
  • In the second session, we will set up Qiskit, an open-source quantum computing toolkit based on Python and run a demo algorithm.
  • In the third session, we will process and analyze single-cell multi-omics data from the Single Cell atlas or TCGA, etc. using classical ML algorithms to create baseline.
  • In the fourth session, we will set up the data in Qiskit and run a QML algorithm to classify disease sub types.
  • In the fifth and concluding session, we will summarize the tutorial and do an interactive Q&A session with the attendees.

Learning Objectives
Participants in this tutorial will learn a new paradigm of analyzing multi-omics data with hands on experience with a quantum computer. More objectively, the major takeaways of this tutorial would be:

  • Understand the basics of quantum computing including hands-on experience on quantum gates and circuits using Qiskit.
  • Identify the class of problem: analyzing machine learning methods on multi-omics data for biomarker discovery, disease subtyping, etc.?
  • How to process single cell data for quantum-enabled algorithms.
  • How to apply QML algorithms on single cell data.
  • How to design experiments for healthcare and life sciences data using quantum computers

Intended Audience and Level
This tutorial is aimed at computational biologists, bioinformaticians, clinicians, practitioners, data analysts, including early-career to senior researchers in the fields of healthcare and life sciences enthusiastic to learn about new frontiers of computational biology. There are very few prerequisites for the tutorial, listed as follows:

  • Create an IBM Quantum account in IBM Quantum Learning website, click on “Create an IBMid” and follow the instructions.
  • Watch the Qiskit Global Summer School videos – QML 2021 or (optional)
  • Entry-level knowledge of single-cell data and multi-omics analyses.

Schedule

9:00 Session I: Quantum Information and Fundamentals
10:45 Coffee Break
11:00 Session II: Hello Qiskit!: Writing your first program in Qiskit
12:30 Session III: Processing multi-omics data with classical ML algorithms
13:00 Lunch
14:00 Session IV, Part I: Design and implement QML algorithm for single-cell data in Qiskit.
16:00 Coffee Break
16:15 Session IV, Part II: Analyze QML algorithm and compare with classical ML
17:00 Session V: Interactive Q&A session with the participants.

- top -

Tutorial IP5: Modelling Multi-Modal Biomedical Data Using Networks

Room: 521
Date: Friday, July 12, 2024 9:00 – 18:00 EDT

Organizer:
Ian Simpson

Speakers:
Ian Simpson, Professor of Biomedical Informatics, School of Informatics, University of Edinburgh
Barry Ryan, PhD Student, UKRI Centre for Doctoral Training in Biomedical Artificial Intelligence, School of Informatics, University of Edinburgh
Sebesty´en Kamp, PhD Student, UKRI Centre for Doctoral Training in Biomedical Artificial Intelligence, School of Informatics, University of Edinburgh

Max Participants: 30

Description
Network structures allow us to model complex data in an extremely flexible way, enabling a wide range of downstream analytic approaches to help us gain insight into the biological processes and systems we model. The ability of networks to capture myriad features of the primary data and explore high order relationships between them makes them highly suitable to address questions that are not easily answered by classical statistical approaches that typically only look at first-order interactions. Networks have been widely used in the biomedical sciences to study gene and protein expression profiles, protein-protein interactions, metabolic processes, dynamic pathway models, and diseases amongst others. The emergence of multi-modal data in the biomedical setting has gathered pace significantly over recent years whereby several different types of data are measured from the same sample source. Integration of these data is proving incredibly valuable at increasing the breadth and depth of our understanding of the underlying systems by reducing noise, increasing information content, facilitating our handling of missing and/or incomplete data, and crucially, increasing our predictive power beyond that of uni-modal data analysis.

In this comprehensive tutorial we will introduce participants to network analysis from first principles using real-world multi-modal data derived from the Generation Scotland study, a world-leading longitudinal research programme and an excellent use case for biomedical network analysis. Participants will perform hands-on end-to-end network construction and computational analysis using a ground up approach which will give them the skills, experience, and confidence to develop their own network analytic pipelines in the future. We will work in the context of human disease using both molecular and clinical data and introduce introduce analysis approaches for network based tasks including clustering, functional annotation analysis, and classification using graph neural networks.

Learning Objectives
Participants will learn how to analyse biological datasets using networks. They will gain handson
experience with a real-world dataset as an exemplar that can be directly transferred to their
own work in the future. Following the course they will be able to:

  • Understand core network concepts and fundamentals
  • Construct networks using Python and R
  • Develop network models for uni-modal and multi-modal data (e.g. gene expression + DNA methylation)
  • Perform functional annotation analysis and community clustering
  • Implement simple Graph Neural Network based approached for classification tasks

Intended Audience and Level
Introductory Level.

This tutorial is aimed at an audience who have little prior experience working with and analysing data using networks. They will need at least a basic level of knowledge in Python and R programming. Specifically, participants are expected to be familiar with the Python packages Pandas, Numpy, and Matplotlib and the R packages ggplot2 and dplyr

The workshop will be conducted in both R and Python. We will communicate with participants in advance so that they have installed VisualStudioCode (Python) and RStudio (R) prior to the tutorial but can troubleshoot minor installation issues on the day and provide cloud compute instances of these if needed. All materials and data will be made available open-source through a dedicated GitHub repository. All analyses will be streamlined so that there are no challenging compute requirements for participants, a standard modern laptop will be suitable to take part.

Schedule

9:00 Welcome & Introduction
9:10 ”An Introduction to Networks”
9:40 Practical Session 1
10:45 Coffee Break
11:00 ”The Do’s and Don’ts of Biomedical Network Construction”
11:30 Practical Session 2
13:00 Lunch
14:00 ”Common Approaches to the Analysis of Biomedical Networks”
14:30 Practical Session 3
16:00 Coffee Break
16:15 ”An Introduction to Network Inference Using Graph Neural Networks”
16:45 Practical Session 4
17:50 Closing Remarks

- top -

Tutorial IP6: Creating and running cloud-native pipelines with WDL, Dockstore, and Terra

Room: 519
Date: Friday, July 12, 2024 9:00 – 13:00 EDT

Organizer:
David Steinberg

Speakers:
Denis Yuen, Team Lead, Dockstore, Ontario Institute for Cancer Research
David Charles Steinberg, University of Santa Cruz
Leyla Tarhan, PhD, Senior Science Writer, Data Sciences Platform, Broad Institute of MIT and Harvard
Aseel Awdeh, PhD, Computational Biologist, Data Sciences Platform, Broad Institute of MIT and Harvard

Max Participants: 40

Description
With the advent of efficient sequencing technology, the scientific community produces petabytes of data daily. These data are prepared to answer diverse biological questions, each requiring unique sequencing approaches. To combine these disparate datasets and transform them into meaningful insights, researchers are turning to cloud-based approaches that adhere to Findable, Accessible, Interoperable, and Reusable (FAIR) practices. These include cloud-computing environments that allow for efficient resource-sharing and scalability. While the potential of these new resources is thrilling, the migration to cloud computing might feel daunting, as it requires new pipelines that harness the expanse of cloud tools. In this half-day tutorial, we introduce participants to key components that help them create cloud-native pipelines, including portable workflows written in the Workflow Description Language (WDL; pronounced “widdle”), portable packages of software and dependencies known as Docker containers, and Dockstore, a public platform for sharing Docker-based workflows. Participants will get hands-on experience with these resources by developing their own simple WDL workflow and Docker image for genomic analysis. They will push their workflows to Dockstore and export them to the cloud-based Terra platform so that they can run their workflow on real data.

Learning Objectives
In this tutorial, participants will learn how to:

  • Write a basic WDL syntax with inputs and outputs
  • Make a Docker image from a Dockerfile
  • Navigate Dockstore, a platform for Docker-based workflows
  • Find, evaluate, and share workflows in Dockstore
  • Automatically integrate GitHub WDL with Dockstore
  • Export a WDL workflow from Dockstore to Terra
  • Set up and run a workflow in Terra
  • Find resources for writing advanced WDL workflows

Intended Audience and Level
Researchers and tool developers interested in bringing their analyses to the cloud. A basic understanding of command line and a GitHub account is required, and participants are encouraged to have basic familiarity with genomics terminology and standard high-throughput sequencing data formats. The introduction to basic WDL syntax is designed for novice WDL writers and starts with a basic hello-world script.

Schedule

9:00 Welcome/opening remarks/review agenda and learning goals
9:05 Introduction to Docker
● How dockers improve software and scientific reproducibility
● Docker and Dockerfile basics
● Finding and using Dockers
9:15 Building and Using Dockers
● Pull and use an existing Docker
● Create a Dockerfile to build a Docker
9:45 Introduction to WDL
● Anatomy of a WDL
● Where to find and run existing WDLs
10:00 Basic WDL scripting
● Writing your first WDL Hello-world script for Terra
● Running WDLs in Terra
10:45 Coffee Break
11:00 Introduction to Dockstore
● Finding and assessing the quality of workflows on Dockstore
● Launching workflows from Dockstore
11:30 Integrate your GitHub with Dockstore
● Use GitHub apps to streamline the development cycle
12:00 Real genomics example: Modify, export and run a WDL
12:30 Wrap-up and Q&A

- top -

Tutorial IP7: Federated Ensemble Learning for Biomedical Data

Room: 519
Date: Friday, July 12, 2024 14:00 – 18:00 EDT

Organizer:
Hryhorii Chereda

Speakers:
Prof. Dr. Anne-Christin Hauschild, Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
Hryhorii Chereda, Ph.D., Medical Bioinformatics, University Medical Center Göttingen, Göttingen, Germany
Dr. Youngjun Park, Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
Maryam Moradpour (MSc), Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

Max Participants: 15

Description
The digital revolution in healthcare, fostered by novel high-throughput sequencing technologies and electronic health records (EHRs), transitions the field of medical bioinformatics towards an era of big data. While machine learning (ML) have proven to be advantageous in such settings for a multitude of medical applications, they generally depend on a centralization of datasets. Unfortunately, this is not suited for sensitive medical data, which is often distributed across different institutions, comprises intrinsic distribution shifts and cannot be easily shared due to high privacy or security concerns.

Initially proposed by Google in 2017, Federated learning, allows the training of machine learning models on geographically or legally divided data sets without sharing sensitive data. When combined with additional privacy-enhancing techniques, such as differential privacy or homomorphic encryption, it is a privacy-aware alternative to central data collections while still enabling the training of machine learning models on the whole data set. However, in such federated settings, both infrastructure and algorithms become much more complex compared to centralized machine learning approaches. Some of the most intuitive implementations rely on ensemble learning approaches, where only the model parameters are transferred. For example, we can exchange split values of tree nodes as in federated random forest or combine local subgraph-based graph neural network (GNN) models into a global federated Ensemble-GNN.

This tutorial covers the general theory of federated learning and the practice of federated ensemble learning. We will explain the concepts and benefits of federated ensemble learning, and demonstrate how to use Python to implement two state-of-the-art methods: federated random forest and Ensemble-GNN. The participants will learn how to apply these methods to breast cancer data, including clinical and gene expression features, and how to deploy the models in a federated setup. By the end of this tutorial, the participants will have both theoretical and practical skills in federated ensemble learning and privacy-preserving techniques for biomedical data analysis.

Availability of the tutorial’s material: https://gitlab.gwdg.de/cdss/tutorial-federated-ensemblelearning- for-biomedical-data

Learning Objectives

  1. Participants will learn the basics of federated machine learning theory and will be introduced to federated ensemble learning:
    1. Participants will learn about federated random forest.
    2. Participants will be introduced to GNNs, which utilize a molecular subnetwork structuring input genomic data, and they will learn how GNNs can be combined into an ensemble (Ensemble-GNN).
  2. Participants will learn how to practically implement and apply a federated random forest.
  3. Participants will learn how to use GPUs to train Ensemble-GNN and how to apply it in both centralized and federated scenarios.
    1. Optionally, participants can learn how to implement their own GNN as a new base learner for Ensemble-GNN.

Intended Audience and Level
The aimed audience are: Bioinformaticians, Data scientists, Medical informaticians that are already beginners in machine learning. Participants should have a laptop with Linux, macOS, or Windows and internet connection. The access to computational environment will be provided by the organisers.

Level requirements are the following:

  1. Basic knowledge of machine learning.
  2. Basic knowledge of python.

Schedule

14:00

Lecture: Federated ensemble learning in biomedical health data
The basic concepts pf federated ensemble learning are introduced.  Advantages and challenges of central machine learning are discussed based on practical examples.  Finally, privacy aassuring techniques such as differential privacy and homomorphic encryption are explained.

Anne-Christin Hauschild

14:30

Hands-on tutorial: how to develop and implement a federated random forest

  1. Into to federated ensemble learning with decision tree and random forests
  2. Implementation of federated random forest model with scikit-learn

Hryhorii Chereda, Maryam Moradpour, Younjun Park

15:45 Coffee Break
16:00

Continuation of hand-on tutorial: how to develop and implement a federated random forest

  1. Evaluation and comparison of the global model with client-specific local models

Maryam Moradpour, Youngjun Park

16:15

Lecture: Federated ensemble learning with graph neural networks

GNNs are particularly developed to eprform different tasks with graphs. For instance, a patient cna be represented by a biological network where the nodes contain patient-specific omics features.  In this case, GNNs perform graph classification to predict a patients's clinical endpoint.  Ensemble-GNN approach builds predictive models utilizing PPI networks containing carious node features such as gene experssion and/or DNA methylation.  To do this, Ensemble-GNN derives relevant PPI network communities and trains an ensemble of GNN models based on the inferred communities.  Sharing local GNN models allows for the deployment of a federated ensemble of GNNs.

Hryhorii Chereda

16:30

Hands-on tutorial: how to train an apply federated Ensemble-GNN

  1. Intro to a ChebNet GNN model as a base learner of Ensemble-GNN
  2. Training Ensemble-GNN (using PyTorch) in centralized and federated setups
  3. Evaluation and comparison of the global model with client-specific local models
  4. Showcase of a federated scenario where data distributions substantially differ across the clients

Hryhorii Chereda, Maryam Moradpour, Youngjun Park

- top -

Tutorial VT1: A Practical Introduction to Large Language Models in Biomedical Data Science Research SOLD OUT

Part 1: Monday, July 8, 2024 14:00 – 18:00 EDT
Part 2: Tuesday, July 9, 2024 14:00 – 18:00 EDT

Organizer:
Robert Xiangru Tang

Speakers:
Robert Xiangru Tang, Yale University, USA.
Qiao Jin, National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), USA.
Hufeng Zhou, Biostatistics Department, Harvard T. H. Chan School of Public Health, Harvard University, USA.
Shubo Tian, National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), USA.
Zhiyong Lu, National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), USA.
Mark Gerstein, Yale University, USA.

Max Participants: 50
Website: https://llm4biomed.github.io/

Description
Large Language Models (LLMs) like ChatGPT have exhibited remarkable capabilities in understanding and generating language across diverse disciplines. In the realm of biomedical data science and computational biology, LLMs can significantly aid the processes of information accessibility, data analysis, and knowledge discovery. In this tutorial, we offer an introductory level hands-on guide to understanding and utilizing these LLMs in the field of biomedical data science. Our tutorial begins with leveling the learning ground by providing introductions to LLMs and Biomedical Data Science. Subsequently, we delve into the core applications of LLMs in biomedical data science/computational biology via retrieval-augmented generation, database functionalities, and code generation. To facilitate thought-provoking discussions, pertinent case studies will be discussed, emphasizing how to harness the power of LLMs to bridge the gap between technical feasibility and practical utility in biomedical data science. Furthermore, handson exercises are included to enable participants to apply their learning in real-time. Participants will also get acquainted with OpenAI's ChatGPT and open-source LLMs, as well as their design, use cases, limitations, and prospects.

Our topics include:

  • Introduction
    • Large Language Models (LLMs) and their evolution from RNNs, LSTM to Transformers and GPT family.
    • In-depth interaction with OpenAI’s ChatGPT, learning about its overview, capabilities, and implementation, focusing on Chain-of-Thought Prompting.
    • Open-source LLMs
  • Novel applications of LLMs in computational biology and biomedical data sciences
    • Database query generation with LLMs.
    • Retrieval-augmented generation.
    • Language agents and code generation.
  • Advanced topics of LLMs for bioinformatics
    • Biomedical text retrieval and literature mining
    • Gene set analysis
    • Developing Representations of Disease-Relevant Molecules
  • Guided hands-on exercises using provided datasets and problem statements for practical understanding and implementation.
  • Limitations and challenges (e.g. hallucination, fairness, and safety) of using LLMs for science.

Learning Objectives

  • Familiarizing with the key aspects of large-scale biomedical data.
  • Leveraging LLMs to handle and interpret vast amounts of biomedical data.
  • Learning cutting-edge research topics from two invited talks.
  • Utilizing OpenAI APIs for GPTs and open-source LLMs in Python.
  • Integrating LLMs to enhance their coding efficiency in bioinformatics.
  • Deploying LLMs for biomedical question-answering and academic literature exploration.

Intended Audience and Level
This tutorial is designed for graduate students, researchers, data analysts, and practitioners in the domains of bioinformatics, computational biology, and biomedical informatics who are seeking to harness the potential of Large Language Models (LLMs) in their work. The didactic content would be chiefly beneficial for individuals who are keen on enhancing the breadth and depth of their analytical skills.

While the focus of the workshop lies in catering to beginners or users with little experience in LLMs, intermediates will find the advanced topics and in-depth case studies enriching as well. Participants should ideally possess a basic understanding of Python programming and machine learning concepts. Preliminary experience with Linux-based operating systems or interacting with APIs would provide an added advantage but is not a prerequisite.

Our discussion on using OpenAI's ChatGPT and other open-source LLMs, such as LLaMA, along with hands-on exercises and case studies, will offer an immersive learning experience that spans theory and practice. Researchers looking to streamline their data analysis processes and improve the efficiency and accuracy of their results will find this tutorial particularly useful.

Relevant resources and tutorial materials for hands-on activities will be shared online before the commencement of the tutorial, ensuring an unhampered learning experience for all attendees.

Schedule

Part 1
14:00 Overview and Welcome
14:10 Introduction to LLMs with a focus on Biomedical Data Science
14:40 How to use GPT-3.5 and GPT-4 with Python
15:10 How to use Open-source LLMs with Python
15:30 Break
15:45 Database Query Generation with LLMs
16:10 Retrieval-augmented Generation with Large Language Models
16:35 Code generation in Bioinformatics
Part 2
14:00 Large Language Models for Biomedicine: from PubMed Search to Gene Set Analysis
14:45 AI in Biomedicine: Developing Representations of Disease-Relevant Molecules
15:30 Break
15:45 Integrating Biomedical Data Database Development with LLMs
16:10 Querying PubMed with RAG to answer biomedical questions with GPT-4
16:35 Code generation in Bioinformatics with Opensource LLMs
16:55 Closing Remarks

- top -

Tutorial VT2: BioViz: Interactive data visualisation and ML for omics data SOLD OUT

Part 1: Monday, July 8, 2024 14:00 – 18:00 EDT
Part 2: Tuesday, July 9, 2024 14:00 – 18:00 EDT

Organizer:
Ragothaman M Yennamalli

Speakers:
Ragothaman M. Yennamalli - Assistant Professor, SASTRA Deemed to be University, Thanjavur, India
Dr Farzana Rahman – Assistant Professor, Kingston University London, UK.
Shashank Ravichandran - Senior Software Engineer, Incedo Inc, India
Megha Hegde, PhD Researcher, Kingston University London, UK.
Jean-Christophe Nebel, Professor of Computer Science, Kingston University London, UK.

Max Participants: 30

Description
Data Science and Machine Learning are intricately connected, particularly in computational biology. In a time when biological data is being produced on an unprecedented scale — encompassing genomic sequences, protein interactions, and metabolic pathways- meeting the demand has never been more crucial.

Data visualisation plays a crucial role in biological data sciences since it allows the transformation of complex, often incomprehensible raw data into visual formats that are easier to understand and interpret. This allows biologists to recognise patterns, anomalies, and correlations that would otherwise be lost in the sheer volume of data. In addition, machine learning (ML) has brought about a revolution in the analysis of biological data. Exploiting extensive datasets, ML provides tools to model complex systems and generate predictions. Indeed, ML algorithms excel at uncovering subtle patterns in data, contributing to tasks like predicting protein structures, comprehending genetic variations and their implications for diseases, and even facilitating drug discovery by predicting molecular interactions.

The integration of data visualisation and machine learning is particularly powerful. In particular, visualisation may aid in interpreting machine learning models, allowing biologists to understand and trust their predictions. It could also help fine-tune these models by identifying outliers or anomalies in the data.

Due to its remarkable capability, there has been a surge in the development and application of tools that combine data visualisation and machine learning in biology. Platforms that integrate these technologies enable biologists to conduct comprehensive analyses without needing deep expertise in computer science. Assuredly, this democratisation of data science and ML has empowered more and more biologists to engage in sophisticated, data-driven research.

Learning Objectives
This tutorial is divided into two parts. In the first part of the tutorial, the participants will learn how to install and use tools for data visualisation using Python.  The second part will focus on installing and using ML tools for feature selection, model training, and model optimisation using Python.  By the end of this tutorial, the participants will be able to:

  1. Explain the role and significance of data visualisation in the context of scientific research.
  2. Apply fundamental principles of data visualisation to create clear and informative visual representations of data.
  3. Create a variety of data visualisations using Python libraries, i.e., Matplotlib, Seaborn, and Plotly.
  4. Understand the basics of colour theory and its implications for creating accessible and aesthetically pleasing visualisations.
  5. Design data visualisations that are accessible to a diverse audience, including those with colour vision deficiencies.
  6. Gain practical skills in preprocessing data and selecting appropriate features for machine learning models.
  7. Build, train, and evaluate machine learning models using Python libraries like Scikit-learn and TensorFlow/Keras.
  8. Implement machine learning algorithms on real-world biological datasets, demonstrating an understanding of the application of these techniques in biology.
  9. Create integrated visualisations of machine learning results using tools like Yellowbrick, Bokeh, and TensorBoard.
  10. Critically evaluate and discuss the applications, challenges, and implications of data visualisation and machine learning in scientific research, particularly in biology.

Intended Audience and Level
The tutorial is aimed towards entry-level participants (Graduate students, researchers, and scientists) in both academia and industry who are interested in Data Visualisation and ML. Prerequisites: Basic knowledge of computer programming (preferably Python) and machine learning (Beginner). There is no prerequisite to have any knowledge about Art and Aesthetics.

Schedule

Part 1
14:00 Lecture Introduction to Data Visualisation: Importance and Basic principles of data visualization in scientific research
Jean-Christophe Nebel
15:00 Hands-on Python Libraries for Visualization: Matplotlib, Seaborn, Plotly and others
Farzana Rahman, Ragothaman Yennamalli, Shashank Ravichandran, and Megha Hegde
15:45 Coffee/Tea Break
16:00 Lecture Colour theory in Visualization: Colour palettes, Accessible and Inclusive Visualisations
Ragothaman Yennamalli
17:00 Hands-on Creating various types of charts, plots for clarity and aesthetics. Case studies with real world datasets
Farzana Rahman, Ragothaman Yennamalli, Shashank Ravichandran, and Megha Hegde
Part 2
14:00 Lecture Fundamentals of Machine Learning: Types of ML, Data preprocessing and feature selection, model selection and training
Ragothaman Yennamalli and Farzana Rahman
15:00 Hands on Python libraries for Machine Learning: Scikit-learn, Pandas, NumPy, TensorFlow/Keras. Building models using real-world biological data
Shashank Ravichandran, and Megha Hegde
16:00 Coffee/Tea Break
16:15 Hands on Integrating Data Viz and ML: Yellowbrick, Bokeh, Tensorboard, Scikit-plot, etc.
Farzana Rahman and Megha Hegde
17:15  Question and Answer session Identify and highlight blocks of hands-on content in your submission

- top -

Tutorial VT3: Using LinkML (Linked data Modeling Language) to model your data

Date: Monday, July 8, 2024 14:00 – 18:00 EDT

Organizer:Sierra A.T. Moxon

Speakers:
Sierra Moxon, software developer, Lawrence Berkeley National Laboratory
Kevin Schaper, software developer, University of Colorado
Patrick Kalita, software developer, Lawrence Berkeley National Laboratory

Max Participants: 30

Description
LinkML (Linked data Modeling Language; linkml.io) is an open, extensible modeling framework that allows computers and people to work cooperatively to model, validate, and distribute data that is reusable and interoperable. It is designed to create interoperable data from the start without the overhead normally required for doing this. LinkML can help even non-techies create better, FAIRer, more reusable data models backed by ontologies.

Collecting and organizing biomedical data for an individual project presents a huge challenge; doing so in a way that allows for later reanalysis and reuse across projects is even harder. Many data standards are not machine-actionable, or are defined in isolation, leading to siloization. The quantity and variety of data being generated in biomedical fields is increasing rapidly, but is still often captured in unstructured formats like publications, posters, lab notebooks, or spreadsheets. Researchers at all levels struggle with collecting, managing, and analyzing data and complex knowledge, due to a confusing landscape of schemas, standards, and tools. These challenges impede scientific progress and limit our ability to tailor treatments based on data (precision medicine). AI and ML increasingly enable large-scale data analysis, but lack of data harmonization limits cross-disciplinary applications.

LinkML addresses these issues, weaving together elements of the Semantic Web with aspects of conventional modeling languages to provide a pragmatic way to work with a broad range of data types, maximizing interoperability and computability across sources and domains. LinkML meets data producers where they are technically, and speaks many different modeling languages. Data models can be authored in a variety of languages including YAML, JSON Schema, or even spreadsheets. LinkML supports all steps of the data analysis workflow: data generation, submission, cleaning, annotation, integration, and dissemination. LinkML enables even non-developers to create data models that are understandable and usable across the layers from data stores to user interfaces, reducing translation issues and increasing efficiency.

LinkML is an easy-to-use framework that both emerging and established data-generating communities can use to generate interoperable, reusable datasets and workflows. It has already seen wide uptake by projects across the biomedical spectrum and beyond, including the German Human Genome-Phenome archive, Critical Path Institute, iSample project, National Microbiome Data Collaborative, Center for Cancer Data Harmonization, INCLUDE project, NCATS Biomedical Data Translator, Reactome, Alliance of Genome Resources, Open Microscopy Environment (Next Generation File Format), and Genomics Standards Consortium.

In this tutorial, we will discuss best practices for data modeling; introduce LinkML as a modeling framework and tool suite; work together to set up a LinkML project from scratch; develop a model and validate it with test data; and auto-generate model documentation. If time permits, we will discuss the LinkML tool, Schema Automator, and use of LLMs with LinkML models.

Learning Objectives

  • Learn how to author a new data model that exercises some of the main LinkML modeling components.
  • Understand common LinkML schema best practices.
  • Generate documentation for the new model, and get familiar with generating the model in different formats.
  • Time permitting, get familiar with LinkML’s bootstrapping tools that help migrate existing models to LinkML.

Intended Audience and Level
This tutorial is aimed at anyone who generates or works with data: biologists, biocurators, data scientists, and data modelers. No programming or data modeling expertise is required. Listening through the hands-on aspects is encouraged with or without participating directly. To participate in hands-on training, we assume that participants have basic familiarity with running commands from the command line (in a terminal)--for example, calling Python scripts or running simple commands like “cat” and “grep”--and they should have a GitHub account and basic familiarity with using GitHub.

Schedule

Time (EDT) Topic Presenter Hands-on?
14:00 Introduction Sierra Moxon No
14:20 Section 1: Set up a LinkML repository Patrick Kalita Yes
14:50 Section 2: Authoring a LinkML Model
A. Model components
B. Classes and slots
Sierra Moxon Yes
15:10 BREAK    
15:25 Section 2: Authoring a LinkML Model (cont.)
C. Mappings, definitions, enumerations
Sierra Moxon Yes
15:40 Section 3: Schema best practices, including linting Patrick Kalita Yes
15:55 Section 4: Generating code from your model
A. Pydantic, JSONSchema
B. Generating documentation
Kevin Schaper Yes
16:35 BREAK    
15:45 Section 5: LinkML Validate Patrick Kalita Yes
17:05 Section 6 (Time permitting): Schema Automator (LLM + LinkML) Sierra Moxon No
17:35 Wrap up/Questions Sierra Moxon No

- top -

Tutorial VT4: Computational Approaches for Identifying Context-Specific Transcription Factors using Single-Cell Multi-Omics Datasets
SOLD OUT

Date: Tuesday, July 9, 2024 14:00 – 18:00 EDT

Organizer:
Hatice Ulku Osmanbeyoglu

Speakers:
Hatice Ulku Osmanbeyoglu, Assistant Professor, University of Pittsburgh, USA
Merve Sahin, Computational Biologist, Memorial Sloan Kettering Cancer Center, USA
Parham Hadikhani, Postdoctoral fellow, University of Pittsburgh, USA
Linan Zhang, Assistant Professor, Ningbo University, China

Max Participants: 30

Description
Development of specialized cell types and their functions are controlled by external signals that initiate and propagate cell-type specific transcriptional programs. Activation or repression of genes by key combinations of transcription factors (TFs) drive these transcriptional programs and control cellular identity and functional state. For example, ectopic expression of the TF factors Oct4, Sox2, Klf4 and c-Myc are sufficient to reprogram fibroblasts into induced pluripotent stem cells. Conversely, disruption of TF activity can cause a broad range of diseases including cancer. Hence, identifying context-specific TFs is particularly relevant to human health and disease.

Systematically identifying key TFs for each cell-type represents a formidable challenge. Determination of TF activity in bulk tissue is confounded by cell-type heterogeneity. Single-cell technologies now measure different modalities from individual cells such as RNA, protein, and chromatin states. For example, recent technological breakthroughs have coupled the relatively sparse single cell RNA sequencing (scRNA-seq) signal with robust detection of highly abundant and well-characterized surface proteins using index sorting and barcoded antibodies such as cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq). But these approaches are limited to surface proteins, whereas TFs are intracellular. Single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) measures genome-wide chromatin accessibility and reveals cellular memory and response to stimuli or developmental decisions. Recently several computational methods have leveraged these omics datasets to systematically estimate TF activity influencing cell states. We will cover these TF activity inference methods using scRNA-seq, scATAC-seq, Multiome and CITE-seq data through hybrid lectures and hand-on-training sessions. We will cover the principles underlying these methods, their assumptions and trade-offs. We will apply multiple methods, interpret results and discuss strategies for further in silico validation. The audience will be equipped with practical knowledge, essential skills to conduct TF activity inference independently on their own datasets and interpret results.

Learning Objectives for Tutorial
At the completion of the tutorial, participants will gain understanding into the basic concepts and recent advances in transcription factor inference methods for single-cell omics datasets including scRNA-seq, scATAC-seq, CITE-seq and Multiome. Four learning objectives are proposed:

  1. Understand the basics principles underlying TF activity inference from single-cell omics
  2. Understand the specific methodologies, assumptions, and trade-offs between computational inference methods
  3. Gain hands-on experience in applying tools and interpreting results using multiple TF activity inference methods on public scRNA-seq, scATAC-seq, multiome and CITE-seq datasets
  4. Discuss current bottlenecks, gaps in the field, and opportunities for future work.

Intended Audience and Level
This tutorial is designed for individuals at the beginner to intermediate level, specifically targeting bioinformaticians or computational biologists with some prior experience in analyzing single-cell RNA sequencing (scRNA-seq), single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq), Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq), and Multiome data, or those familiar with next-generation sequencing (NGS) methods. A foundational understanding of basic statistics is assumed.

While participants are expected to be beginners, a minimum level of experience in handling NGS datasets is required. The workshop will be conducted using Python and JupyterLab, necessitating prior proficiency in Python programming and familiarity with command-line tools.

To facilitate the learning process, participants will be provided with pre-processed count matrices derived from real datasets. All analyses, including JupyterLab notebooks and tutorial steps, will be available on GitHub for reference.

The tutorial will employ publicly accessible data, with examples showcased using datasets that will be made available through repositories such as the Gene Expression Omnibus or similar public platforms. This hands-on workshop aims to equip participants with practical skills and knowledge, enabling them to navigate and analyze complex datasets in the field of single-cell omics.

Schedule

14:00 Welcome remarks and tutorial overview
Hatice
14:05

Basic principles behind TF activity inference methods

  • Overview of the importance of context-specific TF regulation in biological systems.
  • Significance of TF dynamics in health and disease.
  • Single-cell multi-omics technologies for TF activity inference (scRNA-seq, scATAC-seq, Multiome

Hatice

14:45 Overview of computational TF inference methods based on single cell omics
Hatice, Merve
15:45 Break
16:00 Hands-on experience in applying tools and interpreting results using multiple TF activity inference methods using public scRNA-seq
Linan and Merve
16:45 Hands-on experience in applying tools and interpreting results using multiple TF activity inference methods using public scATAC-seq and multiome
Parham and Merve
17:30 Hands-on experience in applying tools and interpreting results using TF activity inference methods using public CITE-seq
Parham and Hatice
17:55 Discuss current bottlenecks, gaps in the field, and opportunities for future work
Hatice

- top -

Tutorial VT5: Explainability in Graph Deep Learning for Biomedicine SOLD OUT

Date: Monday, July 8 14:00 – 18:00 EDT

Organizer:
Guadalupe Gonzalez

Speakers:
Guadalupe Gonzalez, Prescient, Genentech Computational Sciences, Genentech.
Chirag Agarwal. Harvard University.

Max Participants: 50

Description
In the rapidly evolving field of biomedical research, graph deep learning (DL) has emerged as a powerful tool for analyzing complex biological data like molecular graphs, protein-protein interaction networks, and patient similarity networks. However, modern graph DL models are complex black-box neural networks comprising millions of parameters, and it is crucial to understand their model predictions before employing them in life-critical applications. Our proposed tutorial is designed to address the above challenge by providing a brief overview of explainability research in the context of graph neural networks (GNNs) and their applications to biomedical problems.

The tutorial will start with an introduction to graph DL, focusing on its relevance and potential in biomedicine. We will discuss why explainability is not just a desirable trait but a necessity in this domain, where model decisions can have significant implications for both model developers and relevant stakeholders.

The second part of the tutorial delves into the core of explainability research in GNNs. We will define what constitutes an explanation in GNN models, introduce post-hoc explainers, explore metrics for evaluating explanations, and criteria to assess the quality of explanations. We will also introduce explanation-directed message passing – a novel approach that integrates post-hoc explanations directly into the training pipeline of GNNs. Finally, we will introduce existing interpretable graph models in biomedicine.

In the third part, we will apply these concepts to high-stakes biomedical applications like predicting molecular properties, discovering new drug targets, and analyzing patient data. We will be discussing each application in depth, demonstrating how explainability enhances our understanding of modern GNNs and drives decision-making in biomedicine.

Finally, the tutorial will feature interactive demonstrations and a hands-on practical session. Participants will engage with real-world biomedical datasets, applying explainability techniques to GNN models. This session aims to provide attendees with practical experience and insights into developing and utilizing explainability techniques and interpretable GNN models effectively in their research.

By the end of this tutorial, participants will have a solid understanding of the importance, methods, and applications of explainability in GNNs within the biomedical sphere, equipped with the knowledge and skills to implement these techniques in their work.

Learning objectives

  1. Understand the fundamentals of graph deep learning:
    • Gain a solid understanding of graph DL and GNNs.
    • Recognize the significance and applications of graph DL in biomedicine.
  2. Learn the importance of the explainability and interpretability of machine learning models in biomedical applications:
    • Learn why the explainability and interpretability of machine learning models is crucial in biomedical research.
    • Appreciate the implications of model predictions in healthcare and research settings.
  3. Learn methods and metrics for explainability:
    • Understand different approaches to generating explanations for GNN models predictions.
    • Get acquainted with various metrics and desiderata used to assess the quality and effectiveness of explanations.
  4. Explore post-hoc explanation techniques and explanation-directed message passing:
    • Discover methods for post-hoc analysis of GNN predictions.
    • Delve into explanation-directed message passing and its role in enhancing model interpretability.
  5. Gain hands-on experience with explainability in GNN models:
    • Participate in interactive demonstrations and hands-on exercises to learn how to generate explanations of GNN models predictions for the tasks of molecular property prediction, drug target discovery, and patient data analysis.
    • Understand how explainability aids in the decision-making process in these applications.

Intended Audience and Level
This tutorial is primarily intended for:

  • Researchers and academics: Individuals working in the fields of bioinformatics, computational biology, biomedical research, and related areas. This includes both experienced researchers and graduate students who are exploring interpretability in the context of graph machine learning techniques in biomedicine
  • Data scientists and machine learning practitioners: Professionals in data science and machine learning working on graph DL and seeking to expand their knowledge into the interpretability domain.
  • Industry professionals: Individuals from biotech, pharmaceutical, and healthcare technology companies who are involved in research and development, particularly in areas intersecting with AI and machine learning.

The tutorial is designed to be intermediate. Participants are expected to have:

  • A basic understanding of machine learning concepts.
  • Familiarity with the fundamentals of DL.
  • Some knowledge of Python programming, as practical exercises will involve coding. No prior expertise in graph DL or specific biomedical applications is required. The tutorial will provide an introduction to these areas, but will also delve into more advanced topics suitable for attendees with existing knowledge in graph DL or bioinformatics.

Schedule

14:00

Part 1: Introduction to graph leep learning in biomedicine

  • Overview of graph DL: Basics of graph DL and GNNs
  • Importance of explainability and interpretability: Exploring the significance of explainability and interpretability in biomedical applications.
14:30

Part 2: Understanding and measuring explainability in GNNs

  • What are explanations?: Defining explanations in the context of graph DL.
  • Metrics for goodness of explanations: Discuss various metrics for evaluating the faithfulness of explanations and criteria used to evaluate their quality.
  • Post-hoc explainers: Introduction to post-hoc methods for explaining GNN model predictions.
  • Explanation-directed message passing: Exploring advanced techniques that incorporate explanation into the message-passing mechanism of GNNs.
  • Towards interpretable GNN models: Overview of existing interpretable GNN models in biomedicine
15:45 Coffee break
16:00

Part 3: Applying explainability techniques to GNN model predictions in biomedical contexts

  • Molecular property prediction: This section will demonstrate the application of explainability techniques to pretrained GNN models for molecular property prediction
  • Drug target discovery: In this section, we will showcase the use of explainability techniques on GNN model predictions for the discovery and validation of drug targets.
  • Patient data analysis: Illustrating the application of explainability techniques to GNNs analyzing patient data, focusing on how explanations can enhance personalized treatment strategies and disease comprehension.
16:45 Coffee break
17:00

Part 4: Hands-on demonstrations and practical session

  • Interactive demos: Real-time demonstrations showcasing the application of explainability techniques in graph DL.
  • Hands-on exercises: Participants engage in practical exercises applying explainability methods to state-of-the-art GNN models trained on biomedical datasets, such as those included in the MoleculeNet benchmark (https://moleculenet.org/) and Chemprop (https://github.com/chemprop/chemprop).
  • Practical advice: Sharing best practices for developing and utilizing interpretable graph
    models in biomedicine.

- top -

ISMB 2024 will be held in Montreal, Quebec July 12-16, 2024 and is seeking Event Staff (formerly volunteer) applications. Volunteers must be ISCB members with memberships expiring on or after Tuesday, July 16, 2024.

Volunteers are expected to assist as scheduled for approximately 20 - 24 hours during the conference dates of July 12-16, 2024 (generally for a shift of 5 - 6 hours).
Volunteers are asked to participate in a training session in the afternoon of Thursday, July 11. The session will last approximately 90 minutes.
Volunteers should be available for scheduled shifts on all dates beginning Friday, July 12, 2024 through the end of the conference day on Tuesday, July 16, 2024. A schedule of shift allocations will be provided prior to the conference start date.

In return for working as event staff, those selected are provided with a complimentary conference registration, time-based pay, and a conference T-shirt. Regarding registration, we ask that you DO NOT register in advance, we will send a code for your registration after decisions have been made. If you are not selected, a discount code will be send to you to allow you to register at the early bird rate.

Some volunteer roles:

  • Technical Moderator (TM) - responsible for assisting the speakers in loading their presentations on the laptop, managing the audio and visual recording of presentations from the room in the virtual platform, moderate the online chatroom and answer attendee questions as needed. Assist the Scientific Session Moderator as required.
  • Room Monitor (RM) - monitor the doors once the session starts to ensure late arriving delegates enter quietly. Track and record the number of delegates for each session and submit them to ISCB staff. Provide back up to the VTM as needed.
  • Information Desk (ID) - assist delegates with general questions related to the venue and the conference to include room directions, programme timeline and details, and social activities.
  • Registration Desk (RD) - assist by handing out delegate badges and additional items as needed, ensuring the orderly and smooth operation of the registration area and flow of delegate item collection.
  • Poster Hall (PH) - set up poster numbers, provide pins, directions, and other questions about posters as required by attendees and ISCB staff.

In addition to above specific roles, all event staff are asked to assist with general over all directions and other duties as required.

Application deadline is Monday, May 20, 2024.
Notifications will be sent on Friday, May 24, 2024.

Click here to submit an application

Join us for an exciting and innovative networking experience at ISCB's Success Circles event! Success Circles is a unique take on traditional thought-leader sessions, designed to foster meaningful connections and facilitate knowledge sharing among attendees.

What to Expect

  • Networking Reinvented: Success Circles is a dynamic networking event where participants are grouped into small circles, each led by a knowledgeable facilitator. This setup encourages engaging conversations and the exchange of valuable insights.
  • Expert-Led Discussions: Our expert facilitators will guide discussions on various topics related to bioinformatics, computational biology, and beyond. Whether you're an expert in the field or just starting, there's something for everyone.
  • Personalized Learning: Connect with peers who share your interests, challenges, and aspirations. Explore new ideas, gain fresh perspectives, and form lasting connections.
  • Interactive and Engaging: Break away from traditional conference formats and enjoy a lively, interactive experience that encourages open dialogue and collaboration.

- top -


Who Should Attend

Success Circles is open to all ISCB conference attendees looking to expand their professional networks, share knowledge, and gain insights from experts in the field. Attendance is limited and registration is required. Ensure you save your spot by including this in your conference registration.

Don't miss this opportunity to make meaningful connections, share your expertise, and be a part of a dynamic networking event. Success Circles promises to be a memorable and valuable addition to your ISCB conference experience.

Join us and be a part of the future of networking at ISCB's Success Circles!

- top -


Event Support

Success Circles is a dynamic opportunity to connect and collaborate with conference attendees by sponsoring one of the expert-led discussions. Support this exciting and innovative event by sponsoring a topic table. Contact Veronika Hotton to learn more about this and other opportunities.

- top -

Return to ISMB 2024 Homepage

Click link within a given cell to go to the relevant page within the scientific programme for a detailed list of presentations.  Agenda subject to change without notice.

Friday, July 12, 2024
Start Time End Time 517d 518 519 521 522 520a 520b 520c 525 524ab 524c
09:00 10:45   Tutorial IP1 Tutorial IP6 Tutorial IP5 Tutorial IP4 SCS SCS Posters     Tutorial IP3 Tutorial IP2
10:45 11:00 Coffee Break
11:00 13:00   Tutorial IP1 Tutorial IP6 Tutorial IP5 Tutorial IP4 SCS SCS Posters     Tutorial IP3 Tutorial IP2
13:00 14:00 Lunch Break YBS  
14:00 16:00   Tutorial IP1 Tutorial IP7 Tutorial IP5 Tutorial IP4 SCS SCS Posters YBS   Tutorial IP3 Tutorial IP2
16:00 16:15 Coffee Break
16:00 18:00 Career Fair in 517c (pre-registration required)
16:15 18:00   Tutorial IP1 Tutorial IP7 Tutorial IP5 Tutorial IP4 SCS SCS Posters YBS   Tutorial IP3 Tutorial IP2
18:15 18:30 Welcome                    
18:30 19:30 Keynote - Fiona Brinkman                    
19:30 21:00
Welcome Reception - in room 517c
 
Saturday, July 13, 2024
Start Time End Time 517d 518 519 521 522 520a 520b 520c 525 524ab 524c
07:30 08:00 Serene Stretch Symposium - Yoga
08:40 09:00 Welcome                    
09:00 10:00 Keynote - Tandy Warnow                    
10:00 10:40 Caffeinate and Connect with Exhibitors - Coffee Break
10:40 12:20 HitSeq RegSys iRNA Education Bio-Ontologies NIH/ODSS Function MICROBIOME CompMS   Tech Track
12:20 14:20 Poster Session with Lunch
14:20 16:00 HitSeq RegSys iRNA Education Bio-Ontologies NIH/ODSS Function MICROBIOME CompMS   Tech Track
16:00 16:40 Caffeinate and Connect with Exhibitors - Coffee Break
16:40 18:00 HitSeq RegSys iRNA Education Bio-Ontologies NIH/ODSS Function MICROBIOME CompMS    
18:00 20:00 Celebrating 25 Years of Bioinformatics.ca
18:00 22:00 Explore Montreal
 
Sunday, July 14, 2024
Start Time End Time 517d 518 519 521 522 520a 520b 520c 525 524ab 524c
07:30 08:00 Serene Stretch Symposium - Yoga
08:40 09:00 Welcome                    
09:00 10:00 Keynote - Guillaume Bourque                    
10:00 10:40 Caffeinate and Connect with Exhibitors - Coffee Break
10:40 12:20 HitSeq RegSys iRNA BioVis Bio-Ontologies Bioinformatics in Canada Function MICROBIOME BioInfo-Core Text Mining iCn3D
12:20 14:20
14:20 15:00 HitSeq RegSys iRNA BioVis Bio-Ontologies Bioinformatics in Canada Function MICROBIOME BioInfo-Core Text Mining iCn3D
15:00 15:20 HitSeq RegSys iRNA BioVis Bio-Ontologies Bioinformatics in Canada Function   BioInfo-Core Text Mining iCn3D
15:20 16:00 HitSeq RegSys iRNA BioVis Bio-Ontologies Bioinformatics in Canada Function NetBio BioInfo-Core Text Mining iCn3D
16:00 16:40 Caffeinate and Connect with Exhibitors - Coffee Break
16:40 18:00 HitSeq RegSys iRNA BioVis Bio-Ontologies Bioinformatics in Canada Function NetBio BioInfo-Core Text Mining iCn3D
18:00 19:30 Success Circles - Ticketed Event in room 517c
19:30 23:00 President's Reception - INVITE ONLY in room 720a
 
Monday, July 15, 2024
Start Time End Time 517d 518 519 521 522 520a 520b 520c 525 524ab 524c
07:30 08:00 Serene Stretch Symposium - Yoga

Green Task Force Meeting - in room 523b
08:40 09:00 Welcome                    
09:00 10:00 Keynote - Martin Steinegger                    
10:00 10:40 Caffeinate and Connect with Exhibitors - Coffee Break
10:40 12:20 MLCSB EvolCompGen General Computational Biology VarI Equity and Diversity in Computational Biology Research 3DSIG CAMDA NetBio WEB BOSC Tech Track
12:20 14:20
  • Poster Session with Lunch
  • Bioinformatics Editorial Board Meeting - INVITE ONLY in room 445, lunch provided
  • Education Committee Meeting - INVITE ONLY in room 523b, lunch provided
14:20 16:00 MLCSB EvolCompGen General Computational Biology VarI   3DSIG CAMDA NetBio WEB BOSC Tech Track
16:00 16:40 Caffeinate and Connect with Exhibitors - Coffee Break
16:40 18:00 MLCSB EvolCompGen General Computational Biology VarI TransMed 3DSIG CAMDA NetBio   BOSC  
18:15 19:15   ISCB Town Hall                  
  COSI Dinners - RSVP in advance is required
 
Tuesday, July 16, 2024
Start Time End Time 517d 518 519 521 522 520a 520b 520c 525 524ab 524c
07:30 08:00 Serene Stretch Symposium - Yoga
08:40 10:00 MLCSB EvolCompGen TransMed   Computational and Systems Immunology 3DSIG CAMDA Digital Agriculture SysMod BOSC  
10:00 10:40 Caffeinate and Connect with Exhibitors - Coffee Break
10:40 12:20 MLCSB EvolCompGen TransMed   Computational and Systems Immunology 3DSIG CAMDA Digital Agriculture SysMod BOSC Demystifying the World of Scientific Publishing
12:20 14:20
14:20 15:40 MLCSB EvolCompGen TransMed   Computational and Systems Immunology 3DSIG CAMDA Digital Agriculture SysMod BOSC  
15:40 16:00 Grab and Go - Quick Coffee Break
16:00 17:00 Keynote - Su-In Lee                    
17:00 17:30 Awards Presentation                    

Click here to download Abridged Agenda PDFClick here to download full schedule by track XLSX

Click here to download Detailed Agenda PDFReturn to ISMB 2024 Homepage

Fellowship Committee:
Anne Christin Hauschild
Luis Pedro Coelho
R. Gonzalo Parra
Farzana Rahman
Kana Shimizu

FUNDING INFORMATION:

ISCB is pleased to offer conference fellowships, including registration waivers for virtual participants, to ISMB 2024 for students and postdoctoral fellows to present a talk or poster at the conference in Montreal, Canada. Funding sources for Conference Fellowships are very limited and we regret that we are not able to fund all applicants. The conference organizers are committed to providing support to as many eligible applicants as possible. Conference Fellowship consideration is based on membership and accepted work to ISMB 2024.

Conference Fellowship Application Invitations are sent directly to eligible individuals after acceptance of scientific submissions to Proceedings, Abstracts, and/or Posters.

CONFERENCE FELLOWSHIP APPLICATION OVERVIEW:
  1. The submitting author will be sent the invitation and is responsible for getting the invitation to the presenting author if the work is not being presented by the submitting author;
  2. Applicant must be a current member of ISCB prior to submitting an application. The membership must be valid through December 31, 2024;
  3. Applicant must be listed as an author and be the presenter of an accepted Oral or Poster presentation (excluding accepted Late Posters) in order to be eligible to apply for conference fellowship funds through ISMB 2024 
  4. All applicants must attend all four (4) conference days and secure additional funding from other sources in order to be able to cover the full costs of attending the conference;
  5. The deadline to submit a fellowship application is May 20, 2024. No exceptions will be made.

Travel Fellowship Key Dates
May 14, 2024 Conference Fellowship invitations sent for Early Abstract accepted talks and posters.
May 20, 2024 Conference Fellowship Application Deadline
May 31, 2024 Conference Fellowship Acceptance Notification
June 12, 2024 Conference Applicant Registration Deadline
MAXIMUM AWARD AMOUNTS

The maximum fellowship award is determined based on the geographical location of the applicant and upon submission of appropriate receipts. Please note that funded applicants will only be able to cover approximately 50% of the expense of travel and registration fees with these fellowship amounts. Thus all applicants must seek and secure additional funding sources (e.g., from your home institution/university, or grant funding). For ISMB 2024 maximum awards are as follows:

MAXIMUM FUNDS TO BE AWARDED PER REGION OF APPLICANT
Africa 1500 USD
Asia (excluding Middle East) 1000 USD
Canada 750  USD
Europe 1000 USD
Mexico / Central America / South America 1000 USD
Middle East 1500 USD
Oceania 2000 USD
United States 750 USD
Application Process

Application is by invitation-only, sent automatically via email to the submitting author of an accepted Proceeding, Abstract Talk, and/or Poster (excluding accepted Late Posters) submission. This invitation email will arrive after notification of acceptance of one of these submission types as a separate email. IF YOU HAVE AN ACCEPTED PRESENTATION AND HAVE NOT RECEIVED AN INVITATION BY END OF DAY MAY 14 – PLEASE CHECK YOUR SPAM FOLDER AND THEN CONTACT US IF AN INVITATION IS NOT THERE: This email address is being protected from spambots. You need JavaScript enabled to view it.

Each invitation will include a travel fellowship application URL to link to an application. The application URL must be submitted by the presenting author only, if the qualifying requirements are met. If the submitting author is not the presenting author, it is the responsibility of the submitting author to forward the invitation to the presenting author if the eligibility requirements are met. Each application URL can only be used one time and no application will be accepted after the deadline of May 20, 2024.

Eligibility Requirements

1. Applicant must be a current ISCB member whose membership does not expire prior to December 31, 2024. Applications will not be accepted from non-members; pending memberships do not qualify and must be paid in full prior to submission of an application.

2. Applicant must be listed as an author or co-author on the original submission of an accepted ISMB 2024 Proceedings paper, Abstract, or Poster (excluding accepted Late Posters), and, per the requirements of the funding agencies, the funded applicant must be the presenting author of the work. (Submitters to the "Call for Late Posters" are not eligible for fellowship funding.)

3. Applicant must be registered in a degree program (undergraduate or graduate) or as a *postdoctoral research fellow at an accredited educational institution at the time of the conference; early career researcher (low - Upper-Middle Economic countries); post docs and employees of any US federal agency are ineligible for funding using US federal funds - currently we have only US federal funds for this travel fellowship program. (*The period of eligibility for a PostDoc is five (5) years from the time of their PhD completion date).

4. Applicant must be prepared to register for ISMB 2024 by June 12, 2024, and plan to attend all four conference days. If attendance at the conference is dependent on receipt of fellowship funds, please do not register until after the notification of travel fellowship funding. Any funded applicant failing to register for the conference by June 12, 2024 will automatically forfeit the funds so that another applicant can be awarded from among the original pool of applicants.

5. Applicant must be able to pay all expenses of attending the conference up front, including conference registration fee (as noted in #4 above), travel, accommodations and meals. Travel Fellowship funding will be provided via the ISCB payment system (bill.com) via secure electronic funds transfer (wire or ACH) approximately 6-8 weeks after the conference. 

Eligible expenses

Eligible expenses toward fellowship funds include registration for ISMB 2024, Student Council Symposium or Tutorials, transportation (air or land transportation from home region to conference city), hotel accommodations (booked within the ISCB official block) and a maximum of $250.00 in meal expenses. In order to receive the full-awarded amount, receipts for registration, transportation, and hotel accommodations that equal or exceed the awarded amount are required.

Notification

Applicants will be notified no later than May 31, 2024 of the funding status. In some cases applicants may be notified they are on a waitlist for funding, which means that ISCB is fully expecting but still awaiting the formal confirmation of our grant award from one or more granting agency, and that awarding of those funds will not be possible until the grant needed to fund the travel fellowship is confirmed. Any waitlisted applicant that is eventually awarded funds will be offered the opportunity to register at the early registration rate, therefore, please do not register for the conference if your attendance is fully dependent on being awarded a travel fellowship as any cancellation of an applicant's registration will be subject to the full regular registration cancellation policy.

Funded applicants will be required to present evidence of their eligibility status (such as student identification card) when signing in with the Conference Fellowships Desk to record their attendance. In all cases, funds will be mailed to funded applicants after the conference per the details noted in Eligibility Requirements #5 above.

Contacts

Questions regarding fellowships should be addressed to: This email address is being protected from spambots. You need JavaScript enabled to view it.

The information on this page is subject to change without notice, and all changed information will be considered final for the purposes of awarding and funding ISMB 2024 Conference Fellowships.

Contributors

The Conference Fellowships are made possible by generous donations from:

Brandeis University Online

- top -

Links within this page: Venue Information | Book your Official Accommodations | Conference Accommodations | Housing Policies | Student Housing | Travel



Venue Information

Conference will take place in the
Palais des Congrés de Montréal

The address is:
1001 Place Jean-Paul-Riopelle
Montréal, QC
H2Z 1X7
https://congresmtl.com/en/

- top -

Book your Official Accommodation

Showcare is the official Housing Bureau for ISCB's ISMB Conference. A link to book your hotel room online will be provided when you complete your conference registration. It is recommended that you book your hotel room early in order to take advantage of the special room rates that are subject to availability. ISMB 2024 success depends on attendees, sponsors, and exhibitors booking the conference hotels through the official Housing Bureau. Unfilled rooms create a financial risk in the form of penalties and can jeopardize the success of the association.

Please do not contact the hotels or make a reservation directly with the hotels. Discounted rates are only available through Showcare, the official Housing Bureau.
Please register for the conference before booking your accommodations.

- top -

Conference Accommodations

Le Westin Montreal (HQ)

Transform any trip into a relaxing getaway at Le Westin Montréal. Our Old Montreal hotel is full of modern amenities designed to elevate your stay no matter what time of year. It is surrounded by centuries of history in architecture, art, and French culture. Many of the city's prominent destinations can be found within walking distance.

Visit the Notre-Dame Basilica of Montreal, the cobblestone streets of Montreal's famous Parisian-style historic district, the Old Port, and the Palais des congrès. Get around town on foot or bike, with many bicycle rentals scattered throughout the city.

After a day exploring the sights, retreat to our spacious rooms and suites equipped with free Wi-Fi, pillowtop mattresses, and marble bathrooms. Our on-site gaZette restaurant features a mouth-watering menu with take-out options. Whether traveling for leisure or business or a bit of both, Le Westin Montréal provides a refined experience to restore balance and control.

Distance to Convention Center:
5 min walk
Rate: $289 CAD (single/double)

Hotel Monville

Hotel Monville is a four-star hotel that targets both businesspeople and tourists seeking to immerse themselves in the Montréal experience. Remarkable for its abundant windows that offer panoramic views of the metropolis, the Monville, created in 2018, is a hotel with an original design that combines state-of-the-art technology, ecological practices, and attentive service in a friendly atmosphere. At Monville, we don’t just practice the art of receiving well, but the art of receiving better.

Distance to Convention Center: 5 min walk 
Rate: $259 CAD (single/double)

Hotel Dauphin Montreal Downtown

Le Dauphin Hotels is a family-owned business. A proud third generation of a great hotel tradition started in 1963. An Eco-friendly hotel is trying to improve this aspect of the hotel business.

The property in Montreal Downtown features 114 rooms and suites, all of which were designed with the comfort of our guests in mind. Whether traveling for business or pleasure, Le Dauphin is an affordable decision and the best for any occasion.


Distance to Convention Center: 5 min walk 
Rate: $269 CAD (single/double)

Delta Hotels by Marriott Montreal

Enjoy a simply perfect stay at Delta Hotels Montreal, on business or with the family. Our pet-friendly hotel is in downtown Montreal, near McGill University and the Montreal Convention Centre, making us the ideal destination for conferences and events. Find the most renowned local attractions within walking distance of the hotel in Montreal's entertainment district, such as Sainte-Catherine Street or the Eaton Centre.

Relax in modern, stylish hotel rooms with sleek workspaces. Select rooms include balconies. Club-level rooms allow access to our 23rd-floor Club Lounge to enjoy complimentary breakfast and evening appetizers with stunning views of the Montreal skyline.

Distance to Convention Center: 14 min walk 
Rate: $255 CAD (single/double)

DoubleTree by Hilton Montreal

We're connected to the shops and restaurants of Complex Desjardins, with underground access to the Montreal Convention Centre and two metro stations. Place des Arts is around the corner, and we're a kilometre from Old Montreal. Enjoy our indoor pool, fitness center, and a warm DoubleTree welcome cookie on arrival.
 

Distance to Convention Center: 6 min walk 
Rate: $289 CAD (single/double)

Hampton Inn by Hilton Montreal Downtown

Our hotel is just half a kilometre from the Montreal Convention Center in the heart of downtown. We are surrounded by restaurants, government offices, museums, theatres, and historical attractions. Subway and bus stations are within three blocks, and we are just off A-720. Our rooftop terrace and expansive meeting spaces are ideal for Montreal events. Your stay includes a hot American Buffet daily.

Distance to Convention Center: 6 min walk 
Rate: $286CAD (single/double)

- top -

Housing Policies

The ISMB 2024 Housing Bureau, Showcare, will accept new hotel reservations, changes and cancellations until 5 pm EST on Monday, June 17, 2024.  If you made a reservation, it is being held for you in the inventory of rooms the hotels have blocked for this conference. The reservations will be transferred to the hotels on Thursday, June 20, 2024To ensure the hotels have the most up-to-date information, we ask that all hotel cancellations and changes be made by 5 pm EST on Monday, June 17, 2024.

Guarantee & Deposit Policy: All hotel rates are quoted in CAD and exclude tax. Hotel room rates are subject to applicable taxes that are in effect at check-in time. A credit card is required for each reservation and must have an expiration date on or after November 2024. Your room is not reserved if you do not provide a valid credit card. The hotels may charge a one-night room & tax deposit using the credit card on file with Showcare prior to check-in. Each guest must present a valid credit card or an approximate amount of cash for subsequent room nights and incidental charges for the entire stay upon check-in.  
*You must complete a credit card authorization form if the credit card on file is not in your name. 

Cancellation & Changes Policy: All cancellations or changes must be made online by re-accessing your housing account on or before 5 pm EST on Monday, June 17, 2024.

No cancellations or changes will be made between 5 pm EST on Monday, June 17, 2024 and Thursday, June 20, 2024, while reservation information is being prepared and transferred to the hotels. Cancellations or changes as of 5 pm EST on Friday, June 21, 2024, must be made directly with the hotels. Change requests will be made on a space-available basis. 

Cancellation requests received by the hotel 48-72 hours (refer to your respective hotel policy on their website) or less prior to arrival and no-shows will forfeit the one-night room & tax deposit and the rest of the stay will be cancelled. ISMB 2024 is not responsible for no-shows or early departure fees charged by the hotels or rooms resold due to non-arrival.  

ISMB 2024 takes no responsibility should a room preference not be available at check-in. Please visit the hotel websites for check-in and check-out times. 

Housing Confirmation: You will receive an email from your hotel with your hotel confirmation number approximately 2 weeks prior to arrival. If you do not receive it, please check your spam folder before contacting your hotel. 

Group Housing (10 rooms+/night): Please email This email address is being protected from spambots. You need JavaScript enabled to view it.

Questions About Hotel Reservations? 
Contact: This email address is being protected from spambots. You need JavaScript enabled to view it.

- top -

Student Accommodations

In addition to the official housing block, ISCB has secured a block of rooms at McGill University for student attendees.

La Citadelle Residence at McGill University

All student rooms are fully furnished and include private bathrooms, air conditioning, and a flatscreen TV. In addition to the shared kitchen, there is a large common area on the first floor and a quiet study room, both surrounded by windows. Situated in the center of the downtown area, La Citadelle is a recently renovated, hotel style residence building that opened its doors for move-in weekend of 2012.  La Citadelle is located two blocks east of McGill campus.

  • Room Rate: $189.00 CAD per night
    • $219.00 CAD Triple, $249.00 CAD Quad
  • Breakfast is included

Royal Victoria Dormitory at McGill University

Located immediately across the street from campus and minutes from downtown Montreal. Dormitory-style with shared washroom facilities centrally located on each floor. Shared kitchenettes throughout the building. Common rooms include 2 TV rooms, a games room, aerobics room, study room and a large lounge. 

  • Room Rate: $80.00 CAD
  • Breakfast is included

Carrefour Sherbrooke at McGill University

All student rooms are fully furnished and include private bathrooms, air conditioning, and a flatscreen TV. In room mini refrigerator. Find first class shopping, restaurants and art galleries, outdoor cafés and street festivals all within walking distance at this centrally located hotel style property.

  • Room Rate: $159.00 CAD per night
    • $189.00 CAD Triple, $219.00 CAD Quad
  • Breakfast is included


Please use one of the following to book your reservation at any of the McGill University accommodations:

  • Phone: 514-398-5200
  • E-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Ensure you mention International Society for Computational Biology ISMB 2024 Conference

Student room reservations at McGill university must be received by May 27, 2024.

- top -


Travel

Delta

Delta Air Lines is pleased to offer special discounts for ISCB

Please click here to book your flights.

You may also call Conferences and Events® at 1(800)328-1111* Monday–Friday, 8:00 a.m. – 6:30 p.m. (EST) and refer to Meeting Event Code: NM3UP
*Please note there is not a service fee for reservations booked and ticketed via our reservation
800 number.

Air Canada

When booking a flight to Montréal with Air Canada be sure to use the following discount code: QPE6YYJ1

United

When booking a flight to Montréal with United be sure to use the following discount code: ZPQF421521

- top -

Links within this page: Fiona S. L. Brinkman | Tandy WarnowGuillaume BourqueMartin Steinegger | Su-In Lee



Fiona S. L. Brinkman

Simon Fraser University
Canada
https://brinkmanlab.ca/

Introduced by: Terry Gaasterland
Time: Friday, July 12, 2024 at 18:30
Room: 517d

Sensitive Sustainable Science

How do we sustainably maintain and further develop bioinformatics and computational biology (BCB) software, databases and tools, in the face of short <5 year periods of funding support? How do we promote open data and open science in a way that best effects positive change and avoids causing unwitting harm on communities? Using some historical data and also my recent research as examples, I’ll review how open science is evolving, building on FAIR (findable, accessible, interoperable, reusable) with also, for example, CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) as Principles for Indigenous Data Governance. I’ll review this and other principles in the context of both microbial data, as well as human cohort data, presenting some approaches to research that can support more sustainable, inclusive science that can potentially better lead to positive change. While there is no one size fits all solution, there are some common themes and considerations that we as a BCB community should discuss - and ideally incorporate into BCB training programs.

Biography

Fiona Brinkman is a Distinguished Professor in Bioinformatics and Genomics at Simon Fraser University, interested in developing more preventative, sustainable, and holistic approaches for infectious disease control and supporting health. She is most known for R&D of software and databases aiding analysis of microbial and human omics data, including PSORT, IslandViewer, Pseudomonas.com, and InnateDB.com. She leads data integration for the CHILD Cohort Study – the largest multidisciplinary, longitudinal, population-based birth cohort study in Canada, including diverse omics data. She has co-led development of the IRIDA.ca platform, which is now the primary platform for Canada’s Public Health Agency to analyze infectious disease outbreaks using combined epidemiological/lab/genomics data. She contributed to the pandemic response, co-leading Data Analytics for the Canadian COVID-19 Genomics Network and more recently CoVaRR-Net. She has a strong interest in bioinformatics education and mentoring young scientists. She is on several committees/Boards, including the ELIXIR and European Nucleotide Archive Scientific Advisory Boards. Her awards include a TR100 award from MIT, Thompson Reuters “World’s Most Influential Scientific Minds”, and most recently she received a University of Waterloo Distinguished Alumni Award and became a Fellow of the Royal Society of Canada.

- top -


ISCB Accomplishments by a Senior Scientist Award Winner
Tandy Warnow

University of Illinois Urbana-Champagne
USA
https://cs.illinois.edu/about/people/faculty/warnow

Introduced by: Aïda Ouangraoua
Time: Saturday, July 13, 2024 at 09:00
Room: 517d

Progress in Large-Scale Phylogenomic Estimation Methods

Over the last several years, interest in computing and then using large-scale phylogenies has increased for multiple reasons, including basic science (how did life evolve on earth) and applications in biomedicine and public health (e.g., understanding the evolution of SARS-Cov-2). The estimation of these large phylogenies, wiith potentially millions of leaves, presents fascinating mathmetical, statistical, and computational challenges, ranging from computing multiple-sequence alignments, developing effective heuristics to NP-hard optimization problems (e.g., maximum likelihood tree estimation) on large datasets), estimating species trees from genome- scale data while addressing biological causes for heterogeneity (e.g., gene duplication and loss and incomplete lineage sorting) across the genome). There are also many fascinating and difficult problems that have to do with “post-tree” analyses, such as rooting gene trees and species trees, or estimating branch lengths in species trees and dates at internal nodes, that are needed for many down-stream analyses. In this talk I will describe progress on these questions, and I will also present some open problems where new techniques are needed.

Biography

Dr. Warnow received her PhD in Mathematics at UC Berkeley (1991) under the direction of Gene Lawler, and did postdoctoral training with Simon Tavare and Michael Waterman at the University of Southern California (1991-1992). After positions at Sandia National Laboratories (1992-1993), University of Pennsylvania (1993-1998), and the University of Texas (1998-2014), she joined the University of Illinois at Urbana-Champaign as a Founder Professor of Engineering. She is now Associate Head for Computer Science, and has affiliate faculty appointments in Bioengineering, Electrical and Computer Engineering, Mathematics, Statistics, and several biology departments.

- top -


Guillaume Bourque

McGill University
Canada
https://computationalgenomics.ca/BourqueLab/

Introduced by: Francis Ouellette
Time: Sunday, July 14, 2024 at 09:00
Room: 517d

Human genome 2.0 : why a pangenome graph is better for genetic and epigenetic analyses

Genomic analyses often start by mapping reads to a reference genome. But, in every individual, there are DNA variants and sequences that are unique to that individual and reads coming from those regions will often be ignored. Thankfully, progress in long-read technologies and assembly can now efficiently deliver telomere-to-telomere genomes. Applying such approaches to a diverse panel of individuals combined with the development of graph-based genomic tools, the Human Pangenome Reference Consortium has just released the first human pangenome reference graph. This new resource is meant to alleviate the limitations of relying on a single linear human genome as the first step of most genetic and epigenetic analyses. In this talk, I will summarize some of the benefits of using the pangenome reference. In particular, I will show how this new reference can be used to extract missing signal when looking for genetic variants in a rare disease cohort called Genomic Answers for Kids. I will also describe the results of a new study using a genome-graph looking at epigenetic changes before and after influenza infection in monocyte-derived macrophages extracted from more than 30 individuals of different ancestry. Finally, considering the importance of data sharing in genomics, I will introduce a project called the Pan-Canadian Genome Library, which will establish the framework for Canada’s management and sharing of human genomic data.

Biography

Dr. Bourque is a Professor in the Department of Human Genetics, a Canada Research Chair in Computational Genomics and Medicine and the Director of Bioinformatics at the McGill Genome Center. He leads the Canadian Center for Computational Genomics (C3G) and the Epigenomics Mapping Center at McGill. He is on the External Consultant Panel of two functional genomics consortia funded by the National Human Genome Research Institute in the US (ENCODE and IGVF). Dr. Bourque is also on the Scientific Steering Committee of the International Human Epigenome Consortium (IHEC) and on the Steering Committee of the Global Alliance for Genomics and Health (GA4GH). Dr. Bourque’s research interests are in comparative and functional genomics with a special emphasis on applications of next-generation sequencing technologies and transposable elements.

- top -


ISCB Overton Award Winner
Martin Steinegger

Seoul National university
South Korea
https://steineggerlab.com/en/

Introduced by: Christine Orengo
Time: Monday, July 15, 2024 at 09:00
Room: 517d

Supercharged Protein Analysis in the Era of Accurate Structure Prediction

Abstract: Protein analysis has witnessed a revolution through machine-learning methods. At the forefront are highly accurate structure prediction methods such as AlphaFold2 and ESMFold. These have generated an avalanche of publicly available protein structures. The AlphaFold database and ESMatlas contain over 214 and 620 million predicted structures, respectively, covering nearly every protein sequence in our largest protein reference databases. This unprecedented access to structural information is not just critical for structural biology but impacts most fields of biology. In this talk, I will discuss how this data is revolutionizing genomic and proteomic annotations and introduce fast and sensitive methods to search and cluster this data to extract new biological insights.

Biography

Dr. Steinegger is an Assistant Professor in the Biology Department at Seoul National University, with a joint appointment to the Interdisciplinary Program in Bioinformatics. He conducted his doctoral studies at the Max Planck Institute for Biophysical Chemistry and was awarded a Ph.D. in computer science with summa cum laude honors from the Technical University of Munich in 2018, followed by a postdoctoral fellowship at Johns Hopkins University. Dr. Steinegger has published more than 40 papers covering a wide range of topics in bioinformatics, from detecting genomic assembly contamination to organizing the protein structure space.

He started his research group in 2020, focusing on the development of methods to analyze massive genomics and proteomic datasets. The group's contributions to bioinformatics include widely used tools for predicting structures (ColabFold/AlphaFold2), clustering (Linclust), assembling (Plass), and searching sequences (MMseqs2) and protein structures (Foldseek). His group's software and web services have been installed and used millions of times. Dr. Steinegger is an advocate for internationality at his home institution, open science and open source.

- top -


ISCB Innovator Award Winner
Su-In Lee

University of Washington
USA
https://suinlee.cs.washington.edu/su-in-lee

Introduced by: Karin Verspoor
Time: Tuesday, July 16, 2024 at 16:00
Room: 517d

Explainable AI for health: where we are and how to move forward

The first part of my talk delves into various research endeavors conducted by my lab, focusing on explainable AI's application across diverse biomedical domains. I will demonstrate how explainable AI can elucidate novel scientific inquiries, with a primary emphasis on understanding neurodegenerative diseases and biological age.

In the second part, we will explore the evolving landscape of explainable AI, uncovering its potential to chart new scientific directions in biomedicine, exemplified by our recent work in dermatology, emergency medicine, and precision cancer medicine. This discussion aims to shed light on the necessary enhancements for explainable AI to effectively tackle a wide array of real-world challenges in biomedicine.

Biography

Prof. Su-In Lee, the Paul G. Allen Professor of Computer Science at UW, earned her PhD from Stanford University in 2009 under the mentorship of Prof. Daphne Koller. She joined UW in 2010 after serving as a visiting Assistant Professor in the Computational Biology Department at Carnegie Mellon University School of Computer Science. Recognized for her groundbreaking contributions to AI, biology, and medicine, Prof. Lee has received prestigious accolades including the National Science Foundation (NSF) CAREER Award, the International Society for Computational Biology (ISCB) Innovator Award, and the Samsung Ho-Am Prize, often referred to as the "Korean Nobel Prize," and designation as an American Cancer Society (ACS) Research Scholar and a Fellow of American Institute for Medical and Biological Engineering (AIMBE). Notably, she is recognized as a pioneer and trailblazer in explainable AI (XAI), significantly enhancing ML model interpretability.

Prof. Lee's recent contributions revolve around essential XAI principles and techniques, including her groundbreaking SHAP framework. Her innovative biomedical research spans basic biology to clinical medicine, enabled by XAI advancements. Conceptually advancing the integration of AI with biomedicine, her work addresses forward-looking scientific questions, enabling novel discoveries from high-throughput molecular data and electronic health records and advancing healthcare. This pioneering line of work has led to highly cited publications across foundational AI, computational molecular biology, and clinical medicine.

 

- top -

Exclusively for members

  • Member Discount

    ISCB Members enjoy discounts on conference registration (up to $150), journal subscriptions, book (25% off), and job center postings (free).

  • Why Belong

    Connecting, Collaborating, Training, the Lifeblood of Science. ISCB, the professional society for computational biology!

     

Supporting ISCB

Donate and Make a Difference

Giving never felt so good! Considering donating today.