Details coming soon.
April 26 - 29, 2025
Seoul, South Korea
May 12 - 15, 2025
Minneapolis, MN
ISCB Official Event
May 26 - June 1, 2025
Cuatro Ciénegas, Mexico
July 20, 2025
Liverpool, United Kingdom
July 20 - 24, 2025
Liverpool, United Kingdom
ISCB Official Event
September 18-20, 2025
Kolkata, India
Dec 11-13, 2025
Hong Kong, China
ISCB Official Event
ISCB’s Annual Flagship Meeting
Support the society while achieving your marketing goals
Become a ISCB collaborative conference, learn more here
Regional, topical, worldwide - your platform to present science
dedicated to facilitating development for students and young researchers
The ISCB Affiliates program is designed to forge links between ISCB and regional non-profit membership groups, centers, institutes and networks that involve researchers from various institutions and/or organizations within a defined geographic region involved in the advancement of bioinformatics. Such groups have regular meetings either in person or online, and an organizing body in the form of a board of directors or steering committee. If you are interested in affiliating your regional membership group, center, institute or network with ISCB, please review these guidelines (.pdf) and send your exploratory questions to Diane E. Kovats, ISCB Chief Executive Officer (This email address is being protected from spambots. You need JavaScript enabled to view it.). For information about the Affilliates Committee click here.
Topically-focused collaborative communities
Connect with ISCB worldwide
Environmental Sustainability Effort
ISCB is committed to creating a safe, inclusive, and equal environment for everyone
Resource library for education and training materials
Search jobs, find talent
Science at the click of the mouse, recorded talks
High-quality research devoted to computer-assisted analysis of biological data
Latest research and publications
Certifying Quality in Computational Biology Education
Latest updates from ISCB
Highlighting Society events, programs, and achievements
Celebrating scientific achievement and innovation
Honoring our distinguished researchers
Recognizing contributions and achievements
Center for science, collaboration, and training
Details coming soon.
There will be a series of in-person and virtual tutorials prior to the start of the conference. Tutorial registration fees are shown at: https://www.iscb.org/ismb2024/register#tutorials
In-person Tutorials (All times EDT)
Virtual Tutorials: (All times EDT) Presented through the conference platform
Room: 518
Date: Friday, July 12, 2024 9:00 – 18:00 EDT
Organizer:
Juexin Wang
Speakers:
Mauminah Raina, (Ph.D. student) Indiana University Indianapolis, United States
Yi Jiang, (Ph.D. student) Ohio State University, United States
Lei Jiang, (Ph.D. student) University of Missouri, United States
Michael Eadon, Indiana University Indianapolis, United States
Juexin Wang, Indiana University Indianapolis, United States
Qin Ma, Ohio State University, United States
Dong Xu, University of Missouri, United States
Max Participants: 50
Website
https://github.com/juexinwang/Tutorial_ISMB2024
Description
Emerging single-cell omics and spatial transcriptomics technologies provide unprecedented opportunities and challenges for molecular biology studies. How to model these vast sequencing data in different modalities, perform computational analyses, and interpret mechanisms by identifying biological and pathological meaningful cell types, regulatory relations, and key markers are central questions in this aera.
Advanced machine learning methods and tools provide a promising approach to address these challenges. scGNN (https://github.com/juexinwang/scGNN) is a graph neural network based framework for clustering and imputing scRNA-seq data by modeling the single cells as a cell graph. Targeting single-cell multi-omics data, DeepMAPS (https://bmblx.bmi.osumc.edu/) introduces a heterogenous graph transformer to infer single-cell biological networks. BSP (https://github.com/juexinwang/BSP) proposes a granularity-based statistical approach to identify spatially variable genes on 2D and 3D spatial transcriptomics.
Our tutorial will cover key advancements in machine learning methods developed on single-cell multi-omics and spatial transcriptomics research over the past few years, emphasizing new opportunities in bioinformatics enabled by such advancements. We will start with a technical talk about the machine learning algorithms of covered approaches, including scGNN, DeepMAPS, and BSP, and from model training to model interpretation (discovery on cell types, regulatory relations, and key markers). We will then demonstrate the impact of machine learning on discovering
Learning Objectives
Intended Audience and Level
The target audiences are graduate students, researchers, scientists, and practitioners in both academia and industry who are interested in applications of deep learning in bioinformatics (Broad Interest). The tutorial is aimed towards entry-level participants with knowledge of the fundamentals of biology and machine learning (beginner). Basic experience with Python and R programming languages is recommended for the participants.
The tutorial slides and materials for hands-on exercises (e.g., links to demo, code implementation, and datasets) will be posted online prior to the tutorial and made available to all participants.
Schedule
9:00 |
Part 1: Overview: Introduction to single-cell multi-omics and spatial transcriptomics and corresponding challenges.
|
9:45 |
Part 2: Introduction to biological analyzing methods.
|
10:45 | Coffee Break |
11:00 |
Part 3: Clustering-based single-cell analysis and scGNN on AI-ready platform.
|
12:00 |
Part 4: Applications #1: Single-cell RNA-seq dataset acquisition, model training, and analysis.
|
13:00 | Lunch |
14:00 |
Part 5: Network analysis on single-cell multi-omics and DeepMAPS.
|
14:30 |
Part 6: Applications #2: Single-cell multi-omics dataset acquisition, model training, and analysis.
|
16:00 | Coffee Break |
16:15 |
Part 7: Marker analysis on spatial transcriptomics and BSP.
|
16:45 |
Part 8: Applications #3: Spatial transcriptomics dataset acquisition, model fitting, and analysis.
|
Room: 524c
Date: Friday, July 12, 2024 9:00 – 18:00 EDT
Organizer:
Sven Rahmann
Speakers:
Johanna Schmitz, Center for Bioinformatics Saar and Saarland University, Saarland Informatics Campus, Saarbrücken, Germany; Saarbrücken Graduate School of Computer Science
Jens Zentgraf, Center for Bioinformatics Saar and Saarland University, Saarland Informatics Campus, Saarbrücken, Germany; Saarbrücken Graduate School of Computer Science
Sven Rahmann, Center for Bioinformatics Saar and Saarland University, Saarland Informatics Campus, Saarbrücken, Germany
Max Participants: 20
Description
Python has a reputation for being a clean and easy-to-learn language, but slow when it comes to execution, and difficult concerning multi-threaded execution. Nonetheless, it is one of the most popular languages in science, including bioinformatics, because for many tasks, efficient libraries exist, and Python acts as a glue language. In this tutorial, we explore how to write efficient multi-threaded applications in Python using the numba just-in-time compiler. In this way, we can use Python’s flexibility and the existing packages to handle high-level functionality (e.g., design the user interface, run machine learning models), and then use compiled Python for additional custom compute-heavy tasks; these parts can even run in parallel.
Over a full tutorial day, we introduce a small (but still interesting and relevant) problem as an example: efficient search for bipartite DNA motifs. We develop an efficient tool that outputs every match in a reference genome in a matter of seconds. Starting with an introduction to the problem and a (slow) pure Python implementation, we learn how to write more jit-compiler-friendly code, transition towards a compiled version and observe speed increases until we obtain C-like speed. We parallelize the tool to make it even faster, and add more options for more flexible searching. Finally, we add a simple but effective GUI, which can increase the potential user-base of such a tool by an order of magnitude.
Learning Objectives
Intended Audience and Level
The tutorial addresses active bioinformatics researchers, from graduate students to principal investigators, who write software tools as part of their research. In particular, we address researchers who are looking for an easier transition from research prototype software to software that scales to large datasets and is usable by a large non-technical user-base. Therefore, our participants should have at least some experience developing bioinformatics research software.
Prior experience with the Python programming language is required, as well as some experience with managing environments with installed software, ideally using (bio)conda / mamba.
Schedule
9:00 | Introduction to the numba just-in-time compiler for Python; small examples, possibilities, limitations, how the compilation works. Last 30 minutes are short hands-on exercises (timing iterated execution of a small function in pure vs. compiled Python). |
10:45 | Coffee break |
11:00 | Introduction to DNA motif search and a “motif description” mini-language, with examples from the literature. Automaton-based pattern search and a bit-parallel algorithm. Hands-on: Implementation in pure Python (45 min, 15-20 lines). |
13:00 | Lunch break |
14:00 | Transforming a Python implementation to a numba-compiled implementation; separation of high-level and low-level code parts; managing memory allocations; introduction of type annotations (1 hour principles, 1 hour supervised coding). |
16:00 | Coffee break |
16:15 | Parallelization: Using threads to parallelize the application (e.g. parallel search across chromosomes); Replacing the command-line interface by a simple but effective GUI using streamlit. Hands-on coding: Splitting the task, collecting and visualizing the results. |
Room: 524a
Date: Friday, July 12, 2024 9:00 – 18:00 EDT
Organizer:
Qiyun Zhu
Speakers:
Qiyun Zhu
James Morton
Daniel McDonald
Matthew Aton
Lars Hunger
Max Participants: 40
Description
Modern microbiome research is marked by the extensive use of high-throughput, multi-omic data derived from complex biological systems, such as amplicons, metagenomes, metatranscriptomes, metaproteomes, and metabolomes, as well as data and metadata of the host or environment. The complexity and richness of data demand robust, scalable, and reproducible integration and analysis methods. Our full-day tutorial offers an essential guide to leveraging the expanded capabilities of scikit-bio, alongside the broader Python data science ecosystem. Scikit-bio is a core library behind the widely used QIIME 2 project, and provides various data structures, metrics and algorithms commonly used in bioinformatics. This tutorial is designed to provide researchers, educators, and developers with an overview of current trends, foundational principles, and analytical strategies in microbiome research. Participants will engage in hands-on exercises on handling data and metadata, analyzing communities and features, as well as correlating and predicting biological traits. This tutorial aims to equip attendees with knowledge and practical skills that are adaptable to various applications in microbiome research and beyond.
Exercises will be delivered through Jupyter Notebooks with clear code and documentation. Tutorial materials, including data, slides, and notebooks, will be hosted in a public GitHub repository under a BSD open-source license.
Learning Objectives
Participants will learn how to use scikit-bio and other common Python libraries to analyze and integrate multiple types of omic data that are usually involved in studies of microbiomes and their roles in the host or natural environment. Specifically, participants will:
In the end of the full-day tutorial, each participant will complete an analytical workflow based on a demo dataset and can be customized and extended to other datasets.
Intended Audience and Level
This tutorial is for researchers, educators and developers interested in analyzing various types of biological “omic” data, such as metagenomics, metabolomics, and host transcriptomics. Attendees should have basic skills in Python (preferred), or any other programming language (such as R or C/C++). Experience with the Linux command line is not required. Optionally, attendees may benefit from basic knowledge in bioinformatics, biostatistics, and any specific biological research fields, such as microbiology, ecology, molecular biology, and epidemiology.
Each participant should bring their own laptop or tablet (with keyboard). The practices will be conducted using Google Colab or a local Jupyter environment, depending on the participant’s preference
Schedule
9:00 |
Introduction and software setup
Exercise: Setting up the software environment.
|
10:00 |
Working with various omic data types
Exercise: A real-world multi-omic dataset
|
10:45 | Coffee break |
11:00 |
Working with sparse, high-dimensional data tables
Exercise: Working with omic data tables
|
12:00 |
Analyzing microbial community structures
Exercise: Community diversity analyses
|
13:00 | Lunch break |
14:00 |
Inferring and associating critical features
Exercise: Statistical modeling and tests
|
15:00 |
Predicting host and environmental traits
Exercise: Constructing predictive models
|
16:00 | Coffee break |
16:15 |
Developing an analytical protocol for publication
Exercise: Assembling an analytical protocol
|
17:15 |
Debugging, wrapping-up and open questions
Lecture: Looking beyond
|
Room: 522
Date: Friday, July 12, 2024 9:00 – 18:00 EDT
Organizer:
Aritra Bose
Laxmi Parida
Speakers:
Aritra Bose, PhD, Research Scientist, IBM Research, Yorktown, NY
Hakan Doga, PhD, Postdoctoral Researcher, IBM Research, Cleveland, OH
Filippo Utro, PhD, Senior Research Scientist, IBM Research, Yorktown, NY
Laxmi Parida, PhD, ISCB Fellow, IBM Fellow
Max Participants: 50
Description
Single-cell and -omic analyses has provided profound insights on heterogeneity of complex tissues measuring multiple cells together, including a wide array of multi-omics data such as genomics, proteomics, transcriptomics, etc. The single cell analysis is often plagued by many uncertainties such as missingness, developing robust machine learning algorithms for discovering complex features across, finding patterns in spatial structure of single cell transcriptomics or proteomics, and most importantly integrating multi-omics data to create meaningful embeddings for the cells. Machine Learning (ML) techniques have been extensively used in analyzing, predicting, and understanding multi-omics data. For the purposes of this tutorial, we will use the term classical ML to refer to these the potential to overcome a lot of the above limitations of ML in single-cell analysis. This tutorial will be structured into five sessions as follows:
Learning Objectives
Participants in this tutorial will learn a new paradigm of analyzing multi-omics data with hands on experience with a quantum computer. More objectively, the major takeaways of this tutorial would be:
Intended Audience and Level
This tutorial is aimed at computational biologists, bioinformaticians, clinicians, practitioners, data analysts, including early-career to senior researchers in the fields of healthcare and life sciences enthusiastic to learn about new frontiers of computational biology. There are very few prerequisites for the tutorial, listed as follows:
Schedule
9:00 | Session I: Quantum Information and Fundamentals |
10:45 | Coffee Break |
11:00 | Session II: Hello Qiskit!: Writing your first program in Qiskit |
12:30 | Session III: Processing multi-omics data with classical ML algorithms |
13:00 | Lunch |
14:00 | Session IV, Part I: Design and implement QML algorithm for single-cell data in Qiskit. |
16:00 | Coffee Break |
16:15 | Session IV, Part II: Analyze QML algorithm and compare with classical ML |
17:00 | Session V: Interactive Q&A session with the participants. |
Room: 521
Date: Friday, July 12, 2024 9:00 – 18:00 EDT
Organizer:
Ian Simpson
Speakers:
Ian Simpson, Professor of Biomedical Informatics, School of Informatics, University of Edinburgh
Barry Ryan, PhD Student, UKRI Centre for Doctoral Training in Biomedical Artificial Intelligence, School of Informatics, University of Edinburgh
Sebesty´en Kamp, PhD Student, UKRI Centre for Doctoral Training in Biomedical Artificial Intelligence, School of Informatics, University of Edinburgh
Max Participants: 30
Description
Network structures allow us to model complex data in an extremely flexible way, enabling a wide range of downstream analytic approaches to help us gain insight into the biological processes and systems we model. The ability of networks to capture myriad features of the primary data and explore high order relationships between them makes them highly suitable to address questions that are not easily answered by classical statistical approaches that typically only look at first-order interactions. Networks have been widely used in the biomedical sciences to study gene and protein expression profiles, protein-protein interactions, metabolic processes, dynamic pathway models, and diseases amongst others. The emergence of multi-modal data in the biomedical setting has gathered pace significantly over recent years whereby several different types of data are measured from the same sample source. Integration of these data is proving incredibly valuable at increasing the breadth and depth of our understanding of the underlying systems by reducing noise, increasing information content, facilitating our handling of missing and/or incomplete data, and crucially, increasing our predictive power beyond that of uni-modal data analysis.
In this comprehensive tutorial we will introduce participants to network analysis from first principles using real-world multi-modal data derived from the Generation Scotland study, a world-leading longitudinal research programme and an excellent use case for biomedical network analysis. Participants will perform hands-on end-to-end network construction and computational analysis using a ground up approach which will give them the skills, experience, and confidence to develop their own network analytic pipelines in the future. We will work in the context of human disease using both molecular and clinical data and introduce introduce analysis approaches for network based tasks including clustering, functional annotation analysis, and classification using graph neural networks.
Learning Objectives
Participants will learn how to analyse biological datasets using networks. They will gain handson
experience with a real-world dataset as an exemplar that can be directly transferred to their
own work in the future. Following the course they will be able to:
Intended Audience and Level
Introductory Level.
This tutorial is aimed at an audience who have little prior experience working with and analysing data using networks. They will need at least a basic level of knowledge in Python and R programming. Specifically, participants are expected to be familiar with the Python packages Pandas, Numpy, and Matplotlib and the R packages ggplot2 and dplyr
The workshop will be conducted in both R and Python. We will communicate with participants in advance so that they have installed VisualStudioCode (Python) and RStudio (R) prior to the tutorial but can troubleshoot minor installation issues on the day and provide cloud compute instances of these if needed. All materials and data will be made available open-source through a dedicated GitHub repository. All analyses will be streamlined so that there are no challenging compute requirements for participants, a standard modern laptop will be suitable to take part.
Schedule
9:00 | Welcome & Introduction |
9:10 | ”An Introduction to Networks” |
9:40 | Practical Session 1 |
10:45 | Coffee Break |
11:00 | ”The Do’s and Don’ts of Biomedical Network Construction” |
11:30 | Practical Session 2 |
13:00 | Lunch |
14:00 | ”Common Approaches to the Analysis of Biomedical Networks” |
14:30 | Practical Session 3 |
16:00 | Coffee Break |
16:15 | ”An Introduction to Network Inference Using Graph Neural Networks” |
16:45 | Practical Session 4 |
17:50 | Closing Remarks |
Room: 519
Date: Friday, July 12, 2024 9:00 – 13:00 EDT
Organizer:
David Steinberg
Speakers:
Denis Yuen, Team Lead, Dockstore, Ontario Institute for Cancer Research
David Charles Steinberg, University of Santa Cruz
Leyla Tarhan, PhD, Senior Science Writer, Data Sciences Platform, Broad Institute of MIT and Harvard
Aseel Awdeh, PhD, Computational Biologist, Data Sciences Platform, Broad Institute of MIT and Harvard
Max Participants: 40
Description
With the advent of efficient sequencing technology, the scientific community produces petabytes of data daily. These data are prepared to answer diverse biological questions, each requiring unique sequencing approaches. To combine these disparate datasets and transform them into meaningful insights, researchers are turning to cloud-based approaches that adhere to Findable, Accessible, Interoperable, and Reusable (FAIR) practices. These include cloud-computing environments that allow for efficient resource-sharing and scalability. While the potential of these new resources is thrilling, the migration to cloud computing might feel daunting, as it requires new pipelines that harness the expanse of cloud tools. In this half-day tutorial, we introduce participants to key components that help them create cloud-native pipelines, including portable workflows written in the Workflow Description Language (WDL; pronounced “widdle”), portable packages of software and dependencies known as Docker containers, and Dockstore, a public platform for sharing Docker-based workflows. Participants will get hands-on experience with these resources by developing their own simple WDL workflow and Docker image for genomic analysis. They will push their workflows to Dockstore and export them to the cloud-based Terra platform so that they can run their workflow on real data.
Learning Objectives
In this tutorial, participants will learn how to:
Intended Audience and Level
Researchers and tool developers interested in bringing their analyses to the cloud. A basic understanding of command line and a GitHub account is required, and participants are encouraged to have basic familiarity with genomics terminology and standard high-throughput sequencing data formats. The introduction to basic WDL syntax is designed for novice WDL writers and starts with a basic hello-world script.
Schedule
9:00 | Welcome/opening remarks/review agenda and learning goals |
9:05 | Introduction to Docker ● How dockers improve software and scientific reproducibility ● Docker and Dockerfile basics ● Finding and using Dockers |
9:15 | Building and Using Dockers ● Pull and use an existing Docker ● Create a Dockerfile to build a Docker |
9:45 | Introduction to WDL ● Anatomy of a WDL ● Where to find and run existing WDLs |
10:00 | Basic WDL scripting ● Writing your first WDL Hello-world script for Terra ● Running WDLs in Terra |
10:45 | Coffee Break |
11:00 | Introduction to Dockstore ● Finding and assessing the quality of workflows on Dockstore ● Launching workflows from Dockstore |
11:30 | Integrate your GitHub with Dockstore ● Use GitHub apps to streamline the development cycle |
12:00 | Real genomics example: Modify, export and run a WDL |
12:30 | Wrap-up and Q&A |
Room: 519
Date: Friday, July 12, 2024 14:00 – 18:00 EDT
Organizer:
Hryhorii Chereda
Speakers:
Prof. Dr. Anne-Christin Hauschild, Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
Hryhorii Chereda, Ph.D., Medical Bioinformatics, University Medical Center Göttingen, Göttingen, Germany
Dr. Youngjun Park, Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
Maryam Moradpour (MSc), Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
Max Participants: 15
Description
The digital revolution in healthcare, fostered by novel high-throughput sequencing technologies and electronic health records (EHRs), transitions the field of medical bioinformatics towards an era of big data. While machine learning (ML) have proven to be advantageous in such settings for a multitude of medical applications, they generally depend on a centralization of datasets. Unfortunately, this is not suited for sensitive medical data, which is often distributed across different institutions, comprises intrinsic distribution shifts and cannot be easily shared due to high privacy or security concerns.
Initially proposed by Google in 2017, Federated learning, allows the training of machine learning models on geographically or legally divided data sets without sharing sensitive data. When combined with additional privacy-enhancing techniques, such as differential privacy or homomorphic encryption, it is a privacy-aware alternative to central data collections while still enabling the training of machine learning models on the whole data set. However, in such federated settings, both infrastructure and algorithms become much more complex compared to centralized machine learning approaches. Some of the most intuitive implementations rely on ensemble learning approaches, where only the model parameters are transferred. For example, we can exchange split values of tree nodes as in federated random forest or combine local subgraph-based graph neural network (GNN) models into a global federated Ensemble-GNN.
This tutorial covers the general theory of federated learning and the practice of federated ensemble learning. We will explain the concepts and benefits of federated ensemble learning, and demonstrate how to use Python to implement two state-of-the-art methods: federated random forest and Ensemble-GNN. The participants will learn how to apply these methods to breast cancer data, including clinical and gene expression features, and how to deploy the models in a federated setup. By the end of this tutorial, the participants will have both theoretical and practical skills in federated ensemble learning and privacy-preserving techniques for biomedical data analysis.
Availability of the tutorial’s material: https://gitlab.gwdg.de/cdss/tutorial-federated-ensemblelearning- for-biomedical-data
Learning Objectives
Intended Audience and Level
The aimed audience are: Bioinformaticians, Data scientists, Medical informaticians that are already beginners in machine learning. Participants should have a laptop with Linux, macOS, or Windows and internet connection. The access to computational environment will be provided by the organisers.
Level requirements are the following:
Schedule
14:00 |
Lecture: Federated ensemble learning in biomedical health data Anne-Christin Hauschild |
14:30 |
Hands-on tutorial: how to develop and implement a federated random forest
Hryhorii Chereda, Maryam Moradpour, Younjun Park |
15:45 | Coffee Break |
16:00 |
Continuation of hand-on tutorial: how to develop and implement a federated random forest
Maryam Moradpour, Youngjun Park |
16:15 |
Lecture: Federated ensemble learning with graph neural networks GNNs are particularly developed to eprform different tasks with graphs. For instance, a patient cna be represented by a biological network where the nodes contain patient-specific omics features. In this case, GNNs perform graph classification to predict a patients's clinical endpoint. Ensemble-GNN approach builds predictive models utilizing PPI networks containing carious node features such as gene experssion and/or DNA methylation. To do this, Ensemble-GNN derives relevant PPI network communities and trains an ensemble of GNN models based on the inferred communities. Sharing local GNN models allows for the deployment of a federated ensemble of GNNs. Hryhorii Chereda |
16:30 |
Hands-on tutorial: how to train an apply federated Ensemble-GNN
Hryhorii Chereda, Maryam Moradpour, Youngjun Park |
Part 1: Monday, July 8, 2024 14:00 – 18:00 EDT
Part 2: Tuesday, July 9, 2024 14:00 – 18:00 EDT
Organizer:
Robert Xiangru Tang
Speakers:
Robert Xiangru Tang, Yale University, USA.
Qiao Jin, National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), USA.
Hufeng Zhou, Biostatistics Department, Harvard T. H. Chan School of Public Health, Harvard University, USA.
Shubo Tian, National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), USA.
Zhiyong Lu, National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), USA.
Mark Gerstein, Yale University, USA.
Max Participants: 50
Website: https://llm4biomed.github.io/
Description
Large Language Models (LLMs) like ChatGPT have exhibited remarkable capabilities in understanding and generating language across diverse disciplines. In the realm of biomedical data science and computational biology, LLMs can significantly aid the processes of information accessibility, data analysis, and knowledge discovery. In this tutorial, we offer an introductory level hands-on guide to understanding and utilizing these LLMs in the field of biomedical data science. Our tutorial begins with leveling the learning ground by providing introductions to LLMs and Biomedical Data Science. Subsequently, we delve into the core applications of LLMs in biomedical data science/computational biology via retrieval-augmented generation, database functionalities, and code generation. To facilitate thought-provoking discussions, pertinent case studies will be discussed, emphasizing how to harness the power of LLMs to bridge the gap between technical feasibility and practical utility in biomedical data science. Furthermore, handson exercises are included to enable participants to apply their learning in real-time. Participants will also get acquainted with OpenAI's ChatGPT and open-source LLMs, as well as their design, use cases, limitations, and prospects.
Our topics include:
Learning Objectives
Intended Audience and Level
This tutorial is designed for graduate students, researchers, data analysts, and practitioners in the domains of bioinformatics, computational biology, and biomedical informatics who are seeking to harness the potential of Large Language Models (LLMs) in their work. The didactic content would be chiefly beneficial for individuals who are keen on enhancing the breadth and depth of their analytical skills.
While the focus of the workshop lies in catering to beginners or users with little experience in LLMs, intermediates will find the advanced topics and in-depth case studies enriching as well. Participants should ideally possess a basic understanding of Python programming and machine learning concepts. Preliminary experience with Linux-based operating systems or interacting with APIs would provide an added advantage but is not a prerequisite.
Our discussion on using OpenAI's ChatGPT and other open-source LLMs, such as LLaMA, along with hands-on exercises and case studies, will offer an immersive learning experience that spans theory and practice. Researchers looking to streamline their data analysis processes and improve the efficiency and accuracy of their results will find this tutorial particularly useful.
Relevant resources and tutorial materials for hands-on activities will be shared online before the commencement of the tutorial, ensuring an unhampered learning experience for all attendees.
Schedule
Part 1 | |
14:00 | Overview and Welcome |
14:10 | Introduction to LLMs with a focus on Biomedical Data Science |
14:40 | How to use GPT-3.5 and GPT-4 with Python |
15:10 | How to use Open-source LLMs with Python |
15:30 | Break |
15:45 | Database Query Generation with LLMs |
16:10 | Retrieval-augmented Generation with Large Language Models |
16:35 | Code generation in Bioinformatics |
Part 2 | |
14:00 | Large Language Models for Biomedicine: from PubMed Search to Gene Set Analysis |
14:45 | AI in Biomedicine: Developing Representations of Disease-Relevant Molecules |
15:30 | Break |
15:45 | Integrating Biomedical Data Database Development with LLMs |
16:10 | Querying PubMed with RAG to answer biomedical questions with GPT-4 |
16:35 | Code generation in Bioinformatics with Opensource LLMs |
16:55 | Closing Remarks |
Part 1: Monday, July 8, 2024 14:00 – 18:00 EDT
Part 2: Tuesday, July 9, 2024 14:00 – 18:00 EDT
Organizer:
Ragothaman M Yennamalli
Speakers:
Ragothaman M. Yennamalli - Assistant Professor, SASTRA Deemed to be University, Thanjavur, India
Dr Farzana Rahman – Assistant Professor, Kingston University London, UK.
Shashank Ravichandran - Senior Software Engineer, Incedo Inc, India
Megha Hegde, PhD Researcher, Kingston University London, UK.
Jean-Christophe Nebel, Professor of Computer Science, Kingston University London, UK.
Max Participants: 30
Description
Data Science and Machine Learning are intricately connected, particularly in computational biology. In a time when biological data is being produced on an unprecedented scale — encompassing genomic sequences, protein interactions, and metabolic pathways- meeting the demand has never been more crucial.
Data visualisation plays a crucial role in biological data sciences since it allows the transformation of complex, often incomprehensible raw data into visual formats that are easier to understand and interpret. This allows biologists to recognise patterns, anomalies, and correlations that would otherwise be lost in the sheer volume of data. In addition, machine learning (ML) has brought about a revolution in the analysis of biological data. Exploiting extensive datasets, ML provides tools to model complex systems and generate predictions. Indeed, ML algorithms excel at uncovering subtle patterns in data, contributing to tasks like predicting protein structures, comprehending genetic variations and their implications for diseases, and even facilitating drug discovery by predicting molecular interactions.
The integration of data visualisation and machine learning is particularly powerful. In particular, visualisation may aid in interpreting machine learning models, allowing biologists to understand and trust their predictions. It could also help fine-tune these models by identifying outliers or anomalies in the data.
Due to its remarkable capability, there has been a surge in the development and application of tools that combine data visualisation and machine learning in biology. Platforms that integrate these technologies enable biologists to conduct comprehensive analyses without needing deep expertise in computer science. Assuredly, this democratisation of data science and ML has empowered more and more biologists to engage in sophisticated, data-driven research.
Learning Objectives
This tutorial is divided into two parts. In the first part of the tutorial, the participants will learn how to install and use tools for data visualisation using Python. The second part will focus on installing and using ML tools for feature selection, model training, and model optimisation using Python. By the end of this tutorial, the participants will be able to:
Intended Audience and Level
The tutorial is aimed towards entry-level participants (Graduate students, researchers, and scientists) in both academia and industry who are interested in Data Visualisation and ML. Prerequisites: Basic knowledge of computer programming (preferably Python) and machine learning (Beginner). There is no prerequisite to have any knowledge about Art and Aesthetics.
Schedule
Part 1 | |
14:00 | Lecture Introduction to Data Visualisation: Importance and Basic principles of data visualization in scientific research Jean-Christophe Nebel |
15:00 | Hands-on Python Libraries for Visualization: Matplotlib, Seaborn, Plotly and others Farzana Rahman, Ragothaman Yennamalli, Shashank Ravichandran, and Megha Hegde |
15:45 | Coffee/Tea Break |
16:00 | Lecture Colour theory in Visualization: Colour palettes, Accessible and Inclusive Visualisations Ragothaman Yennamalli |
17:00 | Hands-on Creating various types of charts, plots for clarity and aesthetics. Case studies with real world datasets Farzana Rahman, Ragothaman Yennamalli, Shashank Ravichandran, and Megha Hegde |
Part 2 | |
14:00 | Lecture Fundamentals of Machine Learning: Types of ML, Data preprocessing and feature selection, model selection and training Ragothaman Yennamalli and Farzana Rahman |
15:00 | Hands on Python libraries for Machine Learning: Scikit-learn, Pandas, NumPy, TensorFlow/Keras. Building models using real-world biological data Shashank Ravichandran, and Megha Hegde |
16:00 | Coffee/Tea Break |
16:15 | Hands on Integrating Data Viz and ML: Yellowbrick, Bokeh, Tensorboard, Scikit-plot, etc. Farzana Rahman and Megha Hegde |
17:15 | Question and Answer session Identify and highlight blocks of hands-on content in your submission |
Date: Monday, July 8, 2024 14:00 – 18:00 EDT
Organizer:Sierra A.T. Moxon
Speakers:
Sierra Moxon, software developer, Lawrence Berkeley National Laboratory
Kevin Schaper, software developer, University of Colorado
Patrick Kalita, software developer, Lawrence Berkeley National Laboratory
Max Participants: 30
Description
LinkML (Linked data Modeling Language; linkml.io) is an open, extensible modeling framework that allows computers and people to work cooperatively to model, validate, and distribute data that is reusable and interoperable. It is designed to create interoperable data from the start without the overhead normally required for doing this. LinkML can help even non-techies create better, FAIRer, more reusable data models backed by ontologies.
Collecting and organizing biomedical data for an individual project presents a huge challenge; doing so in a way that allows for later reanalysis and reuse across projects is even harder. Many data standards are not machine-actionable, or are defined in isolation, leading to siloization. The quantity and variety of data being generated in biomedical fields is increasing rapidly, but is still often captured in unstructured formats like publications, posters, lab notebooks, or spreadsheets. Researchers at all levels struggle with collecting, managing, and analyzing data and complex knowledge, due to a confusing landscape of schemas, standards, and tools. These challenges impede scientific progress and limit our ability to tailor treatments based on data (precision medicine). AI and ML increasingly enable large-scale data analysis, but lack of data harmonization limits cross-disciplinary applications.
LinkML addresses these issues, weaving together elements of the Semantic Web with aspects of conventional modeling languages to provide a pragmatic way to work with a broad range of data types, maximizing interoperability and computability across sources and domains. LinkML meets data producers where they are technically, and speaks many different modeling languages. Data models can be authored in a variety of languages including YAML, JSON Schema, or even spreadsheets. LinkML supports all steps of the data analysis workflow: data generation, submission, cleaning, annotation, integration, and dissemination. LinkML enables even non-developers to create data models that are understandable and usable across the layers from data stores to user interfaces, reducing translation issues and increasing efficiency.
LinkML is an easy-to-use framework that both emerging and established data-generating communities can use to generate interoperable, reusable datasets and workflows. It has already seen wide uptake by projects across the biomedical spectrum and beyond, including the German Human Genome-Phenome archive, Critical Path Institute, iSample project, National Microbiome Data Collaborative, Center for Cancer Data Harmonization, INCLUDE project, NCATS Biomedical Data Translator, Reactome, Alliance of Genome Resources, Open Microscopy Environment (Next Generation File Format), and Genomics Standards Consortium.
In this tutorial, we will discuss best practices for data modeling; introduce LinkML as a modeling framework and tool suite; work together to set up a LinkML project from scratch; develop a model and validate it with test data; and auto-generate model documentation. If time permits, we will discuss the LinkML tool, Schema Automator, and use of LLMs with LinkML models.
Learning Objectives
Intended Audience and Level
This tutorial is aimed at anyone who generates or works with data: biologists, biocurators, data scientists, and data modelers. No programming or data modeling expertise is required. Listening through the hands-on aspects is encouraged with or without participating directly. To participate in hands-on training, we assume that participants have basic familiarity with running commands from the command line (in a terminal)--for example, calling Python scripts or running simple commands like “cat” and “grep”--and they should have a GitHub account and basic familiarity with using GitHub.
Schedule
Time (EDT) | Topic | Presenter | Hands-on? |
---|---|---|---|
14:00 | Introduction | Sierra Moxon | No |
14:20 | Section 1: Set up a LinkML repository | Patrick Kalita | Yes |
14:50 | Section 2: Authoring a LinkML Model A. Model components B. Classes and slots |
Sierra Moxon | Yes |
15:10 | BREAK | ||
15:25 | Section 2: Authoring a LinkML Model (cont.) C. Mappings, definitions, enumerations |
Sierra Moxon | Yes |
15:40 | Section 3: Schema best practices, including linting | Patrick Kalita | Yes |
15:55 | Section 4: Generating code from your model A. Pydantic, JSONSchema B. Generating documentation |
Kevin Schaper | Yes |
16:35 | BREAK | ||
15:45 | Section 5: LinkML Validate | Patrick Kalita | Yes |
17:05 | Section 6 (Time permitting): Schema Automator (LLM + LinkML) | Sierra Moxon | No |
17:35 | Wrap up/Questions | Sierra Moxon | No |
Date: Tuesday, July 9, 2024 14:00 – 18:00 EDT
Organizer:
Hatice Ulku Osmanbeyoglu
Speakers:
Hatice Ulku Osmanbeyoglu, Assistant Professor, University of Pittsburgh, USA
Merve Sahin, Computational Biologist, Memorial Sloan Kettering Cancer Center, USA
Parham Hadikhani, Postdoctoral fellow, University of Pittsburgh, USA
Linan Zhang, Assistant Professor, Ningbo University, China
Max Participants: 30
Description
Development of specialized cell types and their functions are controlled by external signals that initiate and propagate cell-type specific transcriptional programs. Activation or repression of genes by key combinations of transcription factors (TFs) drive these transcriptional programs and control cellular identity and functional state. For example, ectopic expression of the TF factors Oct4, Sox2, Klf4 and c-Myc are sufficient to reprogram fibroblasts into induced pluripotent stem cells. Conversely, disruption of TF activity can cause a broad range of diseases including cancer. Hence, identifying context-specific TFs is particularly relevant to human health and disease.
Systematically identifying key TFs for each cell-type represents a formidable challenge. Determination of TF activity in bulk tissue is confounded by cell-type heterogeneity. Single-cell technologies now measure different modalities from individual cells such as RNA, protein, and chromatin states. For example, recent technological breakthroughs have coupled the relatively sparse single cell RNA sequencing (scRNA-seq) signal with robust detection of highly abundant and well-characterized surface proteins using index sorting and barcoded antibodies such as cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq). But these approaches are limited to surface proteins, whereas TFs are intracellular. Single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) measures genome-wide chromatin accessibility and reveals cellular memory and response to stimuli or developmental decisions. Recently several computational methods have leveraged these omics datasets to systematically estimate TF activity influencing cell states. We will cover these TF activity inference methods using scRNA-seq, scATAC-seq, Multiome and CITE-seq data through hybrid lectures and hand-on-training sessions. We will cover the principles underlying these methods, their assumptions and trade-offs. We will apply multiple methods, interpret results and discuss strategies for further in silico validation. The audience will be equipped with practical knowledge, essential skills to conduct TF activity inference independently on their own datasets and interpret results.
Learning Objectives for Tutorial
At the completion of the tutorial, participants will gain understanding into the basic concepts and recent advances in transcription factor inference methods for single-cell omics datasets including scRNA-seq, scATAC-seq, CITE-seq and Multiome. Four learning objectives are proposed:
Intended Audience and Level
This tutorial is designed for individuals at the beginner to intermediate level, specifically targeting bioinformaticians or computational biologists with some prior experience in analyzing single-cell RNA sequencing (scRNA-seq), single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq), Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq), and Multiome data, or those familiar with next-generation sequencing (NGS) methods. A foundational understanding of basic statistics is assumed.
While participants are expected to be beginners, a minimum level of experience in handling NGS datasets is required. The workshop will be conducted using Python and JupyterLab, necessitating prior proficiency in Python programming and familiarity with command-line tools.
To facilitate the learning process, participants will be provided with pre-processed count matrices derived from real datasets. All analyses, including JupyterLab notebooks and tutorial steps, will be available on GitHub for reference.
The tutorial will employ publicly accessible data, with examples showcased using datasets that will be made available through repositories such as the Gene Expression Omnibus or similar public platforms. This hands-on workshop aims to equip participants with practical skills and knowledge, enabling them to navigate and analyze complex datasets in the field of single-cell omics.
Schedule
14:00 | Welcome remarks and tutorial overview Hatice |
14:05 |
Basic principles behind TF activity inference methods
Hatice |
14:45 | Overview of computational TF inference methods based on single cell omics Hatice, Merve |
15:45 | Break |
16:00 | Hands-on experience in applying tools and interpreting results using multiple TF activity inference methods using public scRNA-seq Linan and Merve |
16:45 | Hands-on experience in applying tools and interpreting results using multiple TF activity inference methods using public scATAC-seq and multiome Parham and Merve |
17:30 | Hands-on experience in applying tools and interpreting results using TF activity inference methods using public CITE-seq Parham and Hatice |
17:55 | Discuss current bottlenecks, gaps in the field, and opportunities for future work Hatice |
Date: Monday, July 8 14:00 – 18:00 EDT
Organizer:
Guadalupe Gonzalez
Speakers:
Guadalupe Gonzalez, Prescient, Genentech Computational Sciences, Genentech.
Chirag Agarwal. Harvard University.
Max Participants: 50
Description
In the rapidly evolving field of biomedical research, graph deep learning (DL) has emerged as a powerful tool for analyzing complex biological data like molecular graphs, protein-protein interaction networks, and patient similarity networks. However, modern graph DL models are complex black-box neural networks comprising millions of parameters, and it is crucial to understand their model predictions before employing them in life-critical applications. Our proposed tutorial is designed to address the above challenge by providing a brief overview of explainability research in the context of graph neural networks (GNNs) and their applications to biomedical problems.
The tutorial will start with an introduction to graph DL, focusing on its relevance and potential in biomedicine. We will discuss why explainability is not just a desirable trait but a necessity in this domain, where model decisions can have significant implications for both model developers and relevant stakeholders.
The second part of the tutorial delves into the core of explainability research in GNNs. We will define what constitutes an explanation in GNN models, introduce post-hoc explainers, explore metrics for evaluating explanations, and criteria to assess the quality of explanations. We will also introduce explanation-directed message passing – a novel approach that integrates post-hoc explanations directly into the training pipeline of GNNs. Finally, we will introduce existing interpretable graph models in biomedicine.
In the third part, we will apply these concepts to high-stakes biomedical applications like predicting molecular properties, discovering new drug targets, and analyzing patient data. We will be discussing each application in depth, demonstrating how explainability enhances our understanding of modern GNNs and drives decision-making in biomedicine.
Finally, the tutorial will feature interactive demonstrations and a hands-on practical session. Participants will engage with real-world biomedical datasets, applying explainability techniques to GNN models. This session aims to provide attendees with practical experience and insights into developing and utilizing explainability techniques and interpretable GNN models effectively in their research.
By the end of this tutorial, participants will have a solid understanding of the importance, methods, and applications of explainability in GNNs within the biomedical sphere, equipped with the knowledge and skills to implement these techniques in their work.
Learning objectives
Intended Audience and Level
This tutorial is primarily intended for:
The tutorial is designed to be intermediate. Participants are expected to have:
Schedule
14:00 |
Part 1: Introduction to graph leep learning in biomedicine
|
14:30 |
Part 2: Understanding and measuring explainability in GNNs
|
15:45 | Coffee break |
16:00 |
Part 3: Applying explainability techniques to GNN model predictions in biomedical contexts
|
16:45 | Coffee break |
17:00 |
Part 4: Hands-on demonstrations and practical session
|
ISMB 2024 will be held in Montreal, Quebec July 12-16, 2024 and is seeking Event Staff (formerly volunteer) applications. Volunteers must be ISCB members with memberships expiring on or after Tuesday, July 16, 2024.
Volunteers are expected to assist as scheduled for approximately 20 - 24 hours during the conference dates of July 12-16, 2024 (generally for a shift of 5 - 6 hours).
Volunteers are asked to participate in a training session in the afternoon of Thursday, July 11. The session will last approximately 90 minutes.
Volunteers should be available for scheduled shifts on all dates beginning Friday, July 12, 2024 through the end of the conference day on Tuesday, July 16, 2024. A schedule of shift allocations will be provided prior to the conference start date.
In return for working as event staff, those selected are provided with a complimentary conference registration, time-based pay, and a conference T-shirt. Regarding registration, we ask that you DO NOT register in advance, we will send a code for your registration after decisions have been made. If you are not selected, a discount code will be send to you to allow you to register at the early bird rate.
Some volunteer roles:
In addition to above specific roles, all event staff are asked to assist with general over all directions and other duties as required.
Application deadline is Monday, May 20, 2024.
Notifications will be sent on Friday, May 24, 2024.
Join us for an exciting and innovative networking experience at ISCB's Success Circles event! Success Circles is a unique take on traditional thought-leader sessions, designed to foster meaningful connections and facilitate knowledge sharing among attendees.
Success Circles is open to all ISCB conference attendees looking to expand their professional networks, share knowledge, and gain insights from experts in the field. Attendance is limited and registration is required. Ensure you save your spot by including this in your conference registration.
Don't miss this opportunity to make meaningful connections, share your expertise, and be a part of a dynamic networking event. Success Circles promises to be a memorable and valuable addition to your ISCB conference experience.
Join us and be a part of the future of networking at ISCB's Success Circles!
Success Circles is a dynamic opportunity to connect and collaborate with conference attendees by sponsoring one of the expert-led discussions. Support this exciting and innovative event by sponsoring a topic table. Contact Veronika Hotton to learn more about this and other opportunities.
Click link within a given cell to go to the relevant page within the scientific programme for a detailed list of presentations. Agenda subject to change without notice.
Click here to download Abridged Agenda PDFClick here to download full schedule by track XLSX
Click here to download Detailed Agenda PDFReturn to ISMB 2024 Homepage
Fellowship Committee:
Anne Christin Hauschild
Luis Pedro Coelho
R. Gonzalo Parra
Farzana Rahman
Kana Shimizu
ISCB is pleased to offer conference fellowships, including registration waivers for virtual participants, to ISMB 2024 for students and postdoctoral fellows to present a talk or poster at the conference in Montreal, Canada. Funding sources for Conference Fellowships are very limited and we regret that we are not able to fund all applicants. The conference organizers are committed to providing support to as many eligible applicants as possible. Conference Fellowship consideration is based on membership and accepted work to ISMB 2024.
Conference Fellowship Application Invitations are sent directly to eligible individuals after acceptance of scientific submissions to Proceedings, Abstracts, and/or Posters.
Travel Fellowship Key Dates | |
---|---|
May 14, 2024 | Conference Fellowship invitations sent for Early Abstract accepted talks and posters. |
May 20, 2024 | Conference Fellowship Application Deadline |
May 31, 2024 | Conference Fellowship Acceptance Notification |
June 12, 2024 | Conference Applicant Registration Deadline |
The maximum fellowship award is determined based on the geographical location of the applicant and upon submission of appropriate receipts. Please note that funded applicants will only be able to cover approximately 50% of the expense of travel and registration fees with these fellowship amounts. Thus all applicants must seek and secure additional funding sources (e.g., from your home institution/university, or grant funding). For ISMB 2024 maximum awards are as follows:
MAXIMUM FUNDS TO BE AWARDED PER REGION OF APPLICANT | |
---|---|
Africa | 1500 USD |
Asia (excluding Middle East) | 1000 USD |
Canada | 750 USD |
Europe | 1000 USD |
Mexico / Central America / South America | 1000 USD |
Middle East | 1500 USD |
Oceania | 2000 USD |
United States | 750 USD |
Application is by invitation-only, sent automatically via email to the submitting author of an accepted Proceeding, Abstract Talk, and/or Poster (excluding accepted Late Posters) submission. This invitation email will arrive after notification of acceptance of one of these submission types as a separate email. IF YOU HAVE AN ACCEPTED PRESENTATION AND HAVE NOT RECEIVED AN INVITATION BY END OF DAY MAY 14 – PLEASE CHECK YOUR SPAM FOLDER AND THEN CONTACT US IF AN INVITATION IS NOT THERE: This email address is being protected from spambots. You need JavaScript enabled to view it.
Each invitation will include a travel fellowship application URL to link to an application. The application URL must be submitted by the presenting author only, if the qualifying requirements are met. If the submitting author is not the presenting author, it is the responsibility of the submitting author to forward the invitation to the presenting author if the eligibility requirements are met. Each application URL can only be used one time and no application will be accepted after the deadline of May 20, 2024.
1. Applicant must be a current ISCB member whose membership does not expire prior to December 31, 2024. Applications will not be accepted from non-members; pending memberships do not qualify and must be paid in full prior to submission of an application.
2. Applicant must be listed as an author or co-author on the original submission of an accepted ISMB 2024 Proceedings paper, Abstract, or Poster (excluding accepted Late Posters), and, per the requirements of the funding agencies, the funded applicant must be the presenting author of the work. (Submitters to the "Call for Late Posters" are not eligible for fellowship funding.)
3. Applicant must be registered in a degree program (undergraduate or graduate) or as a *postdoctoral research fellow at an accredited educational institution at the time of the conference; early career researcher (low - Upper-Middle Economic countries); post docs and employees of any US federal agency are ineligible for funding using US federal funds - currently we have only US federal funds for this travel fellowship program. (*The period of eligibility for a PostDoc is five (5) years from the time of their PhD completion date).
4. Applicant must be prepared to register for ISMB 2024 by June 12, 2024, and plan to attend all four conference days. If attendance at the conference is dependent on receipt of fellowship funds, please do not register until after the notification of travel fellowship funding. Any funded applicant failing to register for the conference by June 12, 2024 will automatically forfeit the funds so that another applicant can be awarded from among the original pool of applicants.
5. Applicant must be able to pay all expenses of attending the conference up front, including conference registration fee (as noted in #4 above), travel, accommodations and meals. Travel Fellowship funding will be provided via the ISCB payment system (bill.com) via secure electronic funds transfer (wire or ACH) approximately 6-8 weeks after the conference.
Eligible expenses toward fellowship funds include registration for ISMB 2024, Student Council Symposium or Tutorials, transportation (air or land transportation from home region to conference city), hotel accommodations (booked within the ISCB official block) and a maximum of $250.00 in meal expenses. In order to receive the full-awarded amount, receipts for registration, transportation, and hotel accommodations that equal or exceed the awarded amount are required.
Applicants will be notified no later than May 31, 2024 of the funding status. In some cases applicants may be notified they are on a waitlist for funding, which means that ISCB is fully expecting but still awaiting the formal confirmation of our grant award from one or more granting agency, and that awarding of those funds will not be possible until the grant needed to fund the travel fellowship is confirmed. Any waitlisted applicant that is eventually awarded funds will be offered the opportunity to register at the early registration rate, therefore, please do not register for the conference if your attendance is fully dependent on being awarded a travel fellowship as any cancellation of an applicant's registration will be subject to the full regular registration cancellation policy.
Funded applicants will be required to present evidence of their eligibility status (such as student identification card) when signing in with the Conference Fellowships Desk to record their attendance. In all cases, funds will be mailed to funded applicants after the conference per the details noted in Eligibility Requirements #5 above.
Questions regarding fellowships should be addressed to: This email address is being protected from spambots. You need JavaScript enabled to view it.
The information on this page is subject to change without notice, and all changed information will be considered final for the purposes of awarding and funding ISMB 2024 Conference Fellowships.
The Conference Fellowships are made possible by generous donations from:
Links within this page: Venue Information | Book your Official Accommodations | Conference Accommodations | Housing Policies | Student Housing | Travel
Conference will take place in the
Palais des Congrés de Montréal
The address is:
1001 Place Jean-Paul-Riopelle
Montréal, QC
H2Z 1X7
https://congresmtl.com/en/
Showcare is the official Housing Bureau for ISCB's ISMB Conference. A link to book your hotel room online will be provided when you complete your conference registration. It is recommended that you book your hotel room early in order to take advantage of the special room rates that are subject to availability. ISMB 2024 success depends on attendees, sponsors, and exhibitors booking the conference hotels through the official Housing Bureau. Unfilled rooms create a financial risk in the form of penalties and can jeopardize the success of the association.
Please do not contact the hotels or make a reservation directly with the hotels. Discounted rates are only available through Showcare, the official Housing Bureau.
Please register for the conference before booking your accommodations.
Transform any trip into a relaxing getaway at Le Westin Montréal. Our Old Montreal hotel is full of modern amenities designed to elevate your stay no matter what time of year. It is surrounded by centuries of history in architecture, art, and French culture. Many of the city's prominent destinations can be found within walking distance.
Visit the Notre-Dame Basilica of Montreal, the cobblestone streets of Montreal's famous Parisian-style historic district, the Old Port, and the Palais des congrès. Get around town on foot or bike, with many bicycle rentals scattered throughout the city.
After a day exploring the sights, retreat to our spacious rooms and suites equipped with free Wi-Fi, pillowtop mattresses, and marble bathrooms. Our on-site gaZette restaurant features a mouth-watering menu with take-out options. Whether traveling for leisure or business or a bit of both, Le Westin Montréal provides a refined experience to restore balance and control.
Distance to Convention Center: 5 min walk
Rate: $289 CAD (single/double)
Hotel Monville is a four-star hotel that targets both businesspeople and tourists seeking to immerse themselves in the Montréal experience. Remarkable for its abundant windows that offer panoramic views of the metropolis, the Monville, created in 2018, is a hotel with an original design that combines state-of-the-art technology, ecological practices, and attentive service in a friendly atmosphere. At Monville, we don’t just practice the art of receiving well, but the art of receiving better.
Distance to Convention Center: 5 min walk
Rate: $259 CAD (single/double)
Le Dauphin Hotels is a family-owned business. A proud third generation of a great hotel tradition started in 1963. An Eco-friendly hotel is trying to improve this aspect of the hotel business.
The property in Montreal Downtown features 114 rooms and suites, all of which were designed with the comfort of our guests in mind. Whether traveling for business or pleasure, Le Dauphin is an affordable decision and the best for any occasion.
Distance to Convention Center: 5 min walk
Rate: $269 CAD (single/double)
Enjoy a simply perfect stay at Delta Hotels Montreal, on business or with the family. Our pet-friendly hotel is in downtown Montreal, near McGill University and the Montreal Convention Centre, making us the ideal destination for conferences and events. Find the most renowned local attractions within walking distance of the hotel in Montreal's entertainment district, such as Sainte-Catherine Street or the Eaton Centre.
Relax in modern, stylish hotel rooms with sleek workspaces. Select rooms include balconies. Club-level rooms allow access to our 23rd-floor Club Lounge to enjoy complimentary breakfast and evening appetizers with stunning views of the Montreal skyline.
Distance to Convention Center: 14 min walk
Rate: $255 CAD (single/double)
We're connected to the shops and restaurants of Complex Desjardins, with underground access to the Montreal Convention Centre and two metro stations. Place des Arts is around the corner, and we're a kilometre from Old Montreal. Enjoy our indoor pool, fitness center, and a warm DoubleTree welcome cookie on arrival.
Distance to Convention Center: 6 min walk
Rate: $289 CAD (single/double)
Our hotel is just half a kilometre from the Montreal Convention Center in the heart of downtown. We are surrounded by restaurants, government offices, museums, theatres, and historical attractions. Subway and bus stations are within three blocks, and we are just off A-720. Our rooftop terrace and expansive meeting spaces are ideal for Montreal events. Your stay includes a hot American Buffet daily.
Distance to Convention Center: 6 min walk
Rate: $286CAD (single/double)
The ISMB 2024 Housing Bureau, Showcare, will accept new hotel reservations, changes and cancellations until 5 pm EST on Monday, June 17, 2024. If you made a reservation, it is being held for you in the inventory of rooms the hotels have blocked for this conference. The reservations will be transferred to the hotels on Thursday, June 20, 2024. To ensure the hotels have the most up-to-date information, we ask that all hotel cancellations and changes be made by 5 pm EST on Monday, June 17, 2024.
Guarantee & Deposit Policy: All hotel rates are quoted in CAD and exclude tax. Hotel room rates are subject to applicable taxes that are in effect at check-in time. A credit card is required for each reservation and must have an expiration date on or after November 2024. Your room is not reserved if you do not provide a valid credit card. The hotels may charge a one-night room & tax deposit using the credit card on file with Showcare prior to check-in. Each guest must present a valid credit card or an approximate amount of cash for subsequent room nights and incidental charges for the entire stay upon check-in.
*You must complete a credit card authorization form if the credit card on file is not in your name.
Cancellation & Changes Policy: All cancellations or changes must be made online by re-accessing your housing account on or before 5 pm EST on Monday, June 17, 2024.
No cancellations or changes will be made between 5 pm EST on Monday, June 17, 2024 and Thursday, June 20, 2024, while reservation information is being prepared and transferred to the hotels. Cancellations or changes as of 5 pm EST on Friday, June 21, 2024, must be made directly with the hotels. Change requests will be made on a space-available basis.
Cancellation requests received by the hotel 48-72 hours (refer to your respective hotel policy on their website) or less prior to arrival and no-shows will forfeit the one-night room & tax deposit and the rest of the stay will be cancelled. ISMB 2024 is not responsible for no-shows or early departure fees charged by the hotels or rooms resold due to non-arrival.
ISMB 2024 takes no responsibility should a room preference not be available at check-in. Please visit the hotel websites for check-in and check-out times.
Housing Confirmation: You will receive an email from your hotel with your hotel confirmation number approximately 2 weeks prior to arrival. If you do not receive it, please check your spam folder before contacting your hotel.
Group Housing (10 rooms+/night): Please email This email address is being protected from spambots. You need JavaScript enabled to view it.
Questions About Hotel Reservations?
Contact: This email address is being protected from spambots. You need JavaScript enabled to view it.
In addition to the official housing block, ISCB has secured a block of rooms at McGill University for student attendees.
All student rooms are fully furnished and include private bathrooms, air conditioning, and a flatscreen TV. In addition to the shared kitchen, there is a large common area on the first floor and a quiet study room, both surrounded by windows. Situated in the center of the downtown area, La Citadelle is a recently renovated, hotel style residence building that opened its doors for move-in weekend of 2012. La Citadelle is located two blocks east of McGill campus.
Located immediately across the street from campus and minutes from downtown Montreal. Dormitory-style with shared washroom facilities centrally located on each floor. Shared kitchenettes throughout the building. Common rooms include 2 TV rooms, a games room, aerobics room, study room and a large lounge.
All student rooms are fully furnished and include private bathrooms, air conditioning, and a flatscreen TV. In room mini refrigerator. Find first class shopping, restaurants and art galleries, outdoor cafés and street festivals all within walking distance at this centrally located hotel style property.
Please use one of the following to book your reservation at any of the McGill University accommodations:
Ensure you mention International Society for Computational Biology ISMB 2024 Conference
Student room reservations at McGill university must be received by May 27, 2024.
Delta Air Lines is pleased to offer special discounts for ISCB
Please click here to book your flights.
You may also call Conferences and Events® at 1(800)328-1111* Monday–Friday, 8:00 a.m. – 6:30 p.m. (EST) and refer to Meeting Event Code: NM3UP
*Please note there is not a service fee for reservations booked and ticketed via our reservation
800 number.
When booking a flight to Montréal with Air Canada be sure to use the following discount code: QPE6YYJ1
When booking a flight to Montréal with United be sure to use the following discount code: ZPQF421521
Links within this page: Fiona S. L. Brinkman | Tandy Warnow | Guillaume Bourque | Martin Steinegger | Su-In Lee
How do we sustainably maintain and further develop bioinformatics and computational biology (BCB) software, databases and tools, in the face of short <5 year periods of funding support? How do we promote open data and open science in a way that best effects positive change and avoids causing unwitting harm on communities? Using some historical data and also my recent research as examples, I’ll review how open science is evolving, building on FAIR (findable, accessible, interoperable, reusable) with also, for example, CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) as Principles for Indigenous Data Governance. I’ll review this and other principles in the context of both microbial data, as well as human cohort data, presenting some approaches to research that can support more sustainable, inclusive science that can potentially better lead to positive change. While there is no one size fits all solution, there are some common themes and considerations that we as a BCB community should discuss - and ideally incorporate into BCB training programs.
Fiona Brinkman is a Distinguished Professor in Bioinformatics and Genomics at Simon Fraser University, interested in developing more preventative, sustainable, and holistic approaches for infectious disease control and supporting health. She is most known for R&D of software and databases aiding analysis of microbial and human omics data, including PSORT, IslandViewer, Pseudomonas.com, and InnateDB.com. She leads data integration for the CHILD Cohort Study – the largest multidisciplinary, longitudinal, population-based birth cohort study in Canada, including diverse omics data. She has co-led development of the IRIDA.ca platform, which is now the primary platform for Canada’s Public Health Agency to analyze infectious disease outbreaks using combined epidemiological/lab/genomics data. She contributed to the pandemic response, co-leading Data Analytics for the Canadian COVID-19 Genomics Network and more recently CoVaRR-Net. She has a strong interest in bioinformatics education and mentoring young scientists. She is on several committees/Boards, including the ELIXIR and European Nucleotide Archive Scientific Advisory Boards. Her awards include a TR100 award from MIT, Thompson Reuters “World’s Most Influential Scientific Minds”, and most recently she received a University of Waterloo Distinguished Alumni Award and became a Fellow of the Royal Society of Canada.
Over the last several years, interest in computing and then using large-scale phylogenies has increased for multiple reasons, including basic science (how did life evolve on earth) and applications in biomedicine and public health (e.g., understanding the evolution of SARS-Cov-2). The estimation of these large phylogenies, wiith potentially millions of leaves, presents fascinating mathmetical, statistical, and computational challenges, ranging from computing multiple-sequence alignments, developing effective heuristics to NP-hard optimization problems (e.g., maximum likelihood tree estimation) on large datasets), estimating species trees from genome- scale data while addressing biological causes for heterogeneity (e.g., gene duplication and loss and incomplete lineage sorting) across the genome). There are also many fascinating and difficult problems that have to do with “post-tree” analyses, such as rooting gene trees and species trees, or estimating branch lengths in species trees and dates at internal nodes, that are needed for many down-stream analyses. In this talk I will describe progress on these questions, and I will also present some open problems where new techniques are needed.
Dr. Warnow received her PhD in Mathematics at UC Berkeley (1991) under the direction of Gene Lawler, and did postdoctoral training with Simon Tavare and Michael Waterman at the University of Southern California (1991-1992). After positions at Sandia National Laboratories (1992-1993), University of Pennsylvania (1993-1998), and the University of Texas (1998-2014), she joined the University of Illinois at Urbana-Champaign as a Founder Professor of Engineering. She is now Associate Head for Computer Science, and has affiliate faculty appointments in Bioengineering, Electrical and Computer Engineering, Mathematics, Statistics, and several biology departments.
Genomic analyses often start by mapping reads to a reference genome. But, in every individual, there are DNA variants and sequences that are unique to that individual and reads coming from those regions will often be ignored. Thankfully, progress in long-read technologies and assembly can now efficiently deliver telomere-to-telomere genomes. Applying such approaches to a diverse panel of individuals combined with the development of graph-based genomic tools, the Human Pangenome Reference Consortium has just released the first human pangenome reference graph. This new resource is meant to alleviate the limitations of relying on a single linear human genome as the first step of most genetic and epigenetic analyses. In this talk, I will summarize some of the benefits of using the pangenome reference. In particular, I will show how this new reference can be used to extract missing signal when looking for genetic variants in a rare disease cohort called Genomic Answers for Kids. I will also describe the results of a new study using a genome-graph looking at epigenetic changes before and after influenza infection in monocyte-derived macrophages extracted from more than 30 individuals of different ancestry. Finally, considering the importance of data sharing in genomics, I will introduce a project called the Pan-Canadian Genome Library, which will establish the framework for Canada’s management and sharing of human genomic data.
Dr. Bourque is a Professor in the Department of Human Genetics, a Canada Research Chair in Computational Genomics and Medicine and the Director of Bioinformatics at the McGill Genome Center. He leads the Canadian Center for Computational Genomics (C3G) and the Epigenomics Mapping Center at McGill. He is on the External Consultant Panel of two functional genomics consortia funded by the National Human Genome Research Institute in the US (ENCODE and IGVF). Dr. Bourque is also on the Scientific Steering Committee of the International Human Epigenome Consortium (IHEC) and on the Steering Committee of the Global Alliance for Genomics and Health (GA4GH). Dr. Bourque’s research interests are in comparative and functional genomics with a special emphasis on applications of next-generation sequencing technologies and transposable elements.
Abstract: Protein analysis has witnessed a revolution through machine-learning methods. At the forefront are highly accurate structure prediction methods such as AlphaFold2 and ESMFold. These have generated an avalanche of publicly available protein structures. The AlphaFold database and ESMatlas contain over 214 and 620 million predicted structures, respectively, covering nearly every protein sequence in our largest protein reference databases. This unprecedented access to structural information is not just critical for structural biology but impacts most fields of biology. In this talk, I will discuss how this data is revolutionizing genomic and proteomic annotations and introduce fast and sensitive methods to search and cluster this data to extract new biological insights.
Dr. Steinegger is an Assistant Professor in the Biology Department at Seoul National University, with a joint appointment to the Interdisciplinary Program in Bioinformatics. He conducted his doctoral studies at the Max Planck Institute for Biophysical Chemistry and was awarded a Ph.D. in computer science with summa cum laude honors from the Technical University of Munich in 2018, followed by a postdoctoral fellowship at Johns Hopkins University. Dr. Steinegger has published more than 40 papers covering a wide range of topics in bioinformatics, from detecting genomic assembly contamination to organizing the protein structure space.
He started his research group in 2020, focusing on the development of methods to analyze massive genomics and proteomic datasets. The group's contributions to bioinformatics include widely used tools for predicting structures (ColabFold/AlphaFold2), clustering (Linclust), assembling (Plass), and searching sequences (MMseqs2) and protein structures (Foldseek). His group's software and web services have been installed and used millions of times. Dr. Steinegger is an advocate for internationality at his home institution, open science and open source.
The first part of my talk delves into various research endeavors conducted by my lab, focusing on explainable AI's application across diverse biomedical domains. I will demonstrate how explainable AI can elucidate novel scientific inquiries, with a primary emphasis on understanding neurodegenerative diseases and biological age.
In the second part, we will explore the evolving landscape of explainable AI, uncovering its potential to chart new scientific directions in biomedicine, exemplified by our recent work in dermatology, emergency medicine, and precision cancer medicine. This discussion aims to shed light on the necessary enhancements for explainable AI to effectively tackle a wide array of real-world challenges in biomedicine.
Prof. Su-In Lee, the Paul G. Allen Professor of Computer Science at UW, earned her PhD from Stanford University in 2009 under the mentorship of Prof. Daphne Koller. She joined UW in 2010 after serving as a visiting Assistant Professor in the Computational Biology Department at Carnegie Mellon University School of Computer Science. Recognized for her groundbreaking contributions to AI, biology, and medicine, Prof. Lee has received prestigious accolades including the National Science Foundation (NSF) CAREER Award, the International Society for Computational Biology (ISCB) Innovator Award, and the Samsung Ho-Am Prize, often referred to as the "Korean Nobel Prize," and designation as an American Cancer Society (ACS) Research Scholar and a Fellow of American Institute for Medical and Biological Engineering (AIMBE). Notably, she is recognized as a pioneer and trailblazer in explainable AI (XAI), significantly enhancing ML model interpretability.
Prof. Lee's recent contributions revolve around essential XAI principles and techniques, including her groundbreaking SHAP framework. Her innovative biomedical research spans basic biology to clinical medicine, enabled by XAI advancements. Conceptually advancing the integration of AI with biomedicine, her work addresses forward-looking scientific questions, enabling novel discoveries from high-throughput molecular data and electronic health records and advancing healthcare. This pioneering line of work has led to highly cited publications across foundational AI, computational molecular biology, and clinical medicine.
ISCB Members enjoy discounts on conference registration (up to $150), journal subscriptions, book (25% off), and job center postings (free).
Connecting, Collaborating, Training, the Lifeblood of Science. ISCB, the professional society for computational biology!
Giving never felt so good! Considering donating today.