Attention Presenters - please review the Presenter Information Page available here
Schedule subject to change
All times listed are in EDT
Sunday, July 14th
10:40-11:20
Invited Presentation: The Silent Genomes Project: Building the path to equitable genomic care for Indigenous patients, one variant at a time.
Confirmed Presenter: Laura Arbour

Room: 520a
Format: In Person

Moderator(s): Kateryna Kratzer (CIHR)


Authors List: Show

  • Laura Arbour

Presentation Overview: Show

There is broad concern that genomic technologies may not reach those with the greatest health disparities. There are numerous reasons why there are barriers to specialty care including genetic/genomic diagnosis for Indigenous people of Canada. Of relevance is the lack of Indigenous reference data in public databases limiting interpretation of genetic/genomic sequencing results. However, Indigenous involvement and on-going Indigenous governance of data has the potential to change the current trajectory, improving access to diagnosis for Indigenous patients with genetic conditions. With Indigenous partners, colleagues and community members, the Silent Genomes Project is building an Indigenous background variant library (IBVL). With Indigenous oversight, an approved web-based variant release mechanism (requiring registration and agreement to certain conditions) is currently being tested for the first phase of the IBVL derived from samples from consenting First Nations communities in Canada-with official release in the months to come. This presentation will provide an overview of the Silent Genomes Project, focusing on the development of the IBVL and its governance model.

11:20-11:40
From Sequences to Reports: A Controlled Approach to Pipeline Validation in Cancer Genomics
Confirmed Presenter: Beatriz Lujan Toro, Ontario Institute for Cancer Research, Canada

Room: 520a
Format: In Person

Moderator(s): Kateryna Kratzer (CIHR)


Authors List: Show

  • Beatriz Lujan Toro, Ontario Institute for Cancer Research, Canada
  • Alexander Fortuna, Ontario Institute for Cancer Research, Canada
  • Iain Bancarz, Ontario Institute for Cancer Research, Canada
  • Michael Laszloffy, Ontario Institute for Cancer Research, Canada
  • Aqsa Alam, Ontario Institute for Cancer Research, Canada
  • Heather Armstrong, Ontario Institute for Cancer Research, Canada
  • Felix Beaudry, Ontario Institute for Cancer Research, Canada
  • Hannah Driver, Ontario Institute for Cancer Research, Canada
  • Oumaima Hamza, Ontario Institute for Cancer Research, Canada
  • Richard Jovelin, Ontario Institute for Cancer Research, Canada
  • Xuemei Luo, Ontario Institute for Cancer Research, Canada
  • Muna Mohamed, Ontario Institute for Cancer Research, Canada
  • Gavin Peng, Ontario Institute for Cancer Research, Canada
  • Peter Ruzanov, Ontario Institute for Cancer Research, Canada
  • Lawrence Heisler, Ontario Institute for Cancer Research, Canada

Presentation Overview: Show

The Genomics program at the Ontario Institute for Cancer Research (OICR) specializes in providing genome sequencing and analysis services, offering whole-genome transcriptome (WGTS), plasma whole genome (pWGS), and targeted panels. This is supported by assay specific analysis pipelines which are maintained under version control in github (https://github.com/oicr-gsi). These assays are CAP/ACD-accredited, CLIA-certified, and ISO 15189-compliant and therefore any pipeline change must be evaluated for accuracy and version controlled. Djerba, an open-source software tool developed by our group, generates reports tailored for clinical diagnostics and is used to evaluate the reproducibility of results and the impact of pipeline updates.
As software is updated and alternative analysis tools are identified, our pipelines are subject to change. To facilitate updates without compromising the accuracy of our reporting, we have developed a validation framework which is part of our standard operating procedures. This methodology involves the parallel execution of benchmarking samples, chosen to reflect the variety of biomarkers being reported, through both the production and revised pipelines within a distinct staging environment. Subsequent comparison of outputs, facilitated by our clinical reporting software Djerba, allows the proper evaluation of results and determines the equivalence of clinical reports by a trained genome interpreter. By rigorously testing the entire pipeline from raw sequences to variant calls, we can evaluate the suitability and accuracy of pipeline updates. This approach guarantees the reliability of our analytical processes, fulfills the requirements of our accreditation, and underscores our commitment to providing cutting-edge bioinformatics solutions tailored for cancer research and patient care.

11:40-12:00
The Canadian Genomic Data Commons (CGDC): A Platform for National Genomic Data Sharing
Confirmed Presenter: Erika Frangione, Mount Sinai Hospital, Sinai Health; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Canada

Room: 520a
Format: In Person

Moderator(s): Kateryna Kratzer (CIHR)


Authors List: Show

  • Jordan Lerner-Ellis, Mount Sinai Hospital, Sinai Health; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Canada
  • Erika Frangione, Mount Sinai Hospital, Sinai Health; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Canada
  • Selina Casalino, Mount Sinai Hospital, Sinai Health; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Canada
  • Radhika Mahajan, Mount Sinai Hospital, Sinai Health; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Canada
  • Navneet Aujla, Mount Sinai Hospital, Sinai Health; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Canada
  • Lochana Jayachandran, Mount Sinai Hospital, Sinai Health; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Canada
  • Anthony Philippakis, Broad Institute, United States
  • Heidi Rehm, Broad Institute, United States
  • Marc Fiume, DNAStack, Canada
  • Vincent Ferretti, Université de Montréal, Canada
  • Yann Joly, McGill University, Canada
  • Steven Jones, BC Cancer, Canada
  • Patrick Frosk, University of Manitoba, Canada
  • Sherryl Taylor, University of Alberta, Canada
  • Kym Boycott, University of Ottawa, Canada

Presentation Overview: Show

Background: Clinical data generated from genome sequencing (GS) is an ever-growing and valuable resource. To date however, this data has been largely inaccessible to clinical and research communities due to data storage costs. The CGDC will address this by establishing a digital high performance computing (HPC) infrastructure comprised of two novel databases and a suite of bioinformatics tools to put Canadian labs on a centralized platform for sharing and accessing genomic data. Three core facilities in a federated and secure data sharing ecosystem will be developed: 1. The Canadian Open Genetics Repository (COGR); 2. Canadian genome aggregation database (gnomAD-Canada); and, 3. Tools for RD researchers.

Methods: 1) COGR will be developed as a genomic database for standardizing and sharing genetic interpretations and phenotypic information as reported by Canadian diagnostic laboratories. It will develop custom workflows for automated and real-time sharing of interpretations, and will utilize consensus-building for discrepancy reporting. 2) A Canadian instance of the gnomAD browser (gnomAD-Canada) will harmonize data from 50,000 genomes derived from large-scale GS projects to generate aggregated allele frequencies. 3) A Canadian version of the seqr software will be launched for gene-discovery on up to 10,000 exomes and genomes from RD cohorts.

Conclusion: The CGDC will bring together a team of global leaders in bioinformatics and database development to advance biomedical research in Canada. It will improve the sharing of genomic data nationally through the creation of core facilities that provide standardized variant interpretations and frequency estimates for the study of gene-disease relationships.

12:00-12:20
Binomify: Unified normalization of ChIP-seq data through negative binomial regression
Confirmed Presenter: Abdul Rahman Diab, Simon Fraser University, Canada

Room: 520a
Format: In Person

Moderator(s): Kateryna Kratzer (CIHR)


Authors List: Show

  • Abdul Rahman Diab, Simon Fraser University, Canada
  • Maxwell Libbrecht, Simon Fraser University, Canada

Presentation Overview: Show

ChIP-seq data is crucial for understanding the mechanisms of gene regulation, and there is a need for a statistically-principled normalization method that facilitates the quantitative comparison of experiments which were conducted under different technical conditions. We propose Binomify, a normalization approach for ChIP-seq data which uses a neural network to predict the parameters of a negative binomial distribution conditioned on a set of experimental covariates, including position-specific control signal and GC content, the antibody used during immunoprecipitation, the machine used to perform the sequencing step, and the total sequencing depth.

Binomify first predicts a distribution over read counts at each position in the genome given the associated covariates, then computes the quantile of the observed read count within the predicted distribution; this quantile roughly indicates how “surprising” the observed signal is, given the covariates. Finally, Binomify uses the computed quantile in each bin to match a target distribution using quantile normalization, producing the final normalized signals which can be used for downstream tasks.

Evaluations on 14 ENTEx H3K27me3 experiments indicate that the normalized signals produced by the model are more predictive of gene expression levels than the observed read counts when using linear regression (mean R2=0.12 using normalized signals vs. mean R2=0.09 using observed reads), and are comparable predictors of externally-called peaks (0.97 AUROC using both signal types). Our results suggest that using a normalization method which explicitly accounts for technical noise can empower simple downstream methods to make more accurate predictions using ChIP-seq data.

14:20-15:00
Invited Presentation: Bioinformatics and AI for precision farming
Confirmed Presenter: Abdoulaye Diallo

Room: 520a
Format: Live Stream

Moderator(s): Wesley Oakes (Genome Canada)


Authors List: Show

  • Abdoulaye Diallo

Presentation Overview: Show

Precision agriculture/farming is becoming the main approach to tackle the food needs for the increasing population in the world. Sustainable agriculture is key in addressing the challenges and in unlocking market potential. More precisely, cattle well-being and longevity are crucial to a sustainable approach in dairy farming. In this presentation, I will illustrate how bioinformatics methods exploiting the power of artificial intelligence are important to integrate heterogeneous data (omics and non-omics) for generating predictive indicators of welfare, longevity, and profitability at different stages of cow’s life from cow milk, blood, and movements. I will present how we can capture emotional state in order to better study welfare. These indicators are derived from ontologies, knowledge graphs, data mining, genomics, metabolomics, computer vision and spatio-temporal deep learning techniques.

15:00-15:20
UseGalaxy Canada now in production
Confirmed Presenter: Pierre-Étienne Jacques, Université de Sherbrooke, Canada

Room: 520a
Format: In Person

Moderator(s): Wesley Oakes (Genome Canada)


Authors List: Show

  • Carol Gauthier, Université de Sherbrooke, Canada
  • Jonathan Laperle, Université de Sherbrooke, Canada
  • Michel Barrette, Université de Sherbrooke, Canada
  • Ata Roudgar, Simon Fraser University, Canada
  • Tannistha Nandi, University of Calgary, Canada
  • Jean-Francois Lucier, Université de Sherbrooke, Canada
  • Pierre-Étienne Jacques, Université de Sherbrooke, Canada

Presentation Overview: Show

We are pleased to announce the release of the UseGalaxy Canada initiative (https://starthere.usegalaxy.ca). In collaboration with the international consortium of large Galaxy instances led by the US, Europe and Australia (https://galaxyproject.org/use/) and after more than 10 years of experience with Galaxy through the development, implementation and maintenance of the GenAP platform, we set up over the last year a new stable instance of this well-established free open-source system to analyze, visualize, and share data and workflows. This platform, using CILogon for the user authentication through their institution and located on the Beluga cloud from the Digital Research Alliance of Canada, is implementing a common core set of tools and reference genomes.

15:20-15:40
Apollo: A comprehensive GPU-powered Within-host Viral simulator with tissue and cellular hierarchies for studying viral evolutionary and infection dynamics.
Confirmed Presenter: Deshan Perera, University of Calgary, Canada, Canada

Room: 520a
Format: In Person

Moderator(s): Wesley Oakes (Genome Canada)


Authors List: Show

  • Deshan Perera, University of Calgary, Canada, Canada
  • Evan Li, University of Calgary, Canada, Canada
  • Christian Huber, The Pennsylvania State University, USA, United States
  • Guido van Marle, University of Calgary, Canada, Canada
  • Alexander Platt, University of Pennsylvania, USA, United States
  • Quan Long, University of Calgary, Canada, Canada

Presentation Overview: Show

The advent of high-throughput sequencing technologies coupled with breakthroughs in third-generation sequencing has allowed new exploration into within-host viral dynamics. However, a simulation platform to analyze this new world of within host/tissue/cell viral populations does not exist. We present a solution. Apollo is a state-of-the-art within-host viral simulator developed to comprehensively model viral transmission, replication dynamics, natural selection, and host behaviors such as Lost to Follow Up, across population, host, tissue, and cell levels. Leveraging CATE (https://doi.org/10.1111/2041-210X.14168), our proven large-scale GPU CUDA-powered parallel processing architecture, Apollo achieves unprecedented speeds and hardware efficiency. Apollo is built on the standard Wright Fisher (WF) evolutionary model, but, thanks to its scriptable parameter structure, users are able to design simulations that mimic real world dynamics that expand beyond the WF model. Through rigorous testing, we have been able to demonstrate Apollo’s accuracy and resource efficiency. We present a complete simulation of an HIV epidemic with within-tissue factors, recombination, and mutation mechanisms that characterizes HIV viral evolution and dynamics both within hosts and across host levels. The simulations correspond/align with clinical findings and enhances the real-world data by providing further insight into the pedigree of viral variants and within-host quasispecies dynamics. Apollo represents a significant advancement in structured viral evolution and offers a powerful new tool for studying complex viral dynamics aimed to inform individual therapeutics and public health interventions.

15:40-16:00
Utanos: A general-purpose shallow whole genome sequencing analysis toolkit identifies interpretable copy number signatures
Confirmed Presenter: J Maxwell Douglas, BC Cancer Research Centre, Canada

Room: 520a
Format: In Person

Moderator(s): Wesley Oakes (Genome Canada)


Authors List: Show

  • J Maxwell Douglas, BC Cancer Research Centre, Canada
  • Branden Lynch, BC Cancer Research Centre, Canada
  • Jacky Yiu, BC Cancer Research Centre, Canada
  • Nirupama Tamvada, BC CDC, Canada
  • Sameer Shankar, BC Cancer Research Centre, Canada
  • David G. Huntsman, BC Cancer Research Centre, Canada
  • Yongjin Park, BC Cancer Research Centre, Canada

Presentation Overview: Show

Whole genome sequencing (WGS) is a powerful method for monitoring mutations in cancer genomes. However, the deep sequencing needed in a traditional WGS pipeline is costly, making it infeasible in many translational and population-level studies. In order to strike a balance between depth and breadth, shallow/low-pass WGS methods have been applied to large-sample studies and have successfully identified copy number (CN) variation across hundreds of samples.
Even so, should biological insights be desired beyond relative CN calling, comprehensive downstream analysis is needed. With this need in mind, we developed a general-purpose R-package for analysis workflows after relative copy-number calling - UTANOS, short for UTilities for the ANalysis Of Shallow WGS. Utanos provides data-driven quality control routines, absolute CN scaling, cross-cohort CN diversity profiling, CN feature extraction/factorization and annotation, and Homologous Recombination Deficiency (HRD) status prediction.
We applied Utanos to sWGS data profiled in multiple cancer genome studies, including two subtypes of endometrial carcinoma and High-Grade Serous Ovarian Cancer, to extract cancer- and individual-specific CN signatures and their activities genome-wide. Across cancer types/subtypes, Utanos indentified comparable CN signature sets and highlighted both ubiquitous and unique genomic features and locations. Further, we tested Utanos on down-sampled deep WGS (PCAWG), thus simulating sWGS data using a well-characterized cohort, and confirmed that Utanos effectively captured CN signatures manifested in the cohort. Finally, Utanos scales well to cohort-level analysis with hundreds of concurrent samples comfortably operable on a laptop.

16:40-17:20
Invited Presentation: The Quebec Genomic Data Center
Confirmed Presenter: Vincent Ferretti, Research Center of the Sainte-Justine University Hospital, Canada

Room: 520a
Format: In Person

Moderator(s): Felipe Pérez-Jvostov (The Alliance)


Authors List: Show

  • Jean-Philippe Thibert, Research Center of the Sainte-Justine University Hospital, Canada
  • Jeremy Costanza, Research Center of the Sainte-Justine University Hospital, Canada
  • Lucas Lemonnier, Research Center of the Sainte-Justine University Hospital, Canada
  • Adrian Paul, Research Center of the Sainte-Justine University Hospital, Canada
  • Aymeric Toulouse, Research Center of the Sainte-Justine University Hospital, Canada
  • Alek Perron, Research Center of the Sainte-Justine University Hospital, Canada
  • Claudia Roy, Research Center of the Sainte-Justine University Hospital, Canada
  • Francis Lavoie, Research Center of the Sainte-Justine University Hospital, Canada
  • Gaëlle Altefrohne, Research Center of the Sainte-Justine University Hospital, Canada
  • Luc-Frédéric Langis, Research Center of the Sainte-Justine University Hospital, Canada
  • Céline Pelletier, Research Center of the Sainte-Justine University Hospital, Canada
  • Evans Girard, Research Center of the Sainte-Justine University Hospital, Canada
  • Laura Bégin, Research Center of the Sainte-Justine University Hospital, Canada
  • Yann Marcou, Research Center of the Sainte-Justine University Hospital, Canada
  • Karine St-Onge, Research Center of the Sainte-Justine University Hospital, Canada
  • Damien Geneste, Research Center of the Sainte-Justine University Hospital, Canada
  • David Morais, Research Center of the Sainte-Justine University Hospital, Canada
  • Éric Vallée, Research Center of the Sainte-Justine University Hospital, Canada
  • Issam Hannache, Research Center of the Sainte-Justine University Hospital, Canada
  • Sébastien Bonami, Research Center of the Sainte-Justine University Hospital, Canada
  • Denis Beauregard, Research Center of the Sainte-Justine University Hospital, Canada
  • Daniel Tremblay-Sher, Research Center of the Sainte-Justine University Hospital, Canada
  • Catherine Boileau, Research Center of the Sainte-Justine University Hospital, Canada
  • Vincent Ferretti, Research Center of the Sainte-Justine University Hospital, Canada

Presentation Overview: Show

The Quebec Genomic Data Center (CQDG), launched in April 2024, is a collaborative research infrastructure project whose mission is to enable the development of precision medicine and artificial intelligence by promoting the harmonization and sharing of genomic data produced by clinical and research activities in Quebec. The main objective of the CQDG will be to provide a secure bioinformatics platform for hosting and harmonizing data produced by genomic studies and to disseminate it through a rigorous data access process that ensures the privacy of participants. The CQDG will integrate thousands of clinical whole exomes and genomes annually from consenting patients across Quebec with diverse conditions including rare diseases and cancer. The CQDG data infrastructure is built on open-source principles and leverages our extensive experience from past and ongoing international collaborations, including participation in projects such as the International Cancer Genome Consortium (ICGC), the NCI Genomic Data Commons (GDC), the NIH Gabriella-Miller Kids First Data Resource, and the NIH INCLUDE Data Resource Center on Down Syndrome. We will provide a comprehensive overview of the CQDG project, describe in detail its open source software infrastructure, and introduce the main functionalities of the CQDG web data portal.

17:20-17:40
FraGNNet: A Deep Probabilistic Model for Mass Spectrum Prediction
Confirmed Presenter: Fei Wang, University of Alberta, Canada

Room: 520a
Format: In Person

Moderator(s): Felipe Pérez-Jvostov (The Alliance)


Authors List: Show

  • Fei Wang, University of Alberta, Canada
  • Adamo Young, University of Toronto, Canada
  • David Wishart, University of Alberta, Canada
  • Bo Wang, University of Toronto, Canada
  • Hannes Röst, University of Toronto, Canada
  • Russ Greiner, University of Alberta, Canada

Presentation Overview: Show

Tandem mass spectrometry (MS/MS) plays an important role in metabolomics analysis. MS/MS workflows attempt molecular structure inference from mass spectral data: this is a challenging problem, and reliable automated solutions remain elusive. Existing identification strategies often rely on retrieval from spectral libraries; these approaches are limited by poor library compound coverage.

We propose FraGNNet,a novel computational method for high resolution and interpretable MS/MS prediction. It combines both combinatorial fragmentation and deep learning. First, a bond-breaking algorithm generates a large set of plausible fragment structures from the input molecule. Then, a graph neural network predicts a probability distribution over the fragments, determining the ones most likely to occur in the spectrum. This distribution is subsequently mapped to a distribution over chemical formulae, whose masses are used to determine peak m/z values in the spectrum.

FraGNNet achieves over 0.70 cosine similarity (0.01 Da binning resolution) when evaluated on held-out data, outperforming other spectrum predictors. In terms of retrieval-based spectrum identification, FraGNNet performs well as an spectra library generation tool. FraGNNet is highly interpretable, providing fragment annotations for each predicted peak and an estimate of the total peak intensity that the model cannot explain. Through ensembling, unreliable peak intensities and annotations can be identified to increase user confidence in the model’s predictions.

In summary, FraGNNet is a state-of-the-art spectrum prediction model with a number of unique features that make it a compelling choice for MS/MS-based structure identification.

17:40-18:00
Formation of the Canadian Artificial Intelligence and Mass Spectrometry Consortium (CAN-AIMS)
Confirmed Presenter: Jennifer Geddes-McAlister, University of Guelph, Canada

Room: 520a
Format: In Person

Moderator(s): Felipe Pérez-Jvostov (The Alliance)


Authors List: Show

  • Jennifer Geddes-McAlister, University of Guelph, Canada
  • Arnaud Droit, Laval University, Canada

Presentation Overview: Show

Disease spans diverse demographics and negatively impacts human health, affecting each individual in a specific manner. The ability to diagnose, monitor, and treat individuals in a strategic and personalized manner is limited due to lack of disease knowledge and discrepancies in accessibility of healthcare. To overcome these limitations, an increased understanding of the causes, regulatory mechanisms, and treatment options for diseases are needed. Importantly, the identification of proteins, metabolites, and pathways responsible for disease presents a critical starting point. However, the integration of datasets across biological networks and platforms is challenging to ensure comprehensive and robust analyses. Herein, we introduce the Canadian Artificial Intelligence and Mass Spectrometry Consortium (CAN-AIMS), which brings together researchers from across Canada with diverse expertise in human disease, proteomics, computation, and bioethics. The goals of CAN-AIMS are three-fold, to: i) explore innovative research strategies from discovery to translation, ii) develop a hands-on training platform for the next generation of scientists, and iii) build capacity in leading-edge instrumentation and computational recourses for Canada. Together, CAN-AIMS provides the first cohesive group of researchers working collectively to define and mitigate disease within Canada using a combination of mass spectrometry-driven technologies and computational platforms for integration of disease knowledge to improve diagnostics and treatments.