The SciFinder tool lets you search Titles, Authors, and Abstracts of talks and panels. Enter your search term below and your results will be shown at the bottom of the page. You can also click on a track to see all the talks given in that track on that day.

View Talks By Category

Scroll down to view Results

July 12, 2024
July 13, 2024
July 14, 2024
July 15, 2024
July 16, 2024

Results

July 16, 2024
8:40-9:00
Proceedings Presentation: Maximum Likelihood Phylogeographic Inference of Cell Motility and Cell Division from Spatial Lineage Tracing Data
Confirmed Presenter: Gary Hu, Princeton University, United States
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Katharina Jahn


Authors List: Show

  • Uyen Mai, Uyen Mai, Princeton University
  • Gary Hu, Gary Hu, Princeton University
  • Ben Raphael, Ben Raphael, Princeton University

Presentation Overview:Show

Recently developed spatial lineage tracing technologies induce somatic mutations at specific genomic loci in growing cells and then measure these mutations in the sampled cells along with their physical locations. These technologies enable high-throughput studies of developmental processes over space and time. However, these applications rely on accurate reconstruction of a spatial cell lineage tree describing the history of bothcell divisions and locations. We demonstrate that standard phylogeographic models based on Brownian motion are inadequate to describe the symmetric spatial displacement of cells during cell division. We introduce a new model for cell motility that includes symmetric displacements of daughter cells from the parental cell followed by independent diffusion of daughter cells. We show that this model more accurately describes the locations of cells in a real spatial lineage tracing of Drosophila melanogaster embryos. Combining the spatial model with an evolutionary model of DNA mutations, we obtain a comprehensive model for spatial lineage tracing, namely spalin. Using this model, we estimate time-resolved branch lengths, spatial diffusion rate, and mutation rate. On both simulated and real data, we show that the proposed method accurately estimates all parameters while the Brownian motion model overestimates spatial diffusion rate in all test cases. In addition, the inclusion of spatial information improves accuracy of branch length estimation compared to sequence data alone, suggesting augmenting lineage tracing technologies with spatial information is useful to overcome the limitations of genome-editing in developmental systems.

July 16, 2024
9:10-9:20
Interpretable variational encoding of genotypes identifies comprehensive clonality and lineages in single cells geometrically
Confirmed Presenter: Hoi Man Chung, The University of Hong Kong, Hong Kong
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Katharina Jahn


Authors List: Show

  • Hoi Man Chung, Hoi Man Chung, The University of Hong Kong
  • Yuanhua Huang, Yuanhua Huang, University of Hong Kong

Presentation Overview:Show

Despite the wide accessibility of genetic information in multiple omics assays, analyzing single-cell
genomics remains a challenge due to its diverse high-dimensional macrostructures and many
missing signals. For the sake of numerical convergence in diverse macrostructures, existing statistical
methods often pose strong constraints on the form of predicted mutation patterns, and therefore
easily identify underfitted or overfitted local or global optima that are biologically incomprehensive in
complex contexts. To solve this problem, we developed SNPmanifold, a Python package that detects
flexible mutation patterns with a shallow binomial variational autoencoder and UMAP (schematic
shown in Figure 1). After reducing allele count matrix to lower-dimensional latent space, SNPmanifold
then performs 3 downstream analyses on the genomic geometrical manifold: 1. Clustering of cells
with similar genotypes, 2. Ranking of important SNPs, and 3. Phylogenetic tree construction. Based
on nuclear or mitochondrial variants, we demonstrated that SNPmanifold can effectively identify a
large number of multiplexed donors of origin (k = 18) that all existing methods fail and lineages of
somatic clones with promising biological interpretation (detailed results of an example dataset shown
in Figure 2). Compared to existing methods, SNPmanifold can better identify the optimal degree of
fitting with enhanced generalizability and human-interpretability. SNPmanifold therefore can reveal
insights into single-cell clonality and lineages more comprehensively and straight-forwardly.

July 16, 2024
9:20-9:40
Genome streamlining: effect of mutation rate and population size on genome size reduction
Confirmed Presenter: Juliette Luiselli, INRIA Lyon, INSA Lyon
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Katharina Jahn


Authors List: Show

  • Juliette Luiselli, Juliette Luiselli, INRIA Lyon
  • Jonathan Rouzaud-Cornabas, Jonathan Rouzaud-Cornabas, INRIA Lyon
  • Nicolas Lartillot, Nicolas Lartillot, CNRS
  • Guillaume Beslon, Guillaume Beslon, INRIA Lyon

Presentation Overview:Show

Genome size reduction, also known as genome streamlining, is observed in bacteria with very different life traits, including endosymbiotic bacteria and several marine bacteria, raising the question of its evolutionary origin. None of the hypotheses proposed in the literature is firmly established, mainly due to the many confounding factors related to the diverse habitats of species with streamlined genomes. Computational models may help overcome these difficulties and rigorously test hypotheses. We use Aevol, a platform designed to study the evolution of genome architecture, to test two main hypotheses: that an increase in population size (N) or mutation rate (μ) could cause genome reduction. Pre-evolved individuals were transferred into new conditions, characterized by an increase in population size or mutation rate. In our experiments, both conditions lead to genome reduction. However, they lead to very different genome structures. Under increased population size, genomes loose a significant fraction of non-coding sequences, but maintain their coding size, resulting in densely packed genomes (akin to streamlined marine bacteria genomes). By contrast, under increased mutation rate, genomes loose coding and non-coding sequences (akin to endosymbiotic bacteria genomes). Hence, both factors lead to an overall reduction in genome size, but the coding density of the genome appears to be determined by N × μ. Thus, a broad range of genome size and density can be achieved by different combinations of N and μ. Further analyses suggest that genome size and coding density are determined by the interplay between selection for phenotypic adaptation and selection for robustness.

July 16, 2024
9:20-9:40
Evolutionary dynamics of microRNAs pinpoint innovations in the gene regulatory network of vertebrates
Confirmed Presenter: Felix Langschied, Institute of Cell Biology and Neuroscience, Goethe University
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Katharina Jahn


Authors List: Show

  • Felix Langschied, Felix Langschied, Institute of Cell Biology and Neuroscience
  • Matthias S. Leisegang, Matthias S. Leisegang, Institute for Cardiovascular Physiology
  • Ralf P. Brandes, Ralf P. Brandes, Institute for Cardiovascular Physiology
  • Ingo Ebersberger, Ingo Ebersberger, Institute of Cell Biology and Neuroscience

Presentation Overview:Show

The evolution of the regulatory network formed by miRNAs and their target mRNAs remains poorly understood because scalable and accurate frameworks for miRNA ortholog detection are missing. We closed this methodological gap, and our tool ncOrtho identifies miRNA orthologs in large collections of unannotated genome assemblies matching manually curated annotations in sensitivity and precision. With ncOrtho, we have investigated the plasticity of the human miRNA repertoire across 402 phylogenetically diverse vertebrates. This revealed four main bursts of miRNA acquisition of which the oldest predates the diversification of the vertebrates, and the youngest is specific to the Simiiformes. Overall, miRNA loss is rare which directs the attention to 16 miRNA families that are absent in the Eumuroidea (Rodentia). To investigate the impact of these losses on the corresponding gene regulatory networks, we overexpressed Mir-197 and Mir-769 in induced pluripotent stem cells (iPSCs) of human and mouse. Overlapping sets of silenced mRNAs in the two species reveal that miRNA-dependent regulatory networks remain partly intact despite the miRNA losses. The prevalence of target sites specific to either lineage indicates a considerable evolutionary flexibility of the target gene repertoire. Interestingly, human protein-coding genes with a similar history of gene loss as the 16 miRNAs are enriched for transcription factors. This indicates that mouse but also rat have substantially modified their regulatory network of gene expression on transcriptional and post-transcriptional level compared to other vertebrate model organisms.

July 16, 2024
9:40-10:00
PHALCON: Phylogeny-aware variant calling from large-scale single-cell panel sequencing datasets
Confirmed Presenter: Priya, Department of Computer Science and Engineering, IIT Kanpur
Track: EvolCompGen

Room: 518
Format: Live Stream
Moderator(s): Katharina Jahn


Authors List: Show

  • Priya, Priya, Department of Computer Science and Engineering
  • Sunkara B. V. Chowdary, Sunkara B. V. Chowdary, Department of Computer Science and Engineering
  • Hamim Zafar, Hamim Zafar, Department of Computer Science and Engineering & Department of Biological Sciences and Bioengineering

Presentation Overview:Show

Single-cell sequencing (SCS) technologies bring cellular resolution in resolving intra-tumor heterogeneity, which can cause drug resistance and relapse in cancer. Nonetheless, SCS methods pose several technical challenges, such as uneven coverage, allelic dropout (ADO), or artifacts subjected to erroneous amplification. Single-cell variant callers have been developed to distinguish the true variants from technical artifacts. However, recently emerging parallel sequencing methods can now sequence up to thousands of cells by targeting only disease-specific genes. Current variant callers are not scalable for such high-throughput datasets and do not effectively address the amplification biases in panel-based sequencing protocols.

To address these, we present a statistical variant caller, PHALCON, which enables scalable mutation detection from large-scale single-cell panel sequencing data by modeling their evolutionary history under a finite-sites model along a clonal phylogeny. PHALCON infers the underlying cellular sub-populations based on genotype likelihoods of candidate sites and reconstructs a clonal phylogeny and the most likely mutation history (loss and recurrence included) using a probabilistic framework that maximizes the likelihood of the observed read counts given the genotypes.

Using numerous simulated datasets across varied experimental settings, we showed that PHALCON outperforms existing state-of-the-art methods in terms of variant calling accuracy (7.29-51.67% improvement), accuracy in inferring the tumor phylogeny (410.43-32931.8% improvement) and runtime (60-70 times faster). Furthermore, we applied PHALCON on real tumor single-cell panel sequencing datasets from triple negative breast cancer patients where PHALCON detected novel somatic mutations in important oncogenes and tumor suppressor genes with high functional impact and orthogonal support in bulk datasets.

July 16, 2024
10:40-11:00
Proceedings Presentation: A machine-learning based alternative to phylogenetic bootstrap
Confirmed Presenter: Tal Pupko, Tel Aviv University, Israel
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Dannie Durand


Authors List: Show

  • Noa Ecker, Noa Ecker, Tel Aviv University
  • Tal Pupko, Tal Pupko, Tel Aviv University
  • Itay Mayrose, Itay Mayrose, Tel Aviv University
  • Yishay Mansour, Yishay Mansour, Tel Aviv University
  • Dorothée Huchon, Dorothée Huchon, Tel Aviv University

Presentation Overview:Show

Currently used methods for estimating branch support in phylogenetic analyses often rely on the classic Felsenstein's bootstrap, parametric tests, or their approximations. As these branch support scores are widely used in phylogenetic analyses, having accurate, fast, and interpretable scores is of high importance.
Here, we employed a data-driven approach to estimate branch support values with a probabilistic interpretation. To this end, we simulated thousands of realistic phylogenetic trees and the corre-sponding multiple sequence alignments. Each of the obtained alignments was used to infer the phylogeny using state-of-the-art phylogenetic inference software, which was then compared to the true tree. Using these extensive data, we trained machine-learning algorithms to estimate branch support values for each bipartition within the maximum-likelihood trees obtained by each software. Our results demonstrate that our model provides fast and more accurate probability-based branch support values than commonly used procedures.

July 16, 2024
11:10-11:20
Neutral variation in a protein interaction network limits predictability of protein evolution
Confirmed Presenter: Soham Dibyachintan, Université Laval, Canada
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Dannie Durand


Authors List: Show

  • Soham Dibyachintan, Soham Dibyachintan, Université Laval
  • Alexandre Dubé, Alexandre Dubé, Université Laval
  • David Bradley, David Bradley, Université Laval
  • Pascale Lemieux, Pascale Lemieux, Université Laval
  • Ugo Dionne, Ugo Dionne, Lunenfeld-Tanenbaum Research Institute
  • Christian Landry, Christian Landry, Université Laval

Presentation Overview:Show

The evolutionary fate of a mutation is dependent on its phenotypic effects. In recent years, multiple evolutionary models have been developed that use variation in natural sequences to predict the impact of a mutation in any given protein. However, many proteins display multiple phenotypes, most mediated by specific protein domains. Furthermore, many proteins originate from gene duplication and share most of their evolutionary history. How such factors affect the predictability of evolution is unknown owing to the lack of comprehensive experimental data on such proteins. We combined genome editing and high-throughput phenotypic assays to quantify the impact of all single-amino acid substitutions on the binding of two functionally redundant paralogous Src Homology 3 domains to their cognate interaction partners in yeast. These interaction partners have peptides which satisfy a consensus polyproline motif recognized by the SH3 domains. We observed that the effect of many mutations was not conserved across phenotypes or between the paralogs. A comparison of our experiments with evolutionary models revealed that these models only capture few differences in the effect of mutations between paralogs. Ancestral sequence reconstruction revealed that for mutations whose effects differed between domains, there was no difference between ancestral substitutions and mutations sampled at random. Broadly, our results illustrate that neutral sequence variation over time in the components of a protein interaction network limits our ability to predict protein evolution accurately using existing methods. Our work underscores the importance of using experimental data to inform computational models and improve the prediction of protein evolution.

July 16, 2024
11:20-11:40
Simultaneously Building and Reconciling a Synteny Tree
Confirmed Presenter: Mathieu Gascon, Université de Montréal, Canada
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Dannie Durand


Authors List: Show

  • Mathieu Gascon, Mathieu Gascon, Université de Montréal
  • Mattéo Delabre, Mattéo Delabre, Université de Montréal
  • Nadia El-Mabrouk, Nadia El-Mabrouk, University of Montreal

Presentation Overview:Show

Our lab recently presented Synesth (for SYNteny Evolution in SegmenTal Histories), an extended reconciliation model for synteny trees accounting for fissions, losses, gains, duplications and transfers potentially going through unsampled species. Synesth takes as input a synteny tree and a species tree, and outputs a most parsimonious evolutionary history. As reconciliation is very sensitive to the input trees (a slight modification may lead to a significant difference in the inferred evolutionary scenarios), obtaining accurate trees is essential. This is particularly challenging in the case of our model requiring a synteny tree as input, while phylogenetic methods on gene sequences rather output sets of gene trees, one for each gene family. If the individual gene trees are ''consistent'', meaning that they do not represent contradictory phylogenetic information, then a supertree (a tree displaying them all) can be obtained. Such a supertree can be used as an input for Synesth to represent the evolution of the syntenies containing the individual genes. As finding the optimal super-tree as been shown to be an NP-hard problem, the solution we proposed in a previous work was to test each possible supertree and retain the one leading to the most parsimonious reconciliation. In this presentation, we explore a new way to solve this problem by simultaneously building and reconciling the optimal supertree, leading to an algorithm that is exponential in the number of gene trees rather than in the total number of genes. We compare this new algorithm to the previous one using simulated datasets.

July 16, 2024
11:40-12:00
ntSynt: multi-genome synteny detection using minimizer graph mappings
Confirmed Presenter: Inanc Birol, Canada's Michael Smith Genome Sciences Centre at BC Cancer, Canada
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Dannie Durand


Authors List: Show

  • Lauren Coombe, Lauren Coombe, Canada's Michael Smith Genome Sciences Centre at BC Cancer
  • Rene Warren, Rene Warren, Canada's Michael Smith Genome Sciences Centre at BC Cancer
  • Parham Kazemi, Parham Kazemi, Canada's Michael Smith Genome Sciences Centre at BC Cancer
  • Johnathan Wong, Johnathan Wong, Canada's Michael Smith Genome Sciences Centre at BC Cancer
  • Inanc Birol, Inanc Birol, Canada's Michael Smith Genome Sciences Centre at BC Cancer

Presentation Overview:Show

In recent years, the landscape of reference-grade genome assemblies has seen substantial diversification. With such rich data, there is pressing demand for robust tools for scalable, multi-species comparative genomics analyses, including detecting genome synteny, which informs on the sequence conservation between genomes and contributes crucial insights into species evolution. Here, we introduce ntSynt, a scalable utility for computing large-scale multi-genome synteny blocks using a minimizer graph-based approach. After computing the initial multi-genome synteny blocks using this constructed minimizer graph, the synteny blocks are refined in multiple rounds through indel detection, merging collinear blocks and extending block coordinates using decreasing minimizer window sizes. Through extensive testing utilizing multiple ~3 Gbp genomes, we demonstrate how ntSynt produces synteny blocks with coverages between 79–100% in at most 2h using 34 GB of memory, even for genomes with appreciable (>15%) sequence divergence. In addition, we used ntSynt to compare 11 bee genomes of the genus Andrena from the Earth BioGenome Project, and achieved synteny blocks with high coverage (85% for the smallest genome) in less than 15 minutes, despite these genomes varying in both chromosome number (3–7) and genome size (247 Mbp – 443 Mbp). Compared to existing state-of-the-art methodologies, ntSynt offers enhanced flexibility to diverse input genome sequences and synteny block granularity. We expect the macrosyntenic genome analyses facilitated by ntSynt to enable critical evolutionary insights within and between species across the tree of life. ntSynt is freely available at https://github.com/bcgsc/ntsynt.

July 16, 2024
14:20-14:40
Automated clade-level detection of Incomplete lineage sorting
Confirmed Presenter: Maureen Stolzer, Carnegie Mellon University, United States
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Nadia El-Mabrouk


Authors List: Show

  • Maureen Stolzer, Maureen Stolzer, Carnegie Mellon University
  • Yuting Xiao, Yuting Xiao, Carnegie Mellon University
  • Dannie Durand, Dannie Durand, Carnegie Mellon University

Presentation Overview:Show

Phylogenetic population modeling, combined with sequencing of large collections of closely related taxa, has enabled unprecedented exploration of population processes in evolutionary and ecological contexts. Incomplete Lineage Sorting (ILS) and introgression can result in gene trees that disagree with the species tree. For example, the history of a gene sampled from three species with phylogeny A|BC may agree with the species tree or have one of two incongruent topologies, B|AC or C|AB. The resulting distribution of gene tree topologies provides a wealth of information for testing alternate hypotheses and estimating population parameters. Despite these advances, quantification of ILS, while excluding incongruence due to introgression and paralogy, remains a challenging problem.
Here, we present an algorithm that extracts gene tree statistics associated with ILS from all species internodes in a single computational procedure, supporting automated, large-scale phylogenomic analyses of entire clades. Characterizing ILS can help to resolve phylogenetic uncertainty and is important for understanding the relative contributions of incomplete lineage sorting, introgression, and convergent evolution to trait evolution and present-day genetic variation. Our method accounts for uncertainty due to gene loss and missing data and screens out incongruence due to distant introgression and paralogy. As such, it can be applied to both multigene families and single-copy orthologs. The algorithm is polynomial in tree size and is thus applicable to very large species trees. We demonstrate our approach through the reanalysis of several phylogenomic datasets discussed in the literature.

July 16, 2024
14:40-15:00
Sparse Neighbor Joining: rapid phylogenetic inference using a sparse distance matrix
Confirmed Presenter: Semih Kurt, KTH Royal Institute of Technology, Sweden
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Nadia El-Mabrouk


Authors List: Show

  • Semih Kurt, Semih Kurt, KTH Royal Institute of Technology
  • Alexandre Bouchard-Cote, Alexandre Bouchard-Cote, The University of British Columbia
  • Jens Lagergren, Jens Lagergren, KTH Royal Institute of Technology

Presentation Overview:Show

Phylogenetic reconstruction is a fundamental problem in computational biology. The Neighbor Joining (NJ) algorithm offers an efficient distance-based solution to this problem, which often serves as the foundation for more advanced statistical methods. Despite prior efforts to enhance the speed of NJ, the computation of the n^2 entries of the distance matrix, where n is the number of phylogenetic tree leaves, continues to pose a limitation in scaling NJ to larger datasets. In this work, we propose a new algorithm which does not require computing a dense distance matrix. Instead, it dynamically determines a sparse set of at most O(n log n) distance matrix entries to be computed in its basic version, and up to O(n log^2 n) entries in an enhanced version. We show by experiments that this approach reduces the execution time of NJ for large datasets, with a trade-off in accuracy.

July 16, 2024
15:00-15:20
Scalable distance-based phylogeny inference using divide-and-conquer
Confirmed Presenter: Lars Arvestad, Stockholm University, Sweden
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Nadia El-Mabrouk


Authors List: Show

  • Amy Lee Jalsenius, Amy Lee Jalsenius, Stockholm University
  • Lars Arvestad, Lars Arvestad, Stockholm University

Presentation Overview:Show

Distance-based methods for inferring evolutionary trees are important subroutines in computational biology, sometimes as a first step in a statistically more robust phylogenetic method. The most popular method is Neighbor-Joining, mainly due to its relatively good accuracy. Unfortunately, Neighbor-Joining has cubic time complexity, which limits its applicability on larger datasets. Similar but faster algorithms have been suggested, but the overall time complexity of a Neighbor-Joining computation remains essentially cubic as long as the input is a distance matrix that must be computed. In practice, memory usage is today a limiting factor because distance matrix sizes grow quadratically. These constraints become a bottleneck in studies that rely on distance-based phylogeny estimation. With ever increasing data sizes, a scalable distance-based phylogeny inference method would change how scientists think about evolutionary-based studies.

We present two randomized divide-and-conquer heuristics, dnctree and dnctree-k, that selectively estimate pairwise sequence distances and infers a tree by connecting increasingly large subtrees. The divide-and-conquer approach avoids computing all pairwise distances and thereby saves both time and memory. The time complexity is at worst quadratic, and seems to scale like O(n lg n) in practice. Both algorithms have been implemented and tested, and dnctree-k shows similar accuracy as Neighbor-Joining in terms of inference accuracy in our experiments. We show that both algorithms scale very well, which is verified in computational experiments. In fact, they are applicable to very large datasets even when implemented in Python.

A Python implementation, dnctree, is available on GitHub (https://github.com/arvestad/dnctree) and PyPI.org.

July 16, 2024
15:20-16:00
Panel: Panel session
Track: EvolCompGen

Room: 518
Format: In Person
Moderator(s): Nadia El-Mabrouk


Authors List: Show