Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
All times listed are in BST
Sunday, July 20th
8:30-8:45
Student Council Symposium
Room: 02N

Authors List: Show

8:45-9:30
Student Council Symposium
Room: 02N

Authors List: Show

9:30-9:45
Student Council Symposium
Room: 02N

Authors List: Show

9:45-10:00
Student Council Symposium
Room: 02N

Authors List: Show

10:00-10:05
Lifting the veil on Challenging Medically Relevant Genes
Room: 02N
Format: In person


Authors List: Show

  • Victor Grentzinger, Laboratory of Human Genetics, GIGA Research Institute, Liège, Belgium, Belgium
  • Leonor Palmeira, Center of Genetics, University Hospital, Liège, Belgium, Belgium
  • Keith Durkin, Laboratory of Human Genetics, GIGA Research Institute, Liège, Belgium, Belgium
  • Maria Artesi, Center of Genetics, University Hospital, Liège, Belgium, Belgium
  • Vincent Bours, Laboratory of Human Genetics, GIGA Research Institute, Liège, Belgium, Belgium

Presentation Overview: Show

While the cost of DNA sequencing has never been cheaper, a number of genetic diseases remain difficult to diagnose. Nearly 400 medically relevant genes are still challenging to characterize due to the complex nature of their sequence. This complexity can arise from a variety of factors, such as the existence of pseudogene, large Short Tandem Repeat region or Variable Number Tandem Repeat region. As such, the access to reliable and cost-effective genetic tests is limited. To resolve this issue, we decided to focus on improving the characterization of the following genes by using long-read sequencing: PKD1/PKD2, responsible for Autosomal Dominant Polycystic Kidney Disease (ADPKD), and FLG, involved in Atopic Dermatitis. For PKD genes, we amplified their sequence by long-range PCR before sequencing the products by Oxford Nanopore Sequencing. We were able to retrieve all variants previously confirmed by Sanger sequencing on 34 samples with ADPKD. For FLG, while investigating the 23 publicly available PacBio HiFi data of the 1000 Genome project, we identified new undescribed alleles in African samples. To determine if these variations are population specific, we analyzed 1111 additional public samples with long-read data. We discovered 5 novel alleles mostly from Sub-Saharan populations. We also investigated, in our cohort of public data, the MUC1 and SMN1/SMN2 genes, responsible respectively for Autosomal Dominant Tubulointerstitial Kidney Disease and Spinal Muscular Atrophy. Our next goal is to design cost efficient techniques to improve the sequencing of these challenging medically relevant genes in a clinical setting.

10:05-10:10
AccuRate: A Tool Supporting Genotype–Phenotype Analysis and Causal Mutation Discovery in Soybean
Confirmed Presenter: Alžbeta Rástocká, Palacky University, Olomouc, Slovakia

Room: 02N
Format: In person


Authors List: Show

  • Alžbeta Rástocká, Palacky University, Olomouc, Slovakia
  • Jana Biová, Palacky University, Olomouc, Czechia
  • Mária Škrabišová, Palacký University, Olomouc, Czechia

Presentation Overview: Show

Soybean is one of the world’s most significant crops, serving as an indispensable source of high-quality plant protein and oil for both human and livestock consumption. Advances in soybean research support genomics-assisted breeding, guiding the development of more resilient, nutritious, and high-yielding varieties. Soybean also possesses an extensive collection of genomic and phenotypic data, including a large database of phenotypic traits. This enables the creation of new strategies for analysing genotype-phenotype associations. While association studies are important for identifying genomic loci linked to phenotypic traits, pinpointing causal mutations remains a challenge due to many factors. Building on these resources, this study presents new algorithms for analysing, visualizing, and automatically categorizing quantitative and categorical phenotypes. Given that most functional mutations are biallelic, and that quantitative traits often arise from the combined effects of multiple genes, phenotype binarization provides a practical basis for further analysis. Since many traits exist on a spectrum, various categorization methods are applied to transform them into binary form. This step is essential for calculating an accuracy parameter that quantifies genotype-phenotype correlation and facilitates the identification of causal mutations. The algorithm AccuRate was tested on well-characterized genes influencing protein and oil content in soybean. Results confirmed its ability to identify genotype-phenotype correlations. Additionally, two candidate genes were analysed, and a causal mutation was confirmed in one of them (Glyma.06G205800), linked to flowering and maturation time. AccuRate is a promising tool for uncovering genotype-phenotype relationships in soybean and, after optimizing for high-throughput testing, may be extended to other crops.

10:10-10:15
Early colorectal cancer detection with deep learning on ultra-shallow whole genome sequencing of cell-free DNA
Room: 02N
Format: In person


Authors List: Show

  • Ritchie Yu, McGill University, Canada
  • Jasmin Coulombe-Huntington, McGill University, Canada
  • Yu Xia, McGill University, Canada

Presentation Overview: Show

Early detection of cancer can mitigate adverse patient outcomes by reducing the time to intervention and treatment. Cell-free DNA (cfDNA) circulating the bloodstream contains signatures of cancer which can be obtained and sequenced through liquid biopsy. Given a large collection of sequencing reads, features can be extracted and used to develop predictive models for patient cancer classification. However, current techniques for early cancer detection rely on tens of millions of sequencing reads, which can increase the cost of diagnosis. In our work, using whole genome sequencing data obtained from the Sequence Read Archive (SRA), we adapted convolutional neural networks to predict colorectal cancer. We found that the number of reads used by the model can be scaled down from approximately 60 million reads to 1 million reads. Our model achieved a classification performance of 0.902 AUC. This result suggests that the blood sample size required for liquid biopsy could be significantly reduced, thereby reducing the cost of diagnosis. Furthermore, through an ablation study, we showed that the fragment end distribution by itself produced a classification performance of 0.904 AUC. Meanwhile, relying only on fragment length distribution and end motif distribution produced 0.771 and 0.790 AUC, respectively. This suggests that fragment end distribution is a much more predictive feature for classification. In future work, we intend to incorporate fragment end features into transformer-based models to improve classification performance.

10:15-10:30
Room: 02N
Format: In person


Authors List: Show

10:30-10:45
DNA-DistilBERT: A small language model for non-coding variant effect prediction from human DNA sequences
Confirmed Presenter: Megha Hegde, Kingston University London, United Kingdom

Room: 02N
Format: In person


Authors List: Show

  • Megha Hegde, Kingston University London, United Kingdom
  • Jean-Christophe Nebel, Kingston University London, United Kingdom
  • Farzana Rahman, Kingston University London, United Kingdom

Presentation Overview: Show

Genetic variants have been associated with changes in disease risk. Historically, research has focused on coding variants; however, emerging research shows that non-coding variants also have strong links to disease causality, via transcription and gene regulation.
Next-generation sequencing has exponentially increased genomic data availability, necessitating scalable computational approaches for accurate variant effect prediction. Transformer-based LLMs, such as BERT (Bidirectional Encoder Representations from Transformers), have achieved good results on coding variants, however, results on non-coding variants remain inconsistent. Moreover, the quadratic computational complexity of attention mechanisms with sequence length imposes substantial resource demands, restricting innovation in this area to a few institutions with high-end infrastructure.
Arguably, BERT is the most successful of such architectures as it excels in context-aware modelling of genomic sequences due to its bidirectional nature. However, to substantially decrease computational costs, it is proposed to exploit DistilBERT, which uses knowledge distillation during pretraining to reduce the number of model parameters. While small language models (SLMs) such as DistilBERT are established in natural language processing, they remain underexplored in genomics.
Experiments show that, when pretrained on human reference genome sequences, and fine-tuned for variant effect prediction, the SLM approach can match state-of-the-art LLMs such as DNABERT-2 in accuracy, while significantly reducing resource requirements. This innovative, energy-efficient approach not only makes variant effect prediction more scalable but also advances equitable research by enabling training on a single GPU, eliminating the need for high-performance computing.

10:45-11:00
Generative AI for Childhood and Adult Cancer Research
Room: 02N
Format: In person


Authors List: Show

  • Guillermo Prol Castelo, Barcelona Supercomputing Center, Spain
  • Davide Cirillo, Barcelona Supercomputing Center, Spain
  • Alfonso Valencia, Barcelona Supercomputing Centre, Spain

Presentation Overview: Show

Cancer is one of the most common causes of death worldwide, and its complexity makes it especially challenging to study. Despite ongoing progress in cancer research, a significant challenge is the scarcity of detailed data on disease subgroups and stages. To overcome this problem, Generative AI techniques and, specifically, the Variational Autoencoder (VAE), have been widely used to handle high-dimensional data. We propose a robust, explainable Synthetic Data Generation (SDG) pipeline based on the VAE using cancer transcriptomics data. Here, two main scenarios are presented, where we use our SDG pipeline to study different cancer types, addressing data scarcity limitations effectively. First, we present the case of Medulloblastoma, a rare, childhood brain tumor traditionally classified into four molecular subgroups, where we provide evidence supporting the existence of an additional subgroup with distinct molecular features. Additionally, we apply explainability techniques to the VAE, uncovering key relationships between gene expression and disease subgroups. Second, we tackle cancer's dynamic nature to link the most similar patients and leverage our SDG pipeline to direct the process of data generation along a trajectory between patients at different stages of the disease. Our pipeline generates stage-separable patients, revealing actionable molecular insights at intermediate reconstructed steps. These studies demonstrate the potential of synthetic data generation in highly specific contexts, shed light on the temporal aspects of cancer, and advance our understanding of the underlying biological mechanisms.

11:00-11:05
AutoPeptideML 2: An open source library for democratizing machine learning for peptide bioactivity prediction
Confirmed Presenter: Raúl Fernández-Díaz, IBM Research | UCD Conway Institute, Ireland

Room: 02N
Format: In person


Authors List: Show

  • Raúl Fernández-Díaz, IBM Research | UCD Conway Institute, Ireland
  • Thanh Lam Hoang, IBM Research Dublin, Ireland
  • Vanessa Lopez, IBM Research Dublin, Ireland
  • Denis Shields, university college dublin, Ireland

Presentation Overview: Show

Peptides are a rapidly growing drug modality with diverse bioactivities and accessible synthesis, particularly for canonical peptides composed of the 20 standard amino acids. However, enhancing their pharmacological properties often requires chemical modifications, increasing synthesis cost and complexity. Consequently, most existing data and predictive models focus on canonical peptides. To accelerate the development of peptide drugs, there is a need for models that generalize from canonical to non-canonical peptides.

We present AutoPeptideML, an open-source, user-friendly machine learning platform designed to bridge this gap. It empowers experimental scientists to build custom predictive models without specialized computational knowledge, enabling active learning workflows that optimize experimental design and reduce sample requirements. AutoPeptideML introduces key innovations: (1) preprocessing pipelines for harmonizing diverse peptide formats (e.g., sequences, SMILES); (2) automated sampling of negative peptides with matched physicochemical properties; (3) robust test set selection with multiple similarity functions (via the Hestia-GOOD framework); (4) flexible model building with multiple representation and algorithm choices; (5) thorough model evaluation for unseen data at multiple similarity levels; and (6) FAIR-compliant, interpretable outputs to support reuse and sharing. A webserver with GUI enhances accessibility and interoperability.

We validated AutoPeptideML on 18 peptide bioactivity datasets and found that automated negative sampling and rigorous evaluation reduce overestimation of model performance, promoting user trust. A follow-up investigation also highlighted the current limitations in extrapolating from canonical to non-canonical peptides using existing representation methods.

AutoPeptideML is a powerful, platform for democratizing machine learning in peptide research, facilitating integration with experimental workflows across academia and industry.

11:05-11:10
ENQUIRE automatically reconstructs, expands, and drives enrichment analysis of gene and Mesh co-occurrence networks from context-specific biomedical literature
Room: 02N
Format: In person


Authors List: Show

  • Luca Musella, Friedrich-Alexander-Universität Erlangen-Nürnberg, and Uniklinikum Erlangen, Erlangen, Germany, Germany
  • Alejandro Afonso Castro, Friedrich-Alexander-Universität Erlangen-Nürnberg and Uniklinikum Erlangen, Erlangen, Germany, Germany
  • Xin Lai, Systems and Network Medicine Lab, Tampere University, Tampere, Finland, Finland
  • Max Widmann, University of Konstanz, Konstanz, Germany, Germany
  • Julio Vera, Friedrich-Alexander-Universität Erlangen-Nürnberg and Uniklinikum Erlangen, Erlangen, Germany, Germany

Presentation Overview: Show

The accelerating growth of scientific literature overwhelms our capacity to manually distil complex phenomena like molecular networks linked to diseases. Moreover, biases in biomedical research and database annotation limit our interpretation of facts and generation of hypotheses. ENQUIRE (Expanding Networks by Querying Unexpectedly Inter-Related Entities) offers a time- and resource-efficient alternative to manual literature curation and database mining. ENQUIRE reconstructs and expands co-occurrence networks of genes and biomedical ontologies from user-selected input corpora and network-inferred PubMed queries. Its modest resource usage and the integration of text mining, automatic querying, and network-based statistics mitigating literature biases makes ENQUIRE unique in its broad-scope applications. We benchmarked and illustrated ENQUIRE‘s capabilities in several case scenarios and published the results earlier this year (Musella L. et al., 2025, PLoS Comput Biol). At ISMB/ECCB 2025, we showcase how ENQUIRE can support biomedical researchers using melanoma resistance to immunotherapy as an example case study. The frameworks enabled by ENQUIRE include gene set reconstruction, pathway enrichment analysis, and knowledge graph annotation, which can ease literature annotation, boosting hypothesis formulation, and facilitating the identification of molecular targets for subsequent experimentation.

11:10-11:15
Automating Linear Motif Predictions to Map Human Signaling Networks
Room: 02N
Format: In person


Authors List: Show

  • Yitao Sun, McGill University, Canada
  • Yu Xia, McGill University, Canada
  • Jasmin Coulombe-Huntington, McGill University, Canada

Presentation Overview: Show

Short linear motifs (SLiMs) are critical mediators of transient protein-protein interactions (PPIs), yet only 0.2% of human SLiMs are experimentally verified. Their short length (3–11 residues), rapid evolution, and frequent location in intrinsically disordered regions make them difficult to systematically uncover using conventional approaches. We present an automated computational framework for proteome-wide SLiM discovery that integrates structural, evolutionary, and machine learning attributes to overcome current limitations.
Our method combines Gibbs sampling for de novo motif discovery with hidden Markov models (HMMs) that explicitly model insertions and deletions, enabling a more realistic representation of motif variation. To improve specificity, we incorporate four discriminative features: ProtT5-derived motif propensity scores, AlphaFold-based intrinsic disorder (pLDDT), solvent accessibility, and cross-species conservation from multiple sequence alignments. Together, these features enable robust motif characterization even in noisy biological contexts.
Biological relevance is ensured by aiming for candidate areas from BioGRID-derived PPIs, motif clustering via HMM similarity (HH-suite), and integration of known SLiM-binding domains. Our framework identified a novel MAPK1 (ERK2)-mediated phosphorylation motif in RUNX1, exhibiting high feature scores and validated via independent phosphoproteomic data. This site, previously biochemically characterized but not recognized as an SLiM, shows the power of our approach in identifying functional motifs missed by traditional tools.
Our database will address two gaps in current resources (e.g., ELM): probabilistic indel handling and biologically-informed motif classification. This work lays the foundation for systematic reconstruction of motif-mediated signaling networks and advances the discovery of novel regulatory mechanisms and therapeutic targets.

11:15-11:20
TCRBench: A Unified Benchmark for TCR–Antigen Binding Prediction and Clustering
Room: 02N
Format: In person


Authors List: Show

  • Muhammed Hunaid Topiwala, Biodesign Institute, Arizona State University, United States
  • Pengfei Zhang, Biodesign Institute, Arizona State University, United States
  • Heewook Lee, Biodesign Institute, Arizona State University, United States

Presentation Overview: Show

T-cell receptor (TCR) recognition of antigenic peptides presented by major histocompatibility complex (MHC) molecules is central to adaptive immunity, driving pathogen-specific responses and informing therapeutic vaccine development. Computational tasks such as predicting TCR-antigen binding affinity (NetTCR, Montemurro et al., 2021; ImRex, Moris et al., 2021) and clustering TCR sequences by epitope specificity (GLIPH, Glanville et al., 2017; TCRdist, Dash et al., 2017) have emerged as key challenges to decoding immune specificity. While recent models leveraging convolutional neural networks, transformers (e.g., ATM-TCR, Xu et al., 2021), and multimodal embeddings (ERGO, Springer et al., 2020; TCRMatch, Chronister et al., 2021) have significantly advanced performance, fragmented datasets and inconsistent evaluation methods have limited direct model comparisons and generalization. We propose a unified benchmark dataset integrating rigorously curated TCR sequences from human, mouse, and macaque responses to major pathogens (Influenza A, CMV, EBV, SARS-CoV-2) sourced from comprehensive databases such as VDJdb (Shugay et al., 2018) and IEDB (Vita et al., 2019). The benchmark incorporates standardized evaluation splits, structural representations enabled by AlphaFold2 predictions (Jumper et al., 2021), and robust evaluation metrics to ensure fair, reproducible comparisons. By consolidating disparate data and evaluation practices, our benchmark provides clarity on current progress, facilitating future innovation in computational TCR-antigen interaction modeling.

11:20-11:35
Fold first, ask later: structure-informed function prediction in Pseudomonas phages
Room: 02N
Format: In person


Authors List: Show

  • Hannelore Longin, Computational Systems Biology, KU Leuven, Belgium; Laboratory of Gene Technology, KU Leuven, Belgium, Belgium
  • George Bouras, Adelaide Medical School, the University of Adelaide, Australia, Australia
  • Susanna Grigson, Flinders Accelerator for Microbiome Exploration, Flinders University, Australia, Australia
  • Robert Edwards, Flinders Accelerator for Microbiome Exploration, Flinders University, Australia, Australia
  • Hanne Hendrix, Laboratory of Gene Technology, KU Leuven, Belgium, Belgium
  • Rob Lavigne, Laboratory of Gene Technology, KU Leuven, Belgium, Belgium
  • Vera van Noort, Computational Systems Biology, KU Leuven, Belgium, Belgium

Presentation Overview: Show

Phages, the viruses of bacteria, are the most abundant biological entities on earth. In general, phage genomes are densely coded and contain many open reading frames, yet up to 70% encode proteins of unknown function. Despite clinical, biotechnological and fundamental interests in unravelling these proteins’ functions, phage proteins are absent from recent large-scale structure-based efforts (such as AlphaFold database).

Here, we investigate the efficacy of structure-based protein annotation for Pseudomonas-infecting phages, comparing different post-processing strategies to obtain function annotations from FoldSeek output. Briefly, we collected every protein annotated as ‘hypothetical/phage protein’ in NCBI and of at least 100 amino acids in length, of 887 Pseudomonas-infecting phages. These 38,025 proteins (31% of all proteins) were then clustered into 10,453 groups of homologs. Protein structures were predicted with ColabFold and structural similarity to the PDB and AlphaFold database was assessed with FoldSeek. Of all proteins, 59% displayed significant similarity to at least one structure in these databases. We benchmarked various strategies for extracting function from these FoldSeek hits, integrating different information resources, hit selection methods, and structure-based clustering of the hits. The resulting annotations were then compared with state-of-the-art sequence- and structure-based phage annotation tools Pharokka and Phold.

On average, up to 42% of the phage proteins of unknown function could be annotated using structure-based methods, depending on the post-processing strategies applied. While caution is warranted when transferring protein annotations based on similarity, these methods can significantly speed up research into new antimicrobials and biotechnological applications inspired by nature’s finest bioengineers: phages.

11:35-11:50
Exploring capabilities of protein language models for cryptic binding site prediction
Confirmed Presenter: Vít Škrhák, Charles University, Czechia

Room: 02N
Format: In person


Authors List: Show

  • Vít Škrhák, Charles University, Czechia
  • David Hoksza, Charles University, Czechia

Presentation Overview: Show

Identifying protein-ligand binding sites is essential for understanding biological mechanisms and supporting drug discovery. However, accurate prediction remains challenging - particularly in the case of cryptic binding sites (CBSs), which require significant conformational changes to form upon ligand binding. Structure-based prediction methods typically rely on a specific conformation (apo vs. holo), making them less effective for identifying CBSs. A promising alternative is the use of sequence-based approaches, enabled by the emergence of protein language models (pLMs).

In this work, we explored the capabilities of various pLMs for predicting CBSs. As a baseline, we created a simple model trained using transfer learning. We then experimented with several fine-tuning strategies to further improve performance. Specifically, we applied multitask learning - not only to predict whether a residue is part of a CBS, but also to estimate its flexibility. This additional task enhanced the model’s awareness of protein dynamics, which is critical for accurate CBS identification. Our primary data source is the recently published CryptoBench dataset, which contains annotations of cryptic sites, although additional data sources were also considered. The combination of novel fine-tuning strategies and various training data improved performance across all key metrics, including a gain of over 2% in AUC.

To better understand model limitations, we also conducted an analysis of common prediction errors. Finally, we introduced a simple post-processing method designed to refine and smooth the model’s outputs.

11:50-11:55
Coarse-grained and Multi-Scale Modeling of Lytic Polysaccharide Monooxygenases: Insights into Family-Specific Dynamics and Protein Frustration
Room: 02N
Format: In person


Authors List: Show

  • Nisha Nandhini Shankar, SASTRA Deemed to be University, India
  • Ragothaman M Yennamalli, SASTRA Deemed to be University, India

Presentation Overview: Show

Lytic polysaccharide monooxygenases (LPMOs) are copper-dependent redox enzymes that catalyze the oxidative cleavage of C1 and/or C4 bonds in recalcitrant polysaccharides, playing a vital role in biomass conversion. The CAZy database classifies LPMOs into eight families (AA9, AA10, AA11, AA13, AA14, AA15, AA16, and AA17). These families exhibit diversity in their structure as well as catalytic features. This study focuses on analyzing the structure, dynamics and energetic landscapes of LPMO families using FrustratormeteR, SignDy, and multiscale modeling approaches. FrustratormeteR quantifies configurational and mutational frustration, identifying energetically unfavorable interactions. AA9 exhibited high local frustration in the residue range of 100-230, while AA10 showed a more stable profile. SignDy was employed to explore slow collective motions, revealing significant conformational changes in AA9 linked to enzymatic adaptability, with the first six modes indicating notable flexibility. In contrast, AA10 displayed lower mobility in its first three modes, suggesting greater rigidity and substrate specificity. Protein models from AlphaFold2 were used for proteins with missing residues. These models were prepared and subjected to 100 ns all-atom molecular dynamics simulations using the OPLS-AA/L force field. The increase in RMSD in the course of the simulation shows the conformational changes. RMSF and energy analyses revealed flexible regions consistent with mode analysis, with average potential energies stabilizing at -6.25×105 kJ/mol. The radius of gyration (Rg) remained stable around 1.65-1.75 nm. Analysing the coarse-grained Gō model simulations, run using SMOG for 200 million steps will provide further insights into the folding and long-range dynamic behavior of these enzymes.

11:55-12:00
Identification and structural modeling of the novel TTC33-associated core (TANC) complex involved in DNA damage response
Confirmed Presenter: Małgorzata Drabko, Laboratory of RNA Processing and Decay, Institute of Biochemistry and Biophysics PAS, Poland

Room: 02N
Format: In person


Authors List: Show

  • Małgorzata Drabko, Laboratory of RNA Processing and Decay, Institute of Biochemistry and Biophysics PAS, Poland
  • Rafał Tomecki, Laboratory of RNA Processing and Decay, Institute of Biochemistry and Biophysics PAS, Poland
  • Małgorzata Siek, Laboratory of RNA Processing and Decay, Institute of Biochemistry and Biophysics PAS, Poland
  • Aneta Jurkiewicz, Laboratory of RNA Biology, Institute of Biochemistry and Biophysics PAS, Poland
  • Maria Matykiewicz, Laboratory of RNA Processing and Decay, Institute of Biochemistry and Biophysics PAS, Poland
  • Miłosz Ludwinek, Laboratory of RNA Processing and Decay, Institute of Biochemistry and Biophysics PAS, Poland
  • Kamil Kobyłecki, Laboratory of RNA Processing and Decay, Institute of Biochemistry and Biophysics PAS, Poland
  • Dominik Cysewski, Mass Spectrometry Laboratory, Institute of Biochemistry and Biophysics PAS, Poland
  • Agata Malinowska, Mass Spectrometry Laboratory, Institute of Biochemistry and Biophysics PAS, Poland
  • Magdalena Bakun, Mass Spectrometry Laboratory, Institute of Biochemistry and Biophysics PAS, Poland
  • Krzysztof Goryca, Genomic Core Facility, Centre of New Technologies, University of Warsaw, 02-097, Warsaw, Poland, Poland
  • Łukasz S. Borowski, Laboratory of RNA Biology, Institute of Biochemistry and Biophysics PAS, Poland
  • Roman J. Szczęsny, Laboratory of RNA Biology, Institute of Biochemistry and Biophysics PAS, Poland
  • Rafał Płoski, Department of Medical Genetics, Medical University of Warsaw, Poland
  • Agnieszka Tudek, Laboratory of RNA Processing and Decay, Institute of Biochemistry and Biophysics PAS, Poland

Presentation Overview: Show

Of the ~20,200 human proteins, ~9% remain functionally uncharacterized, highlighting a gap in our understanding of cell physiology. Structural proteins without enzymatic activity are particularly difficult to study. Here, we applied a “function by proximity” approach to TTC33, a nuclear structural tetratricopeptide repeat (TPR) protein conserved in bony vertebrates. Using comparative label-free mass spectrometry, we identified the TTC33-associated network (TAN), which includes WDR61, CCDC97, UNG, PP2A-B55α, PHF5A, and the SF3B subcomplex of U2. At the core of TAN is a novel trimeric complex (TANC) formed by TTC33:WDR61:PHF5A, with this claim being supported by co-purification and size exclusion chromatography. Structural predictions performed by AlphaFold 3, and their experimental validation showed WDR61 and PHF5A bind opposite sides of TTC33’s TPR4, while TPR1-3 recruit other TAN factors. To expand the structural model we employed molecular dynamics to identify the most stable amino acid contact pairs between complex subunits. Although TTC33 forms a complex with WDR61 and PHF5A, both of which are involved in RNA metabolism, our RNA-seq assays revealed only a subtle impact on mRNA levels and splicing patterns. In contrast, TTC33 appears more involved in DNA repair through interaction with UNG1/2. TTC33 loss led to increased DNA double-strand breaks, a phenotype previously associated with UNG1/2 knock-down. We showed that TTC33 protein levels are regulated in vivo, and that changes to TTC33 abundance reduced cellular proliferation rate and resistance to hydrogen peroxide. Moreover, the depletion or loss of either TTC33 or CCDC97 induced redistribution of p53-S15P, a marker of DNA damage.

12:05-12:10
Functional Interfaces at Ordered–Disordered Transitions: Conserved Linear Motifs and Flanking Regions in Modular Proteins
Confirmed Presenter: Carla Luciana Padilla Franzotti, Department of Science and Technology, National University of Quilmes, Argentina

Room: 02N
Format: In person


Authors List: Show

  • Carla Luciana Padilla Franzotti, Department of Science and Technology, National University of Quilmes, Argentina
  • Nicolas Palopoli, Department of Science and Technology, National University of Quilmes, Argentina
  • Gustavo Pierdominici-Sottile, Department of Science and Technology, National University of Quilmes, Argentina
  • Miguel Andrade, Computational Biology and Data Mining Research Group, Johannes-Gutenberg University, Mainz, Germany

Presentation Overview: Show

Multidomain proteins integrate ordered domains, structured tandem repeats (STRs), and intrinsically disordered regions (IDRs) to generate modular architectures optimized for dynamic and specific protein-protein interactions. In this study, we analyze the role of short linear motifs (SLiMs) located at the interface between ordered and disordered segments, focusing on their contribution to structural connectivity and interaction regulation.
Two model systems are examined: (1) the large T antigen from simian virus 40 (LTSV40), in which the LxCxE motif—positioned at the junction between a folded domain and an IDR—mediates binding to the retinoblastoma protein (pRb), and (2) the regulatory complex between protein phosphatase 1 delta (PP1δ) and its MYPT1 subunit, where ankyrin repeats (ANKs) are connected to DOC-type docking motifs through an intervening IDR. In both cases, the regions flanking the SLiMs exhibit high sequence conservation and specific biophysical properties, consistent with a modulatory role.
Molecular dynamics simulations demonstrate that these flanking regions promote extended conformations upon complex formation, facilitating physical occlusion of critical interaction interfaces (such as the E2F-binding pocket in pRb) without requiring large-scale allosteric rearrangements. In the PP1-MYPT1 complex, ANK repeats and IDRs exhibit cooperative behavior that contributes to the stabilization of the bound conformation and enhances interaction specificity.
These findings support the existence of a conserved ordered–motif–disordered architectural module recurrently employed in both viral and cellular regulatory systems. This topological arrangement constitutes a potential target for therapeutic intervention in diseases involving aberrant protein-protein interactions mediated by SLiMs at ordered–disordered interfaces.

12:10-12:15
Automating Linear Motif Predictions to Map Human Signaling Networks
Room: 02N
Format: In person


Authors List: Show

  • Yitao Sun, McGill University, Canada
  • Yu Xia, McGill University, Canada
  • Jasmin Coulombe-Huntington, McGill University, Canada

Presentation Overview: Show

Short linear motifs (SLiMs) are critical mediators of transient protein-protein interactions (PPIs), yet only 0.2% of human SLiMs are experimentally verified. Their short length (3–11 residues), rapid evolution, and frequent location in intrinsically disordered regions make them difficult to systematically uncover using conventional approaches. We present an automated computational framework for proteome-wide SLiM discovery that integrates structural, evolutionary, and machine learning attributes to overcome current limitations.
Our method combines Gibbs sampling for de novo motif discovery with hidden Markov models (HMMs) that explicitly model insertions and deletions, enabling a more realistic representation of motif variation. To improve specificity, we incorporate four discriminative features: ProtT5-derived motif propensity scores, AlphaFold-based intrinsic disorder (pLDDT), solvent accessibility, and cross-species conservation from multiple sequence alignments. Together, these features enable robust motif characterization even in noisy biological contexts.
Biological relevance is ensured by aiming for candidate areas from BioGRID-derived PPIs, motif clustering via HMM similarity (HH-suite), and integration of known SLiM-binding domains. Our framework identified a novel MAPK1 (ERK2)-mediated phosphorylation motif in RUNX1, exhibiting high feature scores and validated via independent phosphoproteomic data. This site, previously biochemically characterized but not recognized as an SLiM, shows the power of our approach in identifying functional motifs missed by traditional tools.
Our database will address two gaps in current resources (e.g., ELM): probabilistic indel handling and biologically-informed motif classification. This work lays the foundation for systematic reconstruction of motif-mediated signaling networks and advances the discovery of novel regulatory mechanisms and therapeutic targets.

12:15-13:15
Invited Presentation: Lunch Break and poster session
Room: 02N
Format: In person


Authors List: Show

13:15-14:15
Invited Presentation: A New Bioinformatics Era: State of Multi-Omics Data Integration
Room: 02N
Format: In person


Authors List: Show

14:15-14:30
Heterogeneous Graph Attention Network Improves Cancer Multiomics Integration
Confirmed Presenter:

Room: 02N
Format: In person


Authors List: Show

  • Sina Tabakhi, School of Computer Science, The University of Sheffield, Sheffield, UK, United Kingdom
  • Charlotte Vandermeulen, School of Biosciences, The University of Sheffield, Sheffield, UK, United Kingdom
  • Ian Sudbery, School of Biosciences, The University of Sheffield, Sheffield, UK, United Kingdom
  • Haiping Lu, School of Computer Science, The University of Sheffield, Sheffield, UK, United Kingdom

Presentation Overview: Show

The increase in high-dimensional multiomics data demands advanced integration models to capture the complexity of human diseases. Graph-based deep learning integration models, despite their promise, struggle with small patient cohorts and high-dimensional features, often applying independent feature selection without modeling relationships among omics. Furthermore, conventional graph-based omics models focus on homogeneous graphs, lacking multiple types of nodes and edges to capture diverse structures. We introduce a Heterogeneous Graph ATtention network for omics integration (HeteroGATomics) to improve cancer diagnosis. HeteroGATomics performs joint feature selection through a multi-agent system, creating dedicated networks of feature and patient similarity for each omic modality. These networks are then combined into one heterogeneous graph for learning holistic omic-specific representations and integrating predictions across modalities. Experiments on three cancer multiomics datasets demonstrate HeteroGATomics' superior performance in cancer diagnosis. Moreover, HeteroGATomics enhances interpretability by identifying important biomarkers contributing to the diagnosis outcomes.

14:30-14:35
Multilingual model improves zero-shot prediction of disease effects on proteins
Confirmed Presenter: Ruyi Chen, The University of Queensland, Australia

Room: 02N
Format: In person


Authors List: Show

  • Ruyi Chen, The University of Queensland, Australia
  • Nathan Palpant, The University of Queensland, Australia
  • Gabriel Foley, The University of Queensland, Australia
  • Mikael Boden, The University of Queensland, Australia

Presentation Overview: Show

Models for mutation effect prediction in coding sequences rely on sequence-, structure-, or homology-based features. Here, we introduce a novel method that combines a codon language model with a protein language model, providing a dual representation for evaluating effects of mutations on disease. By capturing contextual dependencies at both the genetic and protein level, our approach achieves a 3% increase in ROC-AUC classifying disease effects for 137,350 ClinVar missense variants across 13,791 genes, outperforming two single-sequence-based language models. Obviously the codon language model can uniquely differentiate synonymous from nonsense mutations at the genomic level. Our strategy uses information at complementary biological scales (akin to human multilingual models) to enable protein fitness landscape modeling and evolutionary studies, with potential applications in precision medicine, protein engineering, and genomics.

14:35-14:50
Integrated analysis of bulk and single-nuclei RNA sequencing data of primary and metastatic pediatric Medulloblastoma.
Room: 02N
Format: In person


Authors List: Show

  • Ana Isabel Castillo Orozco, Research Institute of the McGill University Health Center, Canada
  • Geoffroy Danieau, Research Institute of the McGill University Health Center, Canada
  • Livia Garzia, Research Institute of the McGill University Health Center, Canada

Presentation Overview: Show

Medulloblastoma (MB) is a highly aggressive and the most common brain tumor in childhood. MB presents a high intertumoral heterogeneity, with at least four molecular subgroups identified (SHH, WNT, Group 3, and Group 4). Metastatic MB, or Leptomeningeal Disease (LMD), is predominantly found in the MB Group 3 type. Although LMD represents a main clinical challenge, its molecular mechanisms remain poorly characterized. Recent research has shown that primary and MB metastasis diverge dramatically. Our work has focused on establishing therapy naïve Group 3 Patient-Derived Xenografts models of primary and metastatic Medulloblastoma to conduct transcriptomic profiling at the bulk and single-nuclei RNAseq levels to identify genetic drivers/pathways that sustain leptomeningeal disease compartment. Our results show various signaling pathways enriched across LMD models, such as MYC targets, unfolded protein response, and fatty acid metabolism. Using single-sample GSEA analysis (ssGSEA) and deconvolution approaches, we have also identified that our PDXes models retain neoplastic subpopulations previously identified in MB single-cell sequencing studies. Similarly, we have identified slight differences in cell subpopulation proportions between primary and leptomeningeal compartments. Our single-nuclei studies have confirmed these results and differentially expressed genes previously found in bulk RNAseq analyses. These results suggest the presence of cell populations enriched in the metastatic compartment with an aberrant transcription phenotype and adaptations in metabolism to survive the leptomeningeal space. Our recent findings suggest that LMD should be treated differently from primary brain tumors and that identified metabolic pathways may be potential targets for targeted therapeutics to treat or prevent this devastating disease.

14:50-14:55
Investigating novel transcriptional regulators in symbiotic nodule development of Medicago truncatul
Confirmed Presenter: Sara Eslami, Department of Molecular and Cell Biology, Kosar Bojnord University, Bojnourd, Iran, Iran

Room: 02N
Format: In person


Authors List: Show

  • Sara Eslami, Department of Molecular and Cell Biology, Kosar Bojnord University, Bojnourd, Iran, Iran
  • Mahboobeh Azarakhsh, Department of Molecular and Cell Biology, Kosar Bojnord University, Bojnourd, Iran, Iran

Presentation Overview: Show

Biological nitrogen fixation is a crucial process for sustainable agriculture, allowing leguminous plants to convert atmospheric nitrogen into bioavailable forms through a symbiotic relationship with rhizobia. This interaction results in the formation of specialized root structures called nodules, where nitrogen fixation takes place. A deeper understanding of the molecular mechanisms governing nodule formation is essential for enhancing plant-microbe interactions and improving agricultural productivity.
In this study, we investigate key transcription factors (TFs) involved in the nodulation process of Medicago truncatula, including MtIPD3, MtNSP1, MtNSP2, MtNIN, and MtERNs. Using co-expression analysis (Phytozome database) and interaction network studies (STRING database), we identify novel regulatory elements that potentially play a role in nodule organogenesis. Our findings suggest a strong interaction between IPD3 and splicing factors, implicating its involvement in RNA processing and cell cycle regulation during nodule formation. Additionally, we identify the cytokinin transporter gene ABCG38 as significantly upregulated in nodules, suggesting its role in cytokinin-mediated regulation. Moreover, our analysis indicates that the auxin response factor Medtr2g043250 is a likely transcriptional target of NIN, highlighting a possible cross-talk between auxin and cytokinin signaling in nodulation.
These insights contribute to a deeper understanding of the transcriptional and hormonal regulation of nodule development, offering potential strategies for enhancing biological nitrogen fixation in legumes.

14:55-15:00
Meta-Analysis of Bovine Transcriptome Reveals Key Immune Gene Profiles and Signaling Pathways
Room: 02N
Format: In person


Authors List: Show

  • Vennila Kanchana Devi Marimuthu, SASTRA deemed to be university, India
  • Kishore Matheswaran, SASTRA deemed to be university, India
  • Menaka Thambiraja, SASTRA deemed to be university, India
  • Ragothaman M Yennamalli, SASTRA deemed to be university, India

Presentation Overview: Show

Understanding immune mechanisms in cattle is crucial for improving disease resistance through informed breeding decision and development. Meta-analysis serves as a powerful approach to integrate findings from multiple transcriptomic studies that uncover significant gene expression patters across various experimental conditions and increase statistical power and. In this study, we conducted a meta-analysis of four bovine transcriptomic datasets (GSE45439, GSE62048, GSE125964, and GSE247921) to identify immune-related differentially expressed genes (DEGs) in Bos taurus. These datasets encompassed a range of immune-challenging conditions, including infections caused by Mycobacterium bovis and Mycobacterium avium subsp. paratuberculosis, comparing transcriptomic profiles between diseased and healthy cattle. We implemented a comprehensive transcriptome analysis pipeline involving FastQC, Trimmomatic, Bowtie2, SAMtools, FeatureCounts, DESeq2, and MetaRNASeq, which resulted in the identification of 28 significant DEGs, comprising 12 upregulated and 16 downregulated genes. Comparison with an innate immune gene database revealed five immune-related genes such as IL1A, RGS2, RCAN1, and ZBP1, known to play important regulatory roles in immune responses. KEGG pathway enrichment analysis showed that these genes were involved in four critical immune-related pathways: Necroptosis, Osteoclast Differentiation, Oxytocin Signaling, and cGMP–PKG Signaling. These pathways are associated with various immune functions, including inflammatory cell death, cytokine signaling, immune cell differentiation, and leukocyte trafficking. Overall, this meta-analysis provides a deeper understanding of conserved immune signaling mechanisms in cattle and highlights key genes that could serve as biomarkers for immune competence, disease susceptibility, or vaccine responsiveness. The findings offer valuable insights for future functional studies and applications in bovine immunogenomics.

15:00-15:05
Post-translational regulation of stemness under DNA damage response contributes to the gingivobuccal oral squamous cell carcinoma relapse and progression
Confirmed Presenter: Sachendra Kumar, Indian Institute of Science, India

Room: 02N
Format: In person


Authors List: Show

  • Sachendra Kumar, Indian Institute of Science, India
  • Annapoorni Rangarajan, Indian Institute of Science, India
  • Debnath Pal, Indian Institute of Science, India

Presentation Overview: Show

Tobacco consumption (smoking and particularly smokeless form) contributes to a high prevalence of gingivobuccal oral squamous cell carcinoma (OSCC-GB) in India. OSCC-GB patients exhibit high rates of locoregional relapse and therapeutic failure, often attributed to the involvement of cancer stem cells (CSCs). This study aims to leverage the generalizability of the machine learning prediction model for ‘Tumor Status’ to conduct a comparative somatic mutation analysis between ‘With Tumor’ (recurred/relapsed/progressed) and ‘Tumor Free’ (disease-free/complete remission) OSCC-GB patients. Our results revealed that support vector machines (SVM) classified the ‘Tumor Status’ classes with a mean accuracy of 89% based on clinical features. Furthermore, RNA-seq-based somatic mutation analysis using the classified groups revealed molecular mechanisms underlying tumor relapse and progression within OSCC-GB subgroups. The identified mutational signature (C>T mutations) linked to DNA damage suggests the role of tobacco-related carcinogens in OSCC-GB subgroups. The analysis of distinct somatic variants, functional impact predictions, protein-protein interactions, and survival analysis highlights the involvement of DNA damage response (DDR)-related genes in the ‘With Tumor’ subgroup. This analysis particularly emphasizes the significant role of the Mitogen-activated protein kinase associated protein 1 (MAPKAP1) gene, a key player in the mTORC2 signaling pathway. The study suggests that loss-of-function in the identified MAPKAP1 somatic variant may promote stemness and elevate the risk of disease relapse and progression in ‘With Tumor’ OSCC-GB under DDR conditions, potentially contributing to higher mortality rates among Indian OSCC-GB patients.

15:05-15:10
Integrative Transcriptomic Profiling Reveals Novel lncRNA–circRNA–miRNA–mRNA Regulatory Networks in a Neurohormonal Model of Cardiac Hypertrophy
Room: 02N
Format: In person


Authors List: Show

  • Sebastián Urquiza-Zurich, Advanced Center for Chronic Diseases - ACCDiS, University of Chile, Chile., Chile
  • Sebastián Leiva-Navarrete, Advanced Center for Chronic Diseases - ACCDiS, University of Chile, Chile., Chile
  • Francisco Sigcho-Garrido, Advanced Center for Chronic Diseases - ACCDiS, University of Chile, Chile., Chile
  • Juan Pablo Silva, Advanced Center for Chronic Diseases - ACCDiS, University of Chile, Chile., Chile
  • Sergio Lavandero, Advanced Center for Chronic Diseases - ACCDiS, University of Chile, Chile., Chile
  • Vinicius Maracaja-Coutinho, Advanced Center for Chronic Diseases - ACCDiS, University of Chile, Chile., Chile

Presentation Overview: Show

Pathological cardiac hypertrophy (PCH) is a complex adaptive response to neurohormonal stimulation and hemodynamic stress, often preceding heart failure. While the contribution of protein-coding genes to this process has been extensively studied, the regulatory roles of non-coding RNAs (ncRNAs), particularly long non-coding RNAs (lncRNAs), circular RNAs (circRNAs) and microRNAs (miRNAs), remain incompletely understood. Here, we performed total and small RNA sequencing in neonatal rat ventricular myocytes (NRVMs) stimulated with norepinephrine (NE), modeling neurohormonal stress-induced hypertrophy. We implemented a de novo transcriptome assembly and ncRNA identification pipeline using coding potential tools such as FEELnc and CPC2. For circRNAs and miRNAs identification, we use CIRIquant and miRge3.0 tools, respectively. Our analysis revealed widespread transcriptomic remodeling, identifying differentially expressed (DE) mRNAs, miRNAs, lncRNAs, and circRNAs. Notably, 196 lncRNAs were significantly upregulated, including novel transcripts such as MSTRG.1800, located near genes involved in inflammation (Frmd8). Intronic lncRNAs like MSTRG.6214 (Lats2) suggest roles in Hippo signaling. We also detected 18 upregulated circRNAs, including circ_Cdyl and circ_Pfkp, potentially acting as miRNA sponges. Downregulated miRNAs such as miR-708 and miR-511 negatively correlated with these ncRNAs, reinforcing a ceRNA-based regulatory model. To corroborate this, RNA–RNA interaction analysis supported miRNA binding to both lncRNAs and circRNAs, miRDB to miRNA-mRNA target identification and network visualization using Cytoscape revealed modules of co-regulated RNAs linked to hypertrophic and metabolic pathways. Our integrative approach uncovers novel components of the lncRNA–circRNA-miRNA–mRNA axis in PCH in rat model and provides a systems-level framework for understanding ncRNA regulation in cardiac disease.

15:10-15:55
Panel: Pof. Dame Janet Thornton
Room: 02N
Format: In person


Authors List: Show

15:55-16:05
Room: 02N
Format: In person


Authors List: Show