Return to ISMB/ECCB 2025 Homepage Click here for the abridged agenda
Select Track: 3DSIG | Bio-Ontologies and Knowledge Representation | BioInfo-Core | Bioinfo4Women Meet-Up | Bioinformatics in the UK | BioVis | BOSC | CAMDA | CollaborationFest | CompMS | Computational Systems Immunology | Distinguished Keynotes | Dream Challenges | Education | Equity and Diversity | EvolCompGen | Fellows Presentation | Function | General Computational Biology | HiTSeq | iRNA | ISCB-China Workshop | JPI | MICROBIOME | MLCSB | NetBio | NIH Cyberinfrastructure and Emerging Technologies Sessions | NIH/Elixir | Publications - Navigating Journal Submissions | RegSys | Special Track | Stewardship Critical Infrastructure | Student Council Symposium | SysMod | Tech Track | Text Mining | The Innovation Pipeline: How Industry & Academia Can Work Together in Computational Biology | TransMed | Tutorials | VarI | WEB 2025 | Youth Bioinformatics Symposium | All
NOTE: Browser resolution may limit the width of the agenda and you may
need to scroll the iframe to see additional columns.
Click the buttons below to download your current table in that format
Date | Start Time | End Time | Room | Track | Title | Confrimed Presenter | Format | Authors | Abstract |
---|---|---|---|---|---|---|---|---|---|
2025-07-20 | 08:30:00 | 08:45:00 | 02N | Student Council Symposium | Student Council Symposium | ||||
2025-07-20 | 08:45:00 | 09:30:00 | 02N | Student Council Symposium | Según Fatumo | ||||
2025-07-20 | 09:30:00 | 09:45:00 | 02N | Student Council Symposium | Nutri-omics: how omics investigation can help designing personalized nutrition research | Mirko Treccani | Federica Bergamo, Pedro Mena, Davide Martorana, Daniele Del Rio, Giovanni Malerba, Valeria Barili, Riccardo Bonadonna, Alessandra Dei Cas, Marco Ventura, Francesca Turroni, Letizia Bresciani, Mirko Treccani, Cristiano Negro, Alice Rosi, Cristina Del Burgo-Gutiérrez, Maria Sole Morandini, Nicola Luigi Bragazzi, Claudia Favari, José Fernando Rinaldi de Alvarenga, Lucia Ghiretti, Cristiana Mignogna | (Poly)phenols (PPs) are a group of bioactive compounds found in plant-based food, widely consumed within diet. Several studies have reported the beneficial effects of PPs in preventing chronic diseases through a myriad of mechanisms of action. However, the bioavailability and effects of these compounds greatly differ across individuals, causing uneven physiological responses. To understand their inter-individual variability, we present a multi-omics investigation comprising genomics, metagenomics and metabolomics. We recruited 300 healthy individuals and collected biological samples (blood, urine, and faeces), anthropometric measurements, health status and lifestyle/dietary information. After identification by UPLC-IMS-HRMS and quantification by UPLC-QqQ-MS/MS, the large set of phenolic metabolites underwent dimensionality reduction and clustering to identify individuals with similar metabolic profiles (metabotypes), identifying high and low PP producers. Then, genomics and metagenomics investigations were performed to gain insights on inter-individual differences and unravel the potential pathophysiological impact of these molecules, with particular regards to cardiometabolic diseases. In details, genome-wide association studies followed by computational functional analyses on genetic variants, and taxonomic and functional investigations of gut microbiome were performed, showing hints for associations in genes and microbial species related to PP metabolism, together with unprecedented genetic associations. Genomics were further investigated in terms of gene networks and computational functional analyses, identifying differentially expressed genes, gene sets enrichments, candidate regulatory regions, and interacting loci and chromatin states, and associations with metabolic traits and diseases. Overall, we demonstrated the benefits of omics research in nutrition, advancing the field of personalised nutrition and health. | |
2025-07-20 | 09:45:00 | 10:00:00 | 02N | Student Council Symposium | Nocardia Genomes are a Large Reservoir of Diverse Gene Content, Biosynthetic Gene Clusters, and Species-specific Genes | Kiran Kumar Eripogu | Kiran Kumar Eripogu, Wen-Hsiung Li | Nocardia, an opportunistic pathogenic bacterial genus, remains underexplored in terms of biosynthetic potential, gene content, and evolutionary history. By analyzing 263 genomes across 88 species, we found that Nocardia varies greatly in genome size and gene content. It exhibits an open pangenome, with a small core genome (< 900 genes), and high genomic fluidity (0.76), indicating high gene turnover. A large proportion (75%) of its genes are species-specific, indicating its high genomic plasticity and dynamic evolutionary adaptation. Average Nucleotide Identity (ANI) analysis confirmed taxonomic relationships among Nocardia species, with most exhibiting high between-species ANI values (80-85%). N. globerula showed a high ANI of ~84% with Rhodococcus erythropolis, strongly supporting its reclassification under Rhodococcus. The biosynthetic capabilities of the Nocardia genus are striking with the presence of >8,000 biosynthetic gene clusters (BGCs), dominated by type 1 polyketide synthase, terpenes, and non-ribosomal polypeptide synthetases. This establishes Nocardia as the Actinomycetota genus that has the largest biosynthetic repertoire. Our study is the first to identify a prodigiosin BGC in Nocardia. Network analysis revealed complex evolutionary connections between Nocardia’s gene cluster families (GCFs) and MIBiG reference BGCs, suggesting evolutionary changes, including gene gains and losses, that may have influenced the genus’s BGC diversity and composition. Synteny analysis uncovered conserved and unique gene arrangements across Nocardia and related genera, mostly with core genes conserved in Actinomycetota. The findings from our study contribute to advancing microbial genomics, evolution, and biotechnology by uncovering the potential of Nocardia to address challenges in infectious diseases and natural product discovery. | |
2025-07-20 | 10:00:00 | 10:05:00 | 02N | Student Council Symposium | Lifting the veil on Challenging Medically Relevant Genes | Victor Grentzinger | Victor Grentzinger, Leonor Palmeira, Keith Durkin, Maria Artesi, Vincent Bours | While the cost of DNA sequencing has never been cheaper, a number of genetic diseases remain difficult to diagnose. Nearly 400 medically relevant genes are still challenging to characterize due to the complex nature of their sequence. This complexity can arise from a variety of factors, such as the existence of pseudogene, large Short Tandem Repeat region or Variable Number Tandem Repeat region. As such, the access to reliable and cost-effective genetic tests is limited. To resolve this issue, we decided to focus on improving the characterization of the following genes by using long-read sequencing: PKD1/PKD2, responsible for Autosomal Dominant Polycystic Kidney Disease (ADPKD), and FLG, involved in Atopic Dermatitis. For PKD genes, we amplified their sequence by long-range PCR before sequencing the products by Oxford Nanopore Sequencing. We were able to retrieve all variants previously confirmed by Sanger sequencing on 34 samples with ADPKD. For FLG, while investigating the 23 publicly available PacBio HiFi data of the 1000 Genome project, we identified new undescribed alleles in African samples. To determine if these variations are population specific, we analyzed 1111 additional public samples with long-read data. We discovered 5 novel alleles mostly from Sub-Saharan populations. We also investigated, in our cohort of public data, the MUC1 and SMN1/SMN2 genes, responsible respectively for Autosomal Dominant Tubulointerstitial Kidney Disease and Spinal Muscular Atrophy. Our next goal is to design cost efficient techniques to improve the sequencing of these challenging medically relevant genes in a clinical setting. | |
2025-07-20 | 10:05:00 | 10:10:00 | 02N | Student Council Symposium | AccuRate: A Tool Supporting Genotype–Phenotype Analysis and Causal Mutation Discovery in Soybean | Alžbeta Rástocká | Alžbeta Rástocká, Jana Biová, Mária Škrabišová | Soybean is one of the world’s most significant crops, serving as an indispensable source of high-quality plant protein and oil for both human and livestock consumption. Advances in soybean research support genomics-assisted breeding, guiding the development of more resilient, nutritious, and high-yielding varieties. Soybean also possesses an extensive collection of genomic and phenotypic data, including a large database of phenotypic traits. This enables the creation of new strategies for analysing genotype-phenotype associations. While association studies are important for identifying genomic loci linked to phenotypic traits, pinpointing causal mutations remains a challenge due to many factors. Building on these resources, this study presents new algorithms for analysing, visualizing, and automatically categorizing quantitative and categorical phenotypes. Given that most functional mutations are biallelic, and that quantitative traits often arise from the combined effects of multiple genes, phenotype binarization provides a practical basis for further analysis. Since many traits exist on a spectrum, various categorization methods are applied to transform them into binary form. This step is essential for calculating an accuracy parameter that quantifies genotype-phenotype correlation and facilitates the identification of causal mutations. The algorithm AccuRate was tested on well-characterized genes influencing protein and oil content in soybean. Results confirmed its ability to identify genotype-phenotype correlations. Additionally, two candidate genes were analysed, and a causal mutation was confirmed in one of them (Glyma.06G205800), linked to flowering and maturation time. AccuRate is a promising tool for uncovering genotype-phenotype relationships in soybean and, after optimizing for high-throughput testing, may be extended to other crops. | |
2025-07-20 | 10:10:00 | 10:15:00 | 02N | Student Council Symposium | Early colorectal cancer detection with deep learning on ultra-shallow whole genome sequencing of cell-free DNA | Ritchie Yu | Ritchie Yu, Jasmin Coulombe-Huntington, Yu Xia | Early detection of cancer can mitigate adverse patient outcomes by reducing the time to intervention and treatment. Cell-free DNA (cfDNA) circulating the bloodstream contains signatures of cancer which can be obtained and sequenced through liquid biopsy. Given a large collection of sequencing reads, features can be extracted and used to develop predictive models for patient cancer classification. However, current techniques for early cancer detection rely on tens of millions of sequencing reads, which can increase the cost of diagnosis. In our work, using whole genome sequencing data obtained from the Sequence Read Archive (SRA), we adapted convolutional neural networks to predict colorectal cancer. We found that the number of reads used by the model can be scaled down from approximately 60 million reads to 1 million reads. Our model achieved a classification performance of 0.902 AUC. This result suggests that the blood sample size required for liquid biopsy could be significantly reduced, thereby reducing the cost of diagnosis. Furthermore, through an ablation study, we showed that the fragment end distribution by itself produced a classification performance of 0.904 AUC. Meanwhile, relying only on fragment length distribution and end motif distribution produced 0.771 and 0.790 AUC, respectively. This suggests that fragment end distribution is a much more predictive feature for classification. In future work, we intend to incorporate fragment end features into transformer-based models to improve classification performance. | |
2025-07-20 | 10:15:00 | 10:30:00 | 02N | Student Council Symposium | DNA-DistilBERT: A small language model for non-coding variant effect prediction from human DNA sequences | Megha Hegde | Megha Hegde, Jean-Christophe Nebel, Farzana Rahman | Genetic variants have been associated with changes in disease risk. Historically, research has focused on coding variants; however, emerging research shows that non-coding variants also have strong links to disease causality, via transcription and gene regulation. Next-generation sequencing has exponentially increased genomic data availability, necessitating scalable computational approaches for accurate variant effect prediction. Transformer-based LLMs, such as BERT (Bidirectional Encoder Representations from Transformers), have achieved good results on coding variants, however, results on non-coding variants remain inconsistent. Moreover, the quadratic computational complexity of attention mechanisms with sequence length imposes substantial resource demands, restricting innovation in this area to a few institutions with high-end infrastructure. Arguably, BERT is the most successful of such architectures as it excels in context-aware modelling of genomic sequences due to its bidirectional nature. However, to substantially decrease computational costs, it is proposed to exploit DistilBERT, which uses knowledge distillation during pretraining to reduce the number of model parameters. While small language models (SLMs) such as DistilBERT are established in natural language processing, they remain underexplored in genomics. Experiments show that, when pretrained on human reference genome sequences, and fine-tuned for variant effect prediction, the SLM approach can match state-of-the-art LLMs such as DNABERT-2 in accuracy, while significantly reducing resource requirements. This innovative, energy-efficient approach not only makes variant effect prediction more scalable but also advances equitable research by enabling training on a single GPU, eliminating the need for high-performance computing. | |
2025-07-20 | 10:30:00 | 10:45:00 | 02N | Student Council Symposium | Generative AI for Childhood and Adult Cancer Research | Guillermo Prol Castelo | Guillermo Prol Castelo, Davide Cirillo, Alfonso Valencia | Cancer is one of the most common causes of death worldwide, and its complexity makes it especially challenging to study. Despite ongoing progress in cancer research, a significant challenge is the scarcity of detailed data on disease subgroups and stages. To overcome this problem, Generative AI techniques and, specifically, the Variational Autoencoder (VAE), have been widely used to handle high-dimensional data. We propose a robust, explainable Synthetic Data Generation (SDG) pipeline based on the VAE using cancer transcriptomics data. Here, two main scenarios are presented, where we use our SDG pipeline to study different cancer types, addressing data scarcity limitations effectively. First, we present the case of Medulloblastoma, a rare, childhood brain tumor traditionally classified into four molecular subgroups, where we provide evidence supporting the existence of an additional subgroup with distinct molecular features. Additionally, we apply explainability techniques to the VAE, uncovering key relationships between gene expression and disease subgroups. Second, we tackle cancer's dynamic nature to link the most similar patients and leverage our SDG pipeline to direct the process of data generation along a trajectory between patients at different stages of the disease. Our pipeline generates stage-separable patients, revealing actionable molecular insights at intermediate reconstructed steps. These studies demonstrate the potential of synthetic data generation in highly specific contexts, shed light on the temporal aspects of cancer, and advance our understanding of the underlying biological mechanisms. | |
2025-07-20 | 11:00:00 | 11:05:00 | 02N | Student Council Symposium | AutoPeptideML 2: An open source library for democratizing machine learning for peptide bioactivity prediction | Raúl Fernández-Díaz | Raúl Fernández-Díaz, Thanh Lam Hoang, Vanessa Lopez, Denis Shields | Peptides are a rapidly growing drug modality with diverse bioactivities and accessible synthesis, particularly for canonical peptides composed of the 20 standard amino acids. However, enhancing their pharmacological properties often requires chemical modifications, increasing synthesis cost and complexity. Consequently, most existing data and predictive models focus on canonical peptides. To accelerate the development of peptide drugs, there is a need for models that generalize from canonical to non-canonical peptides. We present AutoPeptideML, an open-source, user-friendly machine learning platform designed to bridge this gap. It empowers experimental scientists to build custom predictive models without specialized computational knowledge, enabling active learning workflows that optimize experimental design and reduce sample requirements. AutoPeptideML introduces key innovations: (1) preprocessing pipelines for harmonizing diverse peptide formats (e.g., sequences, SMILES); (2) automated sampling of negative peptides with matched physicochemical properties; (3) robust test set selection with multiple similarity functions (via the Hestia-GOOD framework); (4) flexible model building with multiple representation and algorithm choices; (5) thorough model evaluation for unseen data at multiple similarity levels; and (6) FAIR-compliant, interpretable outputs to support reuse and sharing. A webserver with GUI enhances accessibility and interoperability. We validated AutoPeptideML on 18 peptide bioactivity datasets and found that automated negative sampling and rigorous evaluation reduce overestimation of model performance, promoting user trust. A follow-up investigation also highlighted the current limitations in extrapolating from canonical to non-canonical peptides using existing representation methods. AutoPeptideML is a powerful, platform for democratizing machine learning in peptide research, facilitating integration with experimental workflows across academia and industry. | |
2025-07-20 | 11:05:00 | 11:10:00 | 02N | Student Council Symposium | ENQUIRE automatically reconstructs, expands, and drives enrichment analysis of gene and MeSH co-occurrence networks from context-specific biomedical literature | Luca Musella | Luca Musella, Alejandro Afonso Castro, Xin Lai, Max Widmann, Julio Vera | The accelerating growth of scientific literature overwhelms our capacity to manually distil complex phenomena like molecular networks linked to diseases. Moreover, biases in biomedical research and database annotation limit our interpretation of facts and generation of hypotheses. ENQUIRE (Expanding Networks by Querying Unexpectedly Inter-Related Entities) offers a time- and resource-efficient alternative to manual literature curation and database mining. ENQUIRE reconstructs and expands co-occurrence networks of genes and biomedical ontologies from user-selected input corpora and network-inferred PubMed queries. Its modest resource usage and the integration of text mining, automatic querying, and network-based statistics mitigating literature biases makes ENQUIRE unique in its broad-scope applications. We benchmarked and illustrated ENQUIRE‘s capabilities in several case scenarios and published the results earlier this year (Musella L. et al., 2025, PLoS Comput Biol). At ISMB/ECCB 2025, we showcase how ENQUIRE can support biomedical researchers using melanoma resistance to immunotherapy as an example case study. The frameworks enabled by ENQUIRE include gene set reconstruction, pathway enrichment analysis, and knowledge graph annotation, which can ease literature annotation, boosting hypothesis formulation, and facilitating the identification of molecular targets for subsequent experimentation. | |
2025-07-20 | 11:10:00 | 11:15:00 | 02N | Student Council Symposium | Automating Linear Motif Predictions to Map Human Signaling Networks | Yitao (Eric) Sun | Yitao (Eric) Sun, Yu Xia, Jasmin Coulombe-Huntington | Short linear motifs (SLiMs) are critical mediators of transient protein-protein interactions (PPIs), yet only 0.2% of human SLiMs are experimentally verified. Their short length (3–11 residues), rapid evolution, and frequent location in intrinsically disordered regions make them difficult to systematically uncover using conventional approaches. We present an automated computational framework for proteome-wide SLiM discovery that integrates structural, evolutionary, and machine learning attributes to overcome limitations in current resources (e.g., MEME Suite, ELM). Our method combines Gibbs sampling for de novo motif discovery with hidden Markov models (HMMs) that explicitly model insertions and deletions, enabling a more realistic representation of motif variation. To improve specificity, we incorporate four discriminative features: ProtT5-derived motif propensity scores, AlphaFold-based intrinsic disorder (pLDDT), solvent accessibility, and cross-species conservation from multiple sequence alignments. Together, these features enable robust motif characterization even in noisy biological contexts. Biological relevance is ensured by searching the interactors of the SLiM-binding domain protein through BioGRID PPIs and motif clustering via HMM similarity (HH-suite). Our framework validated MAPK1 (ERK2)-mediated phosphorylation motif in RUNX1, exhibiting high feature scores and validated via independent phosphoproteomic data. This site, previously biochemically characterized but not recognized as an SLiM, shows the power of our approach in identifying functional motifs missed by traditional tools. Our database allows biologists to browse through validated motifs alongside high-quality predictions. This work lays the foundation for systematic reconstruction of motif-mediated signaling networks and advances the discovery of novel regulatory mechanisms and therapeutic targets. | |
2025-07-20 | 11:15:00 | 11:20:00 | 02N | Student Council Symposium | TCRBench: A Unified Benchmark for TCR–Antigen Binding Prediction and Clustering | Muhammed Hunaid Topiwala | Muhammed Hunaid Topiwala, Pengfei Zhang, Heewook Lee | T-cell receptor (TCR) recognition of antigenic peptides presented by major histocompatibility complex (MHC) molecules is central to adaptive immunity, driving pathogen-specific responses and informing therapeutic vaccine development. Computational tasks such as predicting TCR-antigen binding affinity (NetTCR, Montemurro et al., 2021; ImRex, Moris et al., 2021) and clustering TCR sequences by epitope specificity (GLIPH, Glanville et al., 2017; TCRdist, Dash et al., 2017) have emerged as key challenges to decoding immune specificity. While recent models leveraging convolutional neural networks, transformers (e.g., ATM-TCR, Xu et al., 2021), and multimodal embeddings (ERGO, Springer et al., 2020; TCRMatch, Chronister et al., 2021) have significantly advanced performance, fragmented datasets and inconsistent evaluation methods have limited direct model comparisons and generalization. We propose a unified benchmark dataset integrating rigorously curated TCR sequences from human, mouse, and macaque responses to major pathogens (Influenza A, CMV, EBV, SARS-CoV-2) sourced from comprehensive databases such as VDJdb (Shugay et al., 2018) and IEDB (Vita et al., 2019). The benchmark incorporates standardized evaluation splits, structural representations enabled by AlphaFold2 predictions (Jumper et al., 2021), and robust evaluation metrics to ensure fair, reproducible comparisons. By consolidating disparate data and evaluation practices, our benchmark provides clarity on current progress, facilitating future innovation in computational TCR-antigen interaction modeling. | |
2025-07-20 | 11:20:00 | 11:35:00 | 02N | Student Council Symposium | Fold first, ask later: structure-informed function prediction in Pseudomonas phages | Hannelore Longin | Hannelore Longin, George Bouras, Susanna Grigson, Robert Edwards, Hanne Hendrix, Rob Lavigne, Vera van Noort | Phages, the viruses of bacteria, are the most abundant biological entities on earth. In general, phage genomes are densely coded and contain many open reading frames, yet up to 70% encode proteins of unknown function. Despite clinical, biotechnological and fundamental interests in unravelling these proteins’ functions, phage proteins are absent from recent large-scale structure-based efforts (such as AlphaFold database). Here, we investigate the efficacy of structure-based protein annotation for Pseudomonas-infecting phages, comparing different post-processing strategies to obtain function annotations from FoldSeek output. Briefly, we collected every protein annotated as ‘hypothetical/phage protein’ in NCBI and of at least 100 amino acids in length, of 887 Pseudomonas-infecting phages. These 38,025 proteins (31% of all proteins) were then clustered into 10,453 groups of homologs. Protein structures were predicted with ColabFold and structural similarity to the PDB and AlphaFold database was assessed with FoldSeek. Of all proteins, 59% displayed significant similarity to at least one structure in these databases. We benchmarked various strategies for extracting function from these FoldSeek hits, integrating different information resources, hit selection methods, and structure-based clustering of the hits. The resulting annotations were then compared with state-of-the-art sequence- and structure-based phage annotation tools Pharokka and Phold. On average, up to 42% of the phage proteins of unknown function could be annotated using structure-based methods, depending on the post-processing strategies applied. While caution is warranted when transferring protein annotations based on similarity, these methods can significantly speed up research into new antimicrobials and biotechnological applications inspired by nature’s finest bioengineers: phages. | |
2025-07-20 | 11:35:00 | 11:50:00 | 02N | Student Council Symposium | Exploring capabilities of protein language models for cryptic binding site prediction | Vít Škrhák | Vít Škrhák, David Hoksza | Identifying protein-ligand binding sites is essential for understanding biological mechanisms and supporting drug discovery. However, accurate prediction remains challenging - particularly in the case of cryptic binding sites (CBSs), which require significant conformational changes to form upon ligand binding. Structure-based prediction methods typically rely on a specific conformation (apo vs. holo), making them less effective for identifying CBSs. A promising alternative is the use of sequence-based approaches, enabled by the emergence of protein language models (pLMs). In this work, we explored the capabilities of various pLMs for predicting CBSs. As a baseline, we created a simple model trained using transfer learning. We then experimented with several fine-tuning strategies to further improve performance. Specifically, we applied multitask learning - not only to predict whether a residue is part of a CBS, but also to estimate its flexibility. This additional task enhanced the model’s awareness of protein dynamics, which is critical for accurate CBS identification. Our primary data source is the recently published CryptoBench dataset, which contains annotations of cryptic sites, although additional data sources were also considered. The combination of novel fine-tuning strategies and various training data improved performance across all key metrics, including a gain of over 2% in AUC. To better understand model limitations, we also conducted an analysis of common prediction errors. Finally, we introduced a simple post-processing method designed to refine and smooth the model’s outputs. | |
2025-07-20 | 11:50:00 | 11:55:00 | 02N | Student Council Symposium | Coarse-grained and Multi-Scale Modeling of Lytic Polysaccharide Monooxygenases: Insights into Family-Specific Dynamics and Protein Frustration | Nisha Nandhini Shankar | Nisha Nandhini Shankar, Ragothaman M Yennamalli | Lytic polysaccharide monooxygenases (LPMOs) are copper-dependent redox enzymes that catalyze the oxidative cleavage of C1 and/or C4 bonds in recalcitrant polysaccharides, playing a vital role in biomass conversion. The CAZy database classifies LPMOs into eight families (AA9, AA10, AA11, AA13, AA14, AA15, AA16, and AA17). These families exhibit diversity in their structure as well as catalytic features. This study focuses on analyzing the structure, dynamics and energetic landscapes of LPMO families using FrustratormeteR, SignDy, and multiscale modeling approaches. FrustratormeteR quantifies configurational and mutational frustration, identifying energetically unfavorable interactions. AA9 exhibited high local frustration in the residue range of 100-230, while AA10 showed a more stable profile. SignDy was employed to explore slow collective motions, revealing significant conformational changes in AA9 linked to enzymatic adaptability, with the first six modes indicating notable flexibility. In contrast, AA10 displayed lower mobility in its first three modes, suggesting greater rigidity and substrate specificity. Protein models from AlphaFold2 were used for proteins with missing residues. These models were prepared and subjected to 100 ns all-atom molecular dynamics simulations using the OPLS-AA/L force field. The increase in RMSD in the course of the simulation shows the conformational changes. RMSF and energy analyses revealed flexible regions consistent with mode analysis, with average potential energies stabilizing at -6.25×105 kJ/mol. The radius of gyration (Rg) remained stable around 1.65-1.75 nm. Analysing the coarse-grained Gō model simulations, run using SMOG for 200 million steps will provide further insights into the folding and long-range dynamic behavior of these enzymes. | |
2025-07-20 | 11:55:00 | 12:00:00 | 02N | Student Council Symposium | Identification and structural modeling of the novel TTC33-associated core (TANC) complex involved in DNA damage response | Małgorzata Drabko | Małgorzata Drabko, Rafał Tomecki, Małgorzata Siek, Aneta Jurkiewicz, Miłosz Ludwinek, Kamil Kobyłecki, Dominik Cysewski, Agata Malinowska, Magdalena Bakun, Łukasz S. Borowski, Roman J. Szczęsny, Rafał Płoski, Agnieszka Tudek | Of the ~20,200 human proteins, ~9% remain functionally uncharacterized, highlighting a gap in our understanding of cell physiology. Structural proteins without enzymatic activity are particularly difficult to study. Here, we applied a “function by proximity” approach to TTC33, a nuclear structural tetratricopeptide repeat (TPR) protein conserved in bony vertebrates. Using comparative label-free mass spectrometry, we identified the TTC33-associated network (TAN), which includes WDR61, CCDC97, UNG, PP2A-B55α, PHF5A, and the SF3B subcomplex of U2. At the core of TAN is a novel trimeric complex (TANC) formed by TTC33:WDR61:PHF5A, with this claim being supported by co-purification and size exclusion chromatography. Structural predictions performed by AlphaFold 3, and their experimental validation showed WDR61 and PHF5A bind opposite sides of TTC33’s TPR4, while TPR1-3 recruit other TAN factors. To expand the structural model we employed molecular dynamics to identify the most stable amino acid contact pairs between complex subunits. Although TTC33 forms a complex with WDR61 and PHF5A, both of which are involved in RNA metabolism, our RNA-seq assays revealed only a subtle impact on mRNA levels and splicing patterns. In contrast, TTC33 appears more involved in DNA repair through interaction with UNG1/2. TTC33 loss led to increased DNA double-strand breaks, a phenotype previously associated with UNG1/2 knock-down. We showed that TTC33 protein levels are regulated in vivo, and that changes to TTC33 abundance reduced cellular proliferation rate and resistance to hydrogen peroxide. Moreover, the depletion or loss of either TTC33 or CCDC97 induced redistribution of p53-S15P, a marker of DNA damage. | |
2025-07-20 | 12:00:00 | 12:05:00 | 02N | Student Council Symposium | Functional Interfaces at Ordered–Disordered Transitions: Conserved Linear Motifs and Flanking Regions in Modular Proteins | Carla Luciana Padilla Franzotti | Carla Luciana Padilla Franzotti, Nicolas Palopoli, Gustavo Pierdominici-Sottile, Miguel Andrade | Multidomain proteins integrate ordered domains, structured tandem repeats (STRs), and intrinsically disordered regions (IDRs) to generate modular architectures optimized for dynamic and specific protein-protein interactions. In this study, we analyze the role of short linear motifs (SLiMs) located at the interface between ordered and disordered segments, focusing on their contribution to structural connectivity and interaction regulation. Two model systems are examined: (1) the large T antigen from simian virus 40 (LTSV40), in which the LxCxE motif—positioned at the junction between a folded domain and an IDR—mediates binding to the retinoblastoma protein (pRb), and (2) the regulatory complex between protein phosphatase 1 delta (PP1δ) and its MYPT1 subunit, where ankyrin repeats (ANKs) are connected to DOC-type docking motifs through an intervening IDR. In both cases, the regions flanking the SLiMs exhibit high sequence conservation and specific biophysical properties, consistent with a modulatory role. Molecular dynamics simulations demonstrate that these flanking regions promote extended conformations upon complex formation, facilitating physical occlusion of critical interaction interfaces (such as the E2F-binding pocket in pRb) without requiring large-scale allosteric rearrangements. In the PP1-MYPT1 complex, ANK repeats and IDRs exhibit cooperative behavior that contributes to the stabilization of the bound conformation and enhances interaction specificity. These findings support the existence of a conserved ordered–motif–disordered architectural module recurrently employed in both viral and cellular regulatory systems. This topological arrangement constitutes a potential target for therapeutic intervention in diseases involving aberrant protein-protein interactions mediated by SLiMs at ordered–disordered interfaces. | |
2025-07-20 | 12:05:00 | 12:10:00 | 02N | Student Council Symposium | Automating Linear Motif Predictions to Map Human Signaling Networks | Yitao (Eric) Sun | Yitao (Eric) Sun, Yu Xia, Jasmin Coulombe-Huntington | Short linear motifs (SLiMs) are critical mediators of transient protein-protein interactions (PPIs), yet only 0.2% of human SLiMs are experimentally verified. Their short length (3–11 residues), rapid evolution, and frequent location in intrinsically disordered regions make them difficult to systematically uncover using conventional approaches. We present an automated computational framework for proteome-wide SLiM discovery that integrates structural, evolutionary, and machine learning attributes to overcome limitations in current resources (e.g., MEME Suite, ELM). Our method combines Gibbs sampling for de novo motif discovery with hidden Markov models (HMMs) that explicitly model insertions and deletions, enabling a more realistic representation of motif variation. To improve specificity, we incorporate four discriminative features: ProtT5-derived motif propensity scores, AlphaFold-based intrinsic disorder (pLDDT), solvent accessibility, and cross-species conservation from multiple sequence alignments. Together, these features enable robust motif characterization even in noisy biological contexts. Biological relevance is ensured by searching the interactors of the SLiM-binding domain protein through BioGRID PPIs and motif clustering via HMM similarity (HH-suite). Our framework validated MAPK1 (ERK2)-mediated phosphorylation motif in RUNX1, exhibiting high feature scores and validated via independent phosphoproteomic data. This site, previously biochemically characterized but not recognized as an SLiM, shows the power of our approach in identifying functional motifs missed by traditional tools. Our database allows biologists to browse through validated motifs alongside high-quality predictions. This work lays the foundation for systematic reconstruction of motif-mediated signaling networks and advances the discovery of novel regulatory mechanisms and therapeutic targets. | |
2025-07-20 | 12:10:00 | 12:25:00 | 02N | Student Council Symposium | Deep Phylogenetic Reconstruction Reveals Key Functional Drivers in the Evolution of B1/B2 Metallo-β-Lactamases | Samuel Davis | Samuel Davis, Pallav Joshi, Ulban Adhikary, Julian Zaugg, Phil Hugenholtz, Marc Morris, Gerhard Schenk, Mikael Boden | Metallo-β-lactamases (MBLs) comprise a diverse family of antibiotic-degrading enzymes. Despite their growing implication in drug-resistant pathogens, no broadly effective clinical inhibitors against MBLs currently exist. Notably, β-lactam-degrading MBLs appear to have emerged twice from within the broader, catalytically diverse MBL-fold protein superfamily, giving rise to two distinct monophyletic groups: B1/B2 and B3 MBLs. Comparative analyses have highlighted distinct structural hallmarks of these subgroups, particularly in metal-coordinating residues. However, the precise evolutionary events underlying their emergence remain unclear due to challenges presented by extensive sequence divergence. Understanding the molecular determinants driving the evolution of β-lactamase activity may inform design of broadly effective inhibitors. We sought to infer the evolutionary features driving the emergence of B1/B2 MBLs via phylogenetics and ancestral reconstruction. To overcome challenges associated with evolutionary analysis at this scale, we developed a phylogenetically aware sequence curation framework centred on iterative profile HMM refinement. This framework was applied over several iterations to construct a comprehensive phylogeny encompassing the B1/B2 MBLs and several other recently diverged clades. The resulting tree represents the most robust hypothesis to date regarding the emergence of B1/B2 MBLs and implies a parsimonious evolutionary history of key features, including variation in active site architecture and insertions and deletions of distinct structural elements. Ancestral proteins inferred at key internal nodes were experimentally characterised, revealing distinct activity profiles that reflect underlying evolutionary transitions. These findings give rise to testable hypotheses regarding the molecular basis and evolutionary drivers of functional diversification, as well as potential targets for MBL inhibitor design. | |
2025-07-20 | 12:25:00 | 12:30:00 | 02N | Student Council Symposium | Multilingual model improves zero-shot prediction of disease effects on proteins | Ruyi Chen | Ruyi Chen, Nathan Palpant, Gabriel Foley, Mikael Boden | Models for mutation effect prediction in coding sequences rely on sequence-, structure-, or homology-based features. Here, we introduce a novel method that combines a codon language model with a protein language model, providing a dual representation for evaluating effects of mutations on disease. By capturing contextual dependencies at both the genetic and protein level, our approach achieves a 3% increase in ROC-AUC classifying disease effects for 137,350 ClinVar missense variants across 13,791 genes, outperforming two single-sequence-based language models. Obviously the codon language model can uniquely differentiate synonymous from nonsense mutations at the genomic level. Our strategy uses information at complementary biological scales (akin to human multilingual models) to enable protein fitness landscape modeling and evolutionary studies, with potential applications in precision medicine, protein engineering, and genomics. | |
2025-07-20 | 12:30:00 | 12:45:00 | 02N | Student Council Symposium | Integrated analysis of bulk and single-nuclei RNA sequencing data of primary and metastatic pediatric Medulloblastoma. | Ana Isabel Castillo Orozco | Ana Isabel Castillo Orozco, Geoffroy Danieau, Livia Garzia | Medulloblastoma (MB) is a highly aggressive and the most common brain tumor in childhood. MB presents a high intertumoral heterogeneity, with at least four molecular subgroups identified (SHH, WNT, Group 3, and Group 4). Metastatic MB, or Leptomeningeal Disease (LMD), is predominantly found in the MB Group 3 type. Although LMD represents a main clinical challenge, its molecular mechanisms remain poorly characterized. Recent research has shown that primary and MB metastasis diverge dramatically. Our work has focused on establishing therapy naïve Group 3 Patient-Derived Xenografts models of primary and metastatic Medulloblastoma to conduct transcriptomic profiling at the bulk and single-nuclei RNAseq levels to identify genetic drivers/pathways that sustain leptomeningeal disease compartment. Our results show various signaling pathways enriched across LMD models, such as MYC targets, unfolded protein response, and fatty acid metabolism. Using single-sample GSEA analysis (ssGSEA) and deconvolution approaches, we have also identified that our PDXes models retain neoplastic subpopulations previously identified in MB single-cell sequencing studies. Similarly, we have identified slight differences in cell subpopulation proportions between primary and leptomeningeal compartments. Our single-nuclei studies have confirmed these results and differentially expressed genes previously found in bulk RNAseq analyses. These results suggest the presence of cell populations enriched in the metastatic compartment with an aberrant transcription phenotype and adaptations in metabolism to survive the leptomeningeal space. Our recent findings suggest that LMD should be treated differently from primary brain tumors and that identified metabolic pathways may be potential targets for targeted therapeutics to treat or prevent this devastating disease. | |
2025-07-20 | 12:45:00 | 12:50:00 | 02N | Student Council Symposium | Investigating novel transcriptional regulators in symbiotic nodule development of Medicago truncatul | Sara Eslami | Sara Eslami, Mahboobeh Azarakhsh | Biological nitrogen fixation is a crucial process for sustainable agriculture, allowing leguminous plants to convert atmospheric nitrogen into bioavailable forms through a symbiotic relationship with rhizobia. This interaction results in the formation of specialized root structures called nodules, where nitrogen fixation takes place. A deeper understanding of the molecular mechanisms governing nodule formation is essential for enhancing plant-microbe interactions and improving agricultural productivity. In this study, we investigate key transcription factors (TFs) involved in the nodulation process of Medicago truncatula, including MtIPD3, MtNSP1, MtNSP2, MtNIN, and MtERNs. Using co-expression analysis (Phytozome database) and interaction network studies (STRING database), we identify novel regulatory elements that potentially play a role in nodule organogenesis. Our findings suggest a strong interaction between IPD3 and splicing factors, implicating its involvement in RNA processing and cell cycle regulation during nodule formation. Additionally, we identify the cytokinin transporter gene ABCG38 as significantly upregulated in nodules, suggesting its role in cytokinin-mediated regulation. Moreover, our analysis indicates that the auxin response factor Medtr2g043250 is a likely transcriptional target of NIN, highlighting a possible cross-talk between auxin and cytokinin signaling in nodulation. These insights contribute to a deeper understanding of the transcriptional and hormonal regulation of nodule development, offering potential strategies for enhancing biological nitrogen fixation in legumes. | |
2025-07-20 | 12:50:00 | 12:55:00 | 02N | Student Council Symposium | Meta-Analysis of Bovine Transcriptome Reveals Key Immune Gene Profiles and Signaling Pathways | Vennila Kanchana Devi Marimuthu | Vennila Kanchana Devi Marimuthu, Kishore Matheswaran, Menaka Thambiraja, Ragothaman M Yennamalli | Understanding immune mechanisms in cattle is crucial for improving disease resistance through informed breeding decision and development. Meta-analysis serves as a powerful approach to integrate findings from multiple transcriptomic studies that uncover significant gene expression patters across various experimental conditions and increase statistical power and. In this study, we conducted a meta-analysis of four bovine transcriptomic datasets (GSE45439, GSE62048, GSE125964, and GSE247921) to identify immune-related differentially expressed genes (DEGs) in Bos taurus. These datasets encompassed a range of immune-challenging conditions, including infections caused by Mycobacterium bovis and Mycobacterium avium subsp. paratuberculosis, comparing transcriptomic profiles between diseased and healthy cattle. We implemented a comprehensive transcriptome analysis pipeline involving FastQC, Trimmomatic, Bowtie2, SAMtools, FeatureCounts, DESeq2, and MetaRNASeq, which resulted in the identification of 28 significant DEGs, comprising 12 upregulated and 16 downregulated genes. Comparison with an innate immune gene database revealed five immune-related genes such as IL1A, RGS2, RCAN1, and ZBP1, known to play important regulatory roles in immune responses. KEGG pathway enrichment analysis showed that these genes were involved in four critical immune-related pathways: Necroptosis, Osteoclast Differentiation, Oxytocin Signaling, and cGMP–PKG Signaling. These pathways are associated with various immune functions, including inflammatory cell death, cytokine signaling, immune cell differentiation, and leukocyte trafficking. Overall, this meta-analysis provides a deeper understanding of conserved immune signaling mechanisms in cattle and highlights key genes that could serve as biomarkers for immune competence, disease susceptibility, or vaccine responsiveness. The findings offer valuable insights for future functional studies and applications in bovine immunogenomics. | |
2025-07-20 | 12:55:00 | 13:00:00 | 02N | Student Council Symposium | Post-translational regulation of stemness under DNA damage response contributes to the gingivobuccal oral squamous cell carcinoma relapse and progression | Sachendra Kumar | Sachendra Kumar, Annapoorni Rangarajan, Debnath Pal | Tobacco consumption (smoking and particularly smokeless form) contributes to a high prevalence of gingivobuccal oral squamous cell carcinoma (OSCC-GB) in India. OSCC-GB patients exhibit high rates of locoregional relapse and therapeutic failure, often attributed to the involvement of cancer stem cells (CSCs). This study aims to leverage the generalizability of the machine learning prediction model for ‘Tumor Status’ to conduct a comparative somatic mutation analysis between ‘With Tumor’ (recurred/relapsed/progressed) and ‘Tumor Free’ (disease-free/complete remission) OSCC-GB patients. Our results revealed that support vector machines (SVM) classified the ‘Tumor Status’ classes with a mean accuracy of 89% based on clinical features. Furthermore, RNA-seq-based somatic mutation analysis using the classified groups revealed molecular mechanisms underlying tumor relapse and progression within OSCC-GB subgroups. The identified mutational signature (C>T mutations) linked to DNA damage suggests the role of tobacco-related carcinogens in OSCC-GB subgroups. The analysis of distinct somatic variants, functional impact predictions, protein-protein interactions, and survival analysis highlights the involvement of DNA damage response (DDR)-related genes in the ‘With Tumor’ subgroup. This analysis particularly emphasizes the significant role of the Mitogen-activated protein kinase associated protein 1 (MAPKAP1) gene, a key player in the mTORC2 signaling pathway. The study suggests that loss-of-function in the identified MAPKAP1 somatic variant may promote stemness and elevate the risk of disease relapse and progression in ‘With Tumor’ OSCC-GB under DDR conditions, potentially contributing to higher mortality rates among Indian OSCC-GB patients. | |
2025-07-20 | 14:00:00 | 15:00:00 | 02N | Student Council Symposium | , | ||||
2025-07-20 | 15:00:00 | 15:45:00 | 02N | Student Council Symposium | Pof. Dame Janet Thornton | ||||
2025-07-20 | 15:45:00 | 16:00:00 | 02N | Student Council Symposium | Closing remarks |