Return to ISMB/ECCB 2025 Homepage   Click here for the abridged agenda


Select Track: 3DSIG | Bio-Ontologies and Knowledge Representation | BioInfo-Core | Bioinfo4Women Meet-Up | Bioinformatics in the UK | BioVis | BOSC | CAMDA | CollaborationFest | CompMS | Computational Systems Immunology | Distinguished Keynotes | Dream Challenges | Education | Equity and Diversity | EvolCompGen | Fellows Presentation | Function | General Computational Biology | HiTSeq | iRNA | ISCB-China Workshop | JPI | MICROBIOME | MLCSB | NetBio | NIH Cyberinfrastructure and Emerging Technologies Sessions | NIH/Elixir | Publications - Navigating Journal Submissions | RegSys | Special Track | Stewardship Critical Infrastructure | Student Council Symposium | SysMod | Tech Track | Text Mining | The Innovation Pipeline: How Industry & Academia Can Work Together in Computational Biology | TransMed | Tutorials | VarI | WEB 2025 | Youth Bioinformatics Symposium | All


Schedule for iRNA

NOTE: Browser resolution may limit the width of the agenda and you may need to scroll the iframe to see additional columns.
Click the buttons below to download your current table in that format

Date Start Time End Time Room Track Title Confrimed Presenter Format Authors Abstract
2025-07-23 11:20:00 11:30:00 02N iRNA Introduction to iRNA Michelle Scott, Athma Pai
2025-07-23 11:30:00 12:10:00 02N iRNA Sequential verification of transcription by Integrator and Restrictor Steven West Steven West The decision between productive elongation and premature termination of promoterproximal RNA polymerase II (RNAPII) is fundamental to metazoan gene regulation. Integrator and Restrictor complexes are implicated in promoter-proximal termination, but why metazoans utilise two complexes and how they are coordinated remains unknown. Here, we show that Integrator and Restrictor act sequentially and nonredundantly to monitor distinct stages of transcription. Integrator predominantly engages with promoter-proximally paused RNAPII to trigger premature termination, which is prevented by cyclin-dependent kinase 7/9 activity. After pause release, RNAPII enters a previously unrecognised “restriction zone” universally imposed by Restrictor. Unproductive RNAPII terminates within this zone, while progression through it is promoted by U1 small nuclear ribonucleoprotein (snRNP), which antagonises Integrator and Restrictor in a U1-70K dependent manner. These findings reveal the principles of a sequential verification mechanism governing the balance between productive and attenuated transcription, rationalising the necessity of Integrator and Restrictor complexes in metazoans.
2025-07-23 12:10:00 12:20:00 02N iRNA CIRI-Deep Enables Single-Cell and Spatial Transcriptomic Analysis of Circular RNAs with Deep Learning Yuan Gao Zihan Zhou, Yuan Gao Circular RNAs (circRNAs) are a crucial yet relatively unexplored class of transcripts known for their tissue- and cell-type-specific expression patterns. Despite the advances in single-cell and spatial transcriptomics, these technologies face difficulties in effectively profiling circRNAs due to inherent limitations in circRNA sequencing efficiency. To address this gap, a deep learning model, CIRI-deep, is presented for comprehensive prediction of circRNA regulation on diverse types of RNA-seq data. CIRI-deep is trained on an extensive dataset of 25 million high-confidence circRNA regulation events and achieved high performances on both test and leave-out data, ensuring its accuracy in inferring differential events from RNA-seq data. It is demonstrated that CIRI-deep and its adapted version enable various circRNA analyses, including cluster- or region-specific circRNA detection, BSJ ratio map visualization, and trans and cis feature importance evaluation. Collectively, CIRI-deep’s adaptability extends to all major types of RNA-seq datasets including single-cell and spatial transcriptomic data, which will undoubtedly broaden the horizons of circRNA research.
2025-07-23 12:20:00 12:40:00 02N iRNA Enhancing circRNA–miRNA Interaction Prediction with Structure-aware Sequence Modeling Juseong Kim Juseong Kim, Sanghun Sel, Giltae Song Circular RNAs (circRNAs) function as key post-transcriptional regulators by interacting with microRNAs (miRNAs) to modulate gene expression. These interactions play a central role in gene regulatory networks and are implicated in various diseases. Accurate prediction of circRNA–miRNA interactions is therefore essential for understanding regulatory mechanisms and advancing therapeutic development. Notably, sequence variability among circRNA isoforms sharing the same back-splice junction can result in distinct miRNA binding profiles, highlighting the importance of isoform-level modeling. However, existing computational methods, including rule-based approaches (e.g., Miranda) and graph-based neural architectures, often fail to incorporate structural information and cannot effectively capture isoform-specific characteristics, thereby limiting their predictive performance. To address these challenges, we propose Thymba, a hybrid deep learning framework for structure-informed prediction of circRNA–miRNA interactions. Thymba combines Mamba modules, self-attention mechanisms, and one-dimensional convolutions to jointly model local sequence motifs and long-range dependencies. Furthermore, it employs a structure-aware pretraining strategy that concurrently optimizes masked language modeling and RNA secondary structure learning, enabling the model to generate representations that encode both sequential and structural contexts. We additionally construct a high-quality isoform-level dataset by integrating AGO-supported interaction data from public repositories and generating hard negative pairs via RNAhybrid-based thermodynamic and alignment filtering. This dataset supports both interaction prediction and binding site prediction tasks. Experimental results show that Thymba consistently outperforms existing methods, particularly on isoform-specific benchmarks, and demonstrates strong generalizability to related RNA–RNA interaction tasks such as circRNA–RBP binding prediction.
2025-07-23 12:40:00 13:00:00 02N iRNA Flash talks Multiple 1-minute flash talks advertising iRNA posters
2025-07-23 14:00:00 14:20:00 02N iRNA Predicting relevant snoRNA genes across any eukaryote genome using SnoBIRD Étienne Fafard-Couture Étienne Fafard-Couture, Pierre-Étienne Jacques, Michelle S Scott Small nucleolar RNAs (snoRNAs) are a group of noncoding RNAs identified in all eukaryotes. In human, C/D box snoRNAs are the most prevalent class, displaying crucial functions like regulating ribosome biogenesis and splicing. We have recently reported that less than a third of all annotated snoRNA genes are expressed in human. The remaining two-thirds, named the snoRNA pseudogenes, present features that are incompatible with their expression (e.g., mutations in their boxes). However, current annotations are often incomplete and overlook these snoRNA pseudogenes. To address this, we developed SnoBIRD. Based on DNABERT, SnoBIRD identifies C/D box snoRNA genes from any input sequence and classifies them as expressed or pseudogenes using sequence features (e.g., mutations in boxes). We show that SnoBIRD outperforms its competitor tools on a test set representative of all eukaryote kingdoms using relevant biological signal in the input sequence. By applying SnoBIRD on different genomes, we find that its runtime is adequate on the small Schizosaccharomyces pombe genome, and really outperforms the other tools on the large human genome (<13h compared to >3.5 days). Moreover, we identify with SnoBIRD most of the already annotated snoRNAs in these two species (respectively 19/32 and 358/403), as well as 8 and 22 novel expressed C/D box snoRNAs in their respective genome. Finally, we applied SnoBIRD on the genome of varied eukaryote species and show that it is an efficient and generalizable snoRNA predictor, as it identifies the known C/D box snoRNAs as well as dozens of novel expressed snoRNAs in these species.
2025-07-23 14:20:00 14:40:00 02N iRNA Charting the dynamics of the tRNAome in health and disease with AMaNITA Xanthi Lida Katopodi Xanthi Lida Katopodi, Laia Llovera Nadal, Alexane Ollivier, Leszek Pryszcz, Cornelius Pauli, Daniel Heid, Thomas Muley, Marc Schneider, Laura Klotz, Michael Allgäuer, Michaela Frye, Carsten Müller-Tidow, Oguzhan Begik, Eva Maria Novoa Transfer RNAs (tRNAs) play a pivotal role in decoding genetic information, determining which transcripts are highly and poorly translated at a given moment. Dysregulation of tRNA abundances and their RNA modifications is a well-known feature in cancer cells, which leads to enhanced expression of specific oncogenic transcripts and proteins or, complementary, to the depletion of proteins essential to the proper cell function. A novel protocol named Nano-tRNAseq was recently developed to study tRNA populations using native RNA nanopore sequencing technologies, providing tRNA abundance and modification information from the same individual molecules. To analyze information-rich nanopore native tRNA sequencing datasets, here we have developed AMaNITA (Abundance, Modifications, and Nanopore Intensity Toolbox/Application), a toolkit that facilitates Nano-tRNAseq analysis and provides a simple and user-friendly computational framework for the analysis of Nano-tRNAseq data. AMaNITA performs several steps, including filtering, quality control, batch effect estimation and automated correction, differential tRNA expression, and differential modification analyses, thus providing a start-to-end analysis of the data. Harnessing the data produced by Nano-tRNAseq with AMaNITA, we then examine whether tRNAs can be used to distinguish biological states, tissue of origin, and disease state. We find that our method separately clusters tumor and normal samples and identifies individual tRNA molecules that are dysregulated in cancer, with potential diagnostic and therapeutic applications in the clinic. When applied on a lung cancer cohort consisting of 69 matched tumor/normal samples, our method reveals that tRNA information can segregate healthy and tumor samples with high accuracy.
2025-07-23 14:40:00 15:00:00 02N iRNA Identification and characterization of chromatin-associated long non-coding RNAs in human Lina Ma Zhao Li, Zhang Zhang, Lina Ma Chromatin-associated long non-coding RNAs (ca-lncRNAs) play crucial regulatory roles within the nucleus by preferentially binding to chromatin. Despite their importance, systematic identification and functional studies of ca-lncRNAs have been limited. Here, we identified and characterized human ca-lncRNAs genome-wide, utilizing 323,950 lncRNAs from LncBook 2.0 and integrating high-throughput sequencing datasets that assess RNA-chromatin association. We identified 14,138 high-confidence ca-lncRNAs enriched on chromatin across six cell lines, comprising nearly 80% of analyzed chromatin-associated RNAs, highlighting their significant role in chromatin localization. To explore the sequence basis for chromatin localization, we applied the LightGBM machine learning model to identify contributing nucleotide k-mers and derived 12 sequence elements through k-mer assembly and feature ablation. These sequence elements are frequently found within Alu repeats, with more Alu repeats enhancing chromatin localization. Meta-profiling of chromatin-binding sequencing segments further demonstrated that ca-lncRNAs bind to chromatin through Alu repeats. To delve deeper into the molecular mechanisms underlying the binding, we conducted integrative interactome analysis and computational prediction, revealing that Alu repeats primarily tether to chromatin through dsDNA-RNA triplex formation. Finally, to address sample constraints in ca-lncRNA identification, we developed a machine learning model based on sequential feature selection for large-scale prediction. This approach yielded 201,959 predicted ca-lncRNAs, approximately 70% of which are predicted to be preferentially located in the nucleus. Collectively, these high-throughput-identified and machine-learning-predicted ca-lncRNAs together form a robust resource for further functional studies.
2025-07-23 15:00:00 15:10:00 02N iRNA Toward a Computational Pipeline for Prokaryotic miRNAs: The Case of Pseudomonas aeruginosa in Lung Disease Laura Veschetti Cristina Cigana, Elisa Lovo, Alessandra Bragonzi, Giovanni Malerba, Laura Veschetti Background: miRNAs are key regulators in eukaryotes, yet little is known about their existence and function in bacteria. Although various noncoding RNAs have been identified in prokaryotes, only a few bacterial miRNAs have been validated. Given the clinical impact of Pseudomonas aeruginosa (PA) in chronic respiratory diseases, we investigated PA-derived miRNAs and their potential interactions with human genes. Motivation: Research has mainly focused on eukaryotic miRNA and the lack of computational tools for prokaryotic miRNA prediction has slowed progress in microbial miRNA research. Our study aims to propose a computational framework for bacterial miRNA prediction, offering an application on PA. Methods: We analyzed 36 RNAseq datasets from clinical PA isolates. Precursor miRNAs were predicted and filtered for structural stability. Mature miRNAs were identified through read mapping. Phylogenetic comparison was performed across organisms, and interactions with human UTRs were predicted. In silico validation across 4 PA reference strains was carried out through genome mapping, expression profiling, and de novo predictions. Results: We identified a mean of 422 precursors and 247 mature miRNAs per sample. Some candidates showed homology with human and were conserved across species. Predicted targets were enriched in immune, metabolic, and signaling pathways. Fifty-six miRNAs scored high in the integrative in silico validation. Experimental confirmation is ongoing. Conclusions: We propose a computational framework for identifying bacterial miRNAs with potential roles in host-pathogen interactions. Significance: The knowledge generated through the study advances the characterization of currently under-studied microbial miRNAs, paving the way for therapeutic interventions in chronic respiratory disease.
2025-07-23 15:10:00 15:20:00 02N iRNA Characterisation of the role of SNORD116 in RNA processing during cardiomyocyte differentiation Sofia Kudasheva Sofia Kudasheva, Wilfried Haerty, Terri Holmes, James Smith, Vanda Knitlhoffer Deletions of the SNORD116 small nucleolar RNA cluster result in Prader-Willi syndrome (PWS), a developmental disorder with a complex multisystem phenotype. Emerging clinical data highlight a high incidence of congenital cardiac defects in individuals with PWS, whilst SNORD116 was found to be elevated in a human pluripotent stem cell (hPSC) model of cardiomyopathy. While previous research in neuronal cells has implicated SNORD116 in regulation of RNA processing, its molecular targets and function in the heart remain unclear. To investigate this, we used an hPSC-derived cardiomyocyte model with SNORD116 knockout. We performed Oxford Nanopore long-read sequencing at three differentiation stages to simultaneously detect effects of SNORD116 knockout on alternative splicing, cleavage and polyadenylation (APA), and poly(A) tail length. We identified 40,018 novel isoforms; 174 of which were involved in significant isoform switches between control and SNORD116 knockout. Analysis of functional changes resulting from these switches revealed a developmental stage-dependent shift in 3’UTR usage in knockout cells, characterised by increased distal poly(A) site usage at day 2 and a reversal by day 30. Transcriptome-wide APA analysis confirmed these trends and revealed significant enrichment for predicted SNORD116 binding sites among APA-regulated genes. Notably, genes showing consistent poly(A) tail shortening in SNORD116 KO cells were enriched for ribosomal components, suggesting coordinated regulation of RNA stability and translation. These findings highlight a previously unrecognised role for SNORD116 in modulating APA and poly(A) tail length during cardiomyocyte differentiation, with implications for understanding the molecular underpinnings of PWS-associated cardiac phenotypes.
2025-07-23 15:20:00 15:30:00 02N iRNA EpiCRISPR: Improving CRISPR/Cas9 on-target efficiency prediction by multiple epigenetic marks, high-throughput datasets, and flanking sequences Yaron Orenstein Michal Rahimi, Yaron Orenstein CRISPR/Cas9 has transformed gene editing, enabling targeted modification of genomic loci using a 20-nt guide RNA followed by an NGG motif. However, editing efficiency varies due to target sequence, flanking regions, and epigenetic context. Measuring endogenous efficiency experimentally is labor-intensive, prompting the development of predictive models. Prior models were trained on small datasets, limiting generalizability. Leenay et al. recently released a dataset of ~1,600 endogenous efficiency measurements in T cells. We present EpiCRISPR, a neural network trained on this dataset that integrates guide RNA sequence, flanking regions, epigenetic marks, and high-throughput predictions. We found that incorporating downstream flanking sequences improved prediction (Spearman correlation from 0.309 to 0.375). Including epigenetic features—especially open chromatin, H3K4me3, and H3K27ac—boosted performance to 0.496. Adding high-throughput-based predictions further raised correlation to 0.514. Importantly, EpiCRISPR generalized well across cell types and revealed biologically meaningful feature importance via saliency maps. EpiCRISPR is publicly available at github.com/OrensteinLab/EpiCRISPR.
2025-07-23 15:30:00 15:40:00 02N iRNA Enhancing CRISPR/Cas9 Guide RNA Design Using Active Learning Techniques Stefano Roncelli Stefano Roncelli, Gül Sude Demircan, Christian Anthon, Lars Juhl Jensen, Jan Gorodkin CRISPR/Cas systems have significantly advanced genome editing, yet the precise design of guide RNAs (gRNAs) for optimal efficiency and specificity remains a persistent challenge. The CRISPRnet project seeks to enhance model performance and predictive accuracy by generating new data from gRNAs that are strategically selected to enrich existing datasets. To determine which gRNAs should be validated experimentally, we utilize methods for estimating prediction uncertainty. The idea being that the gRNAs, for which the efficiency prediction models are most uncertain, are the ones that would be the most valuable to experimentally validate. A key difficulty in this effort lies in the absence of definitive ground truth for model uncertainty. To address this, we modified the state-of-the-art CRISPRon model, which was trained on 30mer gRNA targets with context sequence and the binding energy between the gRNA spacer and the target DNA, by using deep neural networks to predict the editing efficiency. We implemented two approaches: (1) an ensemble of CRISPRon models trained with nested cross-validation to quantify prediction variance, and (2) an ensemble of modified CRISPRon models, extended with an additional classifier head and a customized loss function for uncertainty estimation. The effectiveness of these methods is evaluated through benchmarking against a curated set of candidate gRNAs, enabling data augmentation based on the recommendations made by the models.
2025-07-23 15:40:00 16:00:00 02N iRNA Single-base tiled screen reveals design principles of PspCas13b-RNA targeting and informs automated screening of potent targets Syed Faraz Ahmed Syed Faraz Ahmed, Mohamed Fareh, Wenxin Hu, Matthew R McKay The advancement of RNA therapeutics hinges on developing precise RNA-editing tools with high specificity and minimal off-target effects. We present a framework for optimizing CRISPR PspCas13b, a programmable RNA nuclease with a 30-nucleotide spacer sequence that offers potentially superior targeting specificity. Through single-base tiled screening and computational analyses, we identified critical design principles governing effective RNA recognition and cleavage in human cells. Our analyses revealed position-specific nucleotide preferences that significantly impact crRNA efficiency. Specifically, guanosine bases at positions 1-2 enhance catalytic activity, while cytosine bases at positions 1-4 and 11-17 dramatically reduce efficiency. This positional weighting system forms the foundation of our algorithm, which predicts highly effective crRNAs with ~90% accuracy. Comprehensive spacer-target mutagenesis analysis, implemented through computational modeling, demonstrated that PspCas13b requires ~26-nucleotide base pairing and tolerates only up to four mismatches to activate its nuclease domains. This computational insight explains PspCas13b's superior specificity compared to other RNA interference tools and predicts an extremely low probability of off-target effects, subsequently validated through proteomic analysis. We developed an open-source, R-based computational tool (https://cas13target.azurewebsites.net/) that implements these design principles to generate optimized crRNAs for any target sequence. The tool scores potential crRNAs based on nucleotide composition and position. Additionally, it performs off-target analysis by assessing sequence complementarity with human transcriptome data. This computational approach represents a significant advancement in RNA targeting technology and offers a powerful platform for the development of more effective RNA therapeutics with minimized off-target effects.
2025-07-23 16:40:00 17:20:00 02N iRNA Gene regulation of human cell systems Roser Vento-Tormo The study of human tissues requires a systems biology approach. Their development starts in utero and during adulthood, they change their organization and cell composition. Our team has integrated comprehensive maps of human developing and adult tissues generated by us and others using a combination of single-cell and spatial transcriptomics, chromatin accessibility assays and fluorescent microscopy. We utilise these maps to guide the development and interpretability of in vitro models. To do so, we develop and apply bioinformatic tools that allow us to quantitatively compare both systems and predict changes.
2025-07-23 17:20:00 17:40:00 02N iRNA EdiSetFlow: A robust pipeline for RNA editing detection and differential analysis in bulk RNA-seq Jacob Munro Jacob Munro, Melanie Bahlo, Brendan Ansell Adenosine-to-inosine (A-to-I) RNA editing is a post-transcriptional modification catalyzed by ADAR enzymes that can alter codons, splicing patterns, and RNA secondary structures. This process is essential for neuronal development and immune function, with dysregulation implicated in neurological disorders, cancers, and autoimmune diseases. Despite its biological importance, accurate detection of RNA editing from RNA-seq data remains technically challenging, and robust inference of differential editing between experimental conditions is not straightforward. To address these challenges, we have developed EdiSetFlow, a reproducible and scalable pipeline for transcriptome-wide A-to-I RNA editing analysis from bulk RNA-seq data. EdiSetFlow is implemented in Nextflow takes raw FASTQ files as input, performs read trimming and quality filtering, aligns reads to the reference genome, and identifies editing sites with JACUSA. Common genetic variants are excluded based on the gnomAD population database. Identified sites are annotated for gene context and predicted functional consequences, with results summarized in a user-friendly HTML report. The pipeline is designed to efficiently scale to hundreds or thousands of samples, making it suitable for large datasets such as GTEx. An accompanying R package enables advanced analyses, including model fitting, hypothesis testing, false discovery rate control, and visualisations, facilitating reliable statistical comparisons of editing between experimental groups. Applying EdiSetFlow to GTEx brain RNA-seq data, we uncovered distinct RNA editing signatures across brain regions, identifying both known and previously uncharacterized regional editing patterns. EdiSetFlow provides researchers with a robust, end-to-end solution to efficiently discover and interpret biologically meaningful RNA editing events in diverse transcriptomic datasets.
2025-07-23 17:40:00 18:00:00 02N iRNA Statistical modeling of single-cell epitranscriptomics enabled trajectory and regulatory inference of RNA methylation Jia Meng As a fundamental mechanism for gene expression regulation, post-transcriptional RNA methylation plays versatile roles in various biological processes and disease mechanisms. Recent advances in single-cell technology have enabled simultaneous profiling of transcriptome-wide RNA methylation in thousands of cells, holding the promise to provide deeper insights into the dynamics, functions, and regulation of RNA methylation. However, it remains a major challenge to determine how to best analyze single-cell epitranscriptomics data. In this study, we developed SigRM, a computational framework for effectively mining single-cell epitranscriptomics datasets with a large cell number, such as those produced by the scDART-seq technique from the SMART-seq2 platform. SigRM not only outperforms state-of-the-art models in RNA methylation site detection on both simulated and real datasets but also provides rigorous quantification metrics of RNA methylation levels. This facilitates various downstream analyses, including trajectory inference and regulatory network reconstruction concerning the dynamics of RNA methylation.
2025-07-24 08:40:00 09:00:00 02N iRNA Prediction and validation of Split Open Reading Frames across cell types Christina Kalk Christina Kalk, Marcel Schulz, Michaela Müller-McNicoll, Vladimir Despic, Mauro Siragusa, Justin Murtagh Background: Split Open Reading frames (Split-ORFs) exist on transcripts containing at least two open reading frames, each of which encodes a part of the same full-length protein. These multiple open reading frames arise from alternatively spliced transcript isoforms. The phenomenon of Split-ORFs has been observed for the SR protein family of splicing factors, where the Split-ORF proteins play important autoregulatory roles. Aims/purpose: The aim of this study was to investigate the translation and expression of Split-ORFs. Methods: We built a pipeline that predicts potential Split-ORFs for a user supplied set of transcripts and determines the regions unique to the potential Split-ORFs. These unique regions are absent from protein coding transcripts. The translation of the predicted Split-ORFs can be validated by finding their unique regions in Ribo-seq or proteomics data. Results: The Split-ORF pipeline was applied to a set of transcripts containing premature termination codons or retained introns. Novel Split-ORF transcripts and their unique regions were predicted and a substantial fraction had significant Ribo-seq coverage in data from different cell types. Additionally, the Split-ORF candidate start sites had a significantly higher probability of being translation initiation sites than background sites as predicted by a deep neural network. Outlook: These results suggest that the occurrence of Split-ORFs is more widespread than previously assumed and that they are expressed across different cell types. This paves the road for further functional investigations of the validated Split-ORF candidates and mechanisms of their biogenesis.
2025-07-24 09:00:00 09:20:00 02N iRNA Bridging the Gap: Recalibrating In-vitro Models for Accurate In-vivo RBP Binding Predictions Ilyes Baali Ilyes Baali, Alexander Sasse, Quaid Morris Accurate identification of RNA-binding protein (RBP) binding sites is essential for understanding post-transcriptional gene regulation. However, current models face two major challenges: the limited availability of in vivo data and the poor generalization of models trained solely on in vitro assays. These limitations hinder our ability to make reliable in vivo predictions and obscure the true regulatory roles of RBPs in cellular contexts. This study aims to understand the root causes of discrepancies between these two assay types. By analyzing data from both assays, we investigate whether differences arise from biological context, experimental artifacts, or model limitations. To address these challenges, we introduce a recalibration model that integrates in vitro and in vivo data to improve prediction accuracy and interpretability. We evaluate model performance across multiple generalization tasks—including chromosome, cell-type, and RBP-wise splits—and find that in-vitro-only models generalize poorly to in-vivo settings. In contrast, the recalibrated model significantly improves performance and even outperforms in-vivo-only models, demonstrating the added value of recalibrated in-vitro data. Feature importance analysis shows that the recalibration model corrects for incomplete binding preferences in in vitro assays and adjusts for assay-specific artifacts, such as G-rich motif enrichment in eCLIP. These findings suggest that many observed differences between assays are driven by technical biases rather than fundamental biological divergence and highlight the importance of accounting for such factors when modeling RBP binding in vivo.
2025-07-24 09:20:00 09:40:00 02N iRNA Multi-Tool Intron Retention Analysis in Autism Adi Gershon Adi Gershon, Saira Jabeen, Asa Ben Hur, Maayan Salton Intron retention is an alternative splicing event in which introns remain in mature mRNA, altering protein isoforms or triggering transcript decay. Recent evidence highlights IR’s involvement in key biological processes, including neurodevelopment. However, quantifying IR remains difficult due to intronic complexity and ambiguous read mapping. We systematically analyzed IR in autism spectrum disorder (ASD) using three computational tools with distinct strategies. rMATS (junction-based modeling), IRFinder (intron/spliced read ratios), and iDiffIR (log fold-change in intron coverage). Our focus was on six splicing factors (NOVA2, RBFOX1, SRRM2, SART3, U2AF2, WBP4) implicated in syndromic ASD, alongside idiopathic ASD brain tissue. We aimed to identify shared IR events that might reflect underlying splicing dysregulation in ASD. All tools revealed hundreds of significantly altered introns in ASD and splicing factor models, consistently showing increased retention in ASD or mutant conditions. Despite tool-specific differences, we identified 574 genes with significant intron retention in both splicing factor models and ASD brain, enriched for neurodevelopmental pathways and known autism genes. At the event level, 21 introns were detected across multiple splicing factor models and ASD brains, enriched for transcription factor motifs such as TFAP2A and PLAGL2, suggesting shared regulatory mechanisms. Notably, rMATS and IRFinder detected more events and showed pronounced associations with intron length and GC content, whereas iDiffIR displayed greater variability. Our multi-tool approach highlights the complexity of IR detection and underscores the value of integrating complementary strategies to elucidate splicing dysregulation in ASD. These findings provides prioritized IR candidates for future functional studies in neurodevelopmental disorders.
2025-07-24 09:40:00 10:00:00 02N iRNA Detection of statistically robust interactions from diverse RNA-DNA ligation data Timothy Warwick Simonida Zehr, Ralf Brandes, Marcel Schulz, Timothy Warwick Background: Chromatin-localized RNAs play key roles in gene regulation and nuclear architecture. Genome-wide RNA-DNA interactions can be mapped using molecular methods like RADICL-seq, GRID-seq, Red-C, and ChAR-seq, which utilize bridging oligonucleotides for RNA-DNA ligation. Despite advancements in these methods, a computational tool for reliably identifying biologically meaningful RNA-DNA interactions is lacking. Approach: Herein, we present RADIAnT, a reads-to-interactions pipeline for analysing RNA-DNA ligation data. These data are often confounded by multiple factors, including nascent transcription and expression differences. To manage these confounders, RADIAnT calls interactions against a dataset-specific, unified background which considers RNA binding site-TSS distance, genomic region bias and relative RNA abundance. Results: By calling interactions against the multifactor background described above, RADIAnT is sensitive enough to detect specific interactions of lowly expressed transcripts, while remaining specific enough to discount false positive interactions of highly abundant RNAs. In addition to calling consistent interactions between different molecular methodologies, RADIAnT outperforms previously proposed methods in the accurate identification of genome-wide Malat1-DNA interactions in murine data, and NEAT1-DNA interactions in human cells, with orthogonal one-to-all data used to classify binding regions in each case. In a further use case, RADIAnT was utilized to identify dynamic chromatin-associated RNAs in the physiologically- and pathologically-relevant process of endothelial-to-mesenchymal transition. Conclusion: RADIAnT represents a reproducible, generalisable approach for analysis of RNA-DNA ligation data, and provides users with statistically stratified RNA-DNA interactions which can be probed for biological function.
2025-07-24 11:20:00 11:50:00 02N iRNA Building the future of RNA tools Blake Sweeney Blake Sweeney, Blake Sweeney From the epitranscriptome and 3D structure prediction to large language models, RNA science is experiencing a transformative shift. Recent advances in RNA 3D structure prediction and RNA-focused language models represent early milestones in what's possible. The explosion in data availability and computational power will fundamentally change how we approach RNA research. This computational revolution will be shaped by the tools we build today. This talk serves as an introduction to our special section and panel discussion, where we'll discuss frontiers in RNA tool development. This talk will outline the key themes that our following speakers and panel will explore in detail. In the panel, we aim to tackle questions like: What are the highest-impact tools missing from our current toolkit? What problems can machine learning solve, and what limitations does it face in RNA science? How can these limitations be overcome? What would it take to make sophisticated RNA analysis accessible to every researcher? We encourage anyone interested in RNA research or seeking new computational frontiers to attend this section and contribute to the following panel discussion.
2025-07-24 11:50:00 12:00:00 02N iRNA Sci-ModoM: a quantitative database of transcriptome-wide high-throughput RNA modification sites promoting cross-disciplinary collaborative research Etienne Boileau Etienne Boileau, Harald Wilhelmi, Anne Busch, Andrea Cappannini, Andreas Hildebrand, Janusz M Bujnicki, Christoph Dieterich We recently presented Sci-ModoM [1], the first next-generation RNome database offering a one-stop source for RNA modifications originating from state-of-the-art high-resolution detection methods. Sci-ModoM provides quantitative measurements per site and dataset, enabling researchers, including non-experts, to assess the confidence level of the reported modifications across datasets. Currently, users can Search and Compare over seven million modifications across 162 datasets, Browse or download datasets, and retrieve metadata; and these figures keep growing as data is continuously added. Sci-ModoM addresses critical challenges that are foundational to open science such as the need for standardized nomenclatures, common standards and guidelines for data sharing. It promotes data reuse, as it relies solely on the authors' published results; data are accessible in a human-readable, interoperable format, developed in consultation with the community [2]. In this talk, we will present Sci-ModoM in the context of a broader pan-European roadmap to (i) facilitate access to and sharing of high-throughput transcriptome-wide RNA modification data, and (ii) to promote data-driven sustainability in the development of reliable methods to map and identify RNA modifications. Our current work aims to expand the different RNA types (mRNA, non-coding RNA, tRNA, rRNA) in Sci-ModoM, to further establish FAIR data treatment, and to improve guidelines for data analysis and exchange, under the umbrella of the Human RNome project [3]. [1] Etienne Boileau, Harald Wilhelmi, Anne Busch, Andrea Cappannini, Andreas Hildebrand, Janusz M. Bujnicki, Christoph Dieterich. Sci-ModoM: a quantitative database of transcriptome-wide high-throughput RNA modification sites Nucleic Acids Research, 2024, gkae972. [2] https://dieterich-lab.github.io/euf-specs [3] https://humanrnomeproject.org
2025-07-24 12:00:00 12:10:00 02N iRNA RNAtranslator: A Generative Language Model for Protein-Conditional RNA Design A. Ercument Cicek Sobhan Shukueian Tabrizi, Sina Barazandeh, Helya Hashemi Aghdam, A. Ercument Cicek Protein-RNA interactions are essential in gene regulation, splicing, RNA stability, and translation, making RNA a promising therapeutic agent for targeting proteins, including those considered undruggable. However, designing RNA sequences that selectively bind to proteins remains a significant challenge due to the vast sequence space and limitations of current experimental and computational methods. Traditional approaches rely on in vitro selection techniques or computational models that require post-generation optimization, restricting their applicability to well-characterized proteins. We introduce RNAtranslator, a generative language model that formulates protein-conditional RNA design as a sequence-to-sequence natural language translation problem for the first time. By learning a joint representation of RNA and protein interactions from large-scale datasets, RNAtranslator directly generates binding RNA sequences for any given protein target without the need for additional optimization. Our results demonstrate that RNAtranslator produces RNA sequences with natural-like properties, high novelty, and enhanced binding affinity compared to existing methods. This approach enables efficient RNA design for a wide range of proteins, paving the way for new RNA-based therapeutics and synthetic biology applications. The model and the code is released at github.com/ciceklab/RNAtranslator.
2025-07-24 12:10:00 12:20:00 02N iRNA miRXplain: transformer-driven explainable microRNA target prediction leveraging isomiR interactions Giulia Cantini Ranjan Kumar Maji, Giulia Cantini, Hui Cheng, Annalisa Marsico, Marcel Schulz microRNAs (miRNAs) are short (~22 nt) RNA sequences key regulators of transcript expression. miRNAs bind to target mRNA sites to repress genes. isomiRs, generated with alternate processing of miRNA hairpins during biogenesis, exhibit variations that change the relative seed position to their canonical forms. This results in the selection of a different target transcript repertoire compared to canonical, diversifying miRNA regulation. However, mRNA configurations that enable miRNA target selection are still undetermined. isomiRs, together with canonical miRNA targets, have not been studied due to the lack of high-throughput experiments that capture exact miRNAs bound to their targets. Deep learning (DL) approaches have neither used such datasets nor have they investigated isomiR target interactions. To address this gap, we developed a new transformer model, miRXplain, that predicts miRNA target interactions using miRNA and target sequences from CLIP-L chimeras. We analyzed CLIP-L experiments, which tether exact miRNA variations to their mRNA target site. We annotated these interactions and revealed nucleotide biases at the 5’ end of the target region. We addressed these biases and constructed miRNA and interacting site pairs to learn isomiR differences from their canonicals in their target interaction. miRXplain surpassed in performance all the benchmarked models and performed on par with TEC-miTarget, however ~2 times faster during training per epoch. Model attention weights revealed distinct importance of nucleotide positions for canonical and isomiR types. miRXplain can contribute to the discovery of isomiR targeting rules to enhance our understanding of miRNA biology. Code availability: https://github.com/marsico-lab/miRXplain.
2025-07-24 12:20:00 12:30:00 02N iRNA Designing functional RNA sequences using conditional diffusion models Cho Joohyun Cho Joohyun, Daniil Melnichenko, Jongmin Lim, Dongsup Kim, Young-suk Lee The function of RNA is largely determined by its networks of protein-RNA interactions. A key challenge in RNA engineering is in designing the sequence in a manner that controls its interacting partners. Towards this effort, we built a RNA sequence generator using conditional diffusion models that automatically designs based on the structure of a given RNA-binding protein. The RNA generator is a single unified deep-learning framework of 64 million parameters and is trained on high-quality structure data of 1,190 distinct protein-RNA complexes. The model’s cross-attention mechanism suggests that it learns the evolutionary homology of protein-RNA interactions. When benchmarking on RoseTTAFoldNA’s training and test dataset, we find that our model generates RNA sequences with AlphaFold3-confidence scores comparable to the bound RNA sequence. In all, these results call for experimental confirmation from a complementary source of protein-RNA interaction, and expands the possibility of automatically designing functional RNAs for biomedical applications.
2025-07-24 12:30:00 13:00:00 02N iRNA Panel: The future of RNA tools
2025-07-24 14:00:00 14:40:00 02N iRNA Decoding RNA language in plants Yiliang Ding RNA structure plays an important role in the post-transcriptional regulations of gene expression. Using in vivo RNA structure profiling methods, we have determined the functional roles of RNA structure in diverse biological processes such as mRNA processing (splicing and polyadenylation), translation and RNA degradation in plants. We also developed a new method to reveal the existence of tertiary RNA G-quadruplex structures in eukaryotes and uncovered that RNA G-quadruplex structure serves as a molecular marker to facilitate plant adaptation to the cold during evolution. Additionally, we have developed the single-molecule RNA structure profiling method and revealed the functional importance of RNA structure in long noncoding RNAs. Recently, we established a powerful RNA foundation model, PlantRNA-FM, that facilitates the explorations of functional RNA structure motifs across transcriptomes.
2025-07-24 14:40:00 15:00:00 02N iRNA EnsembleDesign: Messenger RNA Design Minimizing Ensemble Free Energy via Probabilistic Lattice Parsing Liang Huang Ning Dai, Tianshuo Zhou, Wei Yu Tang, David Mathews, Liang Huang The task of designing optimized messenger RNA (mRNA) sequences has received much attention in recent years thanks to breakthroughs in mRNA vaccines during the COVID-19 pandemic. Because most previous work aimed to minimize the minimum free energy (MFE) of the mRNA in order to improve stability and protein expression, which only considers one particular structure per mRNA sequence, millions of alternative conformations in equilibrium are neglected. More importantly, we prefer an mRNA to populate multiple stable structures and be flexible among them during translation when the ribosome unwinds it. Therefore, we consider a new objective to minimize the ensemble free energy of an mRNA, which includes all possible structures in its Boltzmann ensemble. However, this new problem is much harder to solve than the original MFE optimization. To address the increased complexity of this problem, we introduce EnsembleDesign, a novel algorithm that employs continuous relaxation to optimize the expected ensemble free energy over a distribution of candidate sequences. EnsembleDesign extends both the lattice representation of the design space and the dynamic programming algorithm from LinearDesign to their probabilistic counterparts. Our algorithm consistently outperforms LinearDesign in terms of ensemble free energy, especially on long sequences. Interestingly, as byproducts, our designs also enjoy lower average unpaired probabilities (AUP, which correlates with degradation) and flatter Boltzmann ensembles (more flexibility between conformations). Our code is available on: https://github.com/LinearFold/EnsembleDesign.
2025-07-24 15:00:00 15:20:00 02N iRNA Machine learning-guided isoform quantification in bulk and single-cell RNA-seq using joint short- and long-read modeling Hamed Najafabadi Michael Apostolides, Jichen Wang, Ali Saberi, Benedict Choi, Hani Goodarzi, Hamed Najafabadi Accurate quantification of transcript isoforms is crucial for understanding gene regulation, functional diversity, and cellular behavior. Existing RNA sequencing methods have important limitations: short-read (SR) sequencing provides high depth but struggles with isoform deconvolution, especially in single-cell data where substantial positional biases are common; long-read (LR) sequencing offers isoform resolution but suffers from lower depth, higher noise, and technical biases. To address these challenges, we introduce Multi-Platform Aggregation and Quantification of Transcripts (MPAQT), a generative model that combines the complementary strengths of multiple RNA-seq platforms to achieve state-of-the-art isoform-resolved quantification. MPAQT explicitly models platform-specific biases, including positional biases in short-read single-cell data and sequence-dependent biases in LR data. We show that MPAQT enables state-of-the-art gene- and isoform-level quantification both in SR-only single-cell data and in bulk datasets integrating SR and LR reads. By applying MPAQT to an in vitro model of human embryonic stem cell differentiation into cortical neurons, followed by machine learning-based modeling of transcript abundances, we show that untranslated regions (UTRs) are major determinants of isoform proportion and exon usage. This effect is mediated through isoform-specific sequence features embedded in UTRs, which interact with RNA-binding proteins that modulate mRNA stability. We further demonstrate that machine learning-based predictions can be fed back into MPAQT to resolve ambiguities in read-to-isoform assignment, resulting in more accurate abundance estimates. These findings highlight MPAQT’s potential to enhance our understanding of transcriptomic complexity across platforms and cell types, while bridging statistical quantification with machine learning models of isoform regulation.
2025-07-24 15:20:00 15:40:00 02N iRNA Long-read RNA sequencing unveils a novel cryptic exon in MNAT1 along with its full-length transcript structure in TDP-43 proteinopathy Yoshihisa Tanaka Yoshihisa Tanaka, Naohiro Sunamura, Rei Kajitani, Marie Ikeguchi, Ryo Kunimoto Understanding the role of transcript isoforms is crucial for dissecting disease mechanisms. TAR DNA binding protein-43 (TDP-43) is a key regulator of RNA splicing, and its dysfunction in neurons is a hallmark of some neurodegenerative diseases such as amyotrophic lateral sclerosis (ALS) and frontotemporal degeneration (FTD). Specifically, TDP-43 maintains proper splicing by preventing the aberrant inclusion of cryptic exons into mRNA, thereby preserving normal transcript isoforms. Although TDP-43-dependent cryptic exons have been implicated in disease pathogenesis, an approach to investigate how cryptic exons disrupt transcript isoforms has yet to be established. To address this, we developed IsoRefiner, a novel method for identifying full-length transcript structures using long-read RNA-seq. Our results show that IsoRefiner outperforms existing long-read analysis tools. Leveraging this method, we conducted long-read RNA-seq, guided by prior short-read RNA-seq, to comprehensively resolve the full-length structures of aberrant transcripts caused by TDP-43 depletion in human induced pluripotent stem cell (iPSC)-derived motor neurons. This led to the discovery of a novel TDP-43-dependent cryptic exon in the MNAT1 gene, along with its full-length transcript structure. Furthermore, we confirmed the presence of the MNAT1 cryptic exon in tissues derived from patients with ALS and FTD. Our findings deepen understanding of TDP-43 proteinopathy, and our approach provides a powerful framework for investigating splicing mechanisms across diverse cellular and disease contexts.
2025-07-24 15:40:00 15:50:00 02N iRNA Transcriptome Universal Single-isoform COntrol (TUSCO): A Framework for Evaluating Transcriptome Quality Tianyuan Liu Tianyuan Liu, Adam Frankish, Ana Conesa, Alejandro Paniagua, Fabian Jetzinger Long-read sequencing (LRS) platforms, such as Oxford Nanopore (ONT) and Pacific Biosciences (PacBio), enable comprehensive transcriptome analysis but face challenges such as sequencing errors, sample quality variability, and library preparation biases. Current benchmarking approaches address these issues insufficiently: BUSCO assesses transcriptome completeness using conserved single-copy orthologs but can misinterpret alternative splicing as gene duplications, while spike-ins (SIRVs, ERCCs) oversimplify real-sample complexity, neglecting RNA degradation and RNA extraction artifacts, thus inflating performance metrics. To overcome these limitations, we introduce the Transcriptome Universal Single-isoform COntrol (TUSCO), a curated internal reference set of conserved genes lacking alternative isoforms. TUSCO evaluates precision by identifying transcripts deviating from reference annotations and assesses sensitivity by verifying detection completeness in human and mouse samples. Our validation demonstrates that TUSCO provides accurate and reliable benchmarking without external controls, significantly improving quality control standards for transcriptome reconstruction using LRS.
2025-07-24 15:50:00 16:00:00 02N iRNA Concluding remarks and poster prizes Maayan Salton

- top -