Student Council Symposium

Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
All times listed are in CEST
Sunday, July 23rd
9:00-9:10
Introduction and Welcome words
Room: Pasteur Lounge
Format: Live from venue

9:10-9:55
Invited Presentation: Keynote 1: Multimodal data integration for rare genetic diseases
Room: Pasteur Lounge
Format: Live from venue

  • Anais Baudot


Presentation Overview: Show

TBD

9:55-10:15
Coffee Break
Room: Pasteur Lounge
Format: Live from venue

10:15-10:30
Oral Talk 1: Computational resources for understanding the binding affinity of membrane protein-protein complexes and mutants
Room: Pasteur Lounge
Format: Live from venue

  • Fathima Karuvanthodikayil


Presentation Overview: Show

Background:
Membrane proteins (MPs) mostly function as complexes and the interaction between the proteins is are dictated by their strength of binding or binding affinity. Due to their intricate structure, however, the binding affinity of membrane proteins is less explored compared to globular proteins. Mutations in these complexes affect their binding affinity, as well as impair critical functions, and may lead to diseases. Despite an increase in experimental affinity data in the literature, they are dispersed, necessitating their compilation into a comprehensive database for further analysis. Also, experimentally determining the affinity of these complexes is expensive and time-consuming; making them infeasible for their large-scale applications. Therefore, there is high demand for accurate computational approaches to determine the affinity of these complexes.

Results:
We developed the first and specific database, MPAD (Membrane Protein complex binding Affinity Database), which contains experimental binding affinities of membrane protein-protein complexes and their mutants along with sequence, structure, and functional information, membrane-specific features, experimental conditions, as well as literature information. The current version of MPAD contains 5376 entries, which includes 1705 wild-type and 3671 mutant data. MPAD has an easy-to-use interface and options to build search queries, display, sort, download, and upload the data are among the other features available to users.
Using this database, we have developed the first ML-based method, MPA-Pred, for predicting the affinity of novel MP complexes. Our method showed a correlation and MAE of 0.83 and 0.91 kcal/mol, respectively, using the jackknife test on a set of 114 complexes. Classification of complexes based on membrane protein type and function improved the performance of the method.

Conclusion:
MPAD is the first database for the binding affinity of membrane protein-protein complexes. The database can be used to understand the factors influencing the binding affinity in MPs as well as the impact of mutations on binding affinity. We have also developed a multiple regression-based method to predict the affinity of novel MP complexes. Thus, we anticipate that these resources can help contribute to an in-depth understanding of MP complexes which may have potential applications to drug design and also for further analysis in different directions.

10:30-10:45
Oral Talk 2: Development and Application of the MultiSEp R Package to Identify Multiple Myeloma Achilles' Heels for Drug Discovery
Room: Pasteur Lounge
Format: Live from venue

  • Adeline McKie


Presentation Overview: Show

Background: Almost all Multiple Myeloma patients relapse and ultimately succumb to therapy-resistant disease; there is urgent need for more effective treatment. Achilles' heel relationships arise when the status of one gene exposes a cell's vulnerability to perturbation of a second gene, such as chemical inhibition, providing opportunities for precision oncology. While there is a significant focus on genetic approaches, transcriptome data has advantages for the investigation of gene dependency relationships and remains relatively underexplored. Available multiomics resources for this purpose include the Cancer Dependency Map (DepMap) and the Multiple Myeloma Research Foundation (MMRF) CoMMpass study. Description: We developed MultiSEp for integrative discovery of candidate gene dependency relationships in multiomics data. Clustering of samples by expression of one gene allows partitioning of another gene’s CRISPR scores, mutations or gene expression to investigate signatures of synthetic lethality. MultiSEp performed well in benchmarking against other methods and we predicted multiple myeloma gene dependency relationships at genome-scale (27,288 genes, 372,303,828 candidate interactions) with CoMMpass data (n=859 patients). Following multiple filtering steps we derived a high-confidence predicted synthetic lethal network (8,695 edges, 5059 genes; SynLethNet), including characteristic mutual exclusive loss patterns (binomial q<0.05). We predicted the population coverage achieved by drugging SynLethNet genes, for example inhibiting a hub is predicted to achieve therapeutic cell killing if any neighbouring gene is mutated in the cancer cells. Our analysis only utilised deleterious mutations predicted by the variant effect prediction tools VARITY (score >=0.99), SNPeff and SNPsift (annotated ‘high-impact’ mutations). Of ten hubs with predicted therapeutic coverage >5%, two achieved 58% coverage and at least one is an attractive candidate drug target (coverage=14%). Functional annotation of SynLethNet revealed many genes involved in the ubiquitin-proteasome system, which is dysregulated in Multiple Myeloma and a target of current front-line therapy. Predictions were validated with the Cancer Dependency Map and the Cancer Therapeutics Response database. The MultiSEp R package will soon be available from CRAN and has passed 63-unit tests.
Conclusions: We present the MultiSEp R package, demonstrated with a case study in multiple myeloma where we predict candidate drug targets and provide mechanistic insights to advance precision oncology.

10:45-11:00
Oral Talk 3: Exploring Endogenous Peptides for Development of Safer Opioid Analgesics
Room: Pasteur Lounge
Format: Live-stream

  • Md. Hossain


Presentation Overview: Show

Background: Opioid analgesics are widely used for pain management, but their efficacy is often accompanied by various adverse effects, such as tolerance, dependence, addiction, and overdose. The Mu opioid receptor (MOP) is the primary target for opioid analgesics, but its activation can also lead to unwanted effects through β-arrestin recruitment and receptor desensitization. Biased agonism of MOP presents a promising approach to developing safer and more effective opioids. Recently, peptide-based opioids have garnered attention, with over 25 endogenous peptides discovered that bind to MOP to regulate pain and other bodily activities.
Description: This study aimed to evaluate the potential efficacy of 80 G protein-coupled receptor (GPCR) endogenous peptides from human and animal sources as clinical therapeutics using a computational approach. The preliminary analysis involved screening the peptides through molecular docking, and the best five candidates, including Big Endothelin, Glucagon, Kisspeptin, Leptin, and Galanin, were selected for further investigation. Molecular dynamics simulations with POPC membrane-embedded conditions were conducted to evaluate their binding affinities and interaction patterns. The results suggested that Glucagon has the most promising potential as a clinical therapeutic. Glucagon exhibited proper binding with the binding pocket and showed significant interaction with active site residues, including TRP318, THR216, CYS 217, and other binding pocket residues such as TRP226, LYS209, ASP216, and GLU310. Moreover, the analysis of the therapeutic complex formation indicated the dominance of hydrogen (62%) and hydrophobic (28%) bonds, which promote peptide-protein complex formation. Furthermore, the structure-activity relationship (SAR) analysis revealed that positively charged residues played a critical role in peptide-protein interaction.
Conclusion: This computational study suggests that Glucagon has the potential to serve as a safer and more effective alternative to traditional opioids for pain management. The peptide exhibited strong binding with the MOP binding pocket and interactions with active site residues, indicating promising therapeutic efficacy. These findings lay the foundation for further investigations into Glucagon's molecular mechanisms and the development of novel analgesic drugs with reduced side effects.

11:00-11:15
Oral Talk 4: Transformer model based prediction of protein Gene Ontology terms in the CAFA5 competition
Room: Pasteur Lounge
Format: Live-stream

  • Zong Ming Chua
  • Adarsh Rajesh


Presentation Overview: Show

Accurate prediction of protein Gene Ontology (GO) terms is crucial for understanding biological processes, protein function, and various disease mechanisms. Current methods to produce new GO annotations is both time intensive and often misses out on important aspects of protein function. Furthermore, many proteins from less studied organisms have no known function. Methods of GO prediction have recently seen exciting new developments based on large transformer-based language models that allow unprecedented accuracy in predicting the functions of previously little understood proteins. In this context, the fifth Critical Assessment of Functional Annotation (CAFA5) competition provides a platform for benchmarking such developments.

We propose an innovative ensemble learning approach that integrates pre-trained state of the art Transformer-based deep learning models to produce GO predictions superior to that of any individual model. We utilize and integrate pre-trained embeddings such as those from ProtTrans (Rost lab), ESM2 (Meta), and ProteinBERT (Linial lab) to produce GO prediction models that significantly outperform most other models currently submitted on the live leaderboard of CAFA5. Our study exemplifies the significant improvement in protein GO term prediction that can be obtained from an ensemble of diverse pre-trained protein feature models incorporating both sequence and structural information. Improvements in protein GO term prediction has potential implications in accelerating biological research of little studied proteins and enhancing our understanding of disease pathogenesis by pointing the way towards a fuller understanding of biological systems. Future directions include further optimization of the ensemble model to include more accurate predictors of diverse and novel protein sequences and the extension and complementation of such ensemble learning approaches to other open problems in biological modeling such as protein-protein interactions or protein structure folding.

11:15-11:30
Oral Talk 5: ATLAS-AML: An Automated Bioinformatics Pipeline for Target Characterization in Acute Myeloid Leukemia
Room: Pasteur Lounge
Format: Live-stream

  • Suraj Bansal, Princess Margaret Cancer Centre, Canada
  • Andy Zeng, Princess Margaret Cancer Centre, Canada
  • Amanda Mitchell, Princess Margaret Cancer Centre, Canada
  • John Dick, Princess Margaret Cancer Centre, Canada


Presentation Overview: Show

Acute myeloid leukemia (AML) is an aggressively heterogeneous disease with poor survival outcomes. In AML, adverse genomic profiles and leukemia stem cell (LSC)-enriched cellular hierarchies are often linked to chemoresistance and frequently observed at relapse. Although single-cell and bulk transcriptomics have reshaped our understanding of hematopoiesis and AML, subsequent analyses pose technical barriers for tailoring interpretation for scientists’ ongoing drug experiments. To bridge computational and experimental drug development in AML, we introduce ATLAS-AML, an automated bioinformatics pipeline for transcriptomic meta-analysis of genes and gene signatures in AML. The ATLAS-AML pipeline is available as a containerized web application that experimental scientists can readily employ without bioinformatics expertise. ATLAS-AML integrates publicly available single-cell and bulk RNA-sequencing datasets with preconfigured pipelines to streamline five concomitant levels of visualization for queries: expression across normal and leukemic hematopoietic hierarchies, enrichment in functionally-validated LSC+ fractions, and correlation to disease relapse, clinical characteristics, and overall patient survival. For example, ATLAS-AML showed that DNMT3B, a well-documented transcriptomic hallmark of AML, was overexpressed in primitive AML cell types and associated with LSC+ fractions, disease relapse, FLT3-ITD mutations, GATA2-MECOM alterations, and worse patient outcomes. Moreover, ATLAS-AML constitutes a powerful framework for accelerating target discovery. Reinterrogating our datasets using differential expression, we identified 282 novel target candidates. Querying each target in ATLAS-AML, we shortlisted candidates that were associated with disease-propagating LSCs and biologically distinct subsets of AML patients. For example, ATLAS-AML suggested that CNST, a trans-Golgi network receptor for targeting connexins to the plasma membrane, is enriched in malignant LSC populations and associated with adverse cytogenetic alterations, lending to CNST’s therapeutic viability in AML. Altogether, ATLAS-AML enables scientists to leverage insights from single-cell and bulk transcriptomics to inform preclinical studies towards risk-tailored treatments in AML.

11:30-11:35
Flash Talk Poster 1: Impact of oral anti-diabetic drugs on gut-derived extracellular vesicles: Proteomic Signature
Room: Pasteur Lounge
Format: Live from venue

  • Estefania Torrejón, Nova Medical School, Portugal
  • Akiko Teshima, Nova Medical School, Portugal
  • Ana Sofia Carvalho, Nova Medical School, Portugal
  • Hans Christian Beck, Centre for Clinical Proteomics, Department of Clinical Biochemistry and Pharmacology, Odense University Hospital, Denmark
  • Rune Matthiesen, Nova Medical School, Portugal
  • Paula Macedo, APDP, Portugal
  • Rita Machado de Oliveira, Nova Medical School, Portugal


Presentation Overview: Show

Background: Extracellular vesicles (EVs) mediate inter-organ communication in type 2 diabetes (T2D) pathogenesis. Gut derived EVs (GDE) protein content reflects metabolic state and administering prediabetic GDE induces a diabetogenic phenotype on healthy mice. Analysis of GDE proteomic profile showed an upregulation of acyl-CoA thioesterases and downregulation of rate-limiting enzymes for glycolysis. To unveil the relevance of oral antidiabetic drugs, we hypothesize that metformin and pioglitazone's metabolic actions are dependent on GDE’s proteomic cargo.
Description: Two groups of mice were fed with either normal chow diet (NCD) or high-fat diet (HFD), then treated with metformin or pioglitazone. GDE were isolated and characterized by nanoparticle tracking analysis. Proteins were extracted and analyzed by nano-LC-MSMS. Statistical analysis was performed using the limma R package. After treatment, both drugs improved glucose intolerance and liver steatosis compared to prediabetic animals. We identified 159 proteins differentially expressed between HFD and HFD+metformin and 180 between HFD and HFD+pioglitazone and, together with the principal component analysis among groups, these results indicate both drugs alter the GDEs protein composition to resemble NCD.
Conclusion: Metformin and pioglitazone modify GDE-mediated interorgan crosstalk, which plays a role in the progression of dysmetabolism. Modulating this mechanism may have therapeutic implications.

11:35-12:00
Break
Room: Pasteur Lounge
Format: Live from venue

12:00-12:15
Oral Talk 6: Differential Production of Abnormal Isoforms in Brain Cells Contributes to Neuronal Death and Parkinson's Disease: Gender-Specific Differences
Room: Pasteur Lounge
Format: Live-stream

  • Waqar Hanif


Presentation Overview: Show

Parkinson’s disease is the second most common neuropathological disorder having a considerable effect on public health with a wide range of complexities. It accounts for about 8.5 million cases worldwide, being highly prevalent among men with poor prognosis. Neuronal cell death and axonal injury are hallmarks of various neurodegenerative disorders including Parkinson’s disease. This study aims to identify novel therapeutic targets against male-specific Parkinson’s disease exploring isoform variants of dysregulated apoptotic neuron genes among men. RNA-seq pipeline was employed on sequencing reads (collected from NCBI GEO) to identify differentially expressed genes between Parkinson’s disease men and women followed by functional enrichment analysis of dysregulated genes using enrichR. Subsequently isoform switching analysis of dysregulated neuronal genes was performed by IsoformSwitchAnalyzeR. Finally binding interactions between wild and variant structures of abnormal transcripts and Death Receptor 4 were analyzed using HDOCK. Resultantly, 453 differentially expressed genes including 121 upregulated and 332 downregulated genes were obtained however aberrant isoform variants were identified for two under-expressed apoptosis-regulating neuron genes namely HSPB1 and CASP7, and single over-expressed apoptosis-modulating neuron gene TNFSF10 in men compared to women. Alternatively spliced isoforms elucidated the considerable role of post-transcriptional modification on altered structural and functional characteristics leading to loss of apoptotic activity that may play a neurodegenerative role. Furthermore, dysregulated neuronal genes have significantly afflicted neurological mechanisms such as tumor necrosis factor receptor binding and cysteine-type endopeptidase activity indicating neurodegeneration in PD males. Additionally molecular docking analysis revealed strong binding interactions between wild type TNFSF10 and Death Receptor 4 leading to increased apoptosis in neuronal cells. Whereas isoform variant TNFSF10 revealed loss of neuroprotection due to weak interaction between ENST00000420541 (generated protein) and Death Receptor 4. Based on the neurodegenerative role of wild and variant isoforms, TNFSF10 has been proposed as a potential diagnostic and prognostic biomarker that can be targeted therapeutically to ensure early neuroprotection against Parkinson’s disease in men.

12:15-12:30
Oral Talk 7: Decoding the functional roles of intronic microRNA Hsa-Mir-2355 and its host gene KLF7 in the immunobiology of cervical cancer
Room: Pasteur Lounge
Format: Live-stream

  • Nure Sharaf Nower Samia , Bangladesh


Presentation Overview: Show

MicroRNAs (miRNAs) are short, noncoding RNAs involved in post-transcriptional gene regulation. Evidence of their roles in carcinogenesis and cancer progression is expanding gradually, implying their crucial involvement in diagnosis and therapy. How miRNA dysregulation may affect cervical cancer development, which is highly prevalent in women with poor prognosis, remains largely unclear. The purpose of this study was to reassess the studies that focused on miRNA expression in cervical cancer and determine whether intronic microRNAs have a role in the tumorigenic pathway by altering their host gene targets. Using available gene expression data from the GEO dataset GSE145372, we identified the dysregulated intronic miRs and corresponding host genes in cervical cancer patients. Among these, we selected the intronic miRNA Hsa-Mir-2355 and its transcription factor-encoding host gene KLF7, both differentially expressed in these patients. Our findings indicate that miRNAs play a significant role in the invasion and metastasis of cervical cancer by affecting specific signalling pathways. In silico modelling revealed that Hsa-Mir-2355 regulates KLF7 target genes ATG12, KRAS, PRKAR1A, and REL, which can otherwise influence cervical cancer prognosis through control of angiogenesis, apoptosis, and metastasis.Furthermore, we identified REL as a survival gene in cervical cancer, linked to patient survival and CD4+ recruitment, suggesting that it could be employed as a checkpoint for cervical cancer treatment.

12:30-12:35
Flash Talk Poster 2: Changes on the Structure of Microbial Communities of the rhizosphere of peruvian fruit trees (Annona cherimola Mill. and Pouteria lucuma) across depth soil using PacBio HiFi sequencing
Room: Pasteur Lounge
Format: Live-stream

  • Richard Estrada, INIA, Peru
  • Angie Porras, INIA , Peru


Presentation Overview: Show

The characterization of soil microbiological structure at different depths is essential to understand the impact of microorganisms on nutrient availability, soil fertility, plant growth and stress tolerance, as well as to identify bacteria with bioremediation capacity. In this study, the microbiological structure was analyzed at three depths (3 cm, 12 cm and 30 cm) of two fruit trees native to the inter-Andean valleys of South America: Annona cherimola Mill. (Chirimoya) and Pouteria lucuma (Lucuma). These fruit trees not only have nutritional benefits, but also offer a combination of nutrients essential for a balanced diet, as lucuma is a source of vitamins and minerals, while cherimoya provides vitamin C and antioxidants. We used a high-throughput Pacbio HiFi long-read sequencing approach to explore the composition, diversity, and functions of bacterial communities of the rhizosphere soil of Annona cherimola Mill. and Pouteria lucuma native in different soil depth. Significant differences were observed in the alpha diversity indices, evaluated by Shannon's index (p=0.0114) and observed features (p=0.0105) between the soil depths. The family-level relative abundance analysis of Pouteria lucuma revealed that Acidobacteriaceae and Thermoanaerobaculaceae were predominant in the shallower soils, whereas the genera Thiobacter and Pirellula exhibited a significantly higher abundance in the soils at a depth of 30 cm. We found significant changes in beta diversity due to depth gradient and plant type. Also, we also found similar functional diversity profiles among microbial communities and the influences of edafic factor in the strcuture of microorganisms. This study will help for future research aimed at understanding the impact of microorganisms in different deep layers of the soil, as well as their influence on crop growth and quality.

12:35-12:40
Flash Talk Poster 3: Comprehensive annotation of miRNAs and lncRNAs in domesticated cotton species
Room: Pasteur Lounge
Format: Live-stream

  • Vivek AT, National Institute of Plant Genome Research, India
  • Ajeet Singh, National Institute of Plant Genome Research, India
  • Shailesh Kumar, National Institute of Plant Genome Research, india


Presentation Overview: Show

Allotetraploid cotton plants, specifically Gossypium hirsutum and Gossypium barbadense, are widely grown for their natural and renewable textile fibers. Despite extensive research on non-coding RNAs in domesticated cotton species, systematic identification and annotation of long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) expressed in different tissues, developmental stages, and biological contexts remains limited. This limits our understanding of their functions and impedes future cotton research. To fill this void, we present a high-confidence set of lncRNAs and miRNAs from G. hirsutum and G. barbadense derived from large-scale RNA-seq and small RNA-seq datasets. This information is incorporated into CoNCRAtlas, a user-friendly database that provides comprehensive annotations of lncRNAs and miRNAs based on the systematic integration of extensive annotations. We anticipate that this comprehensive resource will accelerate evolutionary and functional studies of non-coding RNAs, providing critical insights for future cotton breeding programs. The CoNCRAtlas database is free and open to the public, and it can be accessed at http://www.nipgr.ac.in/CoNCRAtlas/.

12:40-12:45
Flash Talk Poster 4: mICKEY: Memory-Efficient Deep Learning for Personalized Biomarker Discovery and Cancer Origin Prediction from DNA Methylation Data
Room: Pasteur Lounge
Format: Live-stream

  • Pakanan Tussanapirom, Triam Udom Suksa School, Thailand
  • Kasidech Aewsrisakul, Triam Udom Suksa School, Thailand
  • Natthawadee Leephatarakit, Hatyaiwittayalai, Thailand
  • Kobchai Duangrattanalert, Chulalongkorn University, Thailand
  • Chanati Jantrachotechatchawan, Mahidol University, Thailand


Presentation Overview: Show

Cancer claims over 10 million lives annually, and treatment depends on accurate identification of tissue origin. Traditional diagnostics have long wait times and invasive biopsies pose significant health risks. To tackle these challenges and pave the way for early detection, we developed mICKEY, a deep-learning pipeline for cancer prediction using CpG methylation data from solid and liquid biopsies. We use deep learning, instead of standard statistics, for its capacity to model complex non-linear biological data. Our model employs variational inference to boost consistency and masks to handle low-quality data. We use dense layers with regularization for CpG site selection. Fewer CpG sites allow healthcare facilities to measure and analyze with affordable multiplex techniques instead of high-throughput screening (>450k probes) that requires memory-consuming models. Lastly, mICKEY has a self-attention layer to encode interaction and pinpoint vital CpG sites in each sample. Our model, using fewer than 100 CpG sites, achieves robust sensitivity and specificity of over 95% on 18 cancer origins across demographic groups and sample types on solid biopsy DNA from holdout TCGA and independent GEO datasets. To make our model more suitable for early detection via non-invasive methods, mICKEY couples metric learning with domain adaptation to deal with data limitation and yields over 85% in sensitivity and specificity. Sample-specific attention maps reveal known biomarkers, validating its potential for future personalized treatment. Overall, mICKEY offers a practical solution for early detection of cancer together with a promising future use in personalized therapy.

12:45-12:50
Flash Talk Poster 5: Most frequently harboured missense variants of hACE2 across different populations exhibit varying patterns of binding interaction with spike glycoproteins of emerging SARS-CoV-2 of different lineages
Room: Pasteur Lounge
Format: Live-stream

  • Rubaiat Ahmed, University of Dhaka, Bangladesh
  • Anika Tahsin, University of Dhaka, Bangladesh
  • Piyash Bhattacharjee, University of Dhaka, Bangladesh
  • Maisha Adiba, University of Dhaka, Bangladesh
  • Abdullah Al Saba, University of Dhaka, Bangladesh
  • Tahirah Yasmin, University of Dhaka, Bangladesh
  • Sajib Chakraborty, University of Dhaka, Bangladesh
  • A.K.M. Mahbub Hasan, University of Dhaka, Bangladesh
  • A.H.M. Nurun Nabi, University of Dhaka, Bangladesh


Presentation Overview: Show

Background: Since the emergence of SARS-CoV-2 in 2019, the virus accumulated various mutations, resulting in numerous variants. According to the mutations acquired, the variants are classified into lineages and differ greatly in infectivity and transmissibility. The world saw prominent surges in the rate of infection as newer variants emerged. However, not all populations suffered equally, which suggests a possible role of host genetic factors.
Description: We investigated the effect of the lineage-defining mutations of the SARS-CoV-2 variants: Mu, Delta, Delta Plus (AY.1), Omicron sub-variants BA.1, BA.2, BA.4, BA.5, and BA.2.12.1 on the strength of binding of the spike glycoprotein receptor-binding domain (RBD) with the human angiotensin-converting enzyme 2 (hACE2) missense variants prevalent in major populations (E37K in Africans, F40L in Latin Americans, D355N in non-Finnish Europeans, and P84T in South Asians) via molecular docking and molecular dynamics (MD) simulation. The results demonstrated variable strength of binding and showed altered interaction patterns in different hACE2-RBD complexes.
Conclusion: The missense variants of hACE2 and spike RBD mutation, both affect the binding energy and pattern of interaction between the two proteins. In vitro studies are warranted to confirm these findings which may enable early prediction regarding the risk of transmissibility of newly emerging variants across different populations in the future.

12:50-12:55
Flash Talk Poster 6: Gene subset signatures for complexity reduction of technical and functional comparisons of whole transcriptomes
Room: Pasteur Lounge
Format: Live-stream

  • Shruti Gupta, School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India, India
  • Shandar Ahmad, School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India, India


Presentation Overview: Show

Background: Gene expression profiling has widely investigated cellular and disease contexts. While large-scale transcriptome data are often searched using phenotype metadata of each experiment, there is also an interest in the reverse inference of phenotype from a newly generated transcriptome profile called “content”. Studies have been done on drug repositioning, identifying novel drug-disease connections or inferring drug-drug relationships from expression profiles, highlighting the significance of profile comparisons and content-based queries. In 2017, Subramanium et al. showed that much fewer landmark genes (L1000 genes) carry enough information about whole transcriptome profiles, and the latter can be predicted using a deep learning model. However, it was unclear if the selected L1000 set is the only combination that can do the job and if they have the same power to segregate biologically relevant expression profiles.

Description: In this work, we develop a database of global expression profiles of legacy microarray and RNASeq experiments, including those from single cells. Using these data sets, we assess the performance of L1000 and other similar-sized genesets by creating random and systematically selected samples. We evaluate the ability of these subsets to reproduce profile-profile comparisons that would result from the whole-transcriptome gene expression profile. We also investigated the pathways, gene ontologies and functional features which are critical for the selection of a good subset having the ability to reproduce original comparisons. Finally, profiles derived from selected subsets were also applied to extract biologically similar samples instead of simple expression profile signatures of each sample in the dataset. A framework for large-scale comparison of such data sets is also being developed.

Conclusions: Our results suggest that many different gene subsets are equally powerful to L1000 in reproducing profile-profile similarities between transcriptome data sets. Certain genes often form the part of best-performing gene subsets, highlighting their criticality for determining whole transcriptome patterns. The choice of these genes and subsets will aid in understanding the key regulatory factors in gene expression. These findings will also help in the large-scale imputation of gene expression profiles collected on smaller platforms and speed up the process of content comparison between them.

12:55-13:00
Flash Talk Poster 7: Virus-host interactions in a municipal landfill include non-specific viruses, hyper-targeting, and interviral conflicts
Room: Pasteur Lounge
Format: Live from venue

  • Nikhil George, University of Waterloo, Canada
  • Laura Hug, University of Waterloo, Canada
13:00-14:10
Lunch | Poster Session & Networking
Room: Pasteur Lounge
Format: Live from venue


Presentation Overview: Show

TBD

14:10-14:55
Invited Presentation: Keynote 2: Decoding the language of life
Room: Pasteur Lounge
Format: Live from venue

  • Burkhard Rost


Presentation Overview: Show

Background: Colorectal cancer is the third most diagnosed neoplasm and the second leading cause of cancer-related deaths. Its development is associated with gains and/or losses of genetic material, leading to the emergence of major driver genes with higher mutational frequency. Additionally, there are other genes with mutations that have weak tumor-promoting effects, known as mini-drivers, which could exacerbate the development of oncogenesis when they occur together. The aim of our study was to use computational analysis to explore the impact of survival, frequency, and incidence of mutations in potential mini-driver genes for the prognosis of colorectal cancer.
Description: We retrieved data from three sources of CRC samples using the cBioPortal platform and analyzed the mutational frequency to exclude genes with driver features and those mutated in less than 5% of the original cohort. We also observed that the mutational profile of these mini-driver candidates is associated with variations in the expression levels. The candidate genes obtained were subjected to Kaplan–Meier curve analysis, making a comparison between mutated and wild-type samples for each gene using a p-value threshold of 0.01. After gene filtering by mutational frequency, we obtained 159 genes of which 60 were associated with a high accumulation of total somatic mutations with Log2 (fold change) > 2 and p values < 10−5. In addition, these genes were enriched to oncogenic pathways such as epithelium-mesenchymal transition, hsa-miR-218-5p downregulation, and extracellular matrix organization. Our analysis identified five genes with possible implications as mini-drivers: DOCK3, FN1, PAPPA2, DNAH11, and FBN2. Furthermore, we evaluated a combined classification where CRC patients with at least one mutation in any of these genes were separated from the main cohort obtaining a p-value < 0.001 in the evaluation of CRC prognosis.
Conclusions: Our study suggests that the identification and incorporation of mini-driver genes in addition to known driver genes could enhance the accuracy of prognostic biomarkers for CRC.
Publication link: https://peerj.com/articles/15410/

14:55-15:10
Oral Talk 8: Heterogeneous Domain Adaptation for Species-Agnostic Transfer Learning
Room: Pasteur Lounge
Format: Live from venue

  • Youngjun Park, Universitätsmedizin Göttingen, Germany
  • Anne-Christin Hauschild, Universitätsmedizin Göttingen, Germany


Presentation Overview: Show

Cellular communication plays a crucial role in controlling and regulating many biological processes, such cell development and tissue functionality, and diseases, as cancer progression. The advent of single-cell transcriptomics has enabled the study of cellular communication and several computational tools have been developed for inferring ligand-receptor interactions.
As single-cell transcriptomics has become cheaper, widespread and accessible, the availability of large-scale studies (i.e. cell atlases) of increasing complexity (different conditions, different subjects and time series studies) poses new challenges in cellular communication analysis, such as i) new biological questions to answer, e.g. identify changes in crosstalk across distinct contexts, ii) the increased computational demand and iii) the visualization and interpretation of results. Therefore, there is the need of a generalizable and scalable workflow to perform and support the interpretation of cellular communication analysis from large-scale single-cell RNA data in a user-friendly, efficient, and effective way.
DESCRIPTION
We propose CClens, a bioinformatic pipeline, that first comprises the quantification and characterization of cell-cell communication in each distinct context at both inter- and intra-cellular level. Then, it enables the identification of alterations in the communication patterns across distinct contexts, exploiting ad-hoc statistical methods to work with any multi-condition scenarios, including dataset where only information about experimental condition is available (i.e. multi-condition scenario) or coupled with patients’ ID (i.e. multi-patient scenario). To handle the increased computational burden, we use advanced data structures from the bigmemory R package and the possibility to integrate C++ code from the Rcpp package, achieving both an efficient in-memory computation and exploiting shared-memory parallelism. Lastly, an R/shiny interface offers multiple functionalities (e.g. filtering options, advanced visualization tools) to inspect, summarize and interpret complex cell-cell communication data in user-friendly, accessible (no-code) and flexible way.
CONCLUSIONS
Single-cell transcriptomics is a quite young and fast evolving research area, having a huge impact in biological data analysis advancement, and cellular communication analysis represents an unprecedent opportunity to characterize biological systems. We believe that CClens will facilitate the analysis and interpretation of cell-cell communication, making it a valuable tool to gain new insights about biological processes that govern a multicellular system or different experimental conditions.

15:10-15:25
Oral Talk 9: CClens: a cellular communication workflow for large-scale single-cell RNA sequencing data
Room: Pasteur Lounge
Format: Live from venue

  • Giulia Cesaro


Presentation Overview: Show

Background: Long-range interactions between regulatory elements and promoters are key in gene transcriptional control; however, their study requires large amounts of starting material, which is not compatible with clinical scenarios nor the study of rare cell populations.

Description: We have developed low input capture Hi-C (liCHi-C) as a cost-effective, flexible method to map and robustly compare promoter interactomes at high resolution. As proof of its broad applicability, we implement liCHi-C to study normal and malignant human hematopoietic hierarchy in clinical samples. We demonstrate that the dynamic promoter architecture identifies developmental trajectories and orchestrates transcriptional transitions during cell-state commitment. Moreover, liCHi-C enables the identification of disease-relevant cell types, genes and pathways potentially deregulated by non-coding alterations at distal regulatory elements. Finally, we show that liCHi-C can be harnessed to uncover genome-wide structural variants, resolve their breakpoints and infer their pathogenic effects.

Conclusion: Collectively, our optimized liCHi-C method expands the study of 3D chromatin organization to unique, low-abundance cell populations, and offers an opportunity to uncover factors and regulatory networks involved in disease pathogenesis.

15:25-15:40
Oral Talk 10: Low input capture Hi-C: a method to decipher the molecular mechanisms underlying non-coding alterations
Room: Pasteur Lounge
Format: Live from venue

  • Laureano Tomás-Daza


Presentation Overview: Show

Background: Cell stress and DNA damage activate p53 transcription factor, triggering transcriptional activation of a myriad of target genes to ultimately facilitate distinct cellular outcomes, including cell cycle arrest, senescence, or apoptosis among others. However, the molecular mechanisms underlaying p53-related gene transcription regulation are not completely understood. This gap of knowledge is critical since p53 is one of the most frequently mutated gene in blood malignancies leading the loss of one of the first barriers to prevent malignant transformation.

Description: In this project we addressed this gap of knowledge. p53 preferentially bind enhancers. Since enhancers control transcription of target genes though physical proximity with their promoters, proximity determined by the 3D genome folding within the nucleus, we aimed to study the role of spatio-temporal genome architecture in the p53 response. Specifically, we deciphered how p53 activation orchestrate the 3D epigenetic landscape to ultimately control gene transcription and block cancer development.

To study the dynamic crosstalk between spatiotemporal genome architecture, epigenetics and transcription triggered by p53 we have modelled p53 activation on time and performed a multiomics integration of Hi-C, Promoter Capture Hi-C (PCHi-C), ChIP-seq and RNA-seq data. Just to note, PCHi-C is the method that we previously developed to associated distal enhancers and target genes (Javierre et al., Cell, 2016).

We demonstrated that p53 drives dramatic changes in genome architecture, including A/B compartments, Topological Associated Domains (TADs) and DNA loops after minutes of its activation. These changes accompanied epigenetic landscape re-configuration to ultimately trigger p53-related transcriptional response. Then, we defined set of functional p53 binding sites, being most of these at enhancers as previously reported. p53 activation drove new-loop formations between p53-bound enhancers and gene promoters. However, in some cases the 3D chromatin topology was pre-established. In both cases, DNA loops allowed the propagation of the activating p53 effect from distal enhancers to promoter to ultimately lead gene transcription upregulation. Specifically, we associated these functional p53 binding sites at enhancers with a set of 331 distal target genes, which in most of the cases were not the closest gene in the linear genome (mean distance between p53-bound enhancers and target gene porters of 153Kb). Among these, we did not only identified examples of previously identified p53 target genes (e.g.TP53INP1, PLK2) but we also identified potentially new direct target genes and pathways distally controlled by p53.

Conclusion: Collectively, our results demonstrate that p53 activation dramatically reshapes the promoter-enhancer interactome landscape to ultimately control the transcriptional response. Besides, this study provides the first set of genes and pathways distally controlled by p53 and suggest candidate non-coding regions (i.e., p53-bound functional enhancers and linked promoters) that can be mutated or epimutated in blood malignancies to lead an aberrant p53 response in a wild p53 condition

15:40-15:55
Oral Talk 11: P53 orchestrates spatio-temporal epigenome rewiring to transcriptionally prevent malignant trasnformation
Room: Pasteur Lounge
Format: Live from venue

  • Monica Cabrera-Pasadas, BSC- IJC, Spain


Presentation Overview: Show

Background: Cell stress and DNA damage activate p53 transcription factor, triggering transcriptional activation of a myriad of target genes to ultimately facilitate distinct cellular outcomes, including cell cycle arrest, senescence, or apoptosis among others. However, the molecular mechanisms underlaying p53-related gene transcription regulation are not completely understood. This gap of knowledge is critical since p53 is one of the most frequently mutated gene in blood malignancies leading the loss of one of the first barriers to prevent malignant transformation.

Description: In this project we addressed this gap of knowledge. p53 preferentially bind enhancers. Since enhancers control transcription of target genes though physical proximity with their promoters, proximity determined by the 3D genome folding within the nucleus, we aimed to study the role of spatio-temporal genome architecture in the p53 response. Specifically, we deciphered how p53 activation orchestrate the 3D epigenetic landscape to ultimately control gene transcription and block cancer development.

To study the dynamic crosstalk between spatiotemporal genome architecture, epigenetics and transcription triggered by p53 we have modelled p53 activation on time and performed a multiomics integration of Hi-C, Promoter Capture Hi-C (PCHi-C), ChIP-seq and RNA-seq data. Just to note, PCHi-C is the method that we previously developed to associated distal enhancers and target genes (Javierre et al., Cell, 2016).

We demonstrated that p53 drives dramatic changes in genome architecture, including A/B compartments, Topological Associated Domains (TADs) and DNA loops after minutes of its activation. These changes accompanied epigenetic landscape re-configuration to ultimately trigger p53-related transcriptional response. Then, we defined set of functional p53 binding sites, being most of these at enhancers as previously reported. p53 activation drove new-loop formations between p53-bound enhancers and gene promoters. However, in some cases the 3D chromatin topology was pre-established. In both cases, DNA loops allowed the propagation of the activating p53 effect from distal enhancers to promoter to ultimately lead gene transcription upregulation. Specifically, we associated these functional p53 binding sites at enhancers with a set of 331 distal target genes, which in most of the cases were not the closest gene in the linear genome (mean distance between p53-bound enhancers and target gene porters of 153Kb). Among these, we did not only identified examples of previously identified p53 target genes (e.g.TP53INP1, PLK2) but we also identified potentially new direct target genes and pathways distally controlled by p53.

Conclusion: Collectively, our results demonstrate that p53 activation dramatically reshapes the promoter-enhancer interactome landscape to ultimately control the transcriptional response. Besides, this study provides the first set of genes and pathways distally controlled by p53 and suggest candidate non-coding regions (i.e., p53-bound functional enhancers and linked promoters) that can be mutated or epimutated in blood malignancies to lead an aberrant p53 response in a wild p53 condition

15:55-16:25
Exploring posters and discussion with presenters
Room: Pasteur Lounge
Format: Live from venue

16:25-16:35
Introducing ISCB Student Council Activities
Room: Pasteur Lounge
Format: Live from venue

16:35-17:35
Panel: Round Table Discussion: Exploring the Potential of AI in Revolutionizing Bioinformatics Research: Opportunities and Challenge
Room: Pasteur Lounge
Format: Live from venue

17:35-17:55
Annoucements of the Prize winners & Closing remarks
Room: Pasteur Lounge
Format: Live from venue

17:55-18:00
All on stage for picture/photo of the event
Room: Pasteur Lounge
Format: Live from venue