The SciFinder tool lets you search Titles, Authors, and Abstracts of talks and panels. Enter your search term below and your results will be shown at the bottom of the page. You can also click on a track to see all the talks given in that track on that day.

View Talks By Category

Scroll down to view Results

July 12, 2024
July 13, 2024
July 14, 2024
July 15, 2024
July 16, 2024

Results

July 15, 2024
10:40-11:20
Invited Presentation: 50 Years of Protein Structures & Structural Bioinformatics
Confirmed Presenter: Janet M Thornton, EMBL-EBI, UK
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Rafael Najmanovich


Authors List: Show

  • Janet M Thornton, Janet M Thornton, EMBL-EBI
  • Roman A Laskowski, Roman A Laskowski, EMBL-EBI
  • Sameer Velankar, Sameer Velankar, EMBL-EBI

Presentation Overview:Show

The last 50 years have seen a revolution in our understanding of proteins and how they work in 3D. This has been enabled by the development of many new technologies in producing proteins, crystallisation with robots, the synchrotrons to collect very high resolution data, structure determination by NMR, the more recent developments in Cryo-electron microscopy and tomography. These experimental developments have been matched by the development of sophisticated computational tools and databases using powerful computers, to help in determining structures and also in curating, analysing, comparing and predicting their structures.

In this talk I will focus on our collective progress in understanding more about these molecules of life, from the handful or structures determined in 1974 to our current knowledge of the complex world of proteins. I will conclude by describing some of our own recent work on exploring enzyme catalysis.

I will highlight:
· Our current knowledge of the universe of protein structures
· The development of tools for annotating structures
· wwPDB & PDBe; EMDB & EMPIAR, AFDB
· Protein structure prediction & AI
· Computational Enzymology
· The impact & the future?

July 15, 2024
11:20-11:40
Democratizing Protein Language Models with Parameter-Efficient Fine-Tuning
Confirmed Presenter: Samuel Sledzieski, Massachusetts Institute of Technology, United States
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Rafael Najmanovich


Authors List: Show

  • Samuel Sledzieski, Samuel Sledzieski, Massachusetts Institute of Technology
  • Meghana Kshirsagar, Meghana Kshirsagar, AI for Good Research Lab
  • Minkyung Baek, Minkyung Baek, Seoul National University
  • Rahul Dodhia, Rahul Dodhia, AI for Good Research Lab
  • Juan Lavista Ferres, Juan Lavista Ferres, AI for Good Research Lab
  • Bonnie Berger, Bonnie Berger, Massachusetts Institute of Technology

Presentation Overview:Show

Proteomics has been revolutionized by large protein language models (PLMs), which learn unsupervised representations from large corpora of sequences. These models are typically fine-tuned in a supervised setting to adapt the model to specific downstream tasks. However, the computational and memory footprint of fine-tuning large PLMs presents a barrier for many research groups with limited computational resources. Natural language processing has seen a similar explosion in the size of models, where these challenges have been addressed by methods for parameter-efficient fine-tuning (PEFT). In this work, we introduce this paradigm to proteomics through leveraging the parameter-efficient method LoRA and training new models for two important tasks: predicting protein-protein interactions (PPIs) and predicting the symmetry of homooligomer quaternary structures. We show that these approaches are competitive with traditional fine-tuning while requiring reduced memory and substantially fewer parameters. We additionally show that for the PPI prediction task, training only the classification head also remains competitive with full fine-tuning, using five orders of magnitude fewer parameters, and that each of these methods outperform state-of-the-art PPI prediction methods with substantially reduced compute. We further perform a comprehensive evaluation of the hyperparameter space, demonstrate that PEFT of PLMs is robust to variations in these hyperparameters, and elucidate where best practices for PEFT in proteomics differ from those in natural language processing. All our model adaptation and evaluation code is available open-source at https://github.com/microsoft/peft_proteomics. Thus, we provide a blueprint to democratize the power of protein language model adaptation to groups with limited computational resources.

July 15, 2024
11:40-12:00
EquiPNAS: improved protein–nucleic acid binding site prediction using protein-language-model-informed equivariant deep graph neural networks
Confirmed Presenter: Debswapna Bhattacharya, Virginia Tech, United States
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Rafael Najmanovich


Authors List: Show

  • Rahmatullah Roche, Rahmatullah Roche, Virginia Tech
  • Bernard Moussad, Bernard Moussad, Virginia Tech
  • Md Hossain Shuvo, Md Hossain Shuvo, Virginia Tech
  • Sumit Tarafder, Sumit Tarafder, Virginia Tech
  • Debswapna Bhattacharya, Debswapna Bhattacharya, Virginia Tech

Presentation Overview:Show

Protein language models (pLMs) trained on a large corpus of protein sequences have shown unprecedented scalability and broad generalizability in a wide range of predictive modeling tasks, but their power has not yet been harnessed for predicting protein–nucleic acid binding sites, critical for characterizing the interactions between proteins and nucleic acids. Here, we present EquiPNAS, a new pLM-informed E(3) equivariant deep graph neural network framework for improved protein–nucleic acid binding site prediction. By combining the strengths of pLM and symmetry-aware deep graph learning, EquiPNAS consistently outperforms the state-of-the-art methods for both protein–DNA and protein–RNA binding site prediction on multiple datasets across a diverse set of predictive modeling scenarios ranging from using experimental input to AlphaFold2 predictions. Our ablation study reveals that the pLM embeddings used in EquiPNAS are sufficiently powerful to dramatically reduce the dependence on the availability of evolutionary information without compromising on accuracy, and that the symmetry-aware nature of the E(3) equivariant graph-based neural architecture offers remarkable robustness and performance resilience. EquiPNAS is freely available at https://github.com/Bhattacharya-Lab/EquiPNAS.

July 15, 2024
12:00-12:20
Accurate High-throughput Cryptic Binding Site Prediction Using Protein Language Model
Confirmed Presenter: Shuo Zhang, The City University of New York, United States
Track: 3DSIG

Room: 520a
Format: Live Stream
Moderator(s): Rafael Najmanovich


Authors List: Show

  • Shuo Zhang, Shuo Zhang, The City University of New York
  • Lei Xie, Lei Xie, The City University of New York

Presentation Overview:Show

Identification of cryptic binding sites of proteins is an important but challenging task for understanding the function of proteins and screening potential drugs for proteins currently considered undruggable. Existing methods usually require 3D protein structures from resource-intensive molecular dynamics (MD) simulations or are too slow to be adopted in high-throughput compound screening. To tackle these limitations, we propose LaMPSite, which only takes protein sequences and ligand molecular graphs as input for cryptic binding site predictions. Without any 3D coordinate information of proteins, our proposed model is not only 100 to 1000 times faster than baseline methods that require 3D protein structures from time-consuming MD simulations or generative binding complex structures but also more accurate than them. Given the efficiency and accuracy of LaMPSite, it is promising to be applied to drug discovery.

July 15, 2024
14:20-14:40
Contrastive learning in protein language space predicts interactions between drugs and protein targets
Confirmed Presenter: Rohit Singh, Duke University, United States
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Douglas Pires


Authors List: Show

  • Rohit Singh, Rohit Singh, Duke University
  • Samuel Sledzieski, Samuel Sledzieski, Massachusetts Institute of Technology
  • Bryan Bryson, Bryan Bryson, Massachusetts Institute of Technology
  • Lenore Cowen, Lenore Cowen, Tufts University
  • Bonnie Berger, Bonnie Berger, Massachusetts Institute of Technology

Presentation Overview:Show

Experimental screening of potential drug molecules against protein targets is a key bottleneck in the drug discovery pipeline. Fast and accurate computational prediction of drug-target interactions (DTIs) could significantly accelerate this process. However, current sequence-based DTI prediction methods struggle to achieve broad generalization and high specificity while remaining computationally efficient. We develop ConPLex, a deep learning model that successfully leverages the advances in pretrained protein language models (""PLex"") and employs a protein-anchored contrastive coembedding (""Con"") to outperform state-of-the-art approaches. ConPLex makes predictions of binding based on the distance between learned representations, achieving high accuracy, broad adaptivity to unseen data, and specificity against decoy compounds. Experimental validation yielded a 63% hit rate, including four hits with subnanomolar affinity and a novel strongly-binding EPHB1 inhibitor (KD = 1.3 nM). ConPLex is extremely fast, capable of making 100 million predictions per day on a single GPU, enabling predictions at the scale of massive compound libraries and the human proteome. The contrastive approach and the shared embedding space also provide interpretability, allowing visualization of drug-target relationships and functional characterization of cell-surface proteins. ConPLex has the potential to efficiently guide and prioritize candidates for experimental screening, unlocking significant value in the drug discovery process.

Availability: https://conplex.csail.mit.edu/

Source code: https://github.com/samsledje/ConPLex

Paper: Singh, Sledzieski, Bryson, Cowen, & Berger. PNAS, 120(24) (2023).
https://www.pnas.org/doi/full/10.1073/pnas.2220778120

July 15, 2024
14:40-15:00
NRGDock: An open-source software for ultra-massive high-throughput virtual screening
Confirmed Presenter: Thomas Descoteaux, Université de Montréal, Canada
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Douglas Pires


Authors List: Show

  • Thomas Descoteaux, Thomas Descoteaux, Université de Montréal
  • Oliver Mailhot, Oliver Mailhot, Universite de Montreal
  • Rafael Najmanovich, Rafael Najmanovich, University of Montreal

Presentation Overview:Show

Here we present NRGDock, an easy-to-use docking software based on Python requiring less than 0.5 CPU second per molecule. With this speed, a modern laptop can dock 1 000 000 molecules in 24 hours. Its scoring function is based on that of FlexAID and an exhaustive search procedure. NRGDock has been benchmarked against the widely used DUD-E benchmarking dataset and obtained median enrichment factors similar to AutoDock Vina and Glide. Furthermore, NRGDock performs well on protein structures generated by AlphaFold, where residue positioning may not be modelled precisely. To validate the performance of NRGDock in high throughput virtual screening, testing was conducted on 102 DUD-E targets against 48.3 million compounds from the Enamine Real Diversity Subset (ERDS) for a total of 4.9 billion docking simulations. A clear separation in scores was observed with true binders getting significantly better scores than the ERDS molecules. Lastly, we used the protein kinase PIM-1 associated with triple-negative breast cancer and the related kinases PIM-2 and PIM-3 against the ERDS library. We show that dissimilar top-scoring compounds can be identified unique for each related target.

July 15, 2024
15:00-15:20
Proceedings Presentation: Enhancing Generalizability and Performance in Drug-Target Interaction Identification by Integrating Pharmacophore and Pre-trained Models
Confirmed Presenter: Zuolong Zhang, Henan University, China
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Douglas Pires


Authors List: Show

  • Zuolong Zhang, Zuolong Zhang, Henan University
  • Gang Luo, Gang Luo, Nanchang University
  • Shengbo Chen, Shengbo Chen, Henan University
  • Xin He, Xin He, Henan University
  • Dazhi Long, Dazhi Long, Ji'an Third People's Hospital

Presentation Overview:Show

In drug discovery, it is crucial to assess the drug-target binding affinity. Although molecular docking is widely used, computational efficiency limits its application in large-scale virtual screening. Deep learning-based methods learn virtual scoring functions from labeled datasets and can quickly predict affinity. However, there are three limitations. First, existing methods only consider the atom-bond graph or one-dimensional sequence representations of compounds, ignoring the information about functional groups (pharmacophores) with specific biological activities. Second, relying on limited labeled datasets fails to learn comprehensive embedding representations of compounds and proteins, resulting in poor generalization performance in complex scenarios. Third, existing feature fusion methods cannot adequately capture contextual interaction information. Therefore, we propose a novel drug-target binding affinity prediction method named HeteroDTA. Specifically, a multi-view compound feature extraction module is constructed to model the atom-bond graph and pharmacophore graph. The residue concat graph and protein sequence are also utilized to model protein structure and function. Moreover, to enhance the generalization capability and reduce the dependence on task-specific labeled data, pre-trained models are utilized to initialize the atomic features of the compounds and the embedding representations of the protein sequence. A context-aware nonlinear feature fusion method is also proposed to learn interaction patterns between compounds and proteins. Experimental results on public benchmark datasets show that HeteroDTA significantly outperforms existing methods. In addition, HeteroDTA shows excellent generalization performance in cold-start experiments and superiority in the representation learning ability of drug-target pairs. Finally, the effectiveness of HeteroDTA is demonstrated in a real-world drug discovery study.

July 15, 2024
15:20-15:40
DOCKGROUND: a new release of the long-standing resource for studying protein recognition
Confirmed Presenter: Petras Kundrotas, The University of Kansas, United States
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Douglas Pires


Authors List: Show

  • Petras Kundrotas, Petras Kundrotas, The University of Kansas
  • Keeley Collins, Keeley Collins, The University of Kansas
  • Matthew Copeland, Matthew Copeland, The University of Kansas
  • Ian Kothoff, Ian Kothoff, The University of Kansas
  • Amar Singh, Amar Singh, The University of Kansas
  • Marc Lensink, Marc Lensink, University of Lille
  • Ilay Vakser, Ilay Vakser, The University of Kansas

Presentation Overview:Show

Artificial intelligence (AI) has transformed the field of computational structural biology. Modeled structures of globular proteins now are accurate enough for computer-aided drug design. Structural prediction of protein-protein (PP) complexes (protein docking) has also been significantly advanced. Still, there is a constant need for re-training of the complex network models on newer data. Technical progress rapidly accelerates accumulation of data by various experimental techniques. Thus, static datasets quickly become obsolete. So far, significant efforts in generating reliable, up-to-date datasets have been focusing on individual macromolecules, while their assemblies have attracted less attention, mainly due to the complexity of the task. Here, we present a full revamp of our well-established DOCKGROUND resource for studying protein recognition (http://dockground.compbio.ku.edu). The resource contains comprehensive sets of data needed for the development and testing of protein docking techniques, including AI-based methods: bound and unbound (experimentally determined and simulated) structures of PP complexes, model-model complexes, docking decoys of experimentally determined and modeled proteins, and templates for comparative protein docking. The core dataset of bound PP structures, from which other sets are derived, is automatically updated on a weekly basis. We also implemented a new DOCKGROUND interactive interface that allows generating custom non-redundant datasets using various parameters and provides structure visualization. The DOCKGROUND resource also incorporates docking model quality assessment tool CAPRI-Q, which utilizes CAPRI criteria and other quality metrics such as DockQ, TM-score and l-DDT.

July 15, 2024
15:40-16:00
On finding the right match – a structural perspective
Confirmed Presenter: Marian Novotny, Charles University, Faculty of Science
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Douglas Pires


Authors List: Show

  • Christos Feidakis, Christos Feidakis, Charles University
  • Radoslav Krivak, Radoslav Krivak, Charles University
  • Vit Skrhak, Vit Skrhak, Charles University
  • David Hoksza, David Hoksza, Charles University
  • Marian Novotny, Marian Novotny, Charles University

Presentation Overview:Show

Proteins can assume a number of 3D structural conformations during their lifetime and many of them can undergo a substantial conformational change that might be crucial for their function, e.g., during ligand binding.
Many machine learning methods that are utilising 3D structural information are often trained on just a single structure of the protein. The single structure, however, does not have to represent the protein fully and it can even be misleading. For example,training a ligand-binding site prediction method on a conformation that is already binding a ligand (holo structure), while the prediction makes more sense for a conformation without a bound ligand (apo structure).
To help avoid potential biases in building datasets, we have developed a tool called AHoJ (www.apoholo.cz) to identify apo-holo structure pairs for user-defined binding sites and post-translational modifications. We have also developed AHoJ-DB (www.apoholo.cz/db), a database of apo-holo structure pairs for biologically relevant ligands as defined in the BioLiP2 database. Both services have easy-to-use interfaces and provide metrics of the similarity of binding sites between apo and holo structures, which can be used for further downstream analysis or development of derived datasets. An analysis of AHoJ-DB shows that apo structures are not available for more than 50% of the experimentally described binding sites. We used AHoJ-DB to build CryptoBench, a dataset of cryptic binding sites, which consists of 1437 apo structures and is the most extensive collection of its kind to date.

July 15, 2024
16:40-17:00
Explaining Conformational Diversity in Protein Families through Molecular Motion
Confirmed Presenter: Valentin Lombard, Sorbonne University, France
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Alexander Monzon


Authors List: Show

  • Valentin Lombard, Valentin Lombard, Sorbonne University
  • Sergei Grudinin, Sergei Grudinin, Université Grenoble Alpes
  • Elodie Laine, Elodie Laine, Sorbonne Université

Presentation Overview:Show

Proteins play a central role in biological processes, and understanding their conformational variability is crucial for unraveling their functional mechanisms. Recent advancements in high-throughput technologies have enhanced our knowledge of protein structures, yet predicting their multiple conformational states and motions remains challenging. This study introduces Dimensionality Analysis for protein Conformational Exploration (DANCE) for a systematic and comprehensive description of protein families conformational variability. DANCE accommodates both experimental and predicted structures. It is suitable for analyzing anything from single proteins to superfamilies. Employing it, we clustered all experimentally resolved protein structures available in the Protein Data Bank into conformational collections and characterized them as sets of linear motions. The resource facilitates access and exploitation of the multiple states adopted by a protein and its homologs. Beyond descriptive analysis, we assessed classical dimensionality reduction techniques for sampling unseen states on a representative benchmark. This work improves our understanding of how proteins deform to perform their functions and opens ways to a standardized evaluation of methods designed to sample and generate protein conformations.
In brief, the main contributions of our work are the following: 1. A pipeline was constructed for systematic analysis of protein conformational variability,
2. Datasets of protein ensembles and extracted linear motions have been made publicly accessible,
3. The ability of classical manifold learning methods, including PCA and kPCA, to
capture the diversity of protein conformational states was evaluated.

July 15, 2024
16:40-17:00
Pathway of transition for HIV-1 envelope trimer from prefusion-closed to CD4-bound open through an occluded-intermediate state
Confirmed Presenter: Myungjin Lee, National Institutes of Health, United States
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Alexander Monzon


Authors List: Show

  • Myungjin Lee, Myungjin Lee, National Institutes of Health
  • Maolin Lu, Maolin Lu, University of Texas at Tyler Health Science Cente
  • Baoshan Zhang, Baoshan Zhang, National Institutes of Health
  • Tongqing Zhou, Tongqing Zhou, National Institutes of Health
  • Revansiddha Katte, Revansiddha Katte, University of Texas at Tyler Health Science Center
  • Yang Han, Yang Han, University of Texas at Tyler Health Science Center
  • Reda Rawi, Reda Rawi, National Institutes of Health
  • Peter D. Kwong, Peter D. Kwong, Columbia University Vagelos College of Physicians and Surgeons

Presentation Overview:Show

HIV entry into host cells is initiated by the engagement of the gp120 subunit of the HIV-1 envelope (Env) trimer with the cellular receptor CD4. This interaction induces substantial structural changes in the HIV-1 Env trimer. Although there is existing static structural information for both the prefusion-closed and the CD4-bound prefusion open trimer, the complete transition pathway between these static states (such as transition structures) remains uncharacterized. In this study, we investigated the transition of a fully and site specifically glycosylated HIV-1 Env trimer between prefusion-closed and CD4-bound open conformations using a special molecular dynamics simulation technique – collective MD simulation (coMD). Here, we identified a transition intermediate – the occluded intermediate state. Previously reported antibodies Ab1303, Ab1573, b12, and DH851.3 recognized this intermediate. Additionally, we validated the result by experiments single-molecule Förster resonance energy transfer analysis, confirming that each of these four antibodies induces and stabilizes this distinct intermediate state of Env on the virus, replacing the CD4-bound open state. Overall, our findings using coMD simulation delineate a transition pathway between prefusion-closed and CD4-bound open conformations, unveiling the occluded intermediate as a prevalent intermediate state.

July 15, 2024
17:00-17:20
Analysis and prediction of RuBisCO kinetics using deep learning
Confirmed Presenter: Aleksey Porollo, Cincinnati Children's Hospital Medical Center, Cincinnati
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Alexander Monzon


Authors List: Show

  • Om Jadhav, Om Jadhav, College of Engineering and Applied Sciences
  • Tatyana Belenkaya, Tatyana Belenkaya, College of Medicine
  • Marat Khodoun, Marat Khodoun, Cincinnati Children's Hospital Medical Center
  • Aleksey Porollo, Aleksey Porollo, Cincinnati Children's Hospital Medical Center

Presentation Overview:Show

This study focuses on enhancing the efficiency of Calvin cycle by targeting the kinetic parameters of its key enzyme, Ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO). RuBisCO's slow catalytic rate (Kcat) and its specificity for CO₂ over O₂ (Sc/o) substantially limit photosynthetic efficiency, particularly under high CO₂ levels and light intensities. To address this, we analyzed 175 RuBisCO complexes with experimentally measured kinetic parameters using the protein language model ProtT5 for sequence embeddings. These embeddings were then processed through various machine learning models - Ridge regression, LASSO regression, SVM, and Random Forest regression - to predict Kcat and Sc/o. The Ridge regression models performed best, achieving a Pearson correlation coefficient of 0.611 and R² of 0.359 for Kcat, and 0.814 and R² of 0.663 for Sc/o, utilizing leave-one-out cross-validation. Further, we applied these models to predict kinetic parameters for 56,379 non-annotated RuBisCO sequences. Top performing sequences from both experimentally annotated and predicted datasets underwent in silico mutagenesis using a genetic algorithm. This mutagenesis targeted either any sequence position or specifically those lining the active site cavity, excluding the catalytic sites. Conducted over 10 iterations in 5 independent runs with 5000 mutants each, this approach yielded a maximum predicted Kcat of 12 s⁻¹ and 10 s⁻¹ from full sequence and cavity-targeted mutagenesis, respectively, a 2-fold improvement over natural enzymes. Our results highlight the potential of using computational tools and genetic algorithms for the rational design of RuBisCO, aiming to improve photosynthetic efficiency and agricultural productivity while contributing to climate change mitigation and renewable energy development.

July 15, 2024
17:20-17:40
Understanding and predicting ligand efficacy in the mu-opioid receptor through quantitative dynamical analysis of complex structures
Confirmed Presenter: Gabriel Galdino, University of Montreal, Canada
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Alexander Monzon


Authors List: Show

  • Gabriel Galdino, Gabriel Galdino, University of Montreal
  • Olivier Mailhot, Olivier Mailhot, University of California
  • Rafael Najmanovich, Rafael Najmanovich, University of Montreal

Presentation Overview:Show

GPCRs are a family of membrane proteins that regulate many biological processes and are attractive targets for drug development, representing approximately 1/3 of global marketed drugs. We docked a set of ligands with known Emax for GTP-gammaS binding to a crystal structure of the active Mu (MOR) and Kappa (KOR) Opioid Receptors. Using a coarse-grained approach, we applied normal mode analysis to calculate Dynamical Signatures of different ligand/GPCR complexes, identifying local and global changes in flexibility of different residues upon ligand binding. We used LASSO multiple linear regression to determine crucial residues in contact with the set of ligands and to obtain predictors of the efficacy of new drug candidates as agonists, antagonists, or partial agonists.
We obtained a roc AUC> 0.85 when analysing the performance of the model as a binary classifier. By analyzing the coefficients of these predictors, we identified positions of high importance to the receptor activation, such as L85 (Ballesteros-Weinstein position 1.47), that have mutations that are reported to affect morphine response in MOR, and positions with no known mutations reported such as K305 (6.58) for MOR. Our study provides insights into the dynamics and structural features of ligand binding to GPCRs and represents a new tool for predicting the efficacy of new drug candidates that can be coupled to high-throughput screening.

July 15, 2024
17:40-18:00
Dynamic network analysis of protein structural change
Confirmed Presenter: Aydin Wells, University of Notre Dame, United States
Track: 3DSIG

Room: 520a
Format: In Person
Moderator(s): Alexander Monzon


Authors List: Show

  • Aydin Wells, Aydin Wells, University of Notre Dame
  • Siyu Yang, Siyu Yang, University of Notre Dame
  • Khalique Newaz, Khalique Newaz, University of Hamburg
  • Tijana Milenkovic, Tijana Milenkovic, University of Notre Dame

Presentation Overview:Show

A protein’s sequence folds into a 3D structure, which directs what other proteins it may interact with to carry out cellular function. Hence, analyses of protein structures are critical for understanding protein functions. Because functions of many proteins remain unknown, computational approaches for linking proteins’ structures to functions are necessary.

Our lab previously used network-based methods to model protein structures as protein structure networks (PSNs). Graph-based analyses of these PSNs proved to be superior to using state-of-the-art sequence and non-network-based 3D structural approaches in task of protein structure classification (PSC). However, traditional PSN approaches (including ours) modeled whole, native protein 3D structures as static PSNs that overlook the protein folding dynamics. To overcome this, we recently proposed a dynamic PSN idea. Unfortunately, there is lack of data on 3D sub-structural configurations (or intermediates) of a protein as it undergoes folding to attain its native structure. So, we had to resort to modeling native structures of proteins as dynamic PSNs. Nonetheless, even this yielded significant improvements in the PSC task over modeling the native structures as static PSNs.

Most recently, as an even better proxy to studying protein folding dynamics than our recent PSC study, we have identified large enough experimental data that captures how the structure of a protein dynamically changes before vs. after the protein is bound to a ligand. We aim to examine how well the dynamic PSN analyses of this data will be able to explain seven different types of protein structural changes observed in the data.