Analysis and prediction of RuBisCO kinetics using deep learning
Confirmed Presenter: Aleksey Porollo, Cincinnati Children's Hospital Medical Center, Cincinnati
Track: 3DSIG
Room: 520a
Format: In Person
Moderator(s): Alexander Monzon
Authors List: Show
- Om Jadhav, Om Jadhav, College of Engineering and Applied Sciences
- Tatyana Belenkaya, Tatyana Belenkaya, College of Medicine
- Marat Khodoun, Marat Khodoun, Cincinnati Children's Hospital Medical Center
- Aleksey Porollo, Aleksey Porollo, Cincinnati Children's Hospital Medical Center
Presentation Overview:Show
This study focuses on enhancing the efficiency of Calvin cycle by targeting the kinetic parameters of its key enzyme, Ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO). RuBisCO's slow catalytic rate (Kcat) and its specificity for CO₂ over O₂ (Sc/o) substantially limit photosynthetic efficiency, particularly under high CO₂ levels and light intensities. To address this, we analyzed 175 RuBisCO complexes with experimentally measured kinetic parameters using the protein language model ProtT5 for sequence embeddings. These embeddings were then processed through various machine learning models - Ridge regression, LASSO regression, SVM, and Random Forest regression - to predict Kcat and Sc/o. The Ridge regression models performed best, achieving a Pearson correlation coefficient of 0.611 and R² of 0.359 for Kcat, and 0.814 and R² of 0.663 for Sc/o, utilizing leave-one-out cross-validation. Further, we applied these models to predict kinetic parameters for 56,379 non-annotated RuBisCO sequences. Top performing sequences from both experimentally annotated and predicted datasets underwent in silico mutagenesis using a genetic algorithm. This mutagenesis targeted either any sequence position or specifically those lining the active site cavity, excluding the catalytic sites. Conducted over 10 iterations in 5 independent runs with 5000 mutants each, this approach yielded a maximum predicted Kcat of 12 s⁻¹ and 10 s⁻¹ from full sequence and cavity-targeted mutagenesis, respectively, a 2-fold improvement over natural enzymes. Our results highlight the potential of using computational tools and genetic algorithms for the rational design of RuBisCO, aiming to improve photosynthetic efficiency and agricultural productivity while contributing to climate change mitigation and renewable energy development.