A multiscale functional map of somatic mutations in cancer integrating protein structure and network topology
Confirmed Presenter: Yingying Zhang, Cornell University, United States
Track: 3DSIG
Room: 520a
Format: In Person
Moderator(s): Rafael Najmanovich
Authors List: Show
- Yingying Zhang, Yingying Zhang, Cornell University
- Alden Leung, Alden Leung, Cornell University
- Jin Joo Kang, Jin Joo Kang, Cornell University
- Yu Sun, Yu Sun, Cornell University
- Guanxi Wu, Guanxi Wu, Cornell University
- Le Li, Le Li, Cornell University
- Jiayang Sun, Jiayang Sun, Cornell University
- Lily Cheng, Lily Cheng, Cornell University
- Tian Qiu, Tian Qiu, Cornell University
- Junke Zhang, Junke Zhang, Cornell University
- Shayne Wierbowski, Shayne Wierbowski, Cornell University
- James Booth, James Booth, Cornell University
- Haiyuan Yu, Haiyuan Yu, Cornell University
Presentation Overview:Show
A major goal of cancer biology is to understand the mechanisms underlying tumorigenesis driven by somatically acquired mutations. Two distinct types of computational methodologies have emerged: one focuses on analyzing clustering of mutations within protein sequences and 3D structures, while the other characterizes mutations by leveraging the topology of protein-protein interaction network. Their insights are largely non-overlapping, offering complementary strengths. Here, we established a unified, end-to-end 3D structurally-informed protein interaction network propagation framework, NetFlow3D, that systematically maps the multiscale mechanistic effects of somatic mutations in cancer. The establishment of NetFlow3D hinges upon the Human Protein Structurome, a comprehensive repository we compiled that incorporates the 3D structures of every single protein as well as the binding interfaces of all known protein interactions in humans. NetFlow3D leverages the Structurome to integrate information across atomic, residue, protein and network levels: It conducts 3D clustering of mutations across atomic and residue levels on protein structures to identify potential driver mutations. It then anisotropically propagates their impacts across the protein interaction network, with propagation guided by the specific 3D structural interfaces involved, to identify significantly interconnected network “modules”, thereby uncovering key biological processes underlying disease etiology. Applied to 1,038,899 somatic protein-altering mutations in 9,946 TCGA tumors across 33 cancer types, NetFlow3D identified 12,378 significant 3D clusters throughout the Human Protein Structurome, of which ~54% would not have been found if using only experimentally-determined structures. It then identified 28 significantly interconnected modules that encompass ~8-fold more proteins than applying standard network analyses.