Click here for conference updates
Click here for the "How to?" guide
Click here to access Airmeet
Click here to access Discord
Click here access the CASP14 press release
Session 1: Tuesday, December 1, 2020 at 1:50 PM EST
- Gabriele Pozatti
- Wensi Zhu
- John Lamb
- Claudio Bassot
- Petras Kundrotas
- Arne Elofsson
Short Abstract: In the last decade de novo protein structure prediction accuracy for individual proteins, by the use of co-evolution and deep learning harvesting the information from large multiple sequence alignments. This information can, in principle, also be used to extract information about protein-protein interaction, but the success has so far been limited to a handful of proteins. Most of earlier studies have not used the latest improvements achieved in contact-based predictions using deep learning to predict the distances between residue pairs. Here, we first show that using one of the best residue-residue contact prediction methods (trRosetta) it is in many cases possible to simultaneously predict the structure of two proteins and their interaction. The average performance is comparable to the use of alternative docking methods, either template based or methods used by shape-complementarity. However, the results are complementary making it possible to combine the three methods to accurately predict the structure of X% of all protein-protein pairs in a common benchmark set. In contrast to what is observed for the prediction accuracy of single protein, the prediction of protein-protein pairs is not strongly dependent on the size of the multiple sequence alignments. We also identify that the current method produces artefacts when there exists homology between the interacting proteins. This bottleneck affects approximately one-third of the proteins pairs in our benchmark set. Finally. we introduce a novel scoring function, PconsDock, that can be used to evaluate the quality of a protein-protein pair.
To ask a question to the presenter click here
Show
- Q. Aguirre-Plans, University Pompeu Fabra, Spain
- A. Molina Martinez de los Reyes2, Barcelona Supercomputing Center, Spain
- R. Molina-Fernandez, University Pompeu Fabra, Spain
- A. Meseguer, University Pompeu Fabra, Spain
- N. Fernandez-Fuentes, Universitat de Vic-Universitat Central de Catalunya, Spain
- B. Oliva, University Pompeu Fabra, Spain
Short Abstract: In this poster we present the four methods that have been used on the different predictions during CASP14 competition. 1. The SPServer [1] have been used to assess the quality of the models deposited in CASP14. The program is based on statistic potentials1. The server is available at http://sbi.upf.edu/spserver/. 2. RADI [2], a computational approaches developed to predict residue contacts from multiple-sequence alignment using Direct-coupling analysis (DCA) and testing different alphabets that group chemically equivalent residues2. The program is available at https://github.com/structuralbioinformatics. 3. radiMod, to predict protein structure ab initio by combining restraints derived from RADI, combined with the prediction of secondary structure and the alignment with local templates of super-secondary structure motifs (sMotifs)[3], using MODELLER[4]. The program is available at https://github.com/structuralbioinformatics. 4. VD2OCK method [5], To predict the quaternary structure of proteins using data-driven docking. The program is available at http://www.bioinsilico.org/VD2OCK and http://galaxy.interactomix.com/tool_runner?tool_id=interactomix_vd2ock References 1 Aloy, P. & Oliva, B. Splitting statistical potentials into meaningful scoring functions: testing the prediction of near-native structures from decoy conformations. BMC Struct Biol 9, 71, doi:1472-6807-9-71 [pii]10.1186/1472-6807-9-71 (2009). 2 Anton, B. et al. RADI (Reduced Alphabet Direct Information): Improving execution time for direct-coupling analysis. bioRxiv, 406603, doi:https://doi.org/10.1101/406603 (2018). 3 Bonet, J. et al. ArchDB 2014: structural classification of loops in proteins. Nucleic Acids Res 42, D315-319, doi:10.1093/nar/gkt1189 (2014). 4 Webb, B. & Sali, A. Comparative Protein Structure Modeling Using MODELLER. Curr Protoc Bioinformatics 54, 5 6 1-5 6 37, doi:10.1002/cpbi.3 (2016). 5 Segura et al. VORFFIP-driven dock: V-D2OCK, a fast and accurate protein docking strategy. PLoS One 12;10(3):e0118107, doi: 10.1371/journal.pone.0118107 (2015)
Video not uploaded
To ask a question to the presenter click here
Show
- Gabriel Studer, University of Basel, Switzerland
- Christine Rempfer, University of Basel, Switzerland
- Andrew M Waterhouse, University of Basel, Switzerland
- Rafal Gumienny, University of Basel, Switzerland
- Juergen Haas, University of Basel, Switzerland
- Torsten Schwede, University of Basel, Switzerland
Short Abstract: Assigning reliable estimates of overall, as well as per-residue qualities in 3D protein structure models is crucial to determine their utility and potential applications. Single model methods are capable of assessing individual models. In contrast, consensus methods exploit the variability of model ensembles for their predictions. QMEANDisCo is a composite score for single model quality estimation. It employs single model scores suitable for assessing individual models, extended with a consensus component by additionally leveraging information from experimentally determined protein structures that are homologous to the model being assessed. By using the found homologues directly, QMEANDisCo avoids the requirement of an ensemble of models as input. QMEANDisCo is available as a webserver at https://swissmodel.expasy.org/qmean. The source code can be downloaded from https://git.scicore.unibas.ch/schwede/QMEAN.
Video not uploaded
To ask a question to the presenter click here
Show
- Jian Liu, University of Missouri, United States of America
- Jie Hou, St. Louis University, United States of America
- Tianqi Wu, University of Missouri, United States of America
- Zhiye Guo, University of Missouri, United States of America
- J. Cheng, University of Missouri, United States of America
Short Abstract: The main improvement of our CASP14 MULTICOM human tertiary structure predictor over our CASP13 human predictor1 is to extensively use deep learning-based inter-residue distance prediction in template-free (ab initio) tertiary structure prediction and model quality assessment. Methods The input for MULTICOM predictor includes all the CASP14 server models plus some extra ab initio models built from distance maps predicted by DeepDist2 with deeper alignments generated from larger updated sequence databases if necessary. The redundant models with high similarity from the same server predictor were filtered out. The five automated quality assessment (QA) methods (MULTICOM-CLUSTER, MULTICOM-CONSTRUCT, MULTICOM-HYBRID, MULTICOM-DEEP, MULTICOM-DIST) that integrated a number of single-model and multi-model QA scores and inter-residue distance features were applied to evaluate the quality of the models (for more details, see our CASP14 quality assessment abstracts). The consensus of these QA predictions and human inspections were used to select top five models. Each top ranked model is combined with other similar models to generate a combined model. If the combined model is similar to the original model, it is used as one of final top models. Otherwise, the top ranked model was refined by 3Drefine314 or Modrefiner415 to general final models. If a target was predicted to have multiple domains, the same protocol above was applied to each domain separately to generate five top models for each domain. The top five models of all the domains are joined together to form five full-length models for the target. 1. Hou, J., Wu, T., Cao, R., & Cheng, J. Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13. Proteins: Structure, Function, and Bioinformatics, 87(12), 1165-1178, 2019. 2. Wu, T., Guo, Z., Hou, J., and Cheng, J. DeepDist: real-value inter-residue distance prediction with deep residual convolutional network. bioRxiv, 2020. 3. Bhattacharya, D., Nowotny, J., Cao, R., & Cheng, J. 3Drefine: an interactive web server for efficient protein structure refinement. Nucleic acids research, 44(W1), W406-W409, 2016. 4. Xu, D. and Zhang, Y. Improving the Physical Realism and Structural Accuracy of Protein Models by a Two-step Atomic-level Energy Minimization. Biophysical Journal, vol 101, 2525-2534, 2011.
To ask a question to the presenter click here
Show
- Tiangi Wu, University of Missouri, United States of America
- Jian Liu, University of Missouri, United States of America
- Zhiye Guo, University of Missouri, United States of America
- Jie Hou, St. Louis University, United States of America
- J. Cheng, University of Missouri, United States of America
Short Abstract: In CASP14, we tested our MULTICOM system, an integrated protein structure prediction system that ensembles both template-based and ab-initio protein folding approaches, on five tertiary structure prediction servers (MULTICOM-DIST, MULTICOM-HYBRID, MULTICOM-DEEP, MULTICOM_CLUSTER, and MULTICOM-CONSTRUCT). Improvements have been made on both template-based and ab initio predictors of the MULTICOM system since CASP13: (1)New ab initio modeling methods empowered by the deep-learning based protein inter-sequence distance prediction; (2) Light version of the MULTICOM system that enables faster searching for templates of a target and achieves comparable results with the previous one used in CASP13; (3) Consensus model ranking methods that leverage distance information for assessing the model quality.
Video not uploaded
To ask a question to the presenter click here
Show
- Zhiye Guo, University of Missouri, United States of America
- Tiangi Wu, University of Missouri, United States of America
- Jian Liu, University of Missouri, United States of America
- Jie Hou, St. Louis University, United States of America
- J. Cheng, University of Missouri, United States of America
Short Abstract: ""In CASP14, we tested several residue-residue distance predictors and one contact predictor based on different deep learning models trained on residue-residue co-evolution features and several other sequence and structural features. Methods We use four sets of features with deep neural networks. Three of four feature sets are mostly coevolution-based features, i.e. covariance matrix (COV), pseudolikelihood maximization matrix (PLM), and precision matrix (PRE) calculated from multiple sequence alignments. And one set of features contains non-coevolution sequence-based features (OTHER) in case multiple sequence alignments are shallow. The OTHER feature set has the sequence profile generated by PSI-BLAST, solvent accessibility from PSIPRED and so on. Different input feature sets have different input sizes. The dimension of COV, PLM, PRE and OTHER is L*L*483, L*L*482, L*L*484 and L*L*47 respectively (L: sequence length). The inputs are fed to an instance normalization layer, followed by one convolutional layer and one Maxout layer. The output of the Maxout layer is fed into 20 residual blocks. Each residual block contains three RCIN blocks which are composed of instance normalization layer, row normalization layer, column normalization layer, five convolutional layers with 64 filters and kernel size are 1*1, 3*3, 7*1, 1*7, 1*1 respectively, one squeeze-and-excitation block and one dropout layer with a dropout rate at 0.2. After the last residual block employing a convolutional layer followed by instance normalization layer, the softmax activation function is used to predict the distance distribution. Even though all seven distance predictors share the similar network architectures, they differ in distance interval labels used to train them, how the prediction output is produced, and how input multiple sequence alignments (MSAs) are generated. The distance intervals of MULTICOM-CONSTRUCT are 0 to 4 Å, 4 to 6 Å, 6 to 8 Å, …, 18 to 20 Å and > 20 Å. We discretize inter-residue distance into 42 bins for MULTICOM-DIST, i.e. dividing 2 to 22 Å into 40 bins with bin size 0.5 Å, plus a 0 - 2 Å interval and a > 22 Å interval. The distance range (0 to 20 Å) of MULTICOM-AI is binned into 37 equally spaced interval of 0.5 Å, plus one > 20 Å interval. MULTICOM-HYBRID shares a similar segmentation strategy with MULTICOM-DIST, but the difference is that it starts with an interval 0 - 3.5 Å, and its last interval is set to > 19 Å. All predictions are converted into the official intervals of CASP14. The predictions of MULTICOM-DEEP and MULTICOM are averaged from the outputs of all the other servers above. Unlike the multi-interval distance predictors, MULTICOM-CLUSTER is a binary contact map predictor. Furthermore, different alignment methods are used by the predictors to generate input MSAs, including DeepMSA, our in-house tool DeepAln and one approach that uses HHblits search against BFD database. Availability The source code is available at https://github.com/multicom-toolbox/deepdist ""
Video not uploaded
To ask a question to the presenter click here
Show
- Mindaugas Margelevicius, Vilnius University, Lithuania
Short Abstract: An initial version of a versatile protocol ROPIUS0 is presented. ROPIUS0 involves homology modeling, modeling using deep learning, and estimating the accuracy of structural models. For homology modeling, a template selection and alignment adjustment algorithm TSA3 is presented. The connecting part of the ROPIUS0 protocol is a module for predicting the distributions of distances between residues of a protein sequence. Some examples of predicted distance maps and protein modeling results are shown.
To ask a question to the presenter click here
Show
- Xavier Robin, University of Basel, Switzerland
- Juergen Haas, University of Basel, Switzerland
- Rafal Gumienny, University of Basel, Switzerland
- Anna Smolinski, University of Basel, Switzerland
- Flavio Ackermann, University of Basel, Switzerland
- Gerardo Tauriello, University of Basel, Switzerland
- Torsten Schwede, University of Basel, Switzerland
Short Abstract: Today, many structure prediction workflows are fully automated starting from a protein amino acid sequence. Consequently, continuous automated benchmarking is key to sustain the high-paced development of emerging methods. The Continuous Automated Model EvaluatiOn (CAMEO, https://www.cameo3d.org) is a weekly assessment of the prediction performance of protein structure prediction servers (3D) as well as quality estimation servers (QE). Each Saturday, 20 protein sequences are selected from the pre-release of amino acid sequences that is part of the PDB release cycle, and submitted to the participating CAMEO 3D servers. Models returned by public CAMEO 3D servers within 30 hours are then submitted to registered CAMEO QE servers. CAMEO supports the developers of prediction servers by rapidly assessing new developments anonymously and monitoring the performance of their public productive servers continuously. CAMEO allows life scientists to better understand which public modeling server is the most suited for their specific use case. CAMEO stimulates the prediction communities in discussing new areas of interest and new scores. With the latest improvements in single-chain protein structure modeling, benchmarking needs will significantly evolve in the coming years. To that aim, CAMEO is introducing a new 3D category, “Structures & Complexes”, which will eventually replace the current single-chain 3D category (https://beta.cameo3d.org/). Currently in beta version, this category submits ligands (InChI codes), modified residues (as non-canonical sequences), nucleic acids and heteromeric protein complexes to the registered participants on an opt-in basis. We hence invite the community to discuss these new developments, target selection and scoring schemes and emerging methods to best reflect those recent scientific developments (help-cameo3d@unibas.ch).
Video not uploaded
To ask a question to the presenter click here
Show
- Toshiyuki Oda, Indepent Software Developer
Short Abstract: SurfStamp was developed to provide an intuitive understanding of properties of proteins, especially with molecular surface models. The basic ability to draw residue information on three-dimensional objects is very useful. Several updates have been made since the initial release to promote the usefulness of this approach. In this poster, I will show you some examples created with this software that I think will be interesting for you.
Video not uploaded
To ask a question to the presenter click here
Show
- Xiao Chen, University of Missouri, United States of America
- Jian Liu, University of Missouri, United States of America
- Jie Hou, St. Louis University, United States of America
- Tiangi Wu, University of Missouri, United States of America
- Zhiye Guo, University of Missouri, United States of America
- J. Cheng, University of Missouri, United States of America
Short Abstract: In CASP14, we tested our MULTICOM system on the protein quality assessment (QA) task. We applied advanced deep learning methods and inter-residue distance/contact information to improve prediction performance. Our methods fall into two categories. Multi-model methods: MULTICOM-CLUSTER, MULTICOM-CONSTRUCT, MULTICOM-AI, MULTICOM-HYBRID, and single-model methods: MULTICOM-DEEP, MULTICOM-DIST. The main difference between our six methods lies in the usage of input features and training strategies. We generated some features as the base features. Base features are composed by energy scores from single-model QA models (i.e., SBROD3, OPUS_PSP4, RF_CB_SRS_OD5, Rwplus6, DeepQA7, ProQ28, ProQ39, Dope10 and Voronota11 ), multi-model QA features(i.e., APOLLO12, Pcons13, and ModFOLDclust214). and contact features15. Among the six QA methods, MULTICOM-CLUSTER and MULTICOM-CONSTRUCT used all the base features as input to predict the final quality score, while other methods adopt the feature selection or add novel features to improve the quality assessment. For example, MULTICOM-AI added correlation feature16 as new input feature. MULTICOM-HYBRID, MULTICOM-DEEP, MULTICOM-DIST incorporated novel distance-based features from the predicted inter-residue distance map, including GIST Descriptor17, Oriented FAST and Rotated BRIEF (ORB)18, PHASH19, PSNR & SSIM20, Pearson correlation. Two ensemble methods were applied to these models. Given a particular set of input features and training data, we first trained ten deep neural networks for each method using 10-fold cross-validation. MULTICOM-AI and MULTICOM-CONSTRUCT average the ten network’s predictions to predict the final quality score of the input protein structure. For methods MULITCOM-CLUSTER, MULTICOM-HYBRID, MULTICOM-DEEP, and MULTICOM-DIST, we applied a second stage training after the 10-folds cross-validation. We combined 10- folds CV predicted values with initial features and built new deep learning model on top of these new features for final quality score. All six models were evaluated on 23 released CASP14 targets for both stage 1 and stage 2 models. MULTICOM-AI got the lowest First-Ranked GDT-TS loss and highest PCC on stage 1 models. On stage 2, MULTICOM-CONSTRUCT attained the best loss performance. 1 Hou, J., Wu, T., Cao, R. & Cheng, J. Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13. Proteins: Structure, Function, and Bioinformatics 87, 1165-1178 (2019). 2 Chen, X., Akhter, N., Guo, Z., Wu, T., Hou, J., Shehu, A. and Cheng, J. 2020. Deep Ranking in Template-free Protein Structure Prediction. The 11th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB). 3 Karasikov, M., Pages, G. & Grudinin, S. Smooth orientation-dependent scoring function for coarse-grained protein quality assessment. Bioinformatics 35, 2801-2808 (2019). 4 Lu, M., Dousis, A. D. & Ma, J. OPUS-PSP: an orientation-dependent statistical all-atom potential derived from side-chain packing. Journal of molecular biology 376, 288-301 (2008). 5 Rykunov, D. & Fiser, A. Effects of amino acid composition, finite size of proteins, and sparse statistics on distance-dependent statistical pair potentials. Proteins: Structure, Function, and Bioinformatics 67, 559-568 (2007). 6 Zhang, J. & Zhang, Y. A novel side-chain orientation dependent potential derived from random-walk reference state for protein fold selection and structure prediction. PloS one 5, e15386 (2010). 7 Cao, R., Bhattacharya, D., Hou, J. & Cheng, J. DeepQA: improving the estimation of single protein model quality with deep belief networks. BMC bioinformatics 17, 495 (2016). 8 Uziela, K. & Wallner, B. ProQ2: Estimation of Model Accuracy Implemented in Rosetta. Bioinformatics, btv767 (2016). 9 Uziela, K., Shu, N., Wallner, B. & Elofsson, A. ProQ3: Improved model quality assessments using Rosetta energy terms. Scientific reports 6, 33509 (2016). 10 Shen, M. y. & Sali, A. Statistical potential for assessment and prediction of protein structures. Protein science 15, 2507-2524 (2006). 11 OlechnoviÄ, K. & Venclovas, C. Voronota: A fast and reliable tool for computing the vertices of the Voronoi diagram of atomic balls. Journal of computational chemistry 35, 672-681 (2014). 12 Wang, Z., Eickholt, J. & Cheng, J. APOLLO: a quality assessment service for single and multiple protein models. Bioinformatics 27, 1715-1716 (2011). 13 Wallner, B. & Elofsson, A. Identification of correct regions in protein models using structural, alignment, and consensus information. Protein Science 15, 900-913 (2006). 14 McGuffin, L. J. & Roche, D. B. Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments. Bioinformatics 26, 182-188 (2009). 15 Adhikari, B., Hou, J., & Cheng, J. (2018). DNCON2: improved protein contact prediction using two-level deep convolutional neural networks. Bioinformatics, 34(9), 1466-1472., 16 Chen, X., Akhter, N., Guo, Z., Wu, T., Hou, J., Shehu, A., & Cheng, J. (2020, September). Deep Ranking in Template-free Protein Structure Prediction. In Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics (pp. 1-10). 17 Oliva, A. & Torralba, A. Modeling the shape of the scene: A holistic representation of the spatial envelope. International journal of computer vision 42, 145-175 (2001). 18 Rublee, E., Rabaud, V., Konolige, K. & Bradski, G. in 2011 International conference on computer vision. 2564-2571 (Ieee). 19 Kozat, S. S., Venkatesan, R. & Mihcak, M. K. in 2004 International Conference on Image Processing, 2004. ICIP'04. 3443-3446 (IEEE). 20 Hore, A. & Ziou, D. in 2010 20th international conference on pattern recognition. 2366-2369 (IEEE).
To ask a question to the presenter click here
Show
- Kliment Olechnovic, Vilnius University, Lithuania
Short Abstract: VoroMQA is a method for quality assessment of protein structural models. It uses Voronoi tessellation-derived interatomic and solvent contact areas by employing the idea of a knowledge-based statistical potential. The VoroMQA scoring function produces quality scores at atomic, residue and full structure levels. In addition, VoroMQA can directly assess protein-protein interaction interfaces. We also present VoroMQA-dark, a neural network-based protein structure quality assessment method that estimates local CAD-scores from pre-convoluted VoroMQA-derived features. The VoroMQA software is freely available both as a standalone application and as a web server from bioinformatics.ibt.lt/wtsam/voromqa.
Video not uploaded
To ask a question to the presenter click here
Show
- Ilia Igashov, CNRS, France
- Nikita Pavlichenko, Moscow Institute of Physics and Technology, Russia
- Sergei Grudinin, CNRS, France
Short Abstract: Processing information on 3D objects requires methods stable to rigid-body transformations, in particular rotations, of the input data. In image processing tasks, convolutional neural networks achieve this property using rotation-equivariant operations. However, contrary to images, graphs generally have irregular topology. This makes it challenging to define a rotation-equivariant convolution operation on these structures. In this work we propose a graph convolutional network that processes 3D models of proteins represented as molecular graphs. In a protein molecule, individual amino acids have common topological elements. This allows us to unambiguously associate each amino acid with a local coordinate system and construct rotation-equivariant spherical filters that operate on angular information between graph nodes. Within the framework of the protein model quality assessment problem, we demonstrate that the proposed spherical convolution method significantly improves the quality of model assessment compared to the standard message-passing approach. It is also comparable to the state-of-the-art methods, as we demonstrate on Critical Assessment of Structure Prediction (CASP) benchmarks. The proposed approach operates only on geometric features of protein 3D models. This makes it universal and applicable to any other geometric-learning task where the graph structure allows constructing local coordinate systems.
Video not uploaded
To ask a question to the presenter click here
Show
- Ilia Igashov, CNRS, France
- Kliment Olechnovic, Vilnius University, Lithuania
- Maria Kadukova, CNRS, France
- Ceslovas Venclovas, Vilnius University, Lithuania
- Elodie Laine, Sorbonne University, France
- Sergei Grudinin, CNRS, France
Short Abstract: Effective use of evolutionary information has recently led to tremendous progress in computational prediction of three-dimensional (3D) structures of proteins and their complexes. Despite the progress, the accuracy of predicted structures tends to vary considerably from case to case. Since the utility of computational models depends on their accuracy, reliable estimates of deviation between predicted and native structures are of utmost importance. For the first time we present a deep convolutional neural network (CNN) constructed on a Voronoi tessellation of 3D molecular structures. Despite the irregular data domain, our data representation allows to efficiently introduce both convolution and pooling operations of the network. We trained our model, called VoroCNN, to predict local qualities of 3D protein folds. The prediction results are competitive to the state of the art and superior to the previous 3D CNN architectures built for the same task. We also discuss practical applications of VoroCNN, for example, in the recognition of protein binding interfaces.
Video not uploaded
To ask a question to the presenter click here
Show
- Dennis Della Corte, Brigham Young University
- Wendy Billings, Brigham Young University
- Connor Morris, Brigham Young University
Short Abstract: DellaCorte lab is among the 5 top contributing groups at CASP, with ~700 models submitted for regular, refinement, assisted, and contact/distance targets. Here, we share our preliminary analysis, which shows top performance in refinement with molecular dynamic based methods, good contact predictions with our ProSPr network, and decent model quality for a ProSPr/trRosetta based structure prediction pipeline. We show a novel quality assessment metric that has Pearson r > 0.9 correlation and apply it to hard-to-rank models from the Baker group’s refinement submissions. We also provide a roadmap for future ProSPr releases and current optimization efforts. Finally, we extend an invitation to learn about BYU, the students at DellaCorte lab, possible collaborations, and graduate student recruitment. I made a video recording and a poster of the presentation slides available.
To ask a question to the presenter click here
Show
- Naozumi Hiranuma, University of Washington, United States of America
- Minkyung Baek, University of Washington, United States of America
- Hahnbeom Park, University of Washington, United States of America
- Ivan Anishchenko, University of Washington, United States of America
- David Baker, University of Washington, United States of America
Short Abstract: Deep learning (DL) has been successfully used in numerous methods that aim to estimate accuracy of modeled protein structures. Recently, we developed a novel deep learning framework (DeepAccNet) that estimates per-residue accuracy (C? local distance difference test; C? l-DDT) and residue-residue distance signed error (histogram of error; estogram) of modeled protein structures. In this CASP, we applied DeepAccNet and the variant of DeepAccNet (named DeepAccNet-MSA) to the EMA category. The predictions of DeepAccNet were submitted for "BAKER-experimental" group while those of DeepAccNet-MSA were submitted for "BAKER-ROSETTASERVER" group.
Video not uploaded
To ask a question to the presenter click here
Show
- Dingyan Wang, Chinese Academy of Sciences, China
- Denghiu Liu, Huawei Technologies, China
- Zhimeng Xu, Huawei Technologies, China
- Wenjun He, Huawei Technologies, China
- Chi Xu, Huawei Technologies, China
- Jianzhong He, Huawei Technologies, China
- Xinyuan Lin, Huawei Technologies, China
- Lei Zhang, Huawei Technologies, China
- Xiaopeng Zhang, Huawei Technologies, China
- Lingxi Xie, Huawei Technologies, China
- Qi Tian, Huawei Technologies, China
- Hualiang Jiang, Chinese Academy of Sciences, China
- Xi Cheng, Chinese Academy of Sciences, China
- Nan Qiao, Huawei Technologies, China
- Mingyue Zheng, Chinese Academy of Sciences, China
Short Abstract: The residue-residue contacts information is essential in the protein structure prediction task. A comprehensive protein structure prediction algorithm was developed by integrating contact-driven modeling, template-based modeling, and protein model assessment methods. The prediction of the residue-residue distances and orientations of contact-driven modeling was formulated into a data-driven, dense prediction problem and a deep CNN model equipped with attention modules was used for this purpose. Then, trRosetta was used to generate a set of candidate protein structures and re-ranked them using a 3DCNN model for quality assessment.
Video not uploaded
To ask a question to the presenter click here
Show
Show
- Jianfeng Sun, Technical University of Munich, Germany
- Dimitrij Frishman, Technical University of Munich, Germany
Short Abstract: DeepHelicon is specialized for predicting inter-helical residue contacts in transmembrane proteins in CASP14. It only takes as input a protein sequence in FASTA format. Residues located in the transmembrane regions are detected by the TMHMM2.0 algorithm. Accurate prediction of amino acid residue contacts is an important prerequisite for generating high-quality 3D models of transmembrane (TM) proteins. While a large number of compositional, evolutionary, and structural properties of proteins can be used to train contact prediction methods, recent research suggests that coevolution between residues provides the strongest indication of their spatial proximity. We have developed a deep learning approach, DeepHelicon, to predict inter-helical residue contacts in TM proteins by considering only coevolutionary features. DeepHelicon comprises a two-stage supervised learning process by residual neural networks for a gradual refinement of contact maps, followed by variance reduction by an ensemble of models. We present a benchmark study of 12 contact predictors and conclude that DeepHelicon together with the two other state-of-the-art methods DeepMetaPSICOV and Membrain2 outperforms the 10 remaining algorithms on all datasets and at all settings. On a set of 44 TM proteins with an average length of 388 residues DeepHelicon achieves the best performance among all benchmarked methods in predicting the top L/5 and L/2 inter-helical contacts, with the mean precision of 87.42% and 77.84%, respectively. On a set of 57 relatively small TM proteins with an average length of 298 residues DeepHelicon ranks second best after DeepMetaPSICOV. DeepHelicon produces the most accurate predictions for large proteins with more than 10 transmembrane helices. Coevolutionary features alone allow to predict inter-helical residue contacts with an accuracy sufficient for generating acceptable 3D models for up to 30% of proteins using a fully automated modeling method such as CONFOLD2. The multiple sequence alignments (MSAs) of transmembrane proteins were generated using HHblits. Our work is detailed in the poster and the video presentation. The standalone DeepHelicon software is available at https://github.com/2003100127/deephelicon.
To ask a question to the presenter click here