Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

Posters

Preparing your Poster - Information and Poster Size
Poster Schedule
Print your poster in Chicago
Poster Categories

View Posters By Category

Session A: (July 7 and July 8)
Session B: (July 9 and July 10)
B-500: The Therapeutic Antibody Profiler (TAP): Computational Developability Guidelines from Therapeutic Properties
COSI: 3DSIG
  • Matthew Raybould, University of Oxford, United Kingdom
  • Claire Marks, University of Oxford, United Kingdom
  • Charlotte Deane, University of Oxford, United Kingdom
  • Alan Lewis, GlaxoSmithKline Research and Development, United Kingdom
  • Bojana Popovic, MedImmune, United Kingdom
  • Bruck Taddese, MedImmune, United Kingdom
  • Alexander Bujotzek, Roche Innovation Center Munich, Germany
  • Guy Georges, Roche Innovation Center Munich, Germany
  • Jiye Shi, UCB Pharma, United Kingdom

Short Abstract: In silico prediction of developability problems (including aggregation, immunogenicity, and poor expression) in therapeutic monoclonal antibody design remains a fundamental challenge. Here, we describe the Therapeutic Antibody Profiler (TAP) - a set of guidelines based on computationally calculated characteristics of clinical-stage therapeutic (CST) antibody sequences that can automatically identify candidates with increased potential for developability issues. In building TAP, we computed a multitude of metrics (some dependent only on the antibody sequence, others on a homology model structure) across a large set of CSTs and ~20,000 structurally diverse antibodies from human Next Generation Sequencing data. We identified five key developability-linked characteristics, four of which require a homology model, where our set of CSTs avoid extreme values. These properties were used in our developability guidelines. We have validated TAP on both a test set of CSTs and two example mAb drug discovery projects, with experimental data on their developability profiles, and find that we are able to computationally predict which sequences had developability issues. TAP is freely available at http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/TAP.php

B-501: Interplay between folding and binding modulates protein sequences, structures, functions and regulation
COSI: 3DSIG
  • Laszlo Dobson, Institute of Enzymology, RCNS, Hungarian Academy of Sciences, Hungary
  • Erzsébet Fichó, Institute of Enzymology, RCNS, Hungarian Academy of Sciences, Hungary
  • Gábor Tusnády, Institute of Enzymology, RCNS, Hungarian Academy of Sciences, Hungary
  • Zsuzsanna Dosztányi, Eötvös Loránd University, Hungary
  • István Simon, Institute of Enzymology, RCNS, Hungarian Academy of Sciences, Hungary
  • Bálint Mészáros, Eötvös Loránd University, Hungary

Short Abstract: Intrinsically Disordered Proteins (IDPs) mediate highly diversified and crucial functions in living cells. While lacking a stable structure, a remarkable proportion of IDPs are capable to fold via interactions with (most commonly protein) partners. Assuming the classic binary classification of proteins, each partner in an interaction can be either ordered or disordered. Thus, there are three possible scenarios of the interplay between binding and folding: autonomous folding and independent binding (where interacting partners are ordered), coupled folding and binding (where an IDP binds ordered partners) and mutual synergistic folding (involving exclusively IDPs). Recent advances in database development enabled us to identify a high amount of bound structures from all three classes, opening ways to assess the nature of these interactions through the sequence-structure-function paradigm. Since folding and binding share a similar biophysical background, these interactions can be described by the same approach, showing how the formation of structure is mirrored at different levels (sequence, structure, function). The structural understanding of coupled binding and folding recently led to novel ways of pharmaceutical modulation through the development of small molecules. While there are no current treatments targeting mutual synergistic folding, exploration of these previously unknown mechanisms may lead to new biomedical methods.

B-502: Structural insights into the characterization of binding sites in EGFR kinase mutants
COSI: 3DSIG
  • Zheng Zhao, School of Medicine, University of Virginia, United States
  • Lei Xie, The City University of New York, United States
  • Philip E. Bourne, School of Medicine, University of Virginia, United States

Short Abstract: Over the past two decades the EGFR kinase domain has been identified as an important target to treat non-small cell lung cancer (NSCLS). Currently, three generations of EGFR kinase-targeted small-molecule drugs have been approved. They normally have a near-immediate response at the start of treatment which leads to a substantial survival benefit for patients. However, long-term treatment results in acquired drug resistant and further vulnerability to NSCLS. Therefore, novel EGFR kinase inhibitors that specially overcome acquired mutations are urgently needed. To this end, we carried out comprehensive studies on different EGFR kinase mutants using a structural systems pharmacology strategy[1]. The structureome-based analyses show that both wild-type and mutated structures adopt multiple distinct conformational states as observed by molecular dynamics. Conformational states not observed in existing structures. A function-site-interaction-fingerprints-based analyses[2] shows remarkable conformational flexibility such that the structure accommodates diverse types of ligands with different types of binding modes. These results provide us with insights into designing a new-generation of EGFR kinase inhibitors for combating acquired drug-resistant mutations. More specifically, a multi-conformation-based drug design strategy is needed. [1]. Z Zhao, C Martin, R Fan, PE Bourne, L Xie. BMC_Bioinformatics 2016,17(1),90; [2]. Z Zhao, L Xie, L Xie, PE Bourne. J.Med.Chem. 2016,59(9),4326–4341

B-503: Ordering the Disordered Proteins Based on Electrostatic Properties
COSI: 3DSIG
  • Shula Shazman, The Open University of Israel, Israel

Short Abstract: Intrinsically disordered proteins (IDPs) are key components of regulatory networks that dictate various aspects of cellular decision-making. They are over-represented in major disease pathways, and are considered novel albeit currently difficult drug targets. In this study we investigated Role of intrinsically disordered regions (IDRs)in the regulation of transcription factors’ function. The results of this study suggest that (IDRs) are involved in DNA-binding via electrostatic. Therefore, electrostatic should be taken into account to a greater extent while designing drugs for IDPs targets.

B-504: Backbone Brackets and Arginine Tweezers delineate Class I and Class II aminoacyl tRNA synthetases
COSI: 3DSIG
  • Florian Kaiser, University of Applied Sciences Mittweida, Germany
  • Sebastian Bittrich, University of Applied Sciences Mittweida, Germany
  • Sebastian Salentin, Biotechnology Center (BIOTEC), TU Dresden, Germany
  • Christoph Leberecht, University of Applied Sciences Mittweida, Germany
  • V. Joachim Haupt, Biotechnology Center (BIOTEC), TU Dresden, Germany
  • Sarah Krautwurst, University of Applied Sciences Mittweida, Germany
  • Michael Schroeder, Biotechnology Center (BIOTEC), TU Dresden, Germany
  • Dirk Labudde, University of Applied Sciences Mittweida, Germany

Short Abstract: Aminoacyl tRNA synthetases (aaRS) ligate amino acids to their corresponding tRNA molecule and understanding the origin of aaRS can explain how the genetic code was established. Sequence analyses revealed that aaRS enzymes can be divided into two complementary classes which differ significantly on a sequence and structural level. We identified Backbone Brackets and Arginine Tweezers as most compact ATP binding motifs characteristic for each Class. This oppositional implementation of enzyme substrate binding shows how nature realized the binding of the same ligand species with completely different mechanisms. A structural rearrangement of the Backbone Brackets observed upon ATP binding indicates a general mechanism of all Class I structures. We demonstrate that sequence or even structure analysis for conserved residues may miss important functional aspects. The study shows how structural bioinformatics can be applied to link evolution and genetic coding. Backbone Brackets and Arginine Tweezers were traced back to the ancient Protozymes of aaRS, which were presumably encoded bidirectionally on opposite strands of the same gene. Both structural motifs can be observed in contemporary structures and it seems that the time of their addition, indicated by their placement in the ancient aaRS, coincides with the evolutionary trace of aaRS.

B-505: Structure-Function Relationships in Protein-Protein Complexes
COSI: 3DSIG
  • Petras Kundrotas, The University of Kansas, United States
  • Saveliy Belkin, The University of Kansas, United States
  • Anna Hadarovich, United Institute of Informatics Problems, National Academy of Sciences, Belarus
  • Alexander Tuzikov, United Institute of Informatics Problems, National Academy of Sciences, Belarus
  • Ilya Vakser, The Univrsity of Kansas, United States

Short Abstract: Structural and functional characterization of protein-protein interactions is important for understanding molecular mechanisms in a living cell. We present a systematic, quantitative analysis of structure-function relationships for proteins and protein-protein complexes, based on a set of 4950 representative protein-protein structures from our DOCKGROUND resource (http://dockground.compbio.ku.edu). Structural similarity of individual proteins, protein complexes, and protein-protein interfaces was quantified by TM-scores and the interaction RMSD. To quantify the functional similarity, we used previously developed GO-score based on the similarity of the Gene Ontology (GO) annotations. We calculated the GO-scores for the individual proteins and the protein complexes separately, based on the complete sets of GO terms, and on the GO terms related to the function of the complex only. Proteins and their interfaces were clustered based on structural and functional similarity, and the functional/structural variability of the clusters was analyzed. The function of structurally similar interfaces was determined to be more variable than that of the structurally similar full proteins. We also showed that adding a non-structural component, based on the GO-score, to the structure-based scoring function in template-based docking increases the probability of having a near-native structure of the protein-protein complex among the top-ranked docking models.

B-506: iCn3D 2.0: A Web based 3D Macromolecular Viewer enabling Sequence/Structure analysis through Structural Annotations
COSI: 3DSIG
  • Jiyao Wang, NCBI/NLM/NIH, United States
  • Philippe Youkharibache, NCI/NIH, United States
  • Dachuan Zhang, NIH, United States
  • Christopher Lanczycki, NIH, United States
  • Aron Marchler-Bauer, NIH, United States
  • Lewis Geer, NIH, United States
  • Renata Geer, NIH, United States
  • Minghong Ward, NIH, United States
  • Shennan Lu, NIH, United States
  • Gabriele Marchler, NIH, United States
  • Yanli Wang, NIH, United States
  • Tom Madej, NIH, United States
  • Steve Bryant, NIH, United States
  • Phan Lon, NIH, United States

Short Abstract: With the widespread availability of powerful hardware and mobile computing platforms, the visualization of biomolecular 3D structure is no longer restricted to stand-alone 3D viewer applications, as it can now be achieved via web-based technologies, such as WebGL. iCn3D (I see in 3D) is a novel molecular structure viewer employing these technologies using the JavaScript libraries Three.js and jQuery. iCn3D is designed to provide tight integration between 1D molecular sequence and 3D Molecular Structure displays. The data linked to either sequence or structure can be meshed together and synchronized between 1D and 3D displays, enabling new and cross disciplinary data analysis. The latest version, iCn3D 2.0, integrates data and annotations from a variety of databases, including NCBI’s Molecular Modeling Database (MMDB), which is based on the Protein Data Bank (PDB) and has been annotated with structural domains, secondary structure elements, binding sites, ligands, DNA, RNA and protein interactions, the Conserved Domains Database (CDD), and variant data from dbSNP and ClinVar that can provide insights of the mutation impact on protein functions and diseases. In addition, sequence based annotations such as NGS-called SNPs can be introduced as custom sequence Tracks. Annotations can be visualized simultaneously in 1D (sequence) and 3D (structure).

B-508: Predicting Loop Conformational Ensembles
COSI: 3DSIG
  • Claire Marks, University of Oxford, United Kingdom
  • Jiye Shi, UCB Pharma, United Kingdom
  • Charlotte Deane, University of Oxford, United Kingdom

Short Abstract: Protein function often relies on the ability of the protein to exist in a number of different stable conformations. Accurate prediction of these states would therefore be useful, providing an ensemble of structures that represent the diversity of the target instead of just a single, static model. We have investigated whether current algorithms are capable of this, in the context of loop structure prediction. We obtained two sets of targets, one containing loops with several experimentally-observed conformations and a set containing loops with only one conformation, and assessed the ability of four algorithms to generate and select decoys that are close to any, or all, of the known structures. We found that conformationally diverse loops are modelled significantly less accurately compared to loops with one known conformation. In fact, for most of these diverse loops, the decoys made were not similar to any of the native conformations. Our results imply that the idea of multiple native conformations being present in the decoy ensemble is incorrect, indicating that the prediction of conformation ensembles is impossible using current techniques, and that novel algorithms need to be designed with this specific goal in mind.

B-509: Where the context-free grammar meets the contact map: a probabilistic model of protein sequences aware of contacts between amino acids
COSI: 3DSIG
  • Witold Dyrka, Wroclaw University of Science and Technology, Poland
  • Francois Coste, Univ Rennes, Inria, CNRS, IRISA, France
  • Juliette Talibart, Univ Rennes, Inria, CNRS, IRISA, France

Short Abstract: Learning language of protein sequences, which captures non-local interactions between amino acids close in the spatial structure, is a long-standing bioinformatics challenge, which requires at least context-free grammars. However, complex character of protein interactions impedes unsupervised learning of context-free grammars. Using structural information to constrain the syntactic trees proved effective in learning probabilistic natural and RNA languages. In this work, we establish a framework for learning probabilistic context-free grammars for protein sequences from syntactic trees partially constrained using amino acid contacts obtained from wet experiments or computational predictions, whose reliability has substantially increased recently. Within the framework, we implement the maximum-likelihood and contrastive estimators of parameters for simple yet practical grammars. Tested on samples of protein motifs, grammars developed within the framework showed improved precision in recognition and higher fidelity to protein structures. The framework is applicable to other biomolecular languages and beyond wherever knowledge of non-local dependencies is available.

B-510: RCSB PROTEIN DATA BANK: SUSTAINING A LIVING DIGITAL DATA RESOURCE THAT ENABLES BREAKTHROUGHS IN SCIENTIFIC RESEARCH AND BIOMEDICAL EDUCATION
COSI: 3DSIG
  • Stephen Burley, RCSB PDB, UCSD, Rutgers University, United States

Short Abstract: Protein Data Bank was established as the 1st open access digital data resource in biology and medicine. PDB currently houses ~139,000 atomic-level 3D biomolecular structures determined experimentally. It is managed by the Worldwide Protein Data Bank. RCSB PDB operates the US PDB data center, and makes data available at no charge and without limitations. Studies of website usage, bibliometrics, and economics demonstrate the powerful impact of PDB data on basic and applied research, clinical medicine, education, and the economy. During 2016, >591 million structure data files were downloaded by Data Consumers worldwide. RCSB PDB processed >5,300 new atomic level biomolecular structures plus experimental data and metadata coming submitted by Data Depositors in the Americas and Oceania. More than >1 million RCSB.org users were served with data integrated with ~40 external resources providing rich structural views of fundamental biology, biomedicine, and energy sciences. More than 400 bioinformatics resources utilize PDB data. PDB data contribute to patent applications, drug discovery and development, publication of scientific studies, innovations that can lead to new product development and company formation, and STEM education. Support: NSF, NIH, DOE.

B-511: Prediction of PPI inhibition sites on protein models
COSI: 3DSIG
  • Saveliy Belkin, The University of Kansas, United States
  • Petras Kundrotas, The University of Kansas, United States
  • Ilya Vakser, The Univrsity of Kansas, United States

Short Abstract: Development of computational tools utilizing structural properties of proteins and pharmacological characteristics of small molecules have led to significant accomplishments in computer-aided drug discovery. However, in recent years, the progress has been negatively affected by the limitation of druggable target space. A promising yet understudied avenue in the drug development is modulation of protein-protein interactions (PPI). A protein-protein interface, typically, has a larger buried surface area and flatter topology than a traditional ligand-binding site. Binding to the protein-protein interface is also often accompanied by a change in the surface geometry. Thus, inhibition of PPI requires the development of new tools that address these critical aspects. In our earlier study, we demonstrated that the protein-bound conformation of a PPI target is a good starting point for prediction of the inhibition site on the experimentally determined protein structures. However, due to the limitations of the experimental techniques, most structures in the proteome have to be models, often of relatively low accuracy. In this study, we report the extension of the previous work to protein models. The results show that our methodology can reliably find PPI inhibition sites on protein models, even in the cases of relatively low structural accuracy.

B-512: Modeling and Conformational Analysis of Cyclotides, a Class of Macrocyclic Disulfide Bonded Plant Peptides
COSI: 3DSIG
  • Neha Kalmankar, The Institute of Trans-Disciplinary Health Sciences and Technology and National Centre for Biological Sciences, India
  • P. Balaram, National Centre for Biological Sciences & Indian Institute of Science, India
  • Radhika Venkatesan, National Centre for Biological Sciences, Bangalore, India., India
  • R. Sowdhamini, National Centre for Biological Sciences, Bangalore, India., India

Short Abstract: Cyclotides are a novel class of disulfide-rich macrocyclic peptides (26-37 residues), formed by cyclization of gene encoded, linear precursors in specific plant species. Additional to the circular backbone they form a cyclic-cystine-knot arrangement formed by a conserved six cysteine framework (Cys I-IV, II-V, III-VI). They are divided into Moebius and Bracelet subfamilies, based on presence or absence of a cis-proline in loop 5, respectively. We have analyzed 38 X-ray/NMR structures for structure and sequence signatures. Using peptidomics and transcriptomics sequences from Clitoria ternatea plant, based on size of interlinking loops between two adjacent Cys residues, 'cliotides' may be characterized into two subclasses: 33 sequences of '3-4-4-1-4-{4-8}' motif (predominantly Moebius) and 36 sequences of '3-4-6/7-1-4-{4-7}' motif (all Bracelet); where the numbers represent length of the intervening peptide segments. Thus far only one crystal structure of a cyclotide (Moebius) has been reported, while several NMR models are available for Bracelet conformation. We have used an in-house random conformation generation algorithm (RANMOD), which builds Ramachandran allowed conformations for peptides by assigning stereochemically accessible local conformations at each residue, in conjunction with MODIP disulfide modelling algorithm to generate linear precursors with three disulfide bridges, followed by an energy minimization routine to form cyclizable conformations.

B-513: Sequence Variability at Protein-Protein Interfaces
COSI: 3DSIG
  • Devlina Chakravarty, University of Kansas, United States
  • Petras Kundrotas, The University of Kansas, United States
  • Ilya Vakser, The Univrsity of Kansas, United States

Short Abstract: High-throughput experiments produce large amounts of data on protein-protein interactions. However, this data suffers from the high rate of false positives. Sequence homology is widely used for determining interacting protein pairs. The ultimate goal of this study is to use structural motifs, combined with the sequence similarity of the protein surfaces, to predict biological complexes. As a first step, we report the results on the sequence variability in structurally similar biological interfaces. The interfaces were extracted from PDB biounits of binary protein-protein complexes and clustered by structural similarity assessed by the TM-score. The TM-score cut-off 0.5 yielded 19,958 unique interface structures. We investigated sequence variability in each structural cluster by performing sub-clustering at 30% sequence identity cut-off. 64% of the structural clusters (at TM-score 0.5) comprised more than one sequence sub-cluster, suggesting presence of distantly related or different sequences in structurally similar interfaces. This correlates with the observations that evolutionarily favorable protein folds are repeatedly utilized for various functions. Consensus sequences from multiple sequence alignments were designated as representative for each sequence sub-cluster. The result of this investigation is the library of structurally unique interfaces and associated representative sequences for predicting and characterization of protein complexes

B-514: Improving the prediction of loops and drug binding in GPCR structure models
COSI: 3DSIG
  • Bhumika Arora, Indian Institute of Technology, Monash University, and IITB-Monash Research Academy, India
  • K.V. Venkatesh, Indian Institute of Technology Bombay, India, India
  • Denise Wootten, Monash University, Australia
  • Patrick Sexton, Monash University, Australia

Short Abstract: G protein-coupled receptors (GPCRs) form the largest group of potential drug targets and therefore, the knowledge of their three dimensional structure is important for rational drug design. Homology modeling serves as a common approach for modeling transmembrane helical cores of GPCRs, however, these models have varying degrees of inaccuracies that result from the quality of template used. We have explored the extent to which inaccuracies inherent in homology models of the transmembrane helical cores of GPCRs can impact loop prediction. We found that loop prediction in GPCR models is much more difficult than loop reconstruction in crystal structures owing to the imprecise positioning of loop anchors. Therefore, minimizing the errors in loop anchors is likely to be critical for optimal GPCR structure prediction. To address this, we have developed a ligand directed modeling (LDM) method comprising of geometric protein sampling and ligand docking, and evaluated it for capacity to refine the GPCR models built across a range of templates with varying degrees of sequence similarity with the target. The LDM reduced the errors in loop anchor positions, as well as improved the prediction of ligand binding poses, resulting in the much better performance of these models in virtual library screenings.

B-515: Comprehensive structural survey of HIV-1-neutralizing antibodies targeting Env trimer suggests vaccine templates
COSI: 3DSIG
  • Jing Zhou, National Institutes of Health, United States
  • Reda Rawi, National Institutes of Health, United States
  • Chen-Hsiang Shen, National Institutes of Health, United States
  • Zizhang Sheng, Columbia University, United States
  • Anthony West, Caltech, United States
  • Baoshan Zhang, National Institutes of Health, United States
  • Robert Bailer, National Institutes of Health, United States
  • Nicole Doria-Rose, National Institutes of Health, United States
  • Mark Louder, National Institutes of Health, United States
  • Krisha McKee, National Institutes of Health, United States
  • John Mascola, National Institutes of Health, United States
  • Pamela Bjorkman, Caltech, United States
  • Lawrence Shapiro, Columbia University, United States
  • Peter Kwong, National Institutes of Health, United States
  • Gwo-Yu Chuang, National Institutes of Health, United States

Short Abstract: HIV-1 broadly neutralizing antibodies (bNAbs) have shown great potential as therapeutic drugs and as templates for vaccine design. Structural analyses of the epitope and paratope features of these bNAbs could potentially reveal essential information for vaccine design. Here, we performed a comprehensive structural survey on all HIV-1 envelope-antibody complexes in the Protein Data Bank. Based on ontogeny and recognition, those HIV-1 antibodies segregated into 20 antibody classes. We calculated B cell ontogeny, paratope and epitope features for the representative antibody from each class and measured its neutralization on a 208-isolates panel. We found that neutralizing breadth is negatively correlated with epitope glycan component (r = -0.66), and epitope protein surface area is negatively correlated with its glycan epitope surface area (r = -0.84). Our findings also suggested that the CD4-binding site antibody IOMA could be a promising candidate for lineage-based vaccine design, due to its low somatic hypermutation (SHM) level, short CDR H3 length and relatively high neutralizing breadth (~50%). On the other hand, fusion peptide antibody VRC34.01, which had few epitope segments, low glycan content on the epitope, and high epitope-conformational variability, might be a promising candidate for epitope-based vaccine design.

B-516: Computational Analysis Highlights Key Molecular Interactions and Conformational Flexibility of a New Epitope on the Malaria Circumsporozoite Protein and Paves the Way for Vaccine Design
COSI: 3DSIG
  • S. Katie Farney, National Institute of Allergy and Infectious Diseases, National Institutes of Health, United States
  • Neville Kisalu, National Institute of Allergy and Infectious Diseases, National Institutes of Health, United States
  • Azza Idris, National Institute of Allergy and Infectious Diseases, National Institutes of Health, United States
  • Barbara Flynn, National Institute of Allergy and Infectious Diseases, National Institutes of Health, United States
  • Joseph Francica, National Institute of Allergy and Infectious Diseases, National Institutes of Health, United States
  • Baoshan Zhang, National Institutes of Health, United States
  • Marie Pancera, Fred Hutchinson Cancer Research Center, United States
  • Robert Seder, National Institute of Allergy and Infectious Diseases, National Institutes of Health, United States
  • Peter Kwong, National Institutes of Health, United States
  • Gwo-Yu Chuang, National Institutes of Health, United States

Short Abstract: Development of an effective vaccine or antibodies for the prevention and elimination of malaria is urgently needed and has application for use in malaria-endemic regions and for travelers, military personnel, and elimination campaigns. Previously, we isolated potent human monoclonal antibodies against the Plasmodium falciparum circumsporozoite protein (PfCSP) from a human subject immunized with an attenuated whole-sporozoite vaccine. Two of these antibodies, CIS43 and CIS42, conferred high-level protection in vitro. CIS43, however, additionally showed sterile protection in two different mouse models of malaria infection. To understand why CIS43 demonstrated in vivo efficacy while CIS42 did not, we performed multiple 500 ns molecular dynamics (MD) simulations of the CIS43 and CIS42 antigen-binding fragment crystal structures in complex with the junctional epitope (peptide 21), and additional free peptide 21 simulations. As compared to CIS42, CIS43 maintained more initial crystal contacts with peptide 21, and the critical epitope residues had lower flexibility over the course of the simulation. Principal component analysis quantified flexibility of the epitope and showed that neither the CIS43 or CIS42-bound peptide conformation was predominant. Overall, our findings revealed the importance of antibody mode-of-recognition and epitope conformation for eliciting parasite-neutralizing antibodies. These findings will have implications for next-generation, PfCSP-based malaria vaccines.

B-517: Metalloproteome landscape from the amino acid covariance perspective
COSI: 3DSIG
  • Frazier Baker, University of Cincinnati, United States
  • Nicholas Maltbie, University of Cincinnati, United States
  • Joseph Hirschfeld, University of Cincinnati, United States
  • Alexey Porollo, Cincinnati Children's Hospital Medical Center, United States

Short Abstract: Metal binding proteins are estimated to constitute at least one third of the proteome in any living organism. There is a great need for developing a reliable sequence-based annotation method for metal binding sites. We approached this problem using amino acid covariance analysis. 6090 non-redundant metal binding proteins were retrieved from the BioLiP database. A wide set of cumulative features derived from the top co-varying residues for a given site were evaluated. The best performing feature to discriminate metal binding from non-binding sites was found to be the individual conservation score (Shannon entropy). For metal specificity, the correlation-based metric appears the most informative to discriminate one metal versus others, as well as to achieve their pairwise distinctions. When discerning one type of metal from the other five types, metals can be discriminated in the following descending order of signal strength: Zn > Cu > Ca > Mg > Mn > Fe. In pairwise comparisons, Ca vs Mg appears to be the hardest metal pair to discern. Our study strongly suggests the possibility of developing an accurate sequence-based method for the annotation of metal binding sites and their specificity.

B-518: The Impact of Conformational Entropy on the Accuracy of the Molecular Docking Software FlexAID in Binding Mode Prediction
COSI: 3DSIG
  • Louis-Philippe Morency, University of Montreal, Canada
  • Rafael Najmanovich, University of Montreal, Canada

Short Abstract: Here we introduce the latest version of Flexible Artificial Intelligence Docking (FlexAID) that allows its scoring function to consider the conformational entropy of ligands in complex with their biological targets. We present the impact of FlexAID’s newest feature on its accuracy in binding mode prediction using three increasingly complex scenarios: the Astex Diverse Set, the Astex Non Native Set and the HAP2 dataset. We show that FlexAID outperforms other open-source molecular docking methods when molecular flexibility is crucial. The improved accuracy of FlexAID on complex cases, the addition of novel features, i.e. the conformational entropy, its accessibility and its easy-to-use graphical user interface suggest that FlexAID is in an interesting position to tackle biologically challenging and pharmacologically relevant situations currently ignored by other methods. FlexAID is available as source code, as a command-line pre-compiled executable (available at http://biophys.umontreal.ca/nrg for Windows, macOS & Linux) or through the NRGsuite, a PyMOL integrated user interface allowing the user to use FlexAID in an intuitive manner with real time visualization. Both the NRGsuite and FlexAID are distributed as open-source software.

B-519: In silico structural characterization and molecular docking for human TAS1R2 and TAS1R3 receptors – A link between taste and Autism Disorder
COSI: 3DSIG
  • Samille Gonçalves, Universidade Estadual de Feira de Santana, Brazil
  • Raquel Benevides, Universidade Estadual de Feira de Santana, Brazil
  • Gesivaldo Santos, Universidade Estadual do Sudoeste da Bahia, Brazil
  • Aristóteles Góes-Neto, Universidade Federal de Minas Gerais, Brazil
  • Bruno Andrade, Universidade Estadual do Sudoeste da Bahia, Brazil

Short Abstract: Autism is a rare psychiatric disorder characterized by imbalanced intellectual development, which impairs the ability to socialize, and in some cases motor coordination. This condition occurs due to genetic alterations that affect the normal development of the central Nervous System. There is a range of genetic basis involved in autism, one of these are related to G-protein–coupled taste receptors TAS1R2 and TAS1R3. For some authors, genetic polymorphism in these molecules is responsible for different levels of Autism. In this work, we aimed to construct both 3D structures of human TAS1R2 and TAS1R3 receptor, based on their normal gene sequences, as well as performs molecular docking with different taste molecules (e.g.: glucose, sucrose, cyclamate, etc.) in order to describe all amino acid interactions. For 3D construction, we used Swiss-Model Workspace, and after performed an AMBER 14 energy minimization for 5000 cycles of steepest descent and 5000 cycles of conjugated gradient for adjusting protein structures. Both structures were validated using QMEAN, ANOLEA and Procheck programs. Docking results were obtained with Autodock Vina, and 2D ligand interaction maps were constructed using Accelrys Discovery Studio 2.5.

B-520: The Elastic Network Contact Model applied to RNA: enhanced accuracy for conformational space exploration
COSI: 3DSIG
  • Olivier Mailhot, University of Montreal, Canada
  • Vincent Frappier, MIT, Canada
  • Francois Major, University of Montreal, Canada
  • Rafael Najmanovich, University of Montreal, Canada

Short Abstract: The use of Normal Mode Analysis (NMA) methods to study both protein and nucleic acid dynamics is well established. However, the most widely used coarse-grained methods are based on backbone geometry alone and do not take into account the chemical nature of the residues. Elastic Network Contact Model (ENCoM) is a coarse-grained NMA method that includes a pairwise atom-type non-bonded interaction term, which makes it sensitive to the sequence of the studied molecule. We adapted ENCoM to simulate the dynamics of both pure ribonucleic acid (RNA) molecules and RNA-protein complexes. For pure RNA, ENCoM outperforms the most commonly used coarse-grained model on RNA, Anisotropic Network Model (ANM), in the prediction of b-factors, in the prediction of conformational change as measured by overlap (a measure of effective prediction of structural transitions) and in the prediction of structural variance from NMR ensembles. ENCoM is also computationally faster than ANM. These benchmarks were derived from the set of all RNA structures available from the Protein Data Bank (PDB) and contain more total cases than other studies done on applying NMA to RNA. We thus established ENCoM as an attractive tool for fast and accurate prediction of the conformational space of RNA molecules.

B-521: Pharmacophore-based virtual screening for Hypoxanthine- guanine-phosphoribosyl-transferase (HGPRT) inhibithors: a key enzyme from Leishmania sp
COSI: 3DSIG
  • Liliane Araujo, Universidade Estadual do Sudoeste da Bahia, Brazil
  • Wagner Soares, Universidade Estadual do Sudoeste da Bahia, Brazil
  • Tarcisio Melo, Universidade Estadual do Sudoeste da Bahia, Brazil
  • Bruno Andrade, Universidade Estadual do Sudoeste da Bahia, Brazil

Short Abstract: The hypoxanthine-guanine-phosphoribosyltransferase (HGPRT) initiates the metabolism of toxic purine bases of the Leishmania species. The active site of HGPRT presents one guanine monophosphate (GMP) in both chains A and B. In order to evaluate the pharmacophore characteristics of HGPRT known drugs, we used PharmaGist program. After, previous mol2. pharmacophore file was submited to ZINCPharmer server, in order to select until 100 molecules with known-drugs pharmacophore-like, for docking studies. Autodock Vina was used for docking calculations between the 3D structure of HGPRT (1PZM) and selected ligands, and Accelrys Discovery Studio 2.5 was used for generating 2D amino acid interaction maps. In our results, about 60 chemical structures were selected in ZINCPharmer. After molecular docking calculations, six structures presented affinity energy bellow -8.5 Kcal/Mol. However, all ligands showed interactions with active residues of the HGPRT. The next step of this study is generates analogues for each best affinity compound found, as well as free-energy calculations using molecular dynamics approaches. These chemical constituents can become future drug candidates against Leishmania species.

B-522: In silico Prediction of HLA-associated Drug Hypersensitivity
COSI: 3DSIG
  • Shashank Jariwala, University of Michigan, United States
  • Xin-Qiu Yao, Georgia State Universiy, United States
  • Barry J. Grant, University of California San Diego, United States

Short Abstract: Adverse drug reactions are a leading cause of morbidity and mortality with annual costs of over $100 billion in the US alone. A growing number of adverse reactions, termed idiosyncratic drug hypersensitivities, are observed to be immune system mediated with genetic associations to specific human leukocyte antigen (HLA) alleles. However, little is known about the underlying mechanisms of majority of these associations, which critically hinders preventative action in clinical settings. Previous work has demonstrated that the antiviral drug abacavir can bind to HLA-B*57:01 and alter its specificity for self-peptides presented to T-cells. A critical barrier to examining the generality of this model, is our inability to experimentally test all possible HLA-drug combinations. Here we describe combined approach of utilizing homology modelling, molecular docking and molecular dynamics (MD) simulations to predict HLA-drug interactions. Binding free energy calculations are further applied to improve the HLA-drug interaction rankings. The predictive power of our approach is tested on a set of drugs with known HLA-linked hypersensitivity: abacavir with B*57:01, gout prophylactic allopurinol with B*58:01, and antiepileptic carbamazepine with B*15:02. Our studies represent a first step toward a preclinical screening process that aims to identify drugs with a high risk of causing drug hypersensitivity.

B-523: Understanding structural relationships of membrane proteins via topology alignments
COSI: 3DSIG
  • Edoardo Sarti, NINDS (NIH), United States
  • Lucy Forrest, NINDS (NIH), United States

Short Abstract: Protein structural classification can be very helpful in identifying similar mechanisms in structurally related proteins, but has proved to be very challenging for membrane proteins. Usually, the most accurate way to obtain detailed 3D information on possible structural relationships is considered performing structure alignments between pairs of proteins, and then assessing their quality with a structure similarity estimator such as the TM-score. However, structure alignment algorithms tend to find false positive relationships when two protein structures are too dissimilar, and they often overlook structure-related features that may be relevant for similarity and mechanism. We propose a method for efficiently comparing transmembrane domains of membrane proteins where secondary structure elements found to penetrate the membrane are described by a number of features relating to their internal structure. These topological representations are then aligned with an efficient SVM/HMM method which also derives an absolute measure of structure similarity. We prove the method can discriminate between structurally-related and unrelated proteins over a set of manually curated structure alignments and over all the structure alignments contained in EncoMPASS (encompass.ninds.nih.gov), a recently released database of membrane proteins focused on the analysis of their structure and symmetry.

B-524: The breadth of HIV broadly neutralizing antibodies depends on how they engage key epitope sites
COSI: 3DSIG
  • Hongjun Bai, WRAIR, Henry M. Jackson Foundation for the Advancement of Military Medicine, United States
  • Merlin Robb, WRAIR, Henry M. Jackson Foundation for the Advancement of Military Medicine, United States
  • Nelson Michael, U.S. Military HIV Research Program, WRAIR, United States
  • Morgane Rolland, WRAIR, Henry M. Jackson Foundation for the Advancement of Military Medicine, United States

Short Abstract: Better characterizing the relationship between HIV-1 Env diversity and the breadth of broadly neutralizing antibodies (bnAbs) could reveal key knowledge for the development of effective HIV-1 vaccines. We proposed and tested several methods to quantitatively define the epitope diversity of HIV-1 epitopes. Our results highlighted that epitopes of bnAbs with broader neutralization spectra were not necessarily more conserved based on standard sequence diversity measurements. We found that the diversity of the top-nine epitope sites explained half of the difference in neutralization breadth across bnAbs (Spearman’s Rho = -0.74, p = 6e-7). These results illustrated how the broadest antibodies target their epitopes: they focused on the most conserved sites, thereby achieving cross-reactivity with heterologous Env proteins. These findings support vaccine strategies focusing on conserved elements of the virus. The views expressed are those of the authors and should not be construed to represent the positions of the U.S. Army or the Department of Defense.

B-525: OSPREY 3.0: Open-Source Protein Redesign for You, with Powerful New Features
COSI: 3DSIG
  • Jeffrey W. Martin, Deparment of Computer Science, Duke University, United States
  • Anna U. Lowegard, Program in Computational Biology and Bioinformatics, Duke University, United States
  • Marcel S. Frenkel, Duke University, United States
  • Mark A. Hallen, Toyota Technological Institute at Chicago, United States
  • Adegoke Ojewole, Program in Computational Biology and Bioinformatics, Duke University, United States
  • Jonathan D. Jou, Duke University, United States
  • Siyu Wang, Program in Computational Biology and Bioinformatics, Duke University, United States
  • Graham T. Holt, Program in Computational Biology and Bioinformatics, Duke University, United States
  • Bruce R. Donald, Duke University, United States

Short Abstract: Computational protein design (CPD) holds great promise as a novel and ever more important tool in drug development and enzyme design. The Donald lab has shown that CPD can be applied to develop new drugs, change the specificity of enzymes, enhance the potency and breadth of antibodies, and predict resistance mutations to new drugs. We present OSPREY 3.0, a new and greatly improved release of the OSPREY protein design software. OSPREY 3.0 features a convenient new Python interface, which greatly improves its ease of use. It is over two orders of magnitude faster than previous versions of OSPREY when running the same algorithms on the same hardware. Moreover, OSPREY 3.0 includes several new algorithms, which introduce substantial speedups as well as improved biophysical modeling. It also includes GPU support, which provides an additional speedup of over an order of magnitude. Like previous versions of OSPREY, OSPREY 3.0 offers a unique package of advantages over other design software, including provable design algorithms that account for continuous flexibility during design and model conformational entropy. Finally, we show here empirically that OSPREY 3.0 accurately predicts the effect of mutations on protein-protein binding. OSPREY 3.0 is available at https://www2.cs.duke.edu/donaldlab/osprey.php as open-source software.

B-526: Structural Classification of Proteins in the post-Structural Genomics era
COSI: 3DSIG
  • John-Marc Chandonia, Berkeley National Lab, United States
  • Steven E. Brenner, University of California, Berkeley, United States

Short Abstract: SCOPe (Structural Classification of Proteins – extended, http://scop.berkeley.edu) is a database of relationships between protein structures that extends the Structural Classification of Proteins (SCOP) database. SCOPe 2.07, a major stable update, was released in March 2018. SCOPe continues high quality manual classification of new superfamilies, a key feature of SCOP. As public investment in Structural Genomics has waned, the novelty of newly solved protein structures has fallen to a 20-year low, with only 17 structures each month (~2% of the 800 characterized) representing the first structure from a Pfam family. About half of these newly structurally characterized Pfam families classified to date in SCOPe represent a new fold or superfamily. Thus, ongoing expert manual curation of protein structure classifications such as SCOPe is feasible when abetted by automated methods, and continues to yield new discoveries. An unfortunate consequence of the rate of sequencing outpacing the rate of structural characterization of protein families is that the fraction of large families with a known structure peaked 10 years ago, and is more than 10% lower today than it was at its peak. This makes interpretation of sequence variation much more challenging than would be the case had investment in Structural Genomics continued.

B-527: A convolutional autoencoder approach for mining features in cellular electron cryo-tomograms and weakly supervised coarse segmentation
COSI: 3DSIG
  • Xiangrui Zeng, Carnegie Mellon University, United States
  • Miguel Ricardo Leung, University of Oxford, United Kingdom
  • Tzviya Zeev-Ben-Mordehai, Utrecht University, Netherlands
  • Min Xu, Carnegie Mellon University, United States

Short Abstract: Cellular electron cryo-tomography enables the 3D visualization of cellular organization in the near-native state and at submolecular resolution. However, the contents of cellular tomograms are often complex, making it difficult to automatically isolate different in situ cellular components. In this paper, we propose a convolutional autoencoder-based unsupervised approach to provide a coarse grouping of 3D small subvolumes extracted from tomograms. We demonstrate that the autoencoder can be used for efficient and coarse characterization of features of macromolecular complexes and surfaces, such as membranes. In addition, the autoencoder can be used to detect non-cellular features related to sample preparation and data collection, such as carbon edges from the grid and tomogram boundaries. The autoencoder is also able to detect patterns that may indicate spatial interactions between cellular components. Furthermore, we demonstrate that our autoencoder can be used for weakly supervised semantic segmentation of cellular components, requiring a very small amount of manual annotation.

B-528: Symmetry and biological assemblies in the Protein Data Bank
COSI: 3DSIG
  • Jose M Duarte, RCSB PDB, UC San Diego, United States
  • Dmytro Guzenko, RCSB PDB, UC San Diego, United States
  • Yana Valasatava, RCSB PDB, UC San Diego, United States
  • Aleix Lafita, EMBL-EBI, United Kingdom
  • Spencer Bliven, Zurich University of Applied Science, Switzerland
  • Stephen Burley, RCSB PDB, UCSD, Rutgers University, United States

Short Abstract: Defining a biological assembly from a protein crystal is a challenging problem both experimentally and computationally. Due to these difficulties, biological assembly annotations in the Protein Data Bank may contain some ambiguities and misannotations. We have devised a computational method (EPPIC v3, eppic-web.org) that automates the enumeration of all assemblies in a crystal and predicts the most likely biological assembly. The method is complementary to author annotations and other existing computational prediction methods and thus can be used in combination to provide a better view of biological assemblies. Hosted at RCSB PDB, the service provides current and state-of-the-art analyses and visualizations to help crystallographers guide experiments and understand difficult crystals. Additionally, new and more comprehensive symmetry data for biological assemblies, based on improved symmetry detection algorithms, are available both via rcsb.org and RCSB PDB’s REST API (rest.rcsb.org). Global and local symmetry information is provided for all biological assemblies and asymmetric units, contributing to a better representation of biological assemblies. RCSB PDB will continue to improve symmetry information and biological assembly annotations and predictions and to integrate more data via services provided at rcsb.org. Support: NSF, NIH, and DOE

B-529: Structural Dynamics of DPP-4 and its influence on the projection of inhibitors
COSI: 3DSIG
  • Simone Queiroz, Federal University of ABC, Brazil
  • Kathia Honorio, University os Sao Paulo, Brazil
  • Ana Scott, Federal University of ABC and University of Pittsburgh, United States

Short Abstract: Dipeptidyl peptidase-4 (DPP-4) is a promising target to treat type II diabetes mellitus. Therefore, it is important to understand the structural aspects of this enzyme and its interaction with drug candidates. This study involved molecular dynamics simulations, normal mode analysis, binding site detection and analysis of molecular interactions to understand the protein dynamics. We identified some DPP-4 functional motions contributing to the exposure of the binding sites and twist movements revealing how the two enzyme chains are interconnected in their ligands. We investigate the influence of ligand binding the protein flexibilty and functional motions. The understanding the enzyme structure, its motions and regions of binding sites, will serve the basis for future drug design studies.

B-530: Interactive Exploration, Data Mining, and Visualization of 3D Macromolecular Structures
COSI: 3DSIG
  • Shih-Cheng Huang, UC San Diego, San Diego Supercomputer Center, United States
  • Yue Yu, UC San Diego, San Diego Supercomputer Center, United States
  • Peter Rose, UC San Diego, San Diego Supercomputer Center, United States

Short Abstract: Advances in Structural Bioinformatics are driven by the fast growth in experimental 3D structures and integration with even larger sets of sequence and protein function data. At the same time, the field of Data Science has created new technologies for reengineering legacy software pipelines to make them scalable, easy to use, reproducible, reusable, and sharable. Here, we describe the MMTF-Spark/PySpark project that combines three key components to create such an infrastructure: 1. Interactive Jupyter notebooks to run ad-hoc analyses, data mining, machine learning, and visualization of 3D structure and sequence datasets, 2. A scalable compute infrastructure to run these analyses interactively across large datasets, e.g., the entire PDB, using previously developed efficient data representations and the Apache Spark framework for distributed parallel computing, 3. A library of methods for data mining and analysis of 3D structure and sequence data, capitalizing on the rich data analytics, visualization, and machine/deep learning tools available in the Python ecosystem. A key advantage of this environment is interactivity, which enables iterative exploration. By combining documentation, data sets, analysis code, results, and interactive visualizations in Jupyter notebooks, the steps of an interactive session can be captured, reproduced, and shared.

B-531: Virtual screening with protein family-specific models using deep neural networks and transfer learning
COSI: 3DSIG
  • Fergus Imrie, University of Oxford, United Kingdom
  • Anthony Bradley, University of Oxford, United Kingdom
  • Mihaela van der Schaar, University of Oxford, United Kingdom
  • Charlotte Deane, University of Oxford, United Kingdom

Short Abstract: Discriminating active from inactive molecules for a given target is a central problem of computer-aided drug discovery. Recent research has shown the potential of applying machine learning techniques to virtual screening. However, previous work has not utilised advances in computer vision and has adopted a one-size-fits-all approach. We present an improved framework for protein-ligand scoring using a modern convolutional neural network architecture and apply transfer learning to produce an ensemble of protein family-specific models. Furthermore, we provide guidelines on the minimum requirements for family-specific models and the expected effect of additional data. Our approach substantially outperforms recent benchmarks on the DUD-E data set. Using a clustered cross-validation, we achieve state-of-the-art performance with an average AUC ROC of 0.91 and 0.5% ROC enrichment factor of 77. This represents an improvement in early enrichment of more than 75% over the machine learning benchmark and around 5x the AutoDock Vina scoring function. We observe similar improvements in performance on an independent test set constructed from the ChEMBL database. We also describe a visualisation method for interpreting model predictions that could be used to guide molecule optimisation or gain better understanding of a protein’s active site.

B-532: Mapping β-Turn Geometry and Its Side-Chain Determinants
COSI: 3DSIG
  • Nicholas Newell, Newell, United States

Short Abstract: β-turns constitute more than 20% of all residues in proteins and play crucial roles in structure and function. They are commonly classified by dihedral angles into a small set of types that provides only a low-resolution picture of turn backbone geometries, and more than a quarter of turns remain unclassified. Furthermore, the systematic treatment of side-chains in β-turns has been limited to the tabulation of the propensities of single-position motifs, supplemented by structural examples, and the interactions between β-turns and the structure in their N- and C-terminal neighborhoods have not been systematically characterized. In this work, a two-stage, least-squares, cartesian-space clustering algorithm is applied, first to generate a fine-scale partitioning and 3D conformational mapping of the backbone distribution of all β-turns, and then to map, for each backbone geometry, the distributions of side-chain/rotamer structures for all motifs involving one, two, or three amino acids in the turns and their immediate N- and C-terminal neighborhoods. This analysis updates and expands the existing picture of β-turns by providing a comprehensive, unified, high resolution treatment of the backbone and side-chain structure of all β-turns, and it should prove useful in protein design, structure prediction, and in assessing the structural consequences of disease-associated mutations.

B-533: IN-SILICO PROTEIN STABILIZATION WITH HOTMUSIC : INSIGHT FROM A PROTEOMIC PERSPECTIVE
COSI: 3DSIG
  • Fabrizio Pucci, Université Libre de Bruxelles, Belgium
  • Martin Schwersensky, Université Libre de Bruxelles, Belgium
  • Marianne Rooman, Université Libre de Bruxelles, Belgium

Short Abstract: The ability to rationally modify proteins to increase their thermal stability is one of the main goals of protein design, which has interesting applications in a wide series of biotechnological processes. Here we present HoTMuSiC a newly developed bioinformatics tool that, using as input the three-dimensional structure of the protein and, when available, its melting temperature, is able to predict rapidly and accurately the impact of amino acid substitutions on this temperature. The method is fast enough to scan all the sequence of a target protein and propose the most stabilizing mutations. After an explanation of the key ingredients used in the construction of HoTMuSiC such as the statistical potentials, we will present some preliminary data regarding its application to the proteomic scale. These information has been obtained by applying HoTMuSiC to the part of the proteome of different organisms with known or modeled structures. The analysis of the robustness upon mutations of extremophiles proteomes and its relations with evolutionary pressure can give us important information about the structural mechanisms used by proteins to modulate their thermoresitance whose understanding could be of utmost importance in protein engineering application and to shed light on the extremophiles adaptation to their environment.

B-534: High throughput analysis of allostery through propagation of rigidity
COSI: 3DSIG
  • Adnan Sljoka, Kwansei Gakuin University, Japan

Short Abstract: Allostery can be viewed as an effect of binding at one site of the protein to a second, often significantly distant functional site, enabling regulation of the protein function. In spite of its importance, the molecular mechanisms that give rise to allostery are still poorly understood. We have recently developed rigidity-transmission allostery (RTA) algorithm, an extremely fast computational method based on mathematical algorithms in rigidity theory. RTA algorithm provides a mechanical interpretation of allosteric signaling and is designed to predict if mechanical perturbation of rigidity (mimicking ligand binding) at one site of the protein can transmit and propagate across a protein structure and in turn cause a transmission and change in conformational degrees of freedom at a second distant site, resulting in allosteric transmission. In this talk, we will illustrate our method, identification of novel allosteric sites and a detailed mapping of allosteric pathways, which are experimentally validated with NMR studies on 3 different class of proteins: GPCRs [Nature Communication 2018], fluorocatate dehalogenase [Science 2017], eukaryotic translation initiation factor eIF4E and others. RTA method is computational very efficient (takes minutes of computational time on standard PC) and can scan many unknown sites for allosteric communication, identifying potential new allosteric sites.

B-535: SCALOP: sequence-based antibody canonical form structure prediction
COSI: 3DSIG
  • Wing Ki Wong, University of Oxford, United Kingdom
  • Alexander Bujotzek, Roche Innovation Center Munich, Germany
  • Guy Georges, Roche Innovation Center Munich, Germany
  • Francesca Ros, Roche Innovation Center Munich, Germany
  • Alan Lewis, GlaxoSmithKline Research and Development, United Kingdom
  • Bojana Popovic, MedImmune, United Kingdom
  • Bruck Taddese, MedImmune, United Kingdom
  • Jiye Shi, UCB Pharma, United Kingdom
  • Jinwoo Leem, University of Oxford, United Kingdom
  • Charlotte Deane, University of Oxford, United Kingdom

Short Abstract: Antibodies are proteins of the immune system which bind specifically to target antigens. The complementarity-determining regions (CDRs) of an antibody constitute the majority of its binding site, defining its specificity and affinity. A limited set of backbone conformations, known as the “canonical forms”, have been defined for five of the six CDRs. These are often used as a proxy to the binding site shape. The definitions of the canonical forms have been updated many times since their introduction in 1987. However, each update has been a static snapshot of the data. In order to annotate the antibody repertoire data now becoming available from next generation sequencing experiments, a fast, accurate and freely available solution is needed. We present SCALOP, a sequence-based method to predict CDR canonical forms, which uses an auto-updating database to capture the latest cluster information. Its accuracy is comparable to a standard structural predictor, FREAD, but 800 times faster. By back-dating the database of CDR structures, we show how the number and size of canonical clusters have increased and how this increase in database size improves prediction coverage while retaining consistently high precision.

B-536: Clustering and classification of active and inactive protein kinase structures
COSI: 3DSIG
  • Vivek Modi, Fox Chase Cancer Center, United States
  • Roland Dunbrack, Fox Chase Cancer Center, United States

Short Abstract: The active site of a protein kinase consists of several conserved residues, including the catalytic Asp residue in the HRDmotif and the DFGmotif at the activation loop N-terminus. Unlike the HRDmotif, the DFGmotif exhibits a unique conformation in the active state but displays flexibility across different inactive forms. To classify kinase structures, we have clustered the DFGmotif conformations based on the backbone dihedral angles of the sequence XDF, where X is the residue before the DFGmotif, and the position and conformation of the DFG Phe side-chain, utilizing a density-based clustering algorithm (DBSCAN). We have identified 8 distinct conformations that comprise 92% of kinase structures, and label them based on their Ramachandran regions (A (alpha), B (beta), L (left), E (epsilon)) and the Phe rotamer (gminus, gplus, trans). Active kinases with bound ATP exist exclusively in a BLAgminus conformation (55.4% of structures), known as DFGin, while Type II inhibitors solely to BBAgminus (5.2%), known as DFGout. The most common inactive conformations are BLBgplus (9.5%) and ABAgminus (9.4%), which place the Phe side-chain under the C-helix in a DFGin conformation. We believe the new classification and nomenclature will benefit understanding of conformational dynamics and inhibitor binding in the protein kinase family.

B-537: FoldX accurate biomolecular binding prediction using PADA1 (Protein Assisted DNA Assembly v1)
COSI: 3DSIG
  • Leandro Radusky, CRG, Spain
  • Javier Delgado, CRG, Spain
  • Hector Climente-González, CRG, Spain
  • Luis Serrano, CRG, Spain

Short Abstract: In this work we present PADA1, a generic algorithm that accurately models structural complexes and predicts the interaction regions of resolved protein structures. PADA1 relies on a library of protein and interacting biomolecular fragment pairs obtained from training sets of deposited complexes. It includes a fast statistical force field computed from atom-atom distances, to evaluate and filter the 3D docking models. Using published validation sets we predicted the binding regions with an RMSD of <1.8 Å per residue in >95% of the cases. We show that the quality of the docked templates is compatible with FoldX protein design tool suite to identify the crystallized DNA/RNA/protein molecule sequence as the most energetically favorable in 80% of the cases. We highlighted the biological potential of PADA1 by reconstituting conformational changes upon protein mutagenesis for a variety of protein-DNA/RNA/protein complexes, and by predicting binding regions and protein/nucleotide sequences in proteins crystallized without partner. These results opens up new perspectives for the engineering of biomolecular interfaces. The algorithm is already published only for the DNA version, and here we are presenting the updated version, valid for any kind of biomolecular complex.

B-538: Pseudo-Symmetry in 7 Transmembrane Helix (7TMH) Proteins: Intragenic Duplication of Protodomains with Evolutionary Balance of Structural Constraints and Functional Divergence
COSI: 3DSIG
  • Philippe Youkharibache, NCI/NIH, United States
  • Alexander Tran, California State University Northridge, United States
  • Ravinder Abrol, California State University Northridge, United States

Short Abstract: 7-Transmembrane-helix (7TMH) proteins cannot be grouped under a monolithic fold. A parallel structure-based analysis of sequence and functional evolution on folds sharing that magic number of 7 transmembrane (7TM) helices has revealed an evolutionary principle showing evidence of a duplication pattern of a 3/4-transmembrane helix (3/4-TMH) protodomain. This results in 7TMH proteins being made up of either two 4-TMH protodomains related by a two-fold symmetry, where one TM helix is lost, or two 3-TMH protodomains related by a two-fold symmetry, where an extra transmembrane helix can be present. The independent evolution of the two 3/4-TMH protodomains within a specific superfamily’s 7TMH protein appears to be guided by functional and structural constraints, which leads to either pseudo-symmetric folds of functionally-obligatory oligomeric 7TMH super-families like nicotinamide riboside transporter protein PnuC or pseudo-symmetric folds in other 7TMH super-families like G protein coupled receptors (GPCRs). This study also provides a surprising evolutionary link between GPCRs and ligand-gated ion channels. The sequence and structural protodomain analysis of different 7TMH super-families provides a unifying theme of their evolutionary process, where the intragenic duplication of protodomains is guided by varying degrees of functional divergence and structural constraints.

B-539: Analysis of sequence and structure data to understand nanobody architectures and antigen interactions
COSI: 3DSIG
  • Laura Mitchell, University of Cambridge, United Kingdom
  • Lucy Colwell, University of Cambridge, United Kingdom

Short Abstract: Nanobodies (Nbs) are a class of single domain antibody derived from the immune systems of camelid species. They achieve binding affinities and specificities to target antigens comparable to those of classical antibodies (Abs), despite being ten times smaller (~15 kDa) and having only three variable loops. This raises the question of how these binding affinities and specificities are achieved in such a compact molecule.  In this poster I present key insights from an analysis of 156 Nb-antigen co-crystal structures, and draw comparisons with a set of 156 classical Ab-antigen co-crystal structures. We find that Nbs display greater structural diversity across all three loops - even where underlying sequence variability is equivalent, or lower than Abs. Class-averaged properties of antigen-contacting residues (the ‘paratope’), show Nb paratopes are more variable than those of Abs. This is true for both the distribution of paratope residues across the domain, and the types of residues used at interfaces. Notably, we find that Nbs deviate from the ‘loops = paratope’ assumption that holds for Abs; an insight which has implications for Nb selection, modeling and engineering strategies.

B-540: SeRenDIP: remastered alignment profiles for fast and accurate predictions of PPI interface positions
COSI: 3DSIG
  • K. Anton Feenstra, Vrije Universiteit Amsterdam, Netherlands
  • Qingzhen Hou, Université Libre de Bruxelles, Belgium
  • Paul De Geest, University of Amsterdam, Belgium
  • Christian Griffioen, Vrije Universiteit Amsterdam, Netherlands
  • Sanne Abeln, Vrije Universiteit Amsterdam, Netherlands
  • Jaap Heringa, Vrije Universiteit Amsterdam, Netherlands

Short Abstract: Interpretation of protein sequences is a bottleneck in biomedical research, because structure and other experimental data are scarce. Prediction of protein interaction sites from sequence may be a viable substitute. We present a practical, fast and accurate method using our random-forest-based (RF) interface predictor. A novel approach to generate sequence profiles for calculating input features for the RF predictorby re-mastering the alignment of the homologs of the query sequence makes profile generation four-fold faster, even more for the longest jobs. Using the features generated from the ‘Remastered’ profiles, we trained RF models for heteromeric and homomeric protein interfaces, and for the combined training sets. Interestingly, the prediction performance is shown to be overall similar or higher, compared to the previous method. The ‘Remastered’ method is fast enough to allow the practical implementation of a webserver, which was unfeasible using the previous approach. For heteromeric interactions, the ‘Remastered’ RF-hetero predictor scores best (AUC-ROC 0.668). ‘Remastered’ RF-combined also performs well here (AUC-ROC 0.655), and even better for homomeric interactions (AUC-ROC 0.732), making it the best all-round choice. The method is fast and only requires one sequence as input, and may therefore be of interest to many biomedical researchers in academia and industry.

B-541: Investigating the molecular determinants of ebolavirus pathogenicity
COSI: 3DSIG
  • Henry Martell, The University of Kent, United Kingdom
  • Morena Pappalardo, The University of Kent, United Kingdom
  • Stuart Masterson, The University of Kent, United Kingdom
  • Franca Fraternali, King's College London, United Kingdom
  • Martin Michaelis, The University of Kent, United Kingdom
  • Mark Wass, The Univesity of Kent, United Kingdom

Short Abstract: The West Africa Ebola virus outbreak killed thousands of people. Using sequencing data combined with detailed structural analysis and experimental data, we compared Reston virus genomes, which is the only species of Ebolavirus not pathogenic in humans, to the other four Ebolavirus species. Here we present a significant update of this analysis, using nearly 1500 Ebolavirus genome sequences, compared to 196 in our original analysis. The number of specificity determining positions (SDPs) that are differentially conserved between the two groups and that may act as molecular determinants of pathogenicity reduces to 165 from 180. The large overlap of SDPs between the two datasets (73%) demonstrated the robustness of our approach, and ability to obtain reliable results with a limited number of genome sequences. The updated analysis places greater confidence that the SDPs present in the protein VP24 are likely to impair binding to human karyopherin alpha proteins and prevent inhibition of interferon signaling in response to infection.

B-542: Mimicking Intermolecular Interactions of Tight Protein–Protein Complexes for Small‐Molecule Antagonists
COSI: 3DSIG
  • David Xu, Indiana University School of Informatics and Computing, United States
  • Khuchtumur Bum‐erdene, Indiana University School of Medicine, United States
  • Yubing Si, Indiana University School of Medicine, United States
  • Samy Meroueh, Indiana University School of Medicine, United States

Short Abstract: Tight protein–protein interactions (Kd<100 nm) that occur over a large binding interface (>1000 Å2) are highly challenging to disrupt with small molecules. Historically, the design of small molecules to inhibit protein–protein interactions has focused on mimicking the position of interface protein ligand side chains. Here, we explore mimicry of the pairwise intermolecular interactions of the native protein ligand with residues of the protein receptor to enrich commercial libraries for small‐molecule inhibitors of tight protein–protein interactions. We use the high‐affinity interaction (Kd=1 nm) between the urokinase receptor (uPAR) and its ligand urokinase (uPA) to test our methods. We introduce three methods for rank‐ordering small molecules docked to uPAR: 1) a new fingerprint approach that represents uPA′s pairwise interaction energies with uPAR residues; 2) a pharmacophore approach to identify small molecules that mimic the position of uPA interface residues; and 3) a combined fingerprint and pharmacophore approach. Our work led to small molecules with novel chemotypes that inhibited a tight uPAR⋅uPA protein–protein interaction with single‐digit micromolar IC50 values. This work suggests that mimicking the binding profile of the native ligand and the position of interface residues can be an effective strategy to enrich commercial libraries for small‐molecule inhibitors of tight protein–protein interactions.

B-543: Modelling the effect of a single amino-acid variant on protein structure: How successful are we without adjusting the backbone?
COSI: 3DSIG
  • Tarun Khanna, Imperial College London, United Kingdom
  • Sirawit Ittisoponpisan, Imperial College London, United Kingdom
  • Eman Alhuzimi, Imperial College London, United Kingdom
  • Suhail A Islam, Imperial College London, United Kingdom
  • Alessia David, Imperial College London, United Kingdom
  • Michael J.E. Sternberg, Imperial College London, United Kingdom

Short Abstract: We present a rule-based approach, 3DVar, which predicts the structural effect of a single amino-acid variation (SAV) on both experimental and predicted structure. 3DVar models with fixed backbone the side-chains around the variant. We modelled 1,971 disease-causing and 2,140 neutral SAVs on 606 human PDBs and obtained true positive (TP) and false positive (FP) rates of 41% and 11%; the sequence-based SIFT obtaining 87% and 45%. On Phyre2-predicted structures, the TP is only slightly lower and the FP slightly higher. Structure cannot account for many disease-causing mutations so we expect a lower TP rate than SIFT; but our approach can augment sequence-based methods and provide a structural explanation for a disease-associated SAV. To assess our assumption of a fixed protein backbone, we generated a dataset of 15,182 pairs of PDB structures (all species) that differ in a single amino-acid. We calculated the local c-alpha RMSD. 78% had RMSD < 0.5Å, which indicates either the RMSD is within our observed difference between identical structures or there is only limited conformational change. However, 5% has RMSD > 2Å. Thus, the fixed backbone is valid but for some variants backbone changes need to be modelled.

B-544: Deep learning approaches to predict protein-protein binding sites
COSI: 3DSIG
  • Eli Draizen, University of Virginia; NCBI, United States
  • Alexander Goncearenco, National Center for Biotechnology Information, United States
  • Cameron Mura, University of Virginia, United States
  • Anna Panchenko, National Institutes of Health, United States
  • Philip E. Bourne, School of Medicine, University of Virginia, United States

Short Abstract: Protein-protein interactions (PPI) mediate biological functions and are crucial to understanding biological pathways. Several methods exist to predict binding sites involved in PPI from sequence and/or structure that exploit common features of binding residues such as hydrophobicity and evolutionary conservation, but either have too low accuracy or require significant homology to known PPIs. Deep learning and convolutional neural networks (CNNs) have had great success in 3D image segmentation. PPI binding site prediction may benefit from this method due to its speed and ability to generalize, while preserving the 3D nature of the data. Here, we present a 3D-CNN to predict binding sites on proteins using 3D protein structural information and eight atomic physicochemical features including hydrophobicity, charge, and hydrogen bonding. As a proof of principle, the model was first trained on an example of hollow spheres. Success on the this model has lead to further exploration of the ability of a 3D-CNN to predict binding sites within specific protein families. This new method has the potential to reveal previously unknown binding sites and further expand the knowledge of how proteins interact.

B-545: Systematic Analysis of Symmetry and Pseudo-Symmetry in Membrane Protein Structures
COSI: 3DSIG
  • Antoniya Aleksandrova, NINDS - National Institutes of Health, United States
  • Lucy Forrest, NINDS (NIH), United States

Short Abstract: Available membrane protein structures have revealed an abundance of symmetry and pseudo-symmetry, which are observed not only in the formation of multi-subunit assemblies, but also in the repetition of internal structural elements. There are many known examples of the functional significance of these symmetries. In this context, a systematic study of symmetry should provide a framework for a broader understanding of the mechanistic principles and evolutionary development of membrane proteins. However, existing analyses lack the detail and breadth required for such a systematic study. Therefore, we aim to quantify both the extent and diversity of symmetry relationships in known structures of membrane proteins. To achieve this task, we combine the output of two programs for symmetry detection, namely SymD and CE-Symm, each of which has certain limitations. By leveraging the complementarity of these programs and taking into consideration the restrictions that the lipid bilayer places on protein structures, we improve both the sensitivity of symmetry detection and the coverage of the symmetric units. This analysis provides a valuable foundation for addressing a wide range of questions relating to the function and evolution of these important proteins. Therefore, we have incorporated this data into an online database called EncoMPASS (encompass.ninds.nih.gov).

B-546: Understanding structural space of intra protein domain-domain interfaces
COSI: 3DSIG
  • Rivi Verma, Indian Institute of Science Education and Research, Mohali, India
  • Shashi B. Pandit, Indian Institute of Science Education and Research, Mohali, India

Short Abstract: The orientation of domain-domain or their interfaces in multi-domain proteins has been suggested to play a significant role in protein function. Hence, accurate prediction of domain-domain interfaces (DDI) is essential for reliable tertiary structure prediction of multi-domain proteins. Most structure prediction methods rely on homologous template/s to model domain geometry.We evaluated the reliability of interface prediction by analyzing structural conservation of DDIs among domains related at family/superfamily levels of structural relatedness using iAlign. The comparison of DDIs of equivalent domains from multiple structures of a multi-domain protein showed that domain interfaces for a given protein is structurally conserved (mean (sd) rmsd is 1.1 (2.6) Å). Importantly, structural variation is observed between ligand unbound and bound structures. Next, we compared DDIs of domains that are related at most to the level of family (1320 pairs) or homologous superfamily (44285 pairs) as defined in CATH. Using IS-score as measure of interface similarity, mean (sd) of family and superfamily related domains are 0.69 (0.15) and 0.35 (0.18) respectively. Furthermore, comparison of DDIs formed by completely unrelated domains shows mean IS-score of 0.30. Thus, suggesting structure space of DDI is degenerate. Importantly, domain relatedness from template/s needs to be assessed before modeling DDIs.

B-547: MAINMAST: De novo Main-chain Modeling for EM maps Using Tree-graph optimization.
COSI: 3DSIG
  • Genki Terashi, Purdue University, United States
  • Daisuke Kihara, Purdue University, United States

Short Abstract: An increasing number of protein structures are determined by cryo-electron microscopy (cryo-EM) at near atomic resolution. However, tracing the main-chains and building full-atom models from EM maps of ~4-5 Å is still not trivial and a demanding task. Here, we introduce a novel de novo structure modeling method MAINMAST (MAINchin Model trAcing from Spanning Tree) that builds an entire three-dimensional model of a protein from a near-atomic resolution EM map. The method directly traces the main-chain and identifies Cα atom positions as tree-graph structures in the EM map. The method has substantial advantages over the existing methods: i) MAINMAST directly traces main-chain models from an EM density map without using known protein structures; ii) The procedure is fully automated and no manual setting is required; iii) MAINMAST can estimate a confidence score that indicates accuracy of structure regions. We tested MAINMAST on 40 simulated density maps at 5Å resolution and 30 experimentally determined maps at ~4-5 Å resolution and showed that MAINMAST performed significantly better than existing software. This work is in press in Nature Communications (2018).

B-549: iCFN: an efficient exact algorithm for multistate protein design
COSI: 3DSIG
  • Mostafa Karimi, Texas A&M University, United States
  • Yang Shen, Texas A&M University, United States

Short Abstract: Motivation: Multistate computational protein design (CPD) simultaneously considers positive and negative objectives corresponding to various protein states (e.g. oligomerization) and substates (e.g. conformation). Exact algorithms can guarantee the optimal solutions and thus enable a direct test of mechanistic hypotheses behind models. However, efficient exact algorithms are lacking for multistate CPD. Methods and results: We have developed an efficient exact algorithm called interconnected cost function networks (iCFN) for a generic formulation of multistate CPD. iCFN treats each substate design as a weighted constraint satisfaction problem (WCSP) modeled through a cost function network; and it solves the coupled WCSPs using novel bounds and a depth-first branch-and-bound search over a hierarchical tree of solutions. When iCFN is applied to specificity design of a T-cell receptor, a problem of unprecedented size to exact methods, it drastically reduces search space and running time to make the problem tractable. Moreover, iCFN generates experimentally-agreeing receptor designs with improved accuracy compared to state-of-the-art methods, highlights the importance of modeling backbone flexibility in protein design, and reveals molecular mechanisms underlying binding specificity. Significance: To our best knowledge, this is the first exact algorithm that makes large-scale multistate CPD problems computationally tractable.

B-552: Development of a genome annotation system based on protein three-dimensional structures
COSI: 3DSIG
  • Matsuyuki Shirota, Tohoku University, Japan

Short Abstract: The information from three-dimensional (3D) structures of proteins plays important roles in estimating the functional impact of missense variants in personal genome sequences. However, the availability and variety of 3D structures varies with proteins and changes with time as Protein Data Bank is updated weekly, which makes the application of structural information on genome annotation difficult for non-specialists of structural biology. In this study, a genome-wide annotation system for missense variants was developed. All of the protein structures in PDB were aligned with the GRCh37 and GRCh38 human reference genomes based on the consensus coding sequence gene annotations and protein sequence alignments. For each residue in a 3D structure, structural features, such as secondary structure, solvent accessibility, interaction with other proteins and distance to small molecules, were calculated and the features from different structures for the same residue were integrated to allow the interpretation of the amino acid change based on all the available structures in PDB. These annotations can be updated weekly, which makes the insights obtained from the new protein structures available for broad range of scientific fields. This genome annotation system will be useful for elucidating the impact of missense variants for future personalized health care.

B-553: Network approach integrates 3D structural and sequence data to improve protein structural comparison
COSI: 3DSIG
  • Khalique Newaz, University of Notre Dame, United States
  • Fazle Faisal, University of Notre Dame, United States
  • Julie Chaney, University of Notre Dame, United States
  • Jun Li, University of Notre Dame, United States
  • Scott Emrich, University of Notre Dame, United States
  • Patricia Clark, University of Notre Dame, United States
  • Tijana Milenkovic, University of Notre Dame, United States

Short Abstract: Proteins are key macromolecules of life, and thus understanding their function is important. However, doing so experimentally is resource-consuming. Hence, computational prediction of protein function can help. Since proteins with similar structures often have similar functions, computational approaches have been proposed for capturing proteins’ structural and thus functional similarity. Traditionally, such approaches were sequence-based. Since amino acids that are distant in the sequence can be close in the 3-dimensional (3D) structure, 3D structural approaches can complement sequence approaches. Traditional 3D structural approaches compare “raw” protein structural information. In contrast, 3D structures can first be modeled as protein structure networks (PSNs). Then, “processed” PSN-based information can be used to compare proteins. We developed a novel approach, GRAFENE, to use integrative sequence and PSN-based structural information to compare proteins. In extensive evaluation on PSNs corresponding to protein domains from CATH and SCOP databases against existing state-of-the-art approaches (e.g., DaliLite, TM-align, and GR-Align), GRAFENE was both more accurate (in identifying as similar those PSNs that belong to the same CATH/SCOP class) and faster. Hence, GRAFENE is expected to impact future research on protein structural comparison and thus protein function prediction.

B-554: Learning from the ligand: improving binding affinity prediction using molecular descriptors
COSI: 3DSIG
  • Fergus Boyles, University of Oxford, United Kingdom
  • Charlotte Deane, University of Oxford, United Kingdom
  • Garrett Morris, University of Oxford, United Kingdom

Short Abstract: Scoring functions for structure-based virtual screening use a 3D structure of the protein-ligand complex to predict the ligand's binding affinity. Using computed molecular descriptors of the ligand as input features for a random forest regression, we demonstrate that a purely ligand-based approach to binding affinity prediction is effective on the PDBbind 2016 benchmark. We also show combining molecular descriptors with the features used by structure-based machine-learning scoring functions such as RF-Score and NNScore 2.0 results in greater predictive performance. Finally, we explore the effect of test set composition when assessing model performance. Previous publications have demonstrated that machine learning scoring functions such as RF-Score often have worse predictive performance when applied to unseen protein targets, and we observe similar behaviour in our models. However, we find that adding molecular descriptors to structural models leads to improved performance even in this challenging scenario.

B-555: Modeling protein structures with graph convolutional networks
COSI: 3DSIG
  • Alex Fout, Colorado State University, United States
  • Jonathon Byrd, Colorado State University, United States
  • Basir Shariat, Colorado State University, United States
  • Asa Ben-Hur, Colorado State University, United States

Short Abstract: Deep learning and convolutional neural networks are having tremendous impact in machine learning and being applied in computational biology as well. We present our recent work on prediction of protein interfaces using graph convolutional networks on the basis of protein 3d structure (Fout et al. NIPS, 2017). In this work, a protein structure is represented as a graph where each residue is a node, and nodes are connected by edges labeled by their spatial proximity in the structure. Standard convolution operators used in computer vision are not applicable in this scenario, and are replaced by appropriate graph convolution operators. By convolving over the neighborhood of a node, we are able to stack multiple layers of convolution and learn effective latent representations that integrate information across the three dimensional structure of a protein of interest. An architecture that combines the learned features across pairs of proteins is then used to classify pairs of residues as part of an interface or not. In our experiments, several graph convolution operators yielded accuracy that is better than the state-of-the-art SVM method in this task. Preliminary results on related problems that require representations of protein structures will be presented as well.

B-556: Application of Residue-Level Functional Site Predictions to Gauge the Utility of Protein Structural Models
COSI: 3DSIG
  • Joshua Toth, Geisinger Commonwealth School of Medicine, United States
  • Paul Depietro, Geisinger Commonwealth School of Medicine, United States
  • Jϋrgen Haas, University of Basel, Switzerland
  • William McLaughlin, Geisinger Commonwealth School of Medicine, United States

Short Abstract: The Continuous Automated Model Evaluation (CAMEO) platform assesses protein structure model quality for a host of modeling techniques. Here, we describe a method to assess the local accuracies of different modeling techniques according to their abilities to provide models that contained functional sites consistent with the experimental structures. To assess the functional site consistencies we employ the functional site prediction program “FEATURE”. Its feature to allow varying the probabilities of the predictions, prompted us to calculate the area under the receiver operator characteristic curve (AUC) for approximately 120 types of functional sites. By rank ordering the average AUCs and documenting the significant differences between modeling servers, we provide an objective way to assess the utility of models generated by each modeling server. Also, we estimate the measure of bias of the models by finding functional site residues present in the template structures but absent in the corresponding experimental structures. Overall, we infer that the assessment method described here, which is referred to as ResiRole, provides an objective way to assess a model’s utility for use in functional predictions. Financial support was provided in part by the NIGMS [grant number 5U01 GM093324-02].

B-557: Structural characterization of a Moniliophthora roreri Cyclophilin and the use of virtual screening approach for seaching new inhibitor ligands
COSI: 3DSIG
  • Fernanda Rangel, Universidade Estadual de Santa Cruz, Brazil
  • Carlos Priminho Pirovani, Universidade Estadual de Santa Cruz, Brazil
  • Bruno Andrade, Universidade Estadual do Sudoeste da Bahia, Brazil

Short Abstract: Moniliophthora roreri is the causal agent of Moniliasis, a disease that affects the fruit of cocoa tree (Theobroma cacao). Cyclophilins are in a highly conserved family of proteins, that belongs to the group of immunophilins and have two major characteristics: presents peptidyl-prolyl cis-trans isomerase activity, that plays a crucial role in protein folding, as well as they are molecular targets for an immunosuppressive drug, Ciclosporin. The aim of this work was to perform the structural characterization of M. roreri Cyclophilin, as well as describes the mechanism of action of Ciclosporin in its active site, by molecular docking approach. For 3D construction, we used The Swiss-Model Workspace, and after we performed an energy minimization with AMBER 14, for 5000 cycles of steepest descent and 5000 cycles of conjugated gradient for adjusting protein structures. Protein structure was validated using QMEAN, ANOLEA and Procheck programs. Docking results were obtained with Autodock Vina, and 2D ligand interaction map was constructed using Accelrys Discovery Studio 2.5. Furthermore, we used ligand based and target based virtual screening, accessing ZINC (http://zinck.docking.org) and ChEMBL (https://www.ebi.ac.uk/chembl/) databases for searching possibile new ligands which can complex with this protein and become new options for Cycolophilin inhibitors.

B-558: 3D Molecular Visualization using Virtual Reality Technology
COSI: 3DSIG
  • Hiromu Sato, Tohoku University, Japan
  • Hafumi Nishi, Tohoku University, Japan
  • Kengo Kinoshita, Tohoku University, Japan

Short Abstract: Three-dimensional structures of biomolecules are essential for their biochemical and cellular roles in a living organism. Manual observation of 3D structures by molecular visualization software gives us a deeper insight into local and global configurations as well as the biological functions of molecules. However, manipulating molecular models with a computer mouse and keyboard is not necessarily easy nor intuitive for all users. We have developed eF-Site VR, a new software program for the visualization of molecular surface data using virtual reality technology. Surface data is directly loaded from eF-Site, a database of electrostatic surface of protein functional sites (http://service.pdbj.org/eFsite/). Using Oculus Touch controllers, users can literally grab and rotate molecules with their hands as they can with molecular models in the real world. In addition to hydrophobicity and electrostatic potentials, local flexibility (B-factor) is also represented on models as surface fluctuation, which is implemented by soft bodies. The software is freely available from the Tools page of eF-site.

B-560: A novel methodology on distributed representations of proteins using their interacting ligands
COSI: 3DSIG
  • Hakime Öztürk, Boğaziçi University, Turkey
  • Elif Ozkirimli, Bogazici University, Turkey
  • Arzucan Ozgur, Bogazici University, Turkey

Short Abstract: Motivation: The effective representation of proteins is a crucial task that directly affects the performance of many bioinformatics problems. Related proteins usually bind to similar ligands. Chemical characteristics of ligands are known to capture the functional and mechanistic properties of proteins suggesting that a ligand based approach can be utilized in protein representation.

B-657: Mol*: Creating a common library for web molecular graphics and analysis tools
COSI: 3DSIG
  • Alexander Rose, RCSB PDB, UCSD, SDSC, United States
  • David Sehnal, EMBL-EBI, CEITEC, Masaryk University Brno, Czech Republic
  • Jaroslav Koča, CEITEC, Masaryk University Brno, Czech Republic
  • Sameer Velankar, EMBL-EBI, United Kingdom
  • Stephen Burley, RCSB PDB, UCSD, Rutgers University, United States

Short Abstract: Integrative/hybrid experimental methods for determining three-dimensional structures of biomolecules provide the means for studying large molecular complexes. These structures typically consist of multiple components depicted using models of varying resolution and length scale (e.g., all atom representations, gaussian shapes). Web-based visualization and analyses of macromolecular structures and associated data represents a critical step in enabling access and gaining knowledge from these data. Embracing advances in browser technology provides the means for creating scalable molecular graphics and analysis tools with near-instant data access. We present herein Mol* (/’mol-star/, github.com/mol-star), an open-source software development project that will provide a common library for macromolecular visualization and analysis to facilitate building tools and services for the scientific community. Examples include showing experimental/validation related data; displaying annotations like SCOP, PFAM, UniProt; or visualizing results from structural bioinformatics or computer aided drug discovery approaches. Mol* aims for interoperability with existing and future solutions by supporting standard file formats and defining open domain specific languages for representing and manipulating macromolecular structure data (e.g., molql.org/). Examples from PDB (pdb.org) and PDB-Dev (pdb-dev.org) will be presented. Support: RCSB PDB (NSF, NIH, DOE) and PDBe (EMBL-EBI, Wellcome Trust, BBSRC, EU, MRC).

B-660: Molecular Modeling and Experimental Studies of A Funder Mutation Causing Homocystinuria in Qatar - Structural Basis for Development of Novel Therapies
COSI: 3DSIG
  • Navaneethakrishnan Krishnamoorthy, Sidra Medical and Research Center, Qatar
  • Hesham Ismail, Qatar University, Qatar
  • Gheyath Nasrallah, Qatar University, Qatar

Short Abstract: Homocystinuria is a metabolic disorder that leads to multiple disorders of the central nervous system and cardiovascular system. R336C in the protein cystathionine β-synthase (CBS) is one of the mutations that cause homocystinuria. In particular, it is a founder mutation that causes severe effect in Qatari population and the most prevalent inherited monogenic disease in the country. The molecular mechanism of the disease is unclear. Here, we used molecular modeling and experimental studies to characterize the effect of R336C to understand the structure-functional relationships. Molecular modeling of wildtype and R336C showed that it is adjacent to catalytic core. Molecular dynamics simulation suggests that the mutation induces conformational changes, reduces structural stability, impacts the protein surface, thus, could have influence on the CBS binding sites of substrates and activity. Additionally, we identified potential channels available for the substrates entry and/or exit to/from the catalytic core. Furthermore, we used experimental models of yeast and cell culture (HEK293T and HepG2) to assess the effect of the mutant protein’s expression, stability and activity. Altogether, the results show deleterious effect of the mutant and its impact on the structure-functional relationships. This study can provide a basis for the development of novel therapies to treat homocystinuria.