Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner


Accepted Posters

If you need assistance please contact submissions@iscb.org and provide your poster title or submission ID.


Track: 3Dsig

Session A-001: MOLE 2.5 - Tool for Detection of Ligand Pathways within Biomacromolecules
COSI: 3Dsig
  • Karel Berka, Palacky University Olomouc, Czech Republic
  • David Sehnal, CEITEC - Central European Institute of Technology, Masaryk University Brno, Czech Republic
  • Václav Bazgier, Palacky University Olomouc, Czech Republic
  • Lukáš Pravda, CEITEC - Central European Institute of Technology, Czech Republic
  • Radka Svobodová Vařeková, Central European Institute of Technology, Masaryk University, Czech Republic
  • Michal Otyepka, Department of Physical Chemistry and Center for Biomolecules and Complex Molecular Systems, Faculty of Science, Palacky University at Olomouc, Czech Republic
  • Jaroslav Koča, CEITEC - Central European Institute of Technology, Masaryk University Brno, Czech Republic, Czech Republic

Short Abstract: MOLE 2.5 (http://beta.mole.upol.cz) offers the quickest automatic and user assisted detection of ligand pathways (channels, tunnels and pores) inside biomacromolecular structures, including the analysis of physicochemical properties of amino acids along the pathway. Additionally, interactive visualization of the pathways is provided online by LiteMol (http://litemol.org).

Session A-002: Sensitive and efficient topology-independent structural alignment
COSI: 3Dsig
  • Antonín Pavelka, University of California, San Diego, United States
  • Andreas Prlić, University of California, San Diego, United States
  • Peter Rose, University of California, San Diego, United States

Short Abstract: A new topology-independent algorithm for protein structural alignment is presented. The accuracy of the algorithm is compared with the topology-independent CLICK and the sequence-based rigid jFatCat on randomly paired CATH nonredundant S20 domains. The new algorithm always reveals a larger or equal number of alignments with TM-score above a fixed but arbitrary threshold, and 2.5 times more alignments than CLICK at the threshold of 0.5.

Session A-003: Association between protein disorder and immunogenicity in trypanosomatids
COSI: 3Dsig
  • Jeronimo Conceição Ruiz, Fundação Oswaldo Cruz, Brazil
  • João Paulo Linhares Velloso, Fundação Oswaldo Cruz, Brazil
  • Paul Anderson Souza Guimarães, Fundação Oswaldo Cruz, Brazil
  • Henrique Toledo, Fundação Oswaldo Cruz, Brazil
  • Daniela Resende, Fundação Oswaldo Cruz, Brazil

Short Abstract: Despite considerable scientific efforts, development of neglected diseases vaccines remains a great challenge. This scientific scenario had led to a great expansion of neglected diseases scientific data in the last years. In the context of seeking for new antigens, biological data integration originated from multiple analytical approaches could be a great strategy in the search for novel vaccine candidates. In this context, the presence of protein disordered regions near immunogenic epitopes could be a valuable for optimize the transformation of biological information and data into diagnostics and therapeutics for medicine. The main question addressed by this work is: Is there a statistic correlation between the proximity of immunogenic regions and disordered regions that could be used to better predict new targets for vaccine development? In order to address our hypothesis we initially downloaded all experimentally validated epitopes information about Trypanosoma cruzi (ID:353153), Leishmania spp. (ID:5658) and Trypanosoma brucei (ID:5691) from IEDB, including assays for T cell, B cell and MHC ligand. Disorder regions were predicted using the approach described by Ruy et al, 2014 and the biological data integration was performed using a MySQL database. According to odds ratio analysis, results obtained for Leishmania spp. suggest that B cell epitopes might be 6.028 times more likely to occur in predicted disordered regions than non-immunogenic epitopes. The leishmania T cell epitopes and MHC ligand data and the other two trypanosomatids data are still being analysed.

Session A-004: Conservation of coevolving protein interfaces bridges prokaryote–eukaryote homologies in the twilight zone
COSI: 3Dsig
  • Juan Rodriguez-Rivas

Short Abstract: Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.

Session A-005: Cytochrome P450 structure anatomy – recognition and analysis of secondary structure elements
COSI: 3Dsig
  • Adam Midlik, National Centre for Biomolecular Research and Central European Institute of Technology, Faculty of Science, Masaryk University, Kamenice 5, Brno, Czech Republic
  • Radka Svobodová Vařeková, National Centre for Biomolecular Research and Central European Institute of Technology, Faculty of Science, Masaryk University, Kamenice 5, Brno, Czech Republic
  • Veronika Navrátilová, Department of Physical Chemistry, Regional Centre of Advanced Technologies and Materials, Faculty of Science, Palacký University, 17. listopadu 1192/12, Olomouc, Czech Republic
  • Karel Berka, Department of Physical Chemistry, Regional Centre of Advanced Technologies and Materials, Faculty of Science, Palacký University, 17. listopadu 1192/12, Olomouc, Czech Republic
  • Jaroslav Koča, National Centre for Biomolecular Research and Central European Institute of Technology, Faculty of Science, Masaryk University, Kamenice 5, Brno, Czech Republic

Short Abstract: Nowadays, structural bioinformatics faces an enormous growth in the number of available protein structures. Moreover, structurally similar proteins form protein families and these families represent rich datasets. In order to understand their molecular function we must focus on the key structural regions (binding sites, channels, allosteric sites, etc.), which involve both conserved and variable regions. Number and position of secondary structure elements (helices and sheets) is usually highly conserved within each protein family, thus they can serve as a firm point for identification of these key regions. Cytochromes P450 are proteins responsible for degradation of drugs and other xenobiotic substances in the organism and thus understanding of their function is crucial in drug design. They are an example of a protein family where the secondary structure elements are traditionally annotated with fixed names, facilitating description and comparison of their structures in the literature. In the last few years, the number of resolved cytochrome P450 structures has grown markedly and currently there are available more than 700 structures originating from more than 50 different organisms. Conserved annotation motivated us to analyze secondary structure elements across the whole cytochrome P450 family. During this process, we first developed an algorithm for automated annotation of secondary structure elements based on a template protein annotation. Secondly, we described the general anatomy and variability of cytochromes P450, based on this annotation. Specifically, we report features of their secondary structure elements such as frequency of occurrence, typical length, amino acid composition, and relation to the source organism.

Session A-006: CATH-based protein structure and function analyses to understand the implication of alternative splicing
COSI: 3Dsig
  • Su Datt Lam
  • Christine Orengo

Short Abstract: Alternative splicing (AS) has been suggested as one of the major processes to expand the diversity of proteomes in multicellular organisms. We used domain structure information from CATH to examine the effects of splicing for a set of developmental splice isoforms in human and fly generated by mutually exclusive exon events.

Session A-007: “How well do you fit your partner ? - Continually Assessing Protein-Protein Interface Quality in Structural Models with CAMEO”
COSI: 3Dsig
  • Juergen Haas, SIB Swiss Institute of Bioinformatics & Biozentrum, University of Basel, Klingelbergstr 50-70, 4056 Basel, Switzerland, Switzerland
  • Dario Behringer, SIB Swiss Institute of Bioinformatics & Biozentrum, University of Basel, Klingelbergstr 50-70, 4056 Basel, Switzerland, Switzerland
  • Rafal Gumienny, SIB Swiss Institute of Bioinformatics & Biozentrum, University of Basel, Klingelbergstr 50-70, 4056 Basel, Switzerland, Switzerland
  • Alessandro Barbato, SIB Swiss Institute of Bioinformatics & Biozentrum, University of Basel, Klingelbergstr 50-70, 4056 Basel, Switzerland, Switzerland
  • Steven Roth, SIB Swiss Institute of Bioinformatics & Biozentrum, University of Basel, Klingelbergstr 50-70, 4056 Basel, Switzerland, Switzerland
  • Torsten Schwede, SIB Swiss Institute of Bioinformatics & Biozentrum, University of Basel, Klingelbergstr 50-70, 4056 Basel, Switzerland, Switzerland

Short Abstract: Continuously monitoring tools for structure, structure quality and residue-residue contact prediction allows users to retrospectively select the best tool for a given scientific question. The Continuous Automated Model EvaluatiOn (CAMEO) platform has been running for over five years and has added innovative measures developed by the community and the CAMEO team. New categories requested by the community have been included over the years, the latest being "residue-residue contact prediction". Here, we would like to present the latest progress on structural similarity of protein-protein interfaces and superposition-free model confidence assessment. Several methods assessing structural similarity of protein-protein interfaces have been developed in recent years(MMAlign by S. Mukherjee, QS-score by M. Bertoni) which led us to start adding support for interface analyses in homomers. We focussed on adding Distance metrics developed in the context of protein-protein docking that are not focussed on binary interactions, as decomposing the comparison of assemblies into binary interactions can result in a factorial number of comparisons and missing interfaces (e.g. comparing a dimer to a tetramer) remain unaccounted. Apart from new scores and categories we also added a common subset selection to compare a range of servers on a common target set, modernized the web interface and introduced speed improvements.

Session A-008: Network approach integrates 3D structural and sequence data to improve protein structural comparison
COSI: 3Dsig
  • Khalique Newaz, University of Notre Dame, United States
  • Fazle Faisal, University of Notre Dame, United States
  • Julie Chaney, University of Notre Dame, United States
  • Jun Li, University of Notre Dame,
  • Scott Emrich, University of Notre Dame,
  • Patricia Clark, University of Notre Dame,
  • Tijana Milenkovic, University of Notre Dame,

Short Abstract: Proteins are macromolecules that keep us alive. Understanding protein function experimentally is expensive and time-consuming. Consequently, computational prediction of protein function has received attention. In this context, protein structural comparison (PC) aims to quantify similarity between proteins with respect to their structural patterns, in order to predict functions of unannotated proteins based on functions of annotated proteins that they are structurally similar to. Initial PC approaches were based on sequence patterns. Since amino acids that are distant in the sequence can be close in the 3-dimensional (3D) structure, 3D contact approaches can complement sequence approaches. Traditional 3D contact approaches study 3D structures directly. Instead, 3D structures can be modeled as protein structure networks (PSNs). Then, network approaches can compare proteins by comparing their PSNs. Network approaches may improve upon traditional 3D contact approaches. We cannot use existing PSN approaches to test this, because: 1) They rely on naive measures (patterns) of network topology. They cannot integrate 2) multiple PSN measures or 3) PSN data with sequence data, although this could help because the different data types capture complementary biological knowledge. We address this by: 1) exploiting well-established graphlet (aka “network motif”) measures via a new network approach, which allows for 2) integrating multiple PSN measures, and 3) combining the complementary PSN data and sequence data. We compare both synthetic networks and real-world PSNs more accurately and faster than existing network, 3D contact, or sequence approaches.

Session A-009: Exploiting simulated data for protein stability predictions with Gaussian processes
COSI: 3Dsig
  • Emmi Jokinen, Aalto University, Finland
  • Markus Heinonen, Aalto University, Finland
  • Harri Lähdesmäki, Aalto University, Finland

Short Abstract: Protein stability is a key property that is important in many fields. In many applications, it is desirable to improve the protein stability by introduction of mutations, or amino acid substitutions, that alter the protein structure and thus its properties. Conversely, mutations that are intended to alter some other property of the protein can also affect the stability. Therefore, accurate prediction of the changes in stability upon mutations facilitates efficient protein design. Stability of protein variants can be measured experimentally, but only little data is publicly available. Stabilities can also be simulated with software such as Rosetta, which does not require any experimental stability measurements as its predictions are computed directly from proteins’ 3D structures. However, simulated data may contain systematic biases and are less accurate than experimental data. To exploit both experimental and simulated data, we have developed a novel method that integrates multiple data sources within the Gaussian process framework, which is a powerful tool for learning nonlinear functions and their uncertainties. We model protein variants as amino acid graphs and assess the mutation effects with graph kernels. The model parameters are optimised with robust Bayesian optimisation. The noisy simulated data is calibrated against the experimental data to remove its biases. By integrating a large amount of in silico simulated and calibrated data we need significantly smaller amount of experimental stability measurements to achieve more accurate protein stability predictions.

Session A-010: The Impact of Conformational Entropy on the Accuracy of the Molecular Docking Software FlexAID in Binding Mode Prediction
COSI: 3Dsig
  • Louis-Philippe Morency

Short Abstract: Here we show the newest implementation of Flexible Artificial Intelligence Docking (FlexAID) allowing its scoring function to consider the conformational entropy of ligand and biomolecules complexes. The higher accuracy of FlexAID1 on complex cases, the addition of novel features, i.e. the conformational entropy, its accessibility and its easy-to-use graphical user interface place FlexAID in an interesting position to tackle biologically and pharmacologically relevant situations currently ignored by other methods. FlexAID is available as a command-line pre-compiled executable (available at http://bcb.med.usherbrooke.ca/flexaid for Windows, macOS & Linux) or through the NRGsuite, a PyMOL integrated user interface allowing the user to use FlexAID in an intuitive manner with real time visualization. Both the NRGsuite and FlexAID are distributed as open-source software.

Session A-011: EncoMPASS: an Encyclopedia of Membrane Proteins Analyzed by Structure and Symmetry
COSI: 3Dsig
  • Edoardo Sarti, NINDS, NIH, United States
  • Antoniya Aleksandrova, NINDS, NIH, United States
  • Lucy Forrest, NINDS, NIH, United States

Short Abstract: Protein structure determination, active site detection, and protein sequence alignment techniques all exploit information about proteins of known structure and their structural relations. For membrane proteins, however, such a resource is not yet available, as existing databases do not offer tools for highlighting structural similarities. To address this issue, we created EncoMPASS (Encyclopedia of Membrane Proteins Analyzed by Structure and Symmetry), an online, completely automated database for relating integral proteins of known structure from the points of view of their sequence, structure, and symmetries. For each X-ray structure deposited in the PDB whose resolution is <3.5 Å, EncoMPASS provides the predicted orientation in the membrane, structural analyses based on an exhaustive set of sequence and structure alignments of each of its chains with other chains having the same topology, and a complete analysis of all quaternary and internal symmetries. The database is updated monthly and its underlying source code is freely available. Thanks to these characteristics, EncoMPASS can be used for organizing resources for protein structure determination, benchmarking sequence alignment tools and inferring membrane protein functionalities via comparative studies.

Session A-012: An exploration of the structural interactome of Rac1
COSI: 3Dsig
  • Marijne Schijns, I2BC, CEA, CNRS, University Paris-Saclay, CEA-Saclay, Gif-sur-Yvette; Faculty of Science, Utrecht University, Utrecht, Netherlands
  • Jessica Andreani, I2BC, CEA, CNRS, University Paris-Saclay, CEA-Saclay, Gif-sur-Yvette,, France
  • Raphael Guerois, I2BC, CEA, CNRS, University Paris-Saclay, CEA-Saclay, Gif-sur-Yvette,, France

Short Abstract: Rac1 is a member of the small GTPase superfamily. Members of this family are characterized by their interaction with effector protein partners when in the GTP-bound state. Rac1 has been implicated in regulation of cell motility and the actin network through a large number of effector proteins. To better understand the role of Rac1, and the effect of mutations in Rac1, models of Rac1 in contact with its partners are essential. Since many complexes containing Rac1, or one of its close homologs, have been resolved, we hypothesize that this vast amount of information could provide insights into general binding modes of Rac1. A structural approach to organization of this data is expected to benefit homology modeling for Rac1-containing complexes and the evaluation of docking results involving Rac1 when direct homologous complexes are not available.
For this research, we consider a large set of binding partners of Rac1. Using available structural and evolutionary data of both Rac1 and the partners, existing structures are retrieved to serve as benchmark, which is then analyzed for Rac1 binding motifs. This results in a better understanding of Rac1 interactions, and provides a basis for research into its structural interactome. Additionally, it may give more insight in the methodology surrounding structure modeling and docking concerning proteins with extensive interactive networks.

Session A-013: Building biomolecular modelling communities through BioExcel Interest Groups
COSI: 3Dsig
  • Vera Matser, EMBL-EBI, United Kingdom
  • Rossen Apostolov, KTH Royal Institute of Technology, Sweden
  • Adam Carter, EPCC, The University of Edinburgh, United Kingdom
  • Bert de Groot, Max Planck Gesellschaft, Germany
  • Ian Harrow, Ian Harrow Consulting (IHC), United Kingdom
  • Adam Hospital, Institute for Research in Biomedicine (IRB), Spain
  • Emiliano Ippoliti, Forschungszentrum Juelich, Germany
  • Adrien Melquiond, Universiteit Utrecht, Netherlands
  • Stian Soiland-Reyes, The University of Manchester, United Kingdom
  • Mikael Trellet, Universiteit Utrecht, Netherlands

Short Abstract: BioExcel is an EU-funded Centre of Excellence for computational biomolecular research. Despite the importance and increasing adoption of biomolecular simulation techniques, the user communities in Europe are fragmented, possibly excepting communities built around specific codes (e.g. HADDOCK, GROMACS). Support and maintenance of codes is done by small to mid-sized groups, and there is a noticeable lack of overall interactions to ensure coordinated approaches, knowledge transfer and structured initiatives. The BioExcel vision is to become a focal point for advances in software development, knowledge exchange and support, as well as a networking facilitator for the wider community of computational biomolecular researchers. BioExcel addressed this by creating user-led Interest Groups (IGs), the current IGs are: Entry Level Users, Integrative Modelling, Free energy, Hybrid Methods, Workflows, Industry IG, Training IG. The aim of the IGs are (1) to share BioExcel expertise with members of the groups through workshops, forums and training, (2) to learn from IG members, this could be informal feedback on training needs, software improvement but also more formalised, where IG members join the BioExcel Scientific Advisory Board, (3) enhance the interactions between IG members, including between industry and academia or training professionals. In this poster we present the different Interest Groups, the approaches that each IG has taken to facilitate community building (e.g. newsletter, webinar, online discussions) and what metrics will be used to measure success (e.g. number of active members, workshop attendance). To increase interconnections all IGs are involved in a BioExcel community forum event.

Session A-014: On the turning away
COSI: 3Dsig
  • Alexandre G. de Brevern
Session A-015: Structural analysis of T-cell cross-reactivity
COSI: 3Dsig
  • Kamilla Kjærgaard Jensen, Technical University of Denmark, Denmark
  • Morten Nielsen, Technical University of Denmark, Denmark
  • Paolo Marcatili, Technical University of Denmark, Denmark

Short Abstract: T-cell receptors (TCRs) found on the surface of T-cells play an important role in regulating the adaptive immune response. It is known that the molecular interaction between TCRs and the peptide-MHC (major histocompatibility complex), can usually discriminate between self and non-self. Each single TCR has the potential to recognize many different peptides and this cross-reactivity is known to play a role in autoimmunity or protection against infections and cancer. In this project, we evaluate the effect of peptide mutations on TCR recognition, using a dataset composed of experimentally obtained TCR-pMHC binding values. The dataset was created using two different MHC molecules presenting a wild-type (WT) peptide including each single mutant variant of this WT peptide. Both MHC-molecules was modeled using a modeling approach developed in-house, and all single mutant variation of the WT peptide was created by mutating each position in the peptide to one of the other 19 amino acids. For each mutation, we calculate the change in energy between the wild type and the mutated pMHC. The change in energy was then used to determine if the mutated peptide had a more favorable or unfavorable energy compared to the wild type. Using our pMHC model we show that the glycine is important for TCR recognition, not because it is indirectly in contact with the TCR, but because it ensures the right conformation of the WT peptide.

Session A-016: Understanding the Molecular Consequences of Genomic Variation Associated with Drug Resistance in Mycobacterium tuberculosis
COSI: 3Dsig
  • Nicholas Furnham

Short Abstract: Resistance to treatments for M. tuberculosis is of serious concern due to the large number of associated deaths in the developing and developed world. Large scale genomic studies have revealed mutations that are associated with drug resistance. The outstanding question is what are the molecular consequences of these variations. Using a range of analyses including in silico predictive tools to estimate the enthalpic effects of point mutations reveals the molecular mechanisms behind mutations that would otherwise be considered inconsequential for introducing therapeutic failure. Ultimately these insights can be used to aid the development of future drugs and, through their integration into predictive tools, in pathogen surveillance.

Session A-017: Predicting Structural B-cell Epitopes exploiting Cognate Antibody Information
COSI: 3Dsig
  • Martin Closter Jespersen, Technical University of Denmark, Denmark
  • Morten Nielsen, Technical University of Denmark, Denmark
  • Paolo Marcatili, Technical University of Denmark, Denmark

Short Abstract: B-cells are considered an essential part of the adaptive immune system, as they are able to provide long-term protection against pathogens, using their extremely specific receptors called “antibodies”. Antibodies recognise their molecular target, called “antigen”, by interacting with their binding site (paratope) to a specific region of the antigen (epitope). Being able to computationally predict these B-cell epitopes can reduce time and cost for vaccine design, therapeutic antibodies development and increase our understanding of the immune system. However, current B-cell epitope prediction tools have an overall low accuracy and don’t use any information on the cognate antibody. In this study, we use the antibodies to guide the B-cell epitope predictions, thereby predicting antibody specific epitopes with an increased accuracy. The method is trained on global geometric and physico-chemical features of the interacting surfaces (epitope/paratope) derived from 3D structures of antibodies and antigens. From our analyses, given an antibody, the sets of features we have derived can discriminate the correct epitope pairing from non-epitope patches on the same antigen. Preliminary studies show that specifically the shape (described by its principal components), number of available hydrogen bond donors and acceptors, and hydrophobicity of the generated patches, are the most important features to predict the correct paratope-epitope pairing. We trained a simple Random Forest model using such variables to predict the paratope/epitope pairing on a dataset of 326 non-redundant antibody-antigen pairs and of negative (i.e. non-epitope) patches of similar size generated from the same antigens using a Monte Carlo approach, and we obtained an AUC of 0.661. Also, when identifying an antigen’s cognate antibody from a pool, our tool ranks the true antibody paratope among the 20% predictions. We are now improving the model by adding more sophisticated machine-learning methods and features, such as a geometric description of the patches using Zernike coefficients.

Session A-018: Observation selection bias in contact prediction and its implications for structural bioinformatics
COSI: 3Dsig
  • Gabriele Orlando, VUB, Belgium
  • Daniele Raimondi, Interuniversity Institute of Bioinformatics Brussels, Belgium
  • Wim Vranken, Vrije Universiteit Brussel, Belgium

Short Abstract: Next Generation Sequencing is dramatically increasing the number of known protein sequences, with related experimentally determined protein structures lagging behind. Structural bioinformatics is attempting to close this gap by developing approaches that predict structure-level characteristics for uncharacterized protein sequences, with most of the developed methods relying heavily on evolutionary information collected from homologous sequences. Here we show that there is a substantial observational selection bias in this approach: the predictions are validated on proteins with known structures from the PDB, but exactly for those proteins significantly more homologs are available compared to less studied sequences randomly extracted from Uniprot. Structural bioinformatics methods that were developed this way are thus likely to have over-estimated performances; we demonstrate this for two contact prediction methods, where performances drop up to 60% when taking into account a more realistic amount of evolutionary information. We provide a bias-free dataset for the validation for contact prediction methods called NOUMENON.

Session A-019: Investigating the molecular determinants of ebolavirus pathogenicity
COSI: 3Dsig
  • Mark Wass, University of Kent, United Kingdom

Short Abstract: The West Africa Ebola virus outbreak killed thousands of people. Using sequencing data combined with detailed structural analysis and experimental data, we compare Ebolavirus genomes to identify potential molecular determinants of Ebolavirus pathogenicity. We identify specificity determining positions (SDPs) that may act as molecular determinants of pathogenicity. Of 189 SDPs protein- structural analysis revealed eight that were likely to alter protein structure or function. SDPs present in VP24 are likely to impair binding to human karyopherin alpha proteins and prevent inhibition of interferon signaling in response to infection. Secondly structural analysis of the mutations present in Ebola during rodent adaptation experiments suggested that fewer than five mutations are required to introduce pathogenicity in a new host species. Mutations in VP24 are critical to adaptation. As only a few mutations are need for adaptation and only a few SDPs distinguish Reston virus VP24 from other Ebolaviruses, it is possible that human pathogenic Reston viruses may emerge.

Session A-020: Unveiling the inhibition mechanism of HIF-2α:ARNT dimerization by protein dynamics
COSI: 3Dsig
  • Stefano Motta, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
  • Claudia Minici, Dept. of Immunology, Transplantation, and Infectious Diseases, DIBIT Fondazione San Raffaele, Milan, Italy
  • Laura Bonati, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
  • Alessandro Pandini, Department of Computer Science, Brunel University, London, United Kingdom

Short Abstract: The hypoxia-inducible factor (HIF) mediates cellular response to low oxygen stress through dimerization with the Aryl hydrocarbon Receptor Nuclear Translocator (ARNT) and DNA binding. Since HIF is critically important for the sustained growth and metastasis of solid tumours in humans, its inhibition by means of small molecules has been investigated. Here we shed light into the inhibition mechanism at a molecular level by comparison of dynamics and energetics of the HIF-2α:ARNT dimer in the unbound and inhibitor-bound form.
Using a combination of surface conservation analysis, Molecular Dynamics (MD) simulations, binding free energy and dynamic network analysis, we compared the two HIF-2α:ARNT forms. The dimerization interfaces and two PAS-A loops were detected as functionally important regions. Moreover, MD analysis revealed a perturbed PAS-B:PAS-B interface and a rigidification of one HIF-2α PAS-B strand in the bound simulation. This local perturbation induced by the inhibitor seems to affect long-range interactions by decoupling the dynamics of ARNT and HIF-2α PAS-A domains and reducing the dimer stability. The residue characterization of this allosteric inhibitory mechanism will facilitate future dynamics-based drug design.

Session A-021: Zipping and assembly with limited sets of constraints
COSI: 3Dsig
  • Maryana Wånggren, Chalmers University of Technology, Sweden
  • Martin Billeter, University of Gothenburg, Sweden
  • Graham Kemp, Chalmers University of Technology, Sweden

Short Abstract: We aim to improve the speed and accuracy with which three-dimensional models of protein structures can be generated from easily and rapidly obtainable nuclear magnetic resonance (NMR) data. We use a zipping and assembly dynamic programming approach to search for protein conformations that are consistent with given distance and angle constraints. Our approach benefits from having both high level and low level descriptions of conformational features and constraints, and the possibility to infer new constraints from those that are given. Introduction NMR experiments provide a variety of restraints that can be used when constructing a model structure. Some experiments are relatively straightforward and are routinely performed when studying a new protein. Residual dipolar couplings (RDCs) are the easiest data to obtain for large proteins and give restraints related to torsion angles and also to the orientation (in a fixed coordinate system) of selected bonds, often the N-H bonds of the backbone. Other experiments require alternative isotope labelling or multidimensional NMR, which make the experiments very time-consuming and more expensive. One is often faced with limited and sometimes insufficient information for determining a well-resolved 3D structure. In addition, the type of data available for different proteins may vary: ranges for torsion angles, distance approximations, relative orientation of different molecular parts etc. We want to build accurate model structures quickly (a few minutes), using only data that are easy to obtain. Methods A protein modelling program has been implemented [1] that uses the zipping and assembly method in which longer fragments are constructed from pairs of shorter ones [2]. The built fragments must be free from steric clashes, and be compatible with given constraints (see below). Distance constraints can be propagated: new constraints lower in the zipping and assembly data structure can guide the conformational search towards feasible solutions. Results We are currently testing our method with a range of proteins by generating ensembles of structures based on only secondary structure information and disulphide bridges. The resulting models are compared with experimentally determined structures from the Protein Data Bank. As an illustration, Figure 1(A) shows distance constraints for human beta-defensin 6 [3] mapped onto the zipping and assembly data structure. An ensemble of models that are compatible with these constraints is shown in Figure 1(B). FIGURE 1. (A) Distance constraints for human beta-defensin 6 mapped onto cells in the zipping and assembly data structure. Cells labelled “S” (gold) indicate three pairs of residues that form disulphide bonds: (6,33), (13,27), (17,34). Cells labelled “A” (blue) indicate antiparallel bridges identified by HN-HN NOEs: (12,34), (14,32), (22,35), (25,33). Another distance constraint between positions 6 and 17 (green) can be inferred from these. Additional distance constraints between other pairs of residues (unlabelled cyan and yellow cells) can be inferred from the eight constraints listed above. An alpha-helix, derived from chemical shift data, gives distance constraints between residues i and i+4 (magenta). (B) Alpha-carbon traces of 50 models of human beta-defensin 6. The structures of the core are in good agreement and the unconstrained terminal regions are dynamic. Acknowledgements This work is supported by a Project Research Grant from VR (621-2011-6171). References 1. Wånggren, M., Billeter, M. and Kemp, G.J.L. Proc 12th Intl Workshop on Constraint-Based Methods for Bioinformatics, pp 99-113 (2016). 2. Hockenmaier J., Joshi, A.K., Dill, K.A. Proteins: Structure, Function, and Bioinformatics 66, 1-15 (2007). 3. De Paula, V. S., et al. J Mol Biol 425, 4479-4495 (2013).

Session A-022: Binding of cationic porphyrins to hemoglobin and cytochrome C by the method of molecular docking
COSI: 3Dsig
  • Aram Gyulkhandanyan
Session A-023: From Mutations to Mechanisms and Dysfunction via Computation and Mining of Protein Energy Landscapes
COSI: 3Dsig
  • Amarda Shehu, George Mason University, United States
  • Tatiana Maximova, George Mason University, United States
  • Wanli Qiao, George Mason University, United States
  • Erion Plaku, The Catholic University of America, United States
  • Carla Mattos, Northeastern University, United States
  • Buyong Ma, NIH, United States
  • Ruth Nussinov, NIH, United States

Short Abstract: The energy landscape underscores the inherent nature of proteins as dynamic systems interconverting between structures with varying energies. Recently, we have developed a method that feasibly reconstructs landscapes. Here we demonstrate that the availability of landscapes of wildtype and diseased variants opens the way for data mining techniques to harness quantitative information embedded in landscapes to summarize mechanisms via which mutations alter dynamics and function.

Session A-024: Determining Allosteric Hot Spots in Hsp70 using Perturbation Response Scanning
COSI: 3Dsig
  • David Penkler
Session A-025: Self-consistency test reveals systematic bias in programs for prediction destabilization upon mutation
COSI: 3Dsig
  • Dmitry Ivankov, Centre for Genomic Regulation (CRG), Spain

Short Abstract: Computational prediction of the effect of amino acid substitutions on protein stability is utilized by researchers in many fields. Specifically, researchers may be interested in exploring the effect of combination of different mutations, such as when considering the maintenance of protein stability in the course of accumulation of substitutions in evolution. Such programs, while relatively inaccurate, are not known to provide systematically biased results. We explored the suitability of using two of the most popular algorithms, FoldX and I-Mutant, for prediction of substitutions on structure. We devised a self-consistency test that queries the reciprocity of the prediction of amino acid substitutions. Unbiased algorithms should, on average, predict that the effect of an amino acid substitution is equal but opposite to the reverse substitution. To test this, we applied FoldX and I-Mutant to the crystal structures differing from each other by only a few substitutions. We found that both of the tested algorithms have an inherent bias, whereby for many instances the effect of the forward and the reverse substitution was predicted to be substantially different in magnitude. The systematic bias for single mutants was ~0.6 kcal/mol for both programs. FoldX displays this bias probably because it does not change the backbone of the structure while I-Mutant is influenced by the content of the training set, which likely includes many more damaging mutations than benign one. Authors seeking to use the algorithms for prediction of the effect of amino acid substitutions should be aware of the inherent bias described here.

Session A-026: SPRINT: Ultrafast protein-protein interaction prediction of the entire human interactome
COSI: 3Dsig
  • Yiwei Li, University of Western Ontario. London, Canada
  • Lucian Ilie, University of Western Ontario. London, Canada

Short Abstract: We present SPRINT (Scoring PRotein INTeractions), a new sequence-based algorithm and tool for predicting protein-protein interactions (PPI). We comprehensively compare SPRINT with state-of-the-art programs on seven most reliable human PPI datasets and show that it is more accurate while running five orders of magnitude faster.

Session A-027: MIB: Metal Ion-Binding Site Prediction and Docking Server
COSI: 3Dsig
  • Yu-Feng Lin, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 30050 Taiwan, Taiwan
  • Chih-Wen Cheng, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 30050 Taiwan, Taiwan
  • Chung-Shiuan Shih, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 30050 Taiwan, Taiwan
  • Jenn-Kang Hwang, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 30050 Taiwan, Taiwan
  • Chin-Sheng Yu, Department of Information Engineering and Computer Science, Feng Chia University, Taichung 40724, Taiwan, Taiwan
  • Chih-Hao Lu, Graduate Institute of Basic Medical Science, China Medical University, Taichung 40402, Taiwan, Taiwan

Short Abstract: The structure of a protein determines its biological function(s) and its interactions with other factors, and the binding regions tend to be conserved in sequence and structure, and the interacting residues involved are usually in close 3D space. The Protein Data Bank currently contains more than 110,000 protein structures, approximately one-third of which contain metal ions. Identifying and characterizing metal ion–binding sites is thus essential for investigating a protein’s function(s) and interactions. However, experimental approaches are time-consuming and costly. The web server reported here was built to predict metal ion–binding residues and to generate the predicted metal ion–bound 3D structure. Binding templates have been constructed for regions that bind twelve types of metal ion–binding residues have been used to construct binding templates. The templates include residues within 3.5Å of the metal ion, and the fragment transformation method was used for structural comparison between query proteins and templates without any data training. Through the adjustment of scoring functions, which are based on the similarity of structure and binding residues. Twelve kinds of metal ions (Ca2+, Cu2+, Fe3+, Mg2+, Mn2+, Zn2+, Cd2+, Fe2+, Ni2+, Hg2+, Co2+ and Cu+) binding residues prediction are supported. MIB also provides the metal ions docking after prediction. The MIB server is available at http://bioinfo.cmu.edu.tw/MIB/ .

Session A-028: Molecular Dynamics Simulation of a 17β-Estradiol Specific DNA Aptamer
COSI: 3Dsig
  • Alexander Eisold
Session A-029: Expanding CATH-Gene3D with applications in the analysis of disease-associated mutations
COSI: 3Dsig
  • Natalie Dawson, University College London, United Kingdom
  • Ian Sillitoe, University College London, United Kingdom
  • Sayoni Das, UCL, United Kingdom
  • Paul Ashford, University College London, United Kingdom
  • Millie Pang, University College London, United Kingdom
  • Christine Orengo, University College London, United Kingdom

Short Abstract: CATH classifies 3D structures from the PDB into superfamilies of protein domains that are evolutionarily related. Since protein structure tends to be much more highly conserved than sequence, CATH superfamilies are often able to trace further back in evolution than sequence methods alone. Currently, CATH classifies more than 435,000 domain structures (from ~90% of PDB structures) into evolutionary superfamilies. Once these distant structure-based evolutionary relationships have been established, the Gene3D resource uses start-of-the-art sequence comparison technology to augment these superfamilies with more than 50 million protein domain sequences from ~20,000 cellular genomes. Many of these superfamilies contain protein sequences with detailed functional annotations, which enable a deep understanding of the evolutionary mechanisms by which functions evolve. A recent development is the identification of functional families within CATH superfamilies and the establishment of a new function prediction protocol, which has been highly ranked by the CAFA independent assessment. Approaches have also recently been developed to determine the impacts of disease-causing mutations on protein structure and function, with a particular focus on pathogenic mutations close to known functional sites (e.g. residues involved in catalysis, interfaces). Functional families highly enriched in disease mutations have been identified and structural data used to highlight 3D residue clusters of mutation enrichment. We present strategies for using the CATH-Gene3D functional families for exploring the impacts of disease-associated residue mutations in germline diseases and various cancers, and identifying the specific protein domains that are enriched in such mutations.

Session A-030: 40-fold increase in coverage of structure-based annotations for UniProt entries via the SIFTS resource
COSI: 3Dsig
  • Jose M. Dana, PDBe (EMBL-EBI), United Kingdom

Short Abstract: 40-fold increase in coverage of structure-based annotations for UniProt entries via the SIFTS resource.

The Structure Integration with Function, Taxonomy and Sequences resource (SIFTS) was established in 2002 and continues to operate as a collaboration between the Protein Data Bank in Europe (PDBe) and UniProt. The resource is instrumental in the transfer of structure based annotations for protein sequence data through provision of up-to-date residue-level mappings between entries from the PDB and from UniProt. SIFTS also provides residue-level annotations from other biological resources: currently IntEnz, GO, Pfam, InterPro, SCOP, CATH, PubMed and the NCBI taxonomy database.

Until 2017, SIFTS mappings between PDB and UniProt were calculated with respect to the canonical sequences of proteins. A number of improvements were recently carried out by PDBe enabling overlapping mappings and, as a result, SIFTS now supports mappings to isoform sequences and to UniRef90 clusters (all sequences in UniProt which have 90% or greater sequence identity).

Extending the mapping to UniRef90 cluster members expands the structural coverage of UniProt almost 40-fold. From ~40000 UniProt accessions mapped directly to PDB entries to more than 1.5 million UniProt accessions with at least 90% sequence identity and 70% sequence coverage by the structures in the PDB. Specifically, our analysis shows that while the PDB contains fewer than 3000 unique human proteins, there are a further 3300 proteins from other organisms, for which there is at least one structure in the PDB and which are highly similar, at the sequence level, to human proteins otherwise not available in the PDB.

Session A-031: Modeling long variable regions in proteins by secondary structure prediction and global optimization
COSI: 3Dsig
  • Beomchang Kang, Seoul National University, South Korea
  • Gyu Rie Lee, Seoul National University, South Korea
  • Chaok Seok, Seoul National University, South Korea

Short Abstract: Structure prediction of variable loop regions among homologous proteins has to be tackled by an ab initio modeling method because the amount of evolutionary information is not enough. Variable regions are often involved in protein functions and contribute to functional specificity. Therefore, modeling variable loop regions can be important in functional and design studies. Structure prediction of long variable regions (>12 residues) is still an unsolved problem especially due to the difficulty of effective sampling in the high dimensional conformational space. The large conformational space can be reduced using secondary structure information when secondary structures within the variable regions can be predicted with high accuracy. The dimension of the conformational space can also be reduced by employing a coarse-grained representation of protein. Here, we present a new loop modeling method with enhanced sampling power which combines secondary structure prediction, fragment assembly, analytical loop closure, coarse-grained representation, quality assessment, and global optimization. Secondary structure prediction, fragment assembly, and analytical loop closure are used to generate initial seed conformations for global optimization. Global optimization is then performed in a coarse-grained representation. During global optimization, conformations in the coarse-grained representation can be refined and scored in an all-atom representation. Performance of the new method is presented in comparison with other loop sampling and loop modeling methods such as DISGRO, FALC, GalaxyLoop, and Rosetta.

Session A-032: Mining Functionally Conserved Building Blocks in Proteins
COSI: 3Dsig
  • Florian Kaiser, University of Applied Sciences Mittweida, TU Dresden, Germany
  • Sebastian Salentin, TU Dresden, Germany
  • Michael Schroeder, TU Dresden, Germany
  • Dirk Labudde, University of Applied Sciences Mittweida, Germany

Short Abstract: Proteins with shared catalysis mechanisms, ligand binding, or fold have often evolved to retain similar interaction patterns to conserve functionality. While some of these patterns are reflected by shared sequence motifs, distant proteins may use only a small set of similar structural building blocks.

By unifying geometrical and molecular interaction patterns in a mining algorithm, the presented approach has the potential to identify molecular building blocks with functional conservation and can be applied in protein function prediction and drug target screening.

Session A-033: Effective on-demand searching in structural databases
COSI: 3Dsig
  • Lukas Pravda, Central European Institute of Technology, Czech Republic
  • David Sehnal, Central European Institute of Technology, Czech Republic
  • Radka Svobodova Varekova, Central European Institute of Technology, Czech Republic
  • Jaroslav Koca, Central European Institute of Technology, Czech Republic

Short Abstract: The majority of in silico experiments often relies on a data collection. Indeed, identification of biomolecular substructures (patterns) within biomolecular databases, such as Protein Data Bank is a common procedure in structural bioinformatics and related fields. We are seeking for well-defined molecular patterns such as binding or catalytic sites, interaction sites, protein structural, or sequence motifs, etc. These are in turn used to aid structural and functional characterization and comparison of proteins, analysis of newly determined protein structures, identification of similar binding sites in off-target proteins, discovery of new inhibitors, facilitation of protein-protein interaction and more. This is usually done using a plethora of one-time-only use in-house programs often in combination with dedicated software tools. Development of such solutions is generally error-prone and time-consuming. Hence the question is, can we do any better? Do we really need all these single purpose programs? Or can we extract biologically important sites in an easy user-defined and customizable way? We have developed PatternQuery (PQ - http://ncbr.muni.cz/PatternQuery) – an online service for searching structural databases such as Protein Data Bank. It enables description of a relationship between atoms, residues, and other structural elements using a simple, yet robust, query language. Each query specifies the composition, topology, connectivity, and 3D structure of a pattern. This allows to relate the primary, secondary, and tertiary structure information simultaneously. The entire PDB can be queried in less than an hour. All the results are made available for download and presented in a clear graphical form for online inspection.

Session A-034: Inference of functional states from conformational changes in protein complexes
COSI: 3Dsig
  • Markus Gruber

Short Abstract: The association of chains into complexes is a common way for proteins to augment their capabilities. In particular, several examples are known, where a difference in the association of the chains of the same protein complex reflects different functional states of the molecule. Here, we compare the atomic models of identical and homologous pairs of protein complexes to automatically identify differences in the association of their constituent chains. Results suggest that changes in the association of the chains are strongly tied to functional states of protein molecules and may prove useful to infer functional insights.

Session A-035: HotSpot Wizard 3.0: automated design of site-specific mutations and smart libraries in protein engineering
COSI: 3Dsig
  • Lenka Sumbalova, Brno University of Technology, Czech Republic
  • Jan Stourac, Loschmidt Laboratories, Czech Republic
  • Tomas Martinek, Brno University of Technology, Czech Republic
  • David Bednar, Loschmidt Laboratories, Czech Republic
  • Jiri Damborsky, Loschmidt Laboratories, Czech Republic

Short Abstract: HotSpot Wizard is an interactive web server for prediction of amino acid residues suitable for mutagenesis and construction of libraries of mutants with increased activity, changed specificity or stability. Positions suitable for mutagenesis are evaluated based on protein structure using a combination of structural, functional and evolutionary information obtained from 7 internet databases and 22 computational tools. The application was designed with an emphasis on an easy usage without the necessity of advanced knowledge of the studied system. This is the reason for the setting of all default values of the parameters based on the extensive analysis to appropriately represent as wide spectre of input data as possible. Four different strategies are automatically evaluated for every protein structure: i) identification of evolutionary variable and therefore safely mutable residues, which are placed in a functionally relevant catalytic pockets or tunnels; ii) detection of highly flexible regions, whose mutagenesis can lead to the enhancement of stability because of local limitation of mobility; iii) identification of the mutations based on the back-to-consensus analysis which increase the possibility of finding mutations important for protein stability and iv) identification of evolutionary correlated residues, which indicate their significance for protein’s function or stability. Analysis of the results is being run directly in the web interface, which provides user-friendly visualization tool. Moreover, HotSpot Wizard provides a module for the design of a construction of protein mutant library with the support of an automatic detection of suitable target amino acids and corresponding degenerative codons. There are some new features for the version 3.0 to be released early 2018. Stability of single-point or multiple-point mutant can be predicted using the Rosetta scoring function. Apart from the previous version, users can newly enter also protein sequence as the input for HotSpot Wizard calculation. Then searching for structures or models in the databases of experimental structure (RSCB PDB) or depositories of homology models (ModBase, SWISS-MODEL, NEGS) is performed. Newly the users can run homology modelling of a structure using the programs Modeller and I-Tasser, followed by structure quality evaluation. The current version of the application is freely available for academic users at: http://loschmidt.chemi.muni.cz/hotspotwizard.

Session A-036: Understanding Protein Interactions at a Molecular Level with Web Tools Using Inter-Residue Contacts and Intermolecular Contact Maps
COSI: 3Dsig
  • Romina Oliva

Short Abstract: Web tools for the analysis of 3D structures of protein-protein complexes and for the scoring of docking poses are presented, which are based on a novel approach, using inter-residue contacts and their visualization in intermolecular contact maps.

Session A-037: Analysis of Protein interaction surfaces using a profile method with rigid-body docking decoys
COSI: 3Dsig
  • Nobuyuki Uchikoga, Tokyo Institute of Technology, Japan
  • Yuri Matsuzaki, Tokyo Institute of Technology, Japan
  • Masahito Ohue, Tokyo Institute of Technology, Japan
  • Yutaka Akiyama, Tokyo Institute of Technology, Japan

Short Abstract: To understand protein interaction mechanism, protein interaction interfaces are investigated using properties of protein interaction surfaces, which are derived from sets of protein complex structures generated by rigid-body docking, referred as docking decoys. Because each docking decoy set includes information of sets of possible interacting amino acid pairs we used profiles of protein interaction surfaces for improving the precision of protein-protein interaction (PPI) prediction in process of cluster analysis [Uchikoga & Hirokawa 2010 BMC Bioinfo. 11:236]. After docking process, protein interaction surfaces of whole decoys would contain various interaction patterns of amino acid pairs, which are clues for understanding protein interaction mechanisms. In information of amino acid interaction residues, we proposed a profile by assembling profiles of whole docking decoy interaction surfaces, referred as broad interaction surfaces (BIPs). BIPs are difference among various protein pairs. Each protein (receptor) has various proteins (ligand) including true and false interaction partners. We then investigated differences of BIPs between various protein pairs in a docking benchmark dataset, indicating that BIPs are different between true and false protein interaction partners [Uchikoga et al. 2016 Biophys. Phisicobiol. 13:105-]. Additionally, in this work, we apply the BIP method to PPI analysis involved in bacterial chemotaxis systems. We then performed all-to-all docking analysis using a rigid-body docking software MEGADOCK. Cluster analysis could examine for differences between true interaction protein pairs and false pairs.

Session A-038: A complete Web resource for Galactosemia-related proteins
COSI: 3Dsig
  • Anna Marabotti, University of Salerno, Italy
  • Bernardina Scafuri, CNR-Institute of Food Science, Italy
  • Antonio d'Acierno, CNR-Institute of Food Science, Italy
  • Angelo Facchiano, CNR-Institute of Food Science, Italy

Short Abstract: We have developed a free, Web-accessible database in which we have collected information about the predicted structural and functional effects of missense mutations of the three enzymes of the Leloir pathway, whose impairment is linked to the three different forms of the genetic rare disease galactosemia: classic galactosemia (OMIM #230400), associated to GALT deficiency; galactokinase deficiency (OMIM #230200), associated to GALK deficiency; galactose epimerase deficiency (OMIM #230350), associated to GALE deficiency. Each disease has different clinical manifestations with different severity, depending also on the mutation.
We have performed a thorough study in order to predict these effects starting from the crystallographic structures of these enzymes. For each enzyme, we have modelled each missense mutation associated in literature to galactosemia and we have evaluated their impact on different structural and functional features (secondary structures, solvent accessibility, intra- and intersubunit interactions, interactions with the substrate, protein stability) using online servers or well-known analysis tools installed locally, or developing ad hoc tools. All the data obtained have been stored into a database able to manage information about the wild type and the mutant proteins, both in homozyogus and in heterozygous form. The Web interface has been developed keeping in mind two opposite needs: the possibility to interact with non-experienced users, and the opportunity for expert researchers to gain full information about the predicted effects of mutations on the structural and functional features of these enzymes.
This tool is freely accessible at the Web address: http://www.protein-variants.eu/galactosemia

Session A-039: A Docking Based Approach to Analyze Interaction Surfaces of Protein-protein Interactions
COSI: 3Dsig
  • Yuri Matsuzaki, Tokyo Institute of Technology, Japan
  • Jaak Simm, KU Leuven, Belgium
  • Nobuyuki Uchikoga, Tokyo Institute of Technology, Japan

Short Abstract: Core elements of cell regulation are made up of protein-protein interaction (PPI) networks. However, many parts of the cell regulatory systems include unknown PPIs. Predicting relevant interacting partners from their tertiary structure is a challenging topic where computer science methods have potential to contribute. Protein-protein rigid docking based prediction methods have been applied for this purpose by several projects. We have developed a high throughput PPI prediction system “MEGADOCK” which performs all-to-all rigid docking. The prediction system accepts a set of protein tertiary structures as input and generates a list of possible interacting pairs from all the combinations. An advantage of docking-based methods is that it provides possible complex structures of predicted pairs which that give insight into novel interaction mechanisms. To conduct a large number of docking calculations effectively on parallel computing environments, MEGADOCK employs a hybrid parallelization (MPI/OpenMP) technique as well as parallelization on GPUs. One application is a prediction of novel interactions among non-small cell lung cancer related proteins (1921×1921 = 3,690,241 dockings). The predicted pairs were checked by 6 public PPI databases (DIP, HPRD, MIPS, MINT, BioGRID, IntAct) and we obtained 35 unknown pairs. We then examined further on these pairs by using transcription data and obtained 7 confident pairs. Binding affinities were observed for 6 from 7 pairs by a Surface Plasmon Resonance experiment. Our ongoing application is targeting host-pathogen PPIs such as key enzymes of Dengue virus and human proteins interactions.

Session A-040: A novel method for large-scale structural comparison of protein pockets using a reduced vector representation
COSI: 3Dsig
  • Tsukasa Nakamura
Session A-041: ZoomVar: automated annotation of NGS data onto 3D protein interactions
COSI: 3Dsig
  • Anna Laddach, King's College London, United Kingdom
  • Sun Sook Chung, King's College London, United Kingdom
  • Franca Fraternali, King's College London, United Kingdom

Short Abstract: A plethora of methods have been developed for the mapping of nsSNVs to protein interactions and structures. Unfortunately, to the best of our knowledge, no tools currently exist for the large scale automated mapping of variants from NGS data to experimentally resolved binary complexes. To fill this niche we present a tool, ZoomVar, which consists of a database and query script. The tool enables users to annotate NGS data in several formats (e.g. ANNOVAR, VEP output) and retrieve residue level information from a 3D integrated protein-protein interaction (PPI) network. The tool relies on a computational pipeline which annotates human UniProt proteins with information on structure, interactions, interfaces and functional sites, both at the residue and protein level. Information from homologues has been used to increase the structural coverage of the PPI network. Results from the pipeline have been stored in the ZoomVar database. Currently, the database contains structural data for 13600 proteins and 9791 binary interactions. We have annotated nsSNVs from variant databases (dbSNP, ClinVar, COSMIC) using ZoomVar. In line with previous results, we find PPI interfaces and protein cores to be enriched in disease-associated nsSNVs. We investigate additional features of the protein nodes of the PPI network including the per residue promiscuity of binding, by defining mono and multi partner binding residues mapped with nsSNVs. Interestingly we find multipartner binding sites to be significantly more enriched in disease-associated nsSNVs than monopartner ones. The database and query script will shortly be made freely available for remote query and download.

Session A-042: What can human variation tell us about proteins?
COSI: 3Dsig
  • Stuart A. MacGowan

Short Abstract: Human sequencing projects have generated population variant datasets from thousands of individuals.2-3 For the exome, protein structure is an effective context in which to interpret the effects of missense variation. Studies have shown that disease variants are enriched in buried sites4 and protein interaction surfaces,5 whilst somatic6 and pathogenic germline variants7 often cluster in 3D. Beyond structure, the genomic distribution of genetic variation is affected by gene essentiality,2, 8 protein domain architecture9 and other genomic features.3

Session A-043: The Impact of Native State Switching on Protein Sequence Evolution
COSI: 3Dsig
  • Avital Sharir-Ivry, McGill University, Canada
  • Yu Xia, McGill University, Canada

Short Abstract: For proteins with a single well-defined native state, protein 3Dstructure is a major determinant of sequence evolution. On the other hand, many proteins adopt multiple, distinct native structures under different conditions (“conformational switches”), yet the impact of such native state switching on protein evolution is not fully understood. Here, we performed a proteome-wide analysis of how protein structure impacts sequence evolution for protein conformational switches in Saccharomyces cerevisiae using pooled analysis of sites with similar packing or burial. We observed a strong linear relationship between residue evolutionary rate and residue packing for conformational switches. In addition, we found that conformational switches evolve significantly and consistently more slowly than proteins with a single native state, even after controlling for degree of residue burial or packing. Next, we focused on proteins that switch conformations upon molecular binding. We found that interfacial residues in these conformational switches evolve more slowly than interfacial residues in proteins with a single native state, and that the bound conformation is a better predictor for residue evolutionary rate than the unbound conformation. Our findings suggest that unlike flexible or disordered proteins, which are generally less constrained in sequence evolution, conformational switches evolve significantly more slowly than other proteins with a single native state. Moreover, for conformational switches, the necessity to encode multiple distinct native structures under different conditions imposes strong evolutionary constraints on the entire protein, rather than just a few key residues. Our results provide new insights into the structure–evolution relationship of protein conformational switches, and deeper understanding of the evolutionary design principles of protein conformational switches.

Session A-044: Do trends in biomacromolecular structure quality inspire optimism?
COSI: 3Dsig
  • Vladimír Horský, National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University, Czech Republic
  • Veronika Bendová, Department of Mathematics and Statistics, Faculty of Science, Masaryk University, Czech Republic
  • Radka Svobodová Vařeková, National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University, Czech Republic
  • Sameer Velankar, Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), United Kingdom
  • Jaroslav Koča, National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University, Czech Republic

Short Abstract: The general availability of data about biomacromolecular structures is a key success of modern life sciences. In fact, 13 Nobel prizes were awarded for research based on this data. Acquired structures are stored in databases, the most prominent one in structural biology being the Protein Data Bank (PDB). Not all that glitters is gold, however. Structure errors have caused retractions of articles from reputable journals. In reaction, scientific community began developing various tools suited for validating of biomacromolecular complexes. PDB began offering validation reports that enabled its users to assess wide range of quality criteria of each individual structure. Therefore, we became curious if their emergence had positive influence of quality of newly submitted structures. And since ligands do not enjoy as much care as biomacromolecules, we were also curious about trends in ligand quality.
Wide scale analysis of trends in quality and size of biomacromolecules and their ligands has therefore been carried out by our team. 88 factors have been considered. Structure metadata and quality data came from PDB, while ligand quality data have been sourced from our own database ValidatorDB.
Some trends were expected to exist (e.g., newer structures have better quality), while the existence of others was a surprise (e.g., ligand quality is stagnant at best, currently utilized structure validation methods do not validate ligands well). Explored trends are available in the ValTrendsDB database (ncbr.muni.cz/ValTrendsDB).

Session A-045: Scalable Data Analytics of the PDB with the MacroMolecular Transmission Format (MMTF) and Big Data Technologies
COSI: 3Dsig
  • Peter Rose, UC San Diego, United States
  • Anthony Bradley, UC San Diego, United States
  • Antonin Pavelka, UC San Diego, United States
  • Alexander Rose, UC San Diego, United States
  • Yana Valasatava, UC San Diego, United States
  • Jose Duarte, UC San Diego, United States
  • Andreas Prlic, UC San Diego, United States

Short Abstract: Advances in experimental techniques have led to an explosion in both the number and size of macromolecular structures in the Protein Data Bank (PDB). For this reason, the transfer and parsing of macromolecular data has become increasingly time-consuming. In this work we present the Macromolecular Transmission Format (MMTF), a new compact, extensible macromolecular file format. MMTF offers over 75% compression over mmCIF, and is over an order of magnitude faster to parse than the standard mmCIF format. We describe the new MMTF format, its Application Programming Interface, and demonstrate its use with Big Data Frameworks to enable nearly interactive data analytics on the PDB archive.

Session A-046: Structure-based drug repositioning identifies novel Hsp27 inhibitors, which efficiently suppress drug resistance development in cancer cells
COSI: 3Dsig
  • Michael Schroeder
Session A-047: Bio-vectors: K-mer Embeddings of Biological Sequences for Bioinformatics Applications
COSI: 3Dsig
  • Ehsaneddin Asgari, University of California, Berkeley, United States
  • Mohammad R.K. Mofrad, University of California, Berkeley, United States

Short Abstract: Biophysical and biochemical principles govern biological sequences (e.g., DNA, RNA, and protein sequences) similar to the way grammar of a natural language determines the structure of clauses and sentences. This analogy motivates treating biological sequences as the output of a certain language and adopt/develop language processing methods to perform analyses and predictions for bioinformatics tasks. For this purpose, we propose two specific aims: (1) Developing language model-based representation learning for biological sequences is proposed here to obtain prior knowledge from the existing sequence resources. Distributional representation of words has recently become popular in natural language processing (NLP) as efficient unsupervised representations helping in the downstream NLP tasks. In this work, we propose distributed vector representations of biological sequence segments (k-mers), called bio-vectors, which plays a key role in deep learning for bioinformatics. We propose intrinsic evaluation of bio-vectors by measuring the continuity of the underlying biophysical and biochemical properties. In addition, for the purpose of extrinsic evaluations, we have employed this representation in classification of protein families as well as sequence labeling tasks of intron-exon prediction and domain identification. We show that bio-vectors outperform base-line of one-hot vector representation and engineered features. (2) Performing computational linguistics comparison of genomic language variations to quantify the distances between language model of two genomic variations with applications in comparative genomics. Training model of bio-vectors is analogous to neural probabilistic language modeling of sequence k-mers. Considering this fact, we propose a new quantitative measure of distance between genomic language variations based on the divergence between networks of k-mers in different genetic variations, called word embedding language divergence. The proposed method is a step toward defining a new quantitative measure of high-level similarity between genomic variations, with applications in characterization/classification of sequences of interest.

Session A-048: Docking to homology models highlights the molecular determinants of ligand binding to the AhR
COSI: 3Dsig
  • Sara Giani Tagliabue, University of Milano-Bicocca, Italy
  • Laura Bonati, University of Milano-Bicocca,

Short Abstract: To shed light into ligand-binding processes, molecular docking can be applied to both experimental structures or homology models of the receptor. In the latter case, strategies able to include protein flexibility may help to obtain reliable predictions. In this work, a computational protocol is proposed to study ligand binding to the homology model of the Aryl hydrocarbon Receptor (AhR).
AhR is a ligand-dependent transcription factor that responds to exogenous and endogenous chemicals producing biological and toxic effects. The mechanism is initiated by molecular interactions into the ligand binding domain (LBD). Given that AhR elicits different responses for different ligands, understanding the molecular determinants of binding could help to elucidate its mechanism of action.
Because no experimental information is available for the AhR LBD structure, we developed 10 homology models of the LBD (using MODELLER) on the basis of HIF2α X-ray structures in complex with diverse inhibitors. To simulate ligand binding by including protein flexibility, we performed docking (using Glide XP) to the ensemble of 10 mAhR models and refined the binding poses by Molecular Dynamics. The per-residue contributions to the binding free energy of each ligand were calculated using the MM-GBSA approach.
The 12 AhR agonists here analyzed show diverse structures and physico-chemical characteristics. Thanks to our approach we identified the molecular determinants for binding of such different chemicals and pointed out three characteristic arrangements within the cavity, each selected by molecules with similar steric and electronic properties. Mutagenesis studies are currently underway to confirm these predictions.

Session A-049: Globular Protein Design from Ancestral Supersecondary Structural Elements
COSI: 3Dsig
  • Mohammad Elgamacy, Max Planck Institute for Developmental Biology, Germany
  • Murray Coles, Max Planck Institute for Developmental Biology, Germany
  • Andrei Lupas, Max Planck Institute for Developmental Biology, Germany

Short Abstract: Combinatorial reshuffling of subdomain-sized peptides may have provided a very economic means for sequence space navigation and thus protein fold evolution. Previously, through a bioinformatic study we identified a set of highly conserved, subdomain-sized motifs recurring across distant folds, a cue that such motifs may have predated the existing pedigree of folds. This has led to the hypothesis that these ancestral fragments may have provided the basic building blocks for modern protein folds. We also demonstrated repetition of these fragments as a mechanism in creating new folds. The aim of this work was to investigate an alternative mechanism via recombination of heterologous fragments, especially that we were unable to detect any such recombination incidents between the ancestral fragments in modern proteins. To provide an exemplar, we attempted to reconstruct a polymerase-beta N-terminal domain out of two conserved supersecondary structures derived from two unrelated folds. We have done so using a computational strategy that introduces a minimal number of mutations to the constituting fragments. The resulting NMR structure agreed with the designed coordinates with atomic accuracy, demonstrating that a recombination event and a few mutation are sufficient to evolve a new domain.

Session A-050: New Insights into statistical potentials for describing protein binding affinity and aggregation properties
COSI: 3Dsig
  • Fabrizio Pucci
Session A-051: A novel signal transducer element intrinsic to class IIIa/b adenylate cyclases and guanylate cyclases
COSI: 3Dsig
  • Jens Baßler, Max Planck Institute for Developmental Biology, Germany
  • Stephanie Beltz, University of Tübingen, Germany
  • Miriam Ziegler, University of Tübingen, Germany
  • Joachim Schultz, University of Tübingen, Germany
  • Andrei Lupas, Max Planck Institute for Developmental Biology, Germany

Short Abstract: Class III adenylate cyclases (AC) are signaling proteins that produce the second messenger cAMP. Recently, we identified a novel cyclase transducer element (CTE) on the N-terminus of the AC catalytic domain. Biochemical characterization found the element crucial for the regulation of AC activity in response to upstream receptor activation. In bacterial ACs, this element is present whenever the upstream signaling domain is also found in His kinases, but absent when not. Our data provides new insight into the evolutionary relationship between two-component regulatory systems and ACs, as well as a structural rational for the functioning of various laboratory chimeras between the two. Notably, the CTE also exists in the adenylate and guanylate cyclases of vertebrates, which are involved in many cellular signaling processes, including the propagation of signals coming from G protein-coupled receptors (GPCRs). This provides indirect evidence for a regulatory function of the membrane anchors of vertebrate ACs and hints towards an additional regulatory level in GPCR signaling.

Session A-052: Dexterity: A framework to use a smartphone as a 3D wand
COSI: 3Dsig
  • Jenny Vuong, Data61, CSIRO, Australia
  • Benedetta Frida Baldi, Garvan Institute of Medical Research,
  • Christopher J Hammant, Garvan Institute of Medical Research,
  • Seán I O'Donoghue, Data61, CSIRO & Garvan Institute of Medical Research,

Short Abstract: Advances in web-based 3D graphics are making 3D datasets more accessible to a rapidly growing community of both specialists and lay people. A key challenge common to almost all 3D applications is onboarding, since the standard methods used for 3D controls (e.g., rotate, translate, zoom, etc.) can be difficult for new users to learn. Additionally, these 3D controls often change with different input devices, requiring the user to learn multiple control combinations that trigger the same functionality.
Another key challenge is depth cueing. Both challenges can be partly addressed using dedicated control devices, such as a 3D wand. However for web applications, it is generally best to avoid depending on the availability of such devices, as these are difficult to acquire - quite often only purchasable at specialist shops - and/or are associated with high costs.
Thus we propose Dexterity, a web-accessible prototype framework that allows a smartphone to be used as a 3D wand to control 3D objects. Using standard JavaScript libraries and web-based functionalities (e.g. Socket.io for WebSockets, deviceorientation JavaScript event, etc.), we implemented a simple web-based application that can be accessed using a mobile browser on any smartphone.
Given the wide availability of smartphones, Dexterity thus provides a generic and cost-effective solution that helps address the challenges of onboarding and depth perception.

Session A-053: Centrality analysis and DynaMine flexibility prediction with RINspector
COSI: 3Dsig
  • Guillaume Brysbaert, University of Lille, France
  • Kevin Lorgouilloux, University of Lille, France
  • Wim Vranken, VUB, Belgium
  • Marc Lensink, University of Lille, France

Short Abstract: Representing residues as nodes and interactions as edges, protein structures can be represented by so-called residue interaction networks (RINs). Centrality analyses performed on these networks evidence key residues that have been shown to carry importance for structural integrity and protein function. We have developed RINspector, an app for the Cytoscape network analysis and visualization software, which can perform several centrality analyses on RINs. RINs can be imported, or generated by the structureViz app which also allows for visualization of the protein structure in Chimera. In addition, the DynaMine server can be queried to retrieve predicted residue flexibilities of the given protein chain, but also of mutations therein, thus allowing a direct visualization of the effect of mutation on local flexibility. Results of any one representation (structure, network, graph) can be reflected in the others. RINspector is available in the Cytoscape app store.

Session A-054: Understanding enterovirus uncoating by Normal Mode Analysis and Perturbation Response Scanning
COSI: 3Dsig
  • Caroline Jane Ross South Africa

Short Abstract: Enteroviruses, a genus of the Picornaviridae family, cause many human diseases. Currently there are no antivirals against enterovirus infections. Capsid expansion is critical for the release of RNA into the host cell. Although this process hints at possible drug targets, it remains poorly understood. As a model, we investigated capsid expansion of Enterovirus 71 by Normal Mode Analysis and Perturbation Response Scanning (PRS). We also conducted a bioinformatic screen of all available sequences of enterovirus capsid proteins. We identified the dominant motions and conserved hotspots that may function in capsid expansion. We propose that expansion may be altered by drugs targeted at these regions. Our approach is computationally feasible and can be applied to other virus families.

Session A-055: Predicting Protein Dynamics and Allostery Using Multi-Protein Atomic Distance Constraints
COSI: 3Dsig
  • Joe Greener, Imperial College London, United Kingdom
  • Michael Sternberg, Imperial College London, United Kingdom

Short Abstract: Over recent years there have been many methods proposed to predict and explore allostery in proteins. Despite the potential of allosteric drugs, these methods are not yet reliable and allostery remains poorly understood. We present our Exploration of Protein Structural Ensembles (ExProSE) computational method to investigate allostery in the context of protein dynamics and the conformational ensemble. This method takes two protein structures in different conformations and generates structures that span the conformational space. By adding extra constraints, the effect of binding at a potential allosteric site can be investigated. ExProSE is compared to existing approaches. The application to cyclin-dependent kinase 2 (CDK2), an important regulator of the cell cycle, is explored and a new binding site is predicted.

Session A-056: Protein Structures and their features in UniProtKB
COSI: 3Dsig
  • Nidhi Tyagi, EMBL-EBI, United Kingdom
  • UNIPROT CONSORTIUM, EMBL-EBI, SIB, PIR,

Short Abstract: With increasing submission of protein 3D structures to PDB every week, there is a need for correct annotation of these biomolecules. UniProtKB provides a comprehensive and thoroughly annotated protein resource to the scientific community. The reviewed UniProtKB/SwissProt section provides expertly curated annotation for each protein entry while the unreviewed UniProtKB/TrEMBL section contains automatically generated data. UniProt ensures accurate mapping of PDB structures to UniProtKB records which facilitates linking between the resources, and importing of data such as ligand-binding sites from PDB protein structures into UniProtKB.
The criteria to map a PDB entry to UniProtKB entry are as follows: (a) Two sequences should share high sequence identity (>90%) b) Mapping preference is given to UniProtKB entries from reference and complete proteomes c) Mapping is done at exact taxonomical level (strain level for lower organisms) d) Mapping is done to the longest protein sequence. A semi-automatic mapping pipeline has been developed which allows for automated mapping of many structures with manual intervention required for more complex cases. This accurate mapping allows for the establishment of accurate cross-references between UniProtKB and PDB and for the subsequent import of small molecule data such as ligand-binding sites from PDBe.
To date, UniProt has successfully completed the non-trivial and labour intensive exercise of cross-referencing ~350,000 polypeptide chains and 122,800 PDB entries to 40795 UniProtKB entries. In addition, data from PDB protein structures has been automatically imported into UniProtKB/TrEMBL, enriching the annotation of unreviewed records. Protein structural information in UniProt thus serves as a vital dataset for various projects.

Session A-057: Computational Identification of Putative Ancestral Protein Fragments
COSI: 3Dsig
  • Leonhard Heizinger, University of Regensburg, Germany
  • Rainer Merkl, University of Regensburg, Germany

Short Abstract: The existence of a simple primordial form of life named last universal common ancestor (LUCA) is generally accepted for the Paleoarchean era, i.e., at least 3.5 billion years ago. The experimental characterization of reconstructed LUCA enzymes strongly suggests that the evolution of many highly efficient enzymes and enzyme complexes has already been completed in the LUCA era. Thus, given the age of the earth, many proteins must have evolved in the pre-LUCA era, a relatively short time span of approximately 500 million years. Several lines of evidence suggest that small, primordial peptides existed first, fused to proteins and gave rise to the limited number of folds still observed in nature. Using highly sensitive computational approaches based on sequential and structural alignments, we were able to detect a large number of common fragments shared by supposedly evolutionary unrelated proteins. We present the algorithms for their identification and striking cases.

Session A-058: DisProt 7.0: a major update of the database of disordered proteins
COSI: 3Dsig
  • Silvio Tosatto
Session A-059: Molecular dynamics simulations of triple-helical oligonucleotides
COSI: 3Dsig
  • Julian Nazet, Universität Regensburg, Germany
  • Rainer Merkl, Universität Regensburg, Germany

Short Abstract: Gene expression can be regulated by targeting genomic DNA with proteins or triplex-forming oligonucleotides (TFOs). TFOs are major-groove ligands that bind to specific DNA sequences and may consist of DNA or RNA molecules. During the last years, it became clear the RNA is an important regulatory element of gene expression. Thus, we were interested to assess the capability of RNA molecules to form triple-helices, which depends on the chemical nature of the nucleotides and their sequence [1]. If the TFO is a polypyrimidine, antiparallel triplices of the type py(pu · py) arise. If the TFO is a polypurine, the parallel triplices consist of pu(pu · py) elements. Using molecular dynamics (MD) simulations, we wanted to determine the stability of specific DNA·DNA·RNA triplices in parallel or antiparallel conformation. To begin with, several MD simulations were performed at 298 K and 353 K. For the subsequent assessment of RNA binding, three criteria were used. These were the RMSF value, the binding energy of the RNA strand to the DNA, and the number of hydrogen bonds per residue. Results are detailed with respect to the nature and the sequence of the RNA molecules.

Session A-060: Characterization of the GPR3 binding cavity using combined in silico-in vitro approaches
COSI: 3Dsig
  • Eda Suku, University of Verona, Italy

Short Abstract: Short Abstract
Alzheimer is a neurodegenerative disease characterized by loss of brain connectivity [1]. Lately, it was discovered that the orphan and constitutively active G-protein coupled receptor 3 (GPR3), belonging to Class-A GPCRs, is involved in Alzheimer’s disease (AD), through direct interaction with β-arrestin 2[2]. Although GPR3 is considered a new promising therapeutic target for AD, it has not a solved structure and a known binding site, yet. We modeled GPR3 and predicted its binding cavity by using the GOMoDo web-server[3]. The obtained model was manually checked to inspect the conservation of the class A GPCRs structural fingerprints. DPI were docked into the modeled GPR3 by using the Haddock server on GOMoDo. The best complex structure, was considered as input for Molecular Dynamics simulations (MDs). The 200 ns MDs were performed by using the molecular mechanics/coarse-grained (MM/CG) hybrid approach [5], developed in our laboratory. Our model was validated by performing wet-lab alanine scanning mutagenesis on residues Cys267 and Phe120, putatively involved in halogen and π-stacking interaction with the ligand, respectively. We have also identified a mutant that completely abolishes the basal activity of the receptor by modifying the allosteric sodium (Na+) putative binding cavity. The latter could help the determinants’ characterization underlying the β-arrestin pathway, directly correlated with the plaques formation.
References
Holtzman et al. Science translational medicine, 2011: 77sr1-77sr1.
Thathiah et al. Nature medicine, 2013: 43-49.
Sandal et al. PLoS One, 2013: e74092.
De Vries et al. Nature protocols, 2010: 883-897.
Neri et al. Physical review letters, 2005: 218102.

Session A-061: Computationally investigating and comparing the binding capacity of lactate and malate dehydrogenases from the Apicomplexa Plasmodium falciparum and Babesia microti to Homo sapiens
COSI: 3Dsig
  • Daniel Barry Roche, Institut de Biologie Computationnelle (IBC), Université de Montpellier, France
  • Lena Sauer, Institut de Biologie Computationnelle (IBC), Université de Montpellier, France and Institute for Virology, Philipps University, Hans-Meerwein-Str. 2, Marburg, Germany., France
  • Sahar Usmani-Brown, Yale School of Public Health and Yale School of Medicine, United States
  • Sylvain Milanesi, Institut de Biologie Computationnelle (IBC), Université de Montpellier, France
  • Choukri Ben Mamoun, Yale School of Public Health and Yale School of Medicine, United States
  • Emmanuel Cornillot, Institut de Biologie Computationnelle and Institut de Recherche en Cancérologie de Montpellier, Université de Montpellier, France

Short Abstract: Both Malate (MDH) and lactate (LDH) dehydrogenases are essential metabolic enzymes, which are structurally homologous, share a similar catalytic mechanism, yet retain specificity for their respective substrates. It has previously been shown that in Babesia microti the lactate dehydrogenase (LDH) results from horizontal gene transfer of an animal host ldh gene. The Babesia microti LDH shows higher homology to human LDH, than LDH from other Apicomplexa. In apicomplexa, the ldh gene result from duplication of an mdh gene and via neofunctionalization acquiring LDH activity. We suspect convergent evolution between the two types of LDH because; 1) B. microti presents MDH activity without corresponding encoding genes; and 2) we have shown that the Babesia microti LDH reacts with APAD, which is used to test for Plasmodium acquired malaria, via the reaction with LDH. APAD affinity shows that there may be misdiagnosis of malaria, when the patient actually has babeosis. We also investigate whether the B. microti LDH has moonlight MDH activity in addition to the specificity of the Babesia microti LDH, using the Plasmodium falciparum and Homo sapiens LDH and MDH as controls. Our modelling and docking results show that the binding pocket of Babesia microti LDH is more open and a larger binding pocket, due to a number of mutations, which have resulted in the breaking of a non-covalent interaction and the shortening of one helix, thus leaving the binding pocket more open to react with more substrates.

Session A-062: A computational design strategy for discovering the cellular targets of histone lysine methyltransferases
COSI: 3Dsig
  • Diego Alonso-Martinez

Short Abstract: The activity of histone lysine methyltransferases (HKMTs) is crucial for the regulation of many gene expression programs and the development and progression of many diseases such as cancer. Currently, there are no biologically-consistent methods to fully characterise the specific protein targets of given HKMTs. Here we present a novel HKMT “methylome” profiling assay, which utilises mixed-integer programming to engineer synthetic cofactors selective for a given HKMT based on structural differences, molecular dynamics (MD) simulations to rank cofactor designs and mass spectrometry (MS) proteomics for target identification.

Session A-063: Developing computational descriptors to analyse and classify Glycosylphosphatidylinositol (GPI) proteins in Apicomplexan
COSI: 3Dsig
  • Rodrigo Canovas, LIRMM and IBC, University of Montpellier, France
  • Lena Sauer, LIRMM and IBC and Institute for Virology, Philipps University, Hans-Meerwein-Str. 2, France
  • Sylvain Milanesi, LIRMM and IBC, University of Montpellier, France
  • Hossam Shams-Eldin, Institute for Virology, Philipps University, Hans-Meerwein-Str. 2, Germany
  • Ralph Schwarz, Institute for Virology, Philipps University, Hans-Meerwein-Str. 2, Germany
  • Daniel Barry Roche, LIRMM and IBC, University of Montpellier, France
  • Emmanuel Cornillot, LIRMM and IBC and Institut de Recherche en Cancérologie de Montpellier, France

Short Abstract: Malaria continues to be a leading global health problem, with up to 438,000 deaths in 2015. Plasmodium falciparum is responsible for about 80% of malaria infections worldwide, closely followed by Plasmodium vivax, which is the main cause of malaria infections outside of the African continent. One of the main problems with the treatment of malaria infections is the development of drug resistance against current chemotherapeutic and immune prophylaxis, in particular against Artimisinin and its derivatives. The development of a new generation of malaria targeting drug appears to be an urgent necessity. Glycosylphosphatidylinositol (GPI) and GPI-anchored proteins, represent potential new drug targets. GPI proteins are outer plasma membranes anchored proteins, found in the majority of living organisms, including mammals, yeast, protozoan in addition to archaebacteria. Parasitic GPI proteins have numerous functions including; surface coat proteins; receptors; adhesion molecules; enzymes; in addition to possible roles in host-parasite interactions and immune escape. Sensitivity and specificity of machine-learning algorithms predicting GPI-proteins is weakened by the diversity of the N- and C-terminus specific signals. We present a feature-based approach, predicting GPI-proteins from full-length amino-acid sequence. We show that predicting the hydrophobic properties of the N- and C-terminal, in addition to the the central core of the protein was more accurate for the prediction of GPI-proteins than current approaches. Our approach was used to define the GPI-proteome of 8 Plasmodium species, showing a core set of 14 proteins, in addition to remarkable speciation features among the three main evolutionary branch of Plasmodium.

Session A-064: Using Ancestral Sequence Reconstruction to Characterize an Allosteric Bi-Enzyme Complex
COSI: 3Dsig
  • Kristina Heyn

Short Abstract: Ancestral sequence reconstruction (ASR) is the inference of primordial amino acid sequences from contemporary ones with the help of a phylogenetic tree [1]. Extant sequences occupy the leaves of this tree and the sequences corresponding to the internal nodes and the root are intermediates. The in silico and biochemical characterization of these intermediates can help to elucidate the structural determinants of a whole protein family. Imidazole glycerol phosphate synthase is a bi-enzyme complex consisting of the cyclase subunit HisF and the glutaminase subunit HisH. We have used ASR and protein design to unravel the structural basis of the HisH-HisF interaction between the two enzymes. To this end, we compared the binding of a given HisH protein to i) a modern HisF enzyme (a leaf), to ii) evolutionary intermediate HisF proteins (internal nodes), and to iii) HisF from the last universal common ancestor (LUCA-HisF, root) that differ in the composition of their interfaces [3]. The in silico analyses of these interfaces made clear that one residue is key to the binding affinity of the HisH-HisF complex. The predicted effect on complex stability induced by the reciprocal exchange of the corresponding interface residues was confirmed by means of biochemical studies [4]. Thus, we could demonstrate that a combination of in silico methods and wet-lab experiments allowed us to identify an interface hot-spot in a straightforward manner.

Session A-065: TongDock: Prediction of Symmetric and Asymmetric Protein-Protein Complex Structures by FFT and Improved Parametrisation
COSI: 3Dsig
  • Taeyong Park, Seoul National University, South Korea
  • Hasup Lee, Samsung Advanced Institute of Technology, South Korea
  • Minkyung Baek, Seoul National University, South Korea
  • Chaok Seok, Seoul National University, South Korea

Short Abstract: Physiological functions of individual proteins are ultimately determined by atomic-level interactions of the proteins with various biomolecules including other proteins. However, a large portion of protein-protein complexes and their atomic interactions are hard to be captured experimentally because of their transient nature and/or weak binding affinity. In this research, we have developed a protein-protein docking program called TongDock that predicts both symmetric oligomer structures and asymmetric complex structures. TongDock is similar to a widely used docking program called ZDOCK in that they both sample bound structures in a grid space using FFT (Fast Fourier Transform). However, TongDock has the following advantages over ZDOCK. First, prediction power was improved compared to ZDOCK by a different parametrisation scheme for the scoring function which is based on a more exhaustive search in the parameter space. Second, more flexible interface and block options than ZDOCK are implemented so that users can easily apply available experimental or evolutionary information on binding. Finally, a symmetric homo-oligomer prediction is possible using Oligo-TongDock and Doligo-TongDock for predicting oligomers of Cn and Dn symmetries, while predicting oligomers of Dn symmetry is not available with MZDOCK, a symmetric version of ZDOCK. Benchmark test results are presented for the ZDOCK benchmark set 4.0 and the PISA set.

Session A-066: All-Atom Molecular Dynamics Simulations of a Membrane Protein Stabilizing β-sheet
COSI: 3Dsig
  • Maral Aminpour, University of Alberta, Canada
  • Hiofan Hoi, University of Alberta, Canada
  • Sinoj Abraham, University of Alberta, Canada
  • Carlo Montemagno, University of Alberta, Canada

Short Abstract: IMPs play crucial roles in all cells. However, functional and structural studies of IMPs are hindered by their hydrophobic nature and the fact that they are generally unstable following extraction from the membrane environment. Recently, BPs were used to maintain IMPs stable [1]. These BPs are 8-amino-acid peptides with alternating polar and apolar residues with an octyl side chain at each end. The major incentive to explore the β-sheets and its derivatization with functionalized groups is that they may enable stoichiometric and oriented crosslinking of IMP’s with the solid substrate. However, it is extremely time-consuming to systematically explore the effect of modifications to the chemical structure of β-sheets such as the length of the β-sheet, the distribution of hydrophilic and hydrophobic residues and the inter-strand hydrogen bond interactions experimentally. Here, we employed MD simulations to investigate the β-sheet formation of BP1 as well as two other modifications of BP1 namely, propargyl-BP1 and azido-BP1 by adding a propargyl/ azido group at the N-term of BP1. The latter two functionalized BPs are designed in a way to not only stabilize the membrane protein but also provide a means to covalently immobilize the IMPs on a solid substrate. We are planning to use the AqpZ, the water channel from E. coli as a model protein in future to study β-sheet/ IMP complexes. The structure and function of AqpZ has been well studied through MD simulation, suggesting that we can use MD to address the effect of the β-sheets on AqpZ behaviour.

Session A-067: A Complexity Measurement for de novo Protein Folding
COSI: 3Dsig
  • Michael Brown, University of Maryland, University College, United States
  • James Coker, University of Maryland, University College, United States

Short Abstract: Predicting how a protein folds based solely on its amino acid sequence is an ongoing challenge for the fields of Bioinformatics and Computer Science. Previous attempts to solve this problem have relied on algorithms and a specific set of benchmark proteins. However, there is currently no method for determining if the set of benchmark proteins share a similar level of complexity with proteins of similar size. As a result, a larger variety of benchmarks should be used to evade this problem and a measure of complexity established to determine the validity of all benchmarks. We propose here the Ouroboros Complexity Measurement for the de novo folding of proteins. This metric is easy to compute (not an NP hard problem) and allows the comparing of protein complexity.

Session A-068: Investigation of protein mutations associated with heart failure using bioinformatic tools
COSI: 3Dsig
  • Jennifer Atkins, University of Reading, United Kingdom
  • Lara Makewita, University of Reading, United Kingdom
  • Melissa Gargaro, Imperial College London, United Kingdom
  • Liam McGuffin, University of Reading, United Kingdom
  • Samuel Boateng, University of Reading, United Kingdom
  • Thomas Sorensen, Diamond Light Source, United Kingdom

Short Abstract: Cardiovascular disease (CVD) is an ever-growing burden both worldwide and in the UK, and is one of the leading causes of death worldwide[1]. Often, CVDs progress into heart failure, where the heart can no longer pump effectively. The progression of CVDs to heart failure has been linked with single-gene mutations[2], including MLP; a multi-function protein found within the cytosol and nucleus of myocytes[3]. To investigate mutations within MLP, homology modellers including IntFOLD4[4] were utilised to ascertain the potential structure and relevance of individual mutations in terms of their side chains. A combination of online protein bioinformatics tools were then used to assess the impact of these mutations upon function. Classical Molecular Dynamics is then used to observe the stability and flexibility of the mutants versus the wild type. Using bioinformatic tools, it is possible to direct wet lab experiments, saving time and reagents. From this study, mutants can be categorised into structural or functional impact groups, thus determining whether a structural biology route, or cell culture based studies would be most telling in how the mutations cause heart failure. From this, therapeutic routes can be considered and tailored based upon the mutants’ effect. 1. WHO. Global status report on noncommunicable diseases 2010. World Health Organisation (2011). 2. Morita, H., Seidman, J. & Seidman, C. E. Genetic causes of human heart failure. The Journal of Clinical Investigation. 115, 518–526 (2005). 3. Buyandelger, B. et al. MLP (muscle LIM protein) as a stress sensor in the heart. European journal of physiology. 462, 135–42 (2011). 4. Mcguffin, L. J., Atkins, J. D., Salehe, B. R., Shuid, A. N. & Roche, B. IntFOLD: an integrated server for modelling protein structures and functions from amino acid sequences. Nucleic Acids Research. 1–5 (2015).

Session A-069: FireProt: web server for automated design of thermostable proteins
COSI: 3Dsig
  • Miloš Musil, VUT FIT, Czech Republic
  • Jan Stourac, Loschmidt Laboratories, Czech Republic
  • Jaroslav Bendl, Brno University of Technology, Czech Republic
  • Jan Brezovsky, Loschmidt Laboratories, Czech Republic
  • Zbynek Prokop, Loschmidt Laboratories, Czech Republic
  • Tomas Martinek, VUT FIT, Czech Republic
  • Jaroslav Zendulka, VUT FIT, Czech Republic
  • David Bednar, Loschmidt Laboratories, Czech Republic
  • Jiri Damborsky, Loschmidt Laboratories, Czech Republic

Short Abstract: Stable proteins are used in numerous biomedical and biotechnological applications. Unfortunately, naturally occurring proteins cannot usually withstand the harsh industrial environment, since they are mostly evolved to function at mild conditions. Therefore, there is a continuous interest in increasing protein stability to enhance their industrial potential. A number of in silico tools for the prediction of the effect of mutations on protein stability have been developed recently. However, only single-point mutations with a small effect on protein stability are typically predicted with the existing tools and have to be followed by laborious protein expression, purification, and characterization. A much higher degree of stabilization can be achieved by the construction of the multiple-point mutants. Here, we present the FireProt method and the web server for the automated design of multiple-point mutant proteins that combines structural and evolutionary information in its calculation core. FireProt utilizes sixteen bioinformatics tools, including several force field calculations. Highly reliable designs of the thermostable proteins are constructed by two distinct protein engineering strategies, based on the energy and evolution approaches and the multiple-point mutants are checked for the potentially antagonistic effects in the designed protein structure. Furthermore, time demands of the FireProt method are radically decreased by the utilization of the smart knowledge-based filters, protocol optimization and effective parallelization. The server is complemented with an interactive, easy-to-use interface that allows users to directly analyze and optionally modify designed thermostable proteins. The server is freely available at http://loschmidt.chemi.muni.cz/fireprot.

Session A-070: Drug-target interactoin similairity for drug-target interaction prediction and drug repositioning
COSI: 3Dsig
  • Daniele Parisi

Short Abstract: The drug discovery process is long, complex[1] and scarcely productive[2], mostly because of lack of efficacy of the candidate drug[3]. With drug repositioning[4] the information from previous studies could lead a molecule to the market saving half of the time and money. Although the attitude on drug repositioning is optimistic[5] due to some successful projects (i.e. Sildenafil, Thalidomide) and many tools already developed, at the moment this approach does not represent yet a profitable business model[1]. With this work I present some unpublished results to show the potentiality of the collaboration between bioinformaticians and medicinal chemists for a successful drug-repositioning protocol.

Session A-071: A trial for elucidating the effect of carnitine transport dynamics to pathogenicity of renal carnitine deficiency
COSI: 3Dsig
  • Akiko Higuchi, Graduate School of Frontier Sciences, The University of Tokyo, Japan
  • K. Anton Feenstra, IBIVU/Bioinformatics, Vrije Universiteit Amsterdam, Netherlands
  • Kei Yura, Graduate School of Humanities and Sciences, Ochanomizu University, Japan

Short Abstract: Solute Carrier (SLC) transporter superfamily is known to play a key role in mass transport system. The superfamily consists of 52 families, and at least 386 different transporter genes have been identified. Many pathogenic mutations on SLC22, one of the largest family members, are linked to renal carnitine deficiency diseases (Nicola Longo, 2016). Crystal structures of many human SLC members are already known, but the structure of SLC22 remains to be solved. Thus, the mechanism underlying this severe disease is still unknown. In this study, we computationally investigated the structural features and the biophysical impact of pathogenic mutations on SLC22 and SLC2 of which the sequences share high similarity. First we integrated structural and mutational properties of SLC families, and then built a database of the families. We further analyzed the features on pathogenic mutations and evolutionary relationship. By analyzing the database, we found that the mutations of several conserved arginines, particularly to tryptophan and glutamine, were frequently involved in pathogenic mutations in both families. Most of these highly conserved residues in SLC2 form a large cluster in three-dimension on the cytoplasmic side. We further focused on those mutations and assessed the biophysical effect on the mutation of these arginine residues using atomistic (GROMOS) and coarse-grained (MARTINI) molecular dynamic simulations of the protein membrane system. Comparing observed dynamics preferences between wild-type and mutated proteins, and with known conformational states from the crystal structures, a better understanding of the impact of mutations on pathogenesis will be obtained.

Session A-072: Seeing the Trees through the Forest: Sequence-based Homo- and Heteromeric Protein-protein Interaction sites prediction using Random Forest
COSI: 3Dsig
  • K. Anton Feenstra

Short Abstract: With ever-increasing numbers of protein sequences becoming available, annotating genome data is a challenging task. Protein-protein interactions (PPIs) play a central role in virtually all cellular processes. Identification of interface (IF) sites between interacting proteins is essential to understand complex formation and investigate their functions. Few of the available genomes, however, have a good coverage of experimentally validated annotations for PPIs. This makes the prediction of protein interaction sites from sequence information increasingly attractive. We here explore the potential of sequence information and derived properties of interacting residues as features to predict interacting amino acids

Session A-073: Transport pathways in membrane transport proteins.
COSI: 3Dsig
  • Sayane Shome, Iowa State University, United States
  • Edward Yu, Iowa State University, United States
  • Robert Jernigan, Iowa State University, United States

Short Abstract: Substrate transport through membrane transporters is critical for many biological processes. One of the most interesting questions is how to understand the substrate specificity of transporters. Due to the limitations of experimental methods, computational approaches can be applied advantageously to screen a large number of possible transported molecules. The experimental determination of the mechanistic details of transport is difficult. We have employed steered molecular dynamics simulations to determine the critical factors responsible for the transport and how they interact with protein components along the pathway. Systems that we have investigated include the transport of: 1) sulfonamide drugs by the AbgT transporter YdaH protein, 2) inorganic carbon (CO2 and bicarbonate ion) by the Low CO2 inducible protein Lci1 and 3) long-chain cyclic lipids (Hopanoid and Steroid) by the RND-like HpnN protein. VMD software has been used. Protein-embedded lipid systems were minimized for 250,000 steps, followed by equilibration until the system temperature reaches 310 K. Steered Molecular dynamics simulations at constant velocity (cv-SMD) were carried out on the equilibrated systems via NAMD, with the direction and magnitude of the pulling being adjusted for the system of interest. Based on the simulation trajectories, we have determined the transport pathway through the protein for the ligand transport in these three cases, as well as the roles of functionally important residues along the pathway. These pathways have been confirmed by experimental findings.

Session A-074: Inferring protein phylogeny by modelling the evolution of secondary structure
COSI: 3Dsig
  • Jhih-Siang Lai, School of Chemistry and Molecular Biosciences, The University of Queensland, Australia
  • Bostjan Kobe, School of Chemistry and Molecular Biosciences, The University of Queensland, Australia
  • Mikael Boden, School of Chemistry and Molecular Biosciences, The University of Queensland, Australia

Short Abstract: Ancestral sequence reconstruction has had recent success in decoding the origins and the determinants of complex protein functions. However, attempts to reconstruct extremely ancient proteins and phylogenetic analyses of remote homologues must deal with the sequence diversity that results from extended periods of evolutionary change. In the last twenty years, the number of protein structures in the Protein Data Bank has increased twenty-fold. Using the same principles pioneered by Dayhoff, we seize this wealth of structure data and develop a protein secondary structure evolutionary model, based on differences between discrete secondary structure states observed in modern proteins and those hypothesized in their immediate ancestors. We implement maximum likelihood-based phylogenetic inference tools based on our evolutionary model. We apply these tools to the sequence-diverse but structurally-conserved Toll/interleukin-1 receptor (TIR) domains and show that resulting clades in a phylogenetic tree are more consistent with their biological properties than those of the same inference based on an amino acid model. The approach also allows us to infer ancestral secondary structure; we compare these predictions with those of structure homology modelling and sequence-based secondary structure predictors. The secondary structure evolutionary model extracts information not available from modern structures or the ancestral protein sequences alone. Our evolutionary model has the capacity to highlight relationships that are evolutionarily rooted in structure, and therefore complements the use of sequence-based phylogenetic analysis.

Session A-075: Structural characterization of the IC pocket in the human Cx50 hemichannel
COSI: 3Dsig
  • Claudia Pareja-Barrueto, Computational Biology Laboratory (DLab), Fundacion Ciencia y Vida, Santiago, Chile. Centro Interdisciplinario de Neurociencia de Valparaíso, Valparaíso, Chile., Chile
  • Jose Gomez, Computational Biology Laboratory (DLab), Fundacion Ciencia y Vida, Santiago, Chile. Centro Interdisciplinario de Neurociencia de Valparaíso, Valparaíso, Chile., Chile
  • Felipe Villanelo, Computational Biology Laboratory (DLab), Fundacion Ciencia y Vida, Santiago, Chile. Centro Interdisciplinario de Neurociencia de Valparaíso, Valparaíso, Chile., Chile
  • Peter Minogue, Department of Pediatrics, University of Chicago, Chicago, IL USA., United States
  • Viviana Berthoud, Department of Pediatrics, University of Chicago, Chicago, IL USA., United States
  • Eric Beyer, Department of Pediatrics, University of Chicago, Chicago, IL USA., United States
  • Tomas Perez-Acle, Computational Biology Laboratory (DLab), Fundacion Ciencia y Vida, Santiago, Chile. Centro Interdisciplinario de Neurociencia de Valparaíso, Valparaíso, Chile., Chile

Short Abstract: Previous in silico studies, developed in our laboratory, suggest that gating of human Cx26 hemichannels can be modulated by an intracellular water pocket (IC pocket). The presence of a water pocket has also been reported in Cx32 hemichannels. Both Cx26 and Cx32 belong to the β subfamily of connexins. To elucidate whether members of other connexin subfamilies also contain a water pocket, we chose human Cx50 (hCx50), a member of the α subfamily of connexins. Cx50 is expressed in the eye lens, and several mutations in its gene have been associated with congenital cataracts in humans. To test for the presence of the IC pocket in hCx50 and its possible functional role, we performed all-atom molecular dynamics simulations on wild type hCx50 hemichannels and on hemichannels formed by hCx50 containing mutations of some of the amino acid residues lining the IC pocket. Our analyses are focused on the structure and dynamics of water molecules inside the IC pocket and ionic currents across the channel pore. We calculated occupancy, survival probability, and dipole vector orientation of water molecules, and made structural comparisons. In addition, we performed functional mode and essential dynamics analyses to identify relevant collective atomic motions of the hemichannel. Our results suggest that hCx50 hemichannels contain an IC pocket. Moreover, water occupancy and dynamics imply the presence of highly structured waters that make electrostatic interactions with particular amino acid residues lining the IC pocket. Thus, the existence of an IC pocket is a general feature of hemichannels formed by different connexins. Ongoing studies will evaluate its importance for hCx50 function. Acknowledgements This work was supported by Fondecyt grant #1160574, PFB16 Fundación Ciencia para la Vida and ICM-Economía P09-022-F, CINV, Conicyt graduate fellowships #21161628 and NIH grant EY08368. We acknowledge the University of Chicago RCC for access to supercomputing time. Powered@NLHPC: This research was partially supported by the supercomputing infrastructure of the NLHPC (ECM-02).

Session A-076: Deep learning based subdivision approach for large scale macromolecules structure recovery from electron cryo tomograms
COSI: 3Dsig
  • Min Xu, Carnegie Mellon University, United States

Short Abstract: Motivation: Cellular Electron CryoTomography (CECT) enables 3D visualization of cellular organization at near-native state and in sub-molecular resolution, making it a powerful tool for analyzing structures of macromolecular complexes and their spatial organizations inside single cells. However, high degree of structural complexity together with practical imaging limitations makes the systematic de novo discovery of structures within cells challenging. It would likely require aver- aging and classifying millions of subtomograms potentially containing hundreds of highly heterogeneous structural classes. Although it is no longer difficult to acquire CECT data containing such amount of subtomograms due to advances in data acquisition automation, existing computational approaches have very limited scalability or discrimination ability, making them incapable of processing such amount of data.

Results: To complement existing approaches, in this article we propose a new approach for subdividing subtomograms into smaller but relatively homogeneous subsets. The structures in these subsets can then be separately recovered using existing computation intensive methods. Our approach is based on supervised structural feature extraction using deep learning, in combination with unsupervised clustering and reference-free classification. Our experiments show that, compared with existing unsupervised rotation invariant feature and pose-normalization based approaches, our new approach achieves significant improvements in both discrimination ability and scalability. More importantly, our new approach is able to discover new structural classes and recover structures that do not exist in training data.

Session A-077: Modelling and Functional Characterization of Peptides to Enhance Secretion of Specific Components by Stem Cells for Biomedical Applications
COSI: 3Dsig
  • Krishnamoorthy Navaneeth, Imperial College London, United Kingdom
  • Yuan‐tsan Tseng, Imperial College London, United Kingdom
  • Poornima Gajendrarao, Imperial College London, United Kingdom
  • Adrian Chester, Imperial College London, United Kingdom
  • Magdi Yacoub, Imperial College London, United Kingdom

Short Abstract: Engineering living tissues or organs critically depends on the ability of scaffolds to attract, house and instruct populating cells. One of the strategies to achieve functionalization of scaffolds relies on the use of designer peptides to decorate scaffolds. Peptide linkers have been shown to be important for merging functional motifs and attaching motifs on the surface of bioactive materials. However, intense structural customization of the linkers is required prior to examination of them under experimental conditions. Thus, we here apply computer-aided molecular design to construct the linkers including the essential properties such as stability accessibility of motif/motifs (to enhance functionality) and binding to scaffolds. The molecules that are prioritized by modelling have been selected for in vitro experiments to assess their functionality. Our study shows that the linkers based on valine and alanine can be used for merging dual bioactive motifs, which enhance the stimulation of collagen and fibronectin in human adipose derived stem cells (hADSCs) under experimental conditions. Molecular dynamic simulations showed P3 (SKTTKS‐V4A3‐SKTTKS), with palindromic (SKTTKS) motifs and P5 (SKTTK‐V4A2‐KTTKS) maintained structural integrity and favourable surface electrostatic distributions that are required for functionality. In vitro studies showed that peptides P3 and P5 significantly increased the production of collagen and fibronectin in a concentration-dependent manner, compared to the active peptide motif (KTTKS). The 4 days treatment showed the stem cell markers of hADSCs remained stable with the P3. By further applying the modeling strategy, we are developing linkers with surface attachment property. We focus the structural role of key residues such as serine, lysine and cysteine with glycine in different lengths. The design of customized linkers may offer several advantages for producing intelligent biomaterials with enhanced bioactivity and to target specific sites for biomedical applications.

Session A-078: Homology modeling in a dynamical world
COSI: 3Dsig
  • Alexander Monzon

Short Abstract: A key concept in Template-Based Modeling (TBM) is the high correlation between sequence and structural divergence. The main practical consequence of this correlation is that homologous proteins that are similar at the sequence level will also be similar at the structural level allowing the selection of a proper template for a target sequence. Pioneering work by Chothia and Lesk[1] found a non-linear and well correlated relationship between sequence and structural divergence. However, a given protein sequence could exists in different structures (conformers) where their structural differences describe their conformational diversity (CD). In this work, we explored the impact that CD has on the relationship between structural and sequence divergence.

Session A-079: Molecular Modelling and Phenotypic Study of Disease-causing Mutations in cMyBP-C: Relevance to Hypertrophic Cardiomyopathy
COSI: 3Dsig
  • Navaneethakrishnan Krishnamoorthy, Sidra medical and research center, Qatar
  • Sahar Da’as, Sidra medical and research center, Qatar
  • Poornima Gajendrarao, Imperial College London, Qatar
  • Iacopo Olivotto, Careggi University Hospital, Italy
  • Magdi Yacoub, Imperial College London, United Kingdom

Short Abstract: The regulation of contractile function of the heart is largely depending on the molecular interactions of the sacromeric proteins. Hypertrophic cardiomyopathy (HCM) is an inherited disease that affects approximately 1 in 500 individuals worldwide, which is mainly caused by mutations in the sarcomeric proteins. The cardiac myosin binding protein C (cMyBP-C) is one of the sarcomeric proteins with multiple domains (C1-C10) and the mutations in this protein are the second most common cause of HCM. The complex, C1-motif-C2, at the N-terminal of cMyBP-C is an important region for the regulation of cardiac muscle contraction. However, the mechanism by which the disease-causing mutations can impact the domains during health and disease is unknown. Hence, we investigate potential structural basis/patterns for understanding the mechanism in some of the mutations that cause severe phenotypes. Here, we are using several molecular modelling tools including molecular dynamics simulations (MD) in the wild type and single mutations (Arg177His, Ala216Thr (both are identified in Egypt) andSer217Gly (identified in Qatar) within exon 5, and Glu258Lys (Egypt and founder effect in Italy) within exon 6of the domain C1). The molecular events captured from the trajectory suggest the following: i) Induce local structural changes nearby the mutational spot. ii) Decrease of structural stability followed by major impact on the surface. iii) Significant changes in the network of intra-molecular interactions on the mutational near filed regions. To understand the genotype – phenotype correlation, we have performed mutation analysis of the two exons of C1 domain of MyBP-C in the zebrafish model. Morpholinos targeting exon 5 or exon 6 recapitulate the human mutations and resulted in a specific cardiac phenotype. The exon 6 mutation resulted in severe cardiac phenotype exhibited by more zebrafish morphant embryos with enlarged cardiac chambers and reduced heart rate compared to other mutations. These results support our molecular events of domain C1, suggesting that mutations within exon 5 have minimal effect on electrostatic properties at the surface. Interestingly, the exon 6 mutation inversely impacts the structural properties and has major effect on the surface of C1 and may lead to malfunction of the protein. This comparative modelling study provides patterns in the structure-function consequences of the key mutations in cMyBP-C, which could be one of the potential reasons for the observed severe phenotypes, and might lead to abnormal cardiac function.

Session A-080: Frequent Subgraph Mining for Biologically Meaningful Structural Motifs
COSI: 3Dsig
  • Sebastian Keller

Short Abstract: We present a graph based approach to determine common structural motifs in related proteins using frequent subgraph mining (FSM). To this end, we adapted an existing FSM algorithm to increase its specificity towards biologically relevant and structurally conserved motifs and to make it more lenient towards inaccuracies in biological data.

Session A-081: DEEP LEARNING IN TEXT MINING FOR PROTEIN DOCKING USING FULL-TEXT ARTICLES
COSI: 3Dsig
  • Varsha D. Badal, The University of Kansas, United States
  • Petras J. Kundrotas, The University of Kansas, United States
  • Ilya A. Vakser, The University of Kansas, United States

Short Abstract: Residues extracted from PubMed abstracts by text mining can be used as constraints in protein-protein docking (Badal et al., PLoS Comp. Biol. 2015, 11:e1004630). However, the pool of the mined residues contains many false positives (residues not relevant to protein binding), which can be partially removed by natural language processing (Badal et al., 2017, submitted). Deep learning methods can potentially provide further reduction of these false positives. However, abstracts provide more formal and strictly crafted text, which may lack in variety/richness for training of the deep learning models. We investigated whether deep-learning models trained on the limited available full texts (PMC-open access) can be applied to the filtering of residues in the PubMed abstracts. We used deep recursive neural network, which composes word vectors (Irsoy & Cardie, Adv. Neural Inf. Proc. Syst. 2014, 2096-2104). Word vectors of the residue-containing sentences from the PMC full text articles were generated by word2vec (Mikolov, et al., arXiv:1301.3781, 2013). We propose to label words and trees with sentiments from 0 to 4 (0,1,2 labeling negative samples, i.e. describing non-interface residues and 3,4 denoting positive samples, relevant to the interface residues) to train and classify sentences containing residues. The approach was tested on a set of protein complexes from DOCKGROUND (http://dockground.compbio.ku.edu). The results showed that the model is capable of distinguishing the abstract sentences containing interface residues from those containing non-interface residues. We further investigated the local sentiment (surrounding words/phrase) using a window of words of various lengths around the residue.

Session A-082: Improved Rosetta protein structure prediction with customised fragments libraries based on structural class annotations
COSI: 3Dsig
  • Jad Abbass, Kingston University London, United Kingdom
  • Jean-Christophe Nebel, Kingston University London, United Kingdom

Short Abstract: Since experimental techniques are time and cost consuming, in silico protein structure prediction is essential to produce conformations of protein targets. When homologous structures are not available, fragment-based protein structure prediction has become the approach of choice. However, it still has many issues including poor performance when targets’ lengths are above 100 residues, excessive running times and sub-optimal energy functions. Taking advantage of the reliable performance of structural class prediction software, limitations of fragment-based methods are addressed by integrating structural constraints in their fragment selection process.
Using Rosetta, a state-of-the-art fragment-based protein structure prediction package, the proposed pipeline is evaluated on 70 former CASP targets containing up to 150 amino acids. Using CATH-based structural class annotations, enhancement of structure prediction performance is highly significant in terms of both GDT_TS (at least +2.6, p-values < 0.0005) and RMSD (−0.4, p-values < 0.005). Further analysis also shows that methods relying on class-based fragments produce conformations which are more relevant to user and converge quicker towards the best model as estimated by GDT_TS (up to 10% in average). This substantiates the hypothesis that usage of structurally relevant templates conducts to not only reducing the size of the conformation space to be explored, but also focusing on a more relevant area. Experiments conducted during CASP11 confirmed those results: out of 12 targets, "Rosetta_at_Kingston" – our group – was better than "Baker" in 6 and better than "BAKER-ROSETTASERVER" in 6 in terms of GDT_TS.

Session A-083: DARC shade of Chemokine Receptors
COSI: 3Dsig
  • Tarun Jairaj Narwani, DSIMB, INSERM UMR_S 1134, France
  • Agata Kranjc Pietrucci, DSIMB, INSERM UMR_S 1134, France
  • Sophie Abby, CNRS, UMR 5525, Univ. Grenoble Alpes, France
  • Alexandre G. de Brevern, DSIMB, INSERM UMR_S 1134, France

Short Abstract: Chemokine Receptors (CR) are a class of G-protein coupled receptors (GPCR) that binds specifically to two major (CC, CXC) and minor classes (CX3C and XC) of chemokines. They are involved in inflammation, haemostasis and cancer metastasis pathways. The Duffy Antigen / Chemokine Receptor (DARC) binds non-specifically to both major classes of chemokines. Adding to its atypicality, it fails to transduce any signal since it lacks the conserved DRY motif in C-terminal of GPCR. Therefore, it is often ascribed to as a scavenger receptor and catalogued as an Atypical Chemokine Receptor. Furthermore, DARC plays a major role in the host-entry mechanism of Plasmodium vivax (Pv), the second most endemic strain of malaria. The understanding of the structural dynamics of DARC - parasite interaction (through the Duffy Binding Protein) is of crucial importance in the battle against Plasmodium vivax. The literature on the DARC’s structure is scarce with only one structural model depicting the structure of monomeric DARC. We generated a highly robust structural model of dimeric DARC interacting with a dimeric PvDBP. The protocol for structural template selection is backed by the evolutionary relationships, conserved cysteines & TM regions and three different assessment scores. Moreover, the structural model is embedded in a lipid bilayer mimicking the real erythrocyte membrane composition. The steered molecular dynamics study of this model, while tracing the properties like, disulphide bonds, accessibility and local structural adaptation (using a specific structural alphabet; protein blocks) would provide crucial insights into the mechanism and may help us discover a drug-able target in DARC.

Session A-084: New binding site of the quorum sensing molecule N-3-Oxododecanoyl Homoserine Lactone with the transcriptional regulator LasR of Pseudomonas aeruginosa: Insights from Molecular Docking and Dynamics Simulations
COSI: 3Dsig
  • Hovakim Grabski, Russian-Armenian University, Armenia
  • Lernik Hunanyan, Russian-Armenian University, Armenia
  • Susanna Tiratsuyan, Russian-Armenian University, Armenia
  • Hrachik Vardapetyan, Russian-Armenian University, Armenia

Short Abstract: In 2017 World Health Organization announced the list of the most dangerous superbugs and among them is Pseudomonas aeruginosa, which is an antibiotic-resistant opportunistic human pathogen. This organism attacks human patients suffering from such diseases as AIDS, cancer, cystic fibrosis, etc. Current therapies lack efficacy because P. aeruginosa creates and inhabits surface-associated biofilms conferring increased resistance to antibiotics and host immune responses. Biofilm formation is controlled through cell–cell communication system called quorum sensing. The main quorum sensing system of P. aeruginosa includes receptor protein LasR. However, there remains a need for the understanding of the transcriptional regulator LasR, because till date there is no molecular detail information. In the present study, we tried to analyze the molecular properties of the receptor protein LasR as well as the mode of its interactions with signal molecule N-3-oxododecanoyl homoserine lactone (3-O-C12-HSL).The structure of the entire transcriptional regulator LasR was reconstructed. We performed docking and molecular dynamics (MD) simulations of the LasR with 3-O-C12-HSL. So far this is the first report that shows that the native ligand of LasR can interact with the beta turns in the short linker region of the receptor. It can be safely concluded that 3-O-C12-HSL can bind both to ligand binding domain and to the beta turns in the short linker region of the LasR protein, which is a new binding site. Results from this study may be used for future drug development endeavors, which are based on a new anti-infective strategy against quorum sensing.

Session A-085: Large-scale predictions of protein-peptide interactions in Arabidopsis thaliana
COSI: 3Dsig
  • Rashmi Hazarika, KU Leuven, Belgium
  • Vera van Noort, KU Leuven, Belgium

Short Abstract: Peptide-mediated interactions in which a short linear motif binds to a globular domain play a key role within the cell, mediating several cellular processes such as signal transduction and other regulatory pathways including the DNA replication machinery. Peptides may bind to other proteins for reasons such as inhibition, signaling, catalysis and producing macrostructures and such interactions account for approximately 40% of all protein-protein interactions. Previously, we have identified 189 transcriptionally active regions (TARs) in response to plant oxidative stress, which might encode 607 stress induced peptides (SIPs). Some of these SIPs may mimic the binding motif of one of the partners of the protein-protein complex and bring about conformational changes on their partner upon binding. When no information about the peptide binding site on receptors is available, there is need for computational approaches to predict peptide binding sites on protein surfaces as these models can serve as a starting point for experimental characterization of protein-peptide interactions specifically in the model plant Arabidopsis thaliana. The goal of our study is to carry out predictions on a large-scale as to how these SIPs might interact with receptors. In our study, we extracted 1009 A. thaliana proteins from the Protein Data Bank and used blastp with default settings to screen structures with a motif match. We found that 306 SIPs show matches against 613 PDB structures. These SIPs were further split into 10-mers and screened against the PDB structures using spatial position-specific scoring matrices (s-PSSM) using the peptide-binding site prediction server PepSite. We short-listed 576 protein-peptide pairs (P-value<0.1) from the PepSite output for building atomistic models using the protein-peptide docking protocol pepATTRACT. For each peptide motif three idealized peptide conformations (extended, α-helical and poly-proline) were generated from sequence using PeptideBuilder and this peptide ensemble was docked rigidly against the protein domain using the ATTRACT coarse-grained force field. The top-ranked 1000 structures were subjected to two stages of atomistic refinement using the flexible interface refinement method iATTRACT. We used the distance restraint based local docking protocol of pepATTRACT to restrict the sampling during rigid body sampling stage and flexible refinement stage towards the PepSite predicted interface residues. We selected 189 high confidence protein-peptide pairs for further structure refinement by subjecting them to a short molecular dynamics simulation in Born implicit solvent with the AMBER program. We found that several peptides in our dataset bind to a pocket/catalytic site, which normally binds a cofactor/substrate suggesting that these peptides might be able to modulate the activity of the protein. We have generated a list of interacting residues on PDB structures with which the SIPs might interact and also built models for these interactions. This dataset will help biologists to experimentally verify these interactions given the fact that only a few peptides have been identified so far in the model organism A. thaliana.

Session A-086: Identifying Multiple Active Conformations of G Protein-Coupled Receptors Using Focused Conformational Sampling
COSI: 3Dsig
  • Ravinder Abrol

Short Abstract: G protein-coupled receptors (GPCRs) are membrane proteins critical in manycellular signal transductions. The pleiotropic signaling of GPCRs is enabled by their conformational flexibility that enables them to exist in multiple states, where functionally important active states are high in energy. This makes the experimental studies of these active states very challenging. Most computational methods can only identify lowest-energy states, so we developed a focused conformational sampling method that is capable of identifying multiple active states of GPCRs. It was able to correctly predict the active conformation of two GPCRs starting only from the inactive state, and explained previous experiments, which has been an unsurmountable challenge for standard molecular dynamics simulations.

Session A-087: Deep Learning strategy for Improving ranking of protein fold recognition method ORION
COSI: 3Dsig
  • Jean-Christophe Gelly, Univ Paris Diderot, France
  • Guillaume Postic, Univ Paris Diderot, France
  • Charlotte Perin, Univ Paris Diderot, France

Short Abstract: Ranking potential interesting templates is a key issue for comparative modeling based protein structure prediction. We have introduced a machine learning strategy based on Deep Learning to improve fold recognition performance, ranking and detection of targets.

Session A-088: Large-scale structure prediction enabled by reliable model quality assessment and improved contact predictions for small families.
COSI: 3Dsig
  • Arne Elofsson, Stockholm University, Sweden
  • Mirco Michel, Stockholm University, Sweden
  • David Menendez Hurtado, Stockholm University, Sweden

Short Abstract: Motivation: Accurate contact predictions can be used for predicting the structure of proteins. Until recently these methods were limited to very big protein families, decreasing their utility. However, progress in contact prediction has made it possible to predict accurate contact maps for many small families. Here, we ask the question if it is possible to model these families and if we can identify the cases where the models are correct. Results: We find that it is possible to correctly model and identify about 53% of the families that have more than 1000 effective sequences and 23% of the families with more than 100 effective sequences. Using these numbers we estimate that up to 2000 (25%) of the Pfam families without a known structure can be modelled. Availability: All models of the Pfam families will be made available at c3.pcons.net as soon as all jobs has finished (within a week from submission). All programs used here are freely available. Contact: arne@bioinfo.searne@bioinfo.se Supplementary information: No supplementary data

Session A-089: Automated Realization of RNA Structure from Interaction Topology
COSI: 3Dsig
  • Matthew Wicker
Session A-090: PartSeg – computational segmentation, reconstruction and structural alignment of three-dimensional images from biological experiments
COSI: 3Dsig
  • Grzegorz Bokota, Centre of New Technologies, University of Warsaw, Poland
  • Michal Kadlof, Centre of New Technologies, University of Warsaw, Poland
  • Paweł Trzaskoma, Nencki Institute of Experimental Biology, Poland
  • Adriana Magalska, Nencki Institute of Experimental Biology, Poland
  • Agnieszka Czechowska, Nencki Institute of Experimental Biology, Poland
  • Agnieszka Walczak, Nencki Institute of Experimental Biology, Poland
  • Grzegorz Wilczynski, Nencki Institute of Experimental Biology, Poland
  • Dariusz Plewczynski, Centre of New Technologies, University of Warsaw, Warsaw, Poland, Poland

Short Abstract: Motivation: Recent advancements in bioimaging techniques require further development of computational methods, tools and algorithms aimed at segmentation, reconstruction and spatial alignment of cellular objects. Automated methods do not always give a satisfactory result, therefore manual inspection of data still remain necessary. On the other hand, the high-throughput studies of large cellular populations provide the amount of data that outgrows the possibility of manual inspection. Therefore, a growing need for faster and user-friendly software for analysis of raw data with affordable graphical user interface (GUI) and subsequent batch processing arises. Results: We present PartSeg, the comprehensive software package implementing several computational algorithms that can be used for segmentation and analysis of three-dimensional, microscopic images. We start with automatic segmentation of tree-dimensional images, based on the implementation of the graph theory algorithm that allows determining connected components of a given 3D structure. Secondly, we reconstruct 3D objects within the assigned space. Finally, we structurally align pairs of objects by determining their principal axes and to find their optimal relative orientation of in 3D space. Contact: d.plewczynski@cent.uw.edu.pl Availability and implementation: PartSeg is available as open source software at the public repository https://bitbucket.org/3dome/partseg, www project page http://nucleus3d.cent.uw.edu.pl/PartSeg

Session A-091: The determination of force field parameters of the conserved copper coordinating active site of AA9 proteins
COSI: 3Dsig
  • Vuyani Moses, Rhodes University, ZA
Session A-092: An efficient algorithm for improving structure-based prediction of transcription factor binding sites
COSI: 3Dsig
  • Alvin Farrel, University of North Carolina at Charlotte, United States
  • Jun-Tao Guo, University of North Carolina at Charlotte, United States

Short Abstract: Gene expression is regulated by transcription factors (TFs) binding to specific target DNA sites. Understanding how and where these transcription factors bind at genome scale represents an essential step toward our understanding of gene regulation networks. Previously we developed a structure-based method for prediction of transcription factor binding sites (TFBSs) using an integrative energy that combines a knowledge-based potential and two atomic energy terms. While the method improves the prediction accuracy over the knowledge-based potentials, it is not computationally efficient due to the exponential increase in the number of binding sequences to be evaluated for longer TF binding sites. In this paper, we present an efficient pentamer algorithm by splitting DNA binding sequences into overlapping fragments along with a simplified integrative energy function for TFBS prediction. Our results show that the new pentamer algorithm and energy function improve TFBS prediction accuracy while dramatically reducing the time complexity, especially for prediction of longer binding sites by TF dimers which have significant improvements in both accuracy and speed. To our knowledge, this is the first fragment-based method for structure-based TF binding sites prediction.

Session A-093: Property map collective variable as a useful tool for a force field correction
COSI: 3Dsig
  • Dalibor Trapl, University of Chemistry and Technology, Prague, Department of Biochemistry and Micriobiology, Czech Republic
  • Vojtech Spiwok, University of Chemistry and Technology, Prague, Department of Biochemistry and Micriobiology, Czech Republic

Short Abstract: The accuracy of molecular simulations depends on an empirical molecular mechanics potential known as a force field. While force fields designed for proteins or nucleic acids are considered accurate, force fields for drug-like molecules still need many improvements. Here, we present a novel method for the determination of a force field correction tailored to a general drug-like compound. Using property map collective variable, it is possible to approximate a certain conformationally dependent property by a weighted average of this property for a series of representative landmark structures. We used this approach to approximate the difference between the potential energy calculated by different force fields. To validate this method we used seven AMBER force fields and we performed a set of 20​-ns-long metadynamics simulations of Ace-Ala-Nme in water. The obtained free energy surfaces of the corrected force fields (e.g. AMBER94 corrected to AMBER03) and the intended ones (e.g. AMBER03 without correction) were in good agreement. Thus our method appears suitable for adjusting force field for general drug-like molecule.

Session A-094: Next-Step Conditioned Deep Convolutional Neural Networks Improve Protein Secondary Structure Prediction
COSI: 3Dsig
  • Akosua Busia, Google Brain, United States
  • Navdeep Jaitly, Google Inc, United States

Short Abstract: Recently developed deep learning techniques have significantly improved the accuracy of various speech and image recognition systems. In this paper we show how to adapt some of these techniques to create a novel chained convolutional architecture with next-step conditioning for improving performance on protein sequence prediction problems. We explore its value by demonstrating its ability to improve performance on eight-class secondary structure prediction. We first establish a state-of-the-art baseline by adapting recent advances in convolutional neural networks which were developed for vision tasks. This model achieves 70.0% per amino acid accuracy on the CB513 benchmark dataset without use of standard performance-boosting techniques such as ensembling or multitask learning. We then improve upon this state-of-the-art result using a novel chained prediction approach which frames the secondary structure prediction as a next-step prediction problem. This sequential model achieves 70.3% Q8 accuracy on CB513 with a single model; an ensemble of these models produces 71.4% Q8 accuracy on the same test set, improving upon the previous overall state of the art for the eight-class secondary structure problem.

Session A-095: Use of cross-docking simulations for identification of protein-protein interactions sites: the case of proteins with multiple binding sites
COSI: 3Dsig
  • Nathalie Lagarde, Laboratoire de Biochimie Théorique, France
  • Lydie Vamparys, Laboratoire de Biochimie Théorique, France
  • Benoist Laurent, Laboratoire de Biochimie Théorique, France
  • Alessandra Carbone, Laboratoire de Biologie Computationnelle et Quantitative,
  • Sophie Sacquin-Mora, Laboratoire de Biochimie Théorique,

Short Abstract: Protein–protein interactions (PPI) play a central role in biological systems. Some in silico methods are available to investigate PPI, in particular, protein-protein docking simulations. To evaluate the ability of a cross-docking method to detect multiple binding sites on protein surfaces, we used the MAXDo algorithm with a rigid-body docking approach and a coarse-grain protein model. We compared the use of two different scoring schemes accounting for multiple binding sites, for evaluating the binding sites prediction. Alternative interfaces different from the reference experimental interfaces were predicted and turned out to be interfaces with other partners.

Session A-096: Study of relationships between base positions and DNA backbone conformations
COSI: 3Dsig
  • M. Isabel Agea, Laboratory of Informatics and Chemistry Faculty of Chemical Technology, University of Chemistry and Technology (Prague), Czech Republic
  • Filip Lankaš, Laboratory of Informatics and Chemistry Faculty of Chemical Technology, University of Chemistry and Technology (Prague), Czech Republic
  • Daniel Svozil, Laboratory of Informatics and Chemistry Faculty of Chemical Technology, University of Chemistry and Technology (Prague), Czech Republic

Short Abstract: Base-pair and base-step morphology, expressed by base pair and base step parameters, has been widely analyzed to describe sequence-dependent deformability in DNA. However, the inherent flexibility of the sugar-phosphate backbone influences the arrangement of bases so that the local structure of DNA helix depends on the interplay between the backbone conformation and optimal position of bases. Thus, the study of relationships between base positions and DNA backbone conformations is important for better understanding of the deformability of double helical DNA that is critical for its packaging in the cell and recognition by other molecules. In the presented work, we study the relationships between backbone torsional space and base step parameters. Backbone conformations are grouped into 44 previously published classes and their statistical analysis shows that slide has the highest discriminative power between individual classes.

Session A-097: A Novel Method for Analysis of Ligand Binding and Unbinding Based on Molecular Docking
COSI: 3Dsig
  • Jiří Filipovič, Masaryk University, Czech Republic
  • Ondřej Vávra, Masaryk University, Czech Republic
  • Jan Plhák, Masaryk University, Czech Republic
  • David Bednář, Masaryk University, Czech Republic
  • Sergio M. Marques, Masaryk University, Czech Republic
  • Jan Brezovský, Masaryk University, Czech Republic
  • Luděk Matyska, Masaryk University, Czech Republic
  • Jiří Damborský, Masaryk University, Czech Republic

Short Abstract: Understanding the protein-ligand interactions is crucial for drug design, enzymology research or enzyme engineering. The interaction of a protein and a ligand molecule often takes place in protein's binding or active site. Such functional sites may be hidden inside the protein core, and therefore a transport of a ligand from outside environment to the protein inside needs to be understood. Here we present the CaverDock, implementing a novel method for analysis of these important transport processes. It iteratively places the ligand along the tunnel in such a way that the ligand movement is contiguous and its energy is minimized. The output of the calculation is ligand trajectory and energy profile of transport process. CaverDock uses a modified version of the program Autodock Vina for ligand placement and implements a parallel heuristic algorithm to search the space of possible trajectories. Our method lies in between of geometrical approaches and molecular dynamics simulations. Contrary to geometrical methods, it provides an evaluation of chemical forces. However, it is not as computationally demanding as the methods based on molecular dynamics. The typical input of CaverDock requires setup for molecular docking and tunnel geometry obtained from Caver. Typical computational time is in dozens of minutes at a single node, allowing virtual screening of a large pool of molecules. We demonstrate CaverDock usability by comparison of a ligand trajectory in different tunnels of wild type and engineered proteins; and computation of energetic profiles for a large set of substrates and inhibitors.

Session A-098: Molecular Dynamics Analysis of HIV-1 gp120 Outer Domain
COSI: 3Dsig
  • Katie Farney, Vaccine Research Center, National Institute of Allergy and Infectious Disease, NIH, USA
  • Mallika Sastry, Vaccine Research Center, National Institute of Allergy and Infectious Disease, NIH, USA
  • Jeffrey Boyington, Vaccine Research Center, National Institute of Allergy and Infectious Disease, NIH, USA
  • Ling Xu, Vaccine Research Center, National Institute of Allergy and Infectious Disease, NIH, USA
  • Gary Nabel, Vaccine Research Center, National Institute of Allergy and Infectious Disease, NIH, USA
  • Carole Bewley, National Institute of Diabetes and Digestive and Kidney Diseases, NIH, USA
  • Peter Kwong, Vaccine Research Center, National Institute of Allergy and Infectious Disease, NIH, USA
  • Gwo-Yu Chuang, Vaccine Research Center, National Institute of Allergy and Infectious Disease, NIH, USA

Short Abstract: The heavily glycosylated HIV-1 envelope protein (Env) has been at the forefront of vaccine efforts and the gp120 outer domain (OD) has been proposed as a potential immunogen. However, immunization with a number of different glycosylated OD variants has thus far failed to elicit broadly neutralizing antibodies. Understanding the conformational dynamics of the OD molecule may facilitate the optimization of OD as an immunogen. Here, we performed 250 nanoseconds of molecular dynamics simulation of both glycosylated and glycan-freeunglycosylated OD to investigate the conformational flexibility of these molecules. A homology model was built of the R2OD4 clade B sequence using the crystal structure of OD4.2.2 as the main template. Two final protein models were made: one free of glycans and one with N-acetylglucosamine monosaccharides attached at seven glycosylation sequons. As expected, V3, V4, and V5 loops within OD maintained extreme flexibility, but glycosylated OD was much more flexible than glycan-freeunglycosylated OD. Root mean square fluctuation (RMSF) and hydrogen bonding analyses, however, indicated a stable, mini-core with key residues maintaining their electrostatic interactions over time. This observation was in agreement withexperimentally validated through hydrogen deuterium exchange experiments by NMR that revealed specific residues in the α-2 helix are protected from bulk solvent and able to form a stable hydrophobic core. Proving highly accurate, computational dynamic analyses provide a fast, easy way to elucidate the conformational flexibility of immunogen targets and to better understand viral proteins.

Session A-099: Accurate and reliable prediction of relative ligand binding potency in drug discovery - Applications in scaffold hopping transformations
COSI: 3Dsig
  • Jianxin Duan, Schrodinger,
Session A-100: Mutations and Variations in Health and Disease: Protein Interaction Networks and 3D Structure
COSI: 3Dsig
  • Franca Fraternali, Randall Division of Cell and Molecular Biophysics, King's College London, United Kingdom

Short Abstract: Two parallel efforts are reshaping the microscopic view of the functioning of the human cell and its correlation with healthy and diseased cellular states: a) large-scale proteomics studies with associated functional annotation of the extracted proteins and b) the detection of human variants in healthy individuals and in disease-related cellular states. We aim at merging this information by using in-house developed algorithms for the topological analysis of networks and by extraction of short-loop motifs that highlight associations between proteins and therefore allow for the extraction of disease-related communities of proteins. We extract topologically and functionally connected communities of proteins that are affected by disease mutations and analyse the impact on the underlying 3D-structures to prioritise novel candidates for targeted drug screening.

Session A-101: Next generation structure-based antibody drug design with ABodyBuilder and PEARS.
COSI: 3Dsig
  • Jinwoo Leem United Kingdom

Short Abstract: Antibodies are an important class of biopharmaceuticals. Currently, experimental antibody design is resource-intensive, and computational antibody design methods offer promise to accelerate this process. In particular, accurate structural modelling is essential, as it provides the basis for antibody-antigen docking, binding affinity prediction, and humanisation. One of the key stages in the modelling pipeline is side chain prediction. Here we describe our antibody-specific predictor, PEARS, as an extension of our ABodyBuilder methodology to model large volumes of antibody sequence data, typical of next-generation sequencing (NGS) datasets.

Session A-102: Structural and functional analysis of alternative microexons of proteins observed in RNA-seq studies.
COSI: 3Dsig
  • Matsuyuki Shirota, Tohoku University, Japan
Session A-103: Density-based clustering in structural bioinformatics: application to beta turns and antibody CDRs
COSI: 3Dsig
  • Roland Dunbrack
Session A-104: Using normal modes analysis to characterize the flexibility of protein tunnels and channels
COSI: 3Dsig
  • Pierre Bedoucha, Univeristy of Bergen, Norway

Short Abstract: Protein 3D structures are tightly related to protein functions. There are withal missing links between protein structure and function and studies have shown that protein dynamics is one of them.
The transport of compounds of various sizes through cell membranes is partly ensured by transmembrane proteins, such as channels or carriers. Their 3D structure is often characterized by an opening providing a route through which ions or small molecules will be traveling. During the transport process these tunnels may change geometry, adjusting their access and permeability to regulate the protein’s function. These structural changes are happening on time scales that are often too long to be captured by molecular dynamics simulations.
We propose to use Normal Mode Analysis (NMA) to investigate the repercussion of protein intrinsic dynamics on the properties of tunnels in proteins 3D structure. NMA has indeed been repeatedly shown to be an efficient and reliable method to describe slow and large amplitude movements in proteins ( Fuglebakk et al., BBA, 2015). We combine NMA with the use of CAVER (Chovancova et al., PCBI, 2012), a tool for the analysis and visualization of protein tunnels and cavities.
We have developed a computational framework linking CAVER and NMA results. We have validated it on a few proteins for which we have compared the use of coarse-grained (CG) and all atoms representations to model channel flexibility.
We confirmed that NMA is a fitting mean to investigate tunnel and channel plasticity.

Session A-105: Engineering Improvement of a Potent Human-Derived Monoclonal Antibody Against Respiratory Syncytial Virus Using Structure-Based Computer Modeling
COSI: 3Dsig
  • Sean Le

Short Abstract: Respiratory syncytial virus (RSV) infections are a common cause of hospitalizations in infants, resulting in 57,000-120,000 hospitalizations in the U.S. annually. Synagis®, the current therapy for RSV infection prevention has limited effectiveness, and must be given five-six times during the RSV season. A recently discovered monoclonal antibody, AR-201, has shown to be more potent than Synagis®. Our goal is to use structure-based computer modeling to engineer AR-201 to prolong its time (half-life) in the blood, thus reducing its dosing frequency to optimally once during the RSV season. Our approach to improving AR-201 half-life is to modify binding of AR-201 to the neonatal Fc receptor (FcRn), which salvages antibodies from degradation and recycles them to the blood stream. We selected seven published IgG mutants which showed increased half-life in other antibodies. We introduced these mutations into AR- 201 and predicted their 3D structures. The quality of the 3D structures was assessed and optimized in order to create more realistic structures. The optimized 3D structures were then docked onto FcRn and their computer models were generated. We calculated the predicted binding affinities between the AR-201 mutants and FcRn. Our results showed mutant C6A-78A to have the highest calculated binding affinity to FcRn, indicating that it has the highest potential for prolonged half-life. This work will guide the further development of AR-201 into a highly potent drug with increased half-life. Although still in its infancy, molecular computer modeling can save substantial time and effort during the drug development process.

Session A-106: Mapping and 2D visualization of secondary structure elements in cytochromes
COSI: 3Dsig
  • Ivana Hutařová Vařeková, National Centre for Biomolecular Research, Masaryk University, Czech Republic
  • Jan Hutař, National Centre for Biomolecular Research, Masaryk University, Brno, Czech Republic, Czech Republic
  • Radka Svobodová Vařeková, Central European Institute of Technology, Masaryk University, Czech Republic
  • Karel Berka, Department of Physical Chemistry, Palacky University, Czech Republic

Short Abstract: Secondary structure elements (SSEs) such as helices and sheets are important parts of protein structure. Their composition and organization is often characteristic for proteins from a certain protein family and they participate in formation of protein fold. Thanks to advanced structure determination techniques, we have a lot of structural data about individual protein families with variations originating from different organisms, binding various ligands and containing diverse mutations. Mapping and 2D visualization of their SSEs could provide a very useful insight, e.g. visualization of their differences, identification of conserved or other key regions, etc.. Unfortunately, current approaches focused on SSE 2D visualization (e.g., PROMOTIF, Pro-Origami, Hera) do not take into account an information about real distances of SSEs or common protein family fold. Therefore, even when two proteins from the same family differ only slightly, their SSE 2D diagrams can be totally different. In our approach, the analysis of SSEs is provided within a protein set with the detection of the conserved (“skeleton”) SSEs forming the conserved fold. Afterwards, it can perform SSE 2D visualization such a way, that structural information is kept. An applicability of this approach is shown in a case study focused on cytochromes P450. This protein family, which is important for drug design, has currently available more than 580 structures from about 30 organisms and each cytochrome contains more than 20 SSEs. Our approach can be extended to most of other protein structural families which will allow family-wide annotations and comparisons in a simple visual manner.

Session A-471: LiteMol suite: A comprehensive platform for fast delivery and visualization of macromolecular structure data
COSI: 3Dsig
  • Radka Svobodová Vařeková , National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University, Czech Republic
  • David Sehnal, National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University, Czech Republic
  • Mandar Deshpande, Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), United Kingdom
  • Saqib Mir, Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), United Kingdom
  • Karel Berka, Palacky University Olomouc, Czech Republic
  • Adam Midlik, National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University, Czech Republic
  • Lukáš Pravda, National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University, Czech Republic
  • Sameer Velankar, Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), United Kingdom
  • Jaroslav Koča, National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University, Czech Republic

Short Abstract: Recent advances in 3D structure determination techniques such as Cryo-EM have facilitated the study of large macromolecular machines, leading to a rapid increase in the number, size, and complexity of biomacromolecular structures available in the Protein Data Bank (PDB). As a result, the online archives face a major challenge in enabling access to this diverse and rich data in informative and intuitive ways to more than 250M users, who view the data in PDB each year.
To address this challenge, we have developed the LiteMol suite, a comprehensive open-source solution for the fast delivery and interactive 3D visualization of large-scale structures, experimental data, and biological context annotations from resources such as Pfam or UniProt. The solution includes a next-generation web browser-based 3D molecular viewer (LiteMol Viewer), supported by CoordinateServer and DensityServer services for near-instant delivery of model and experimental data using the newly developed BinaryCIF format. The format is compatible with the existing standards used by PDB and the wider community while substantially reducing the file size. Our innovative approach works in all modern web browsers and mobile devices, and is up to orders of magnitude faster than its competitors. LiteMol suite is integrated into the Protein Data Bank in Europe (PDBe) with thousands of daily users. In parallel, the LiteMol suite also became a part of SIB and CNRS services, and its integration into other key life science web applications is planned.

Session A-472: Drug search for leishmaniasis: a structure-based drug discovery approach for detecting anti-Leishmania hits
COSI: 3Dsig
  • Rodrigo Ochoa , University of Antioquia, Colombia
  • Stanley Watowich, University of Texas Medical Branch, United States
  • Andrés Flórez, German Cancer Research Center, Germany
  • Sara Robledo, University of Antioquia, Colombia
  • Carlos Muskus, University of Antioquia, Colombia

Short Abstract: Leishmaniasis is a neglected tropical disease treated mainly by chemotherapy options. However, it is challenging due to the increase in drug resistance, cost and the lack of available treatments. Thanks to the availability of crystallized proteins solved for different species of the parasite, we implemented a structure-based drug discovery approach combining the advantages of molecular dynamics (MD) simulations and molecular docking. The structures of 53 Leishmania spp. proteins were selected and downloaded from the Protein Data Bank (PDB) database. Initially, the proteins were submitted to MD simulations of five nanoseconds using GROMACS v5.0, with the aim to recreate the protein flexibility. From each MD trajectory, a set of 10 snapshots were docked using the AutoDock Vina software. The screening was made against a library of approximately 600.000 drug-like compounds from the ZINC database based on a Relaxed Complex Scheme (RCS) methodology. Most of the docking hits were detected against three proteins: UDP-glucose pyrophosphorylase, dihydroorotate dehydrogenase (DHODH) and tyrosyl-tRNA synthetase. In vitro testing of eight of the best compounds against the DHODH, showed that three had acceptable activity against Leishmania parasites, and one had a strong activity to kill the parasites without affecting the human-derived cell lines. A website (http://ubmc-pecet.udea.edu.co/index.php/dsfl) with all the data about the used proteins, their hits and the docking scores is available for the public, in order to accelerate the drug discovery pipeline for neglected tropical diseases such as leishmaniasis.

Session A-479: Dewetting the intracellular water pocket of the human connexin 26 hemichannel via a water-selective repulsive potential
COSI: 3Dsig
  • Villanelo , Fundación Ciencia & Vida, Universidad de Valparaíso, Chile
  • Garate JA , Fundación Ciencia & Vida, Universidad de Valparaíso, Chile
  • Perez-Acle T , Fundación Ciencia & Vida, Universidad de Valparaíso, Chile

Short Abstract: Connexins (Cxs) are eukaryotic transmembrane that form hydrophilic channels composed of six Cx monomers, called hemichannels (HCs). The extracellular docking of two opposed HCs, connecting the cytoplasm of two adjacent cells, form a gap junction channel (GJC). HCs and GJCs are expressed in most human tissues being involved in several functions such as cellular differentiation, electrical synapsis and the immune response. These channels allow the passage of solutes up to 1 kDa. Of note, HCs and GJCs exhibit voltage-dependent gating in response to transmembrane and transjunctional potential differences, respectively. Two protein regions flanking the pore are related to gating: the NTH, located to the intracellular side, and the parahelix (PH), facing the extracellular, are related to the fast and the slow gating, respectively 1 . The most studied member and the only source of structural information is the human Cx26 (hCx26) 2 . We recently described an intracellular water pocket, the IC pocket, that is present in each hCx26 monomer 3 . This pocket is located behind the NTH and it is filled with structural water molecules entering from the pore lumen. Mutations disturbing the IC pocket in both hCx26 and mouse Cx32, hinder channel function, either losing voltage-dependence or changing ionic conductance 3,4 . In accordance, we have hypothesized that IC pocket water volume could modulate the NTH, therefore affecting channel gating. In this work, we asses this hypothesis by implementing a water-selective repulsive potential (WSRP) to dewet the hCx26 IC pocket.

Session A-481: Protein S-palmitoylation site prediction using position-specific scores
COSI: 3Dsig
  • Yuri Mukai, Meiji University, Japan
  • Tatsuki Kikegawa, Meiji University, Japan
  • Tsubasa Ogawa, Meiji University, Japan
  • Kenji Etchuya, Meiji University, Japan

Short Abstract: Protein palmitoylation sites were predicted with high accuracy using PSSM. In the case of transmembrane proteins, the accuracy was improved by the membrane topology consideration.

Session A-482: How is structural divergence related to evolutionary information?
COSI: 3Dsig
    Session A-503: Automated evaluation of quaternary structures from protein crystal structures
    COSI: 3Dsig
    • Jose M. Duarte, RCSB Protein Data Bank, United States
    • Spencer Bliven, Paul Scherrer Institute, Switzerland
    • Aleix Lafita, Paul Scherrer Institute, Switzerland
    • Guido Capitani, Paul Scherrer Institute, Switzerland
    • Stephen Burley, RCSB Protein Data Bank,

    Short Abstract: Crystallography is the most powerful technique for generating atomic level structures of
    proteins and other biological macromolecules. However, it does not always yield
    definitive insights into the quaternary structures of biological macromolecules. In order to
    provide better tools for determining the most likely quaternary structure in proteins, we
    have developed the new EPPIC 3 method. It uses evolutionary considerations as the
    ultimate arbiters of the biological relevance of interfaces and assemblies, thereby
    offering a complementary approach versus other available methods that rely on
    thermodynamic considerations.
    EPPIC 3 extends our previous Evolutionary Protein-Protein Interface Classifier (EPPIC)
    by going beyond classifying pairwise interfaces. It identifies all possible topologically
    valid assemblies present in a protein crystal and provides predictions as to likely
    quaternary structures.
    Assembly enumeration is achieved by representing the crystal lattice as a periodic
    graph. Finding valid assemblies is then reduced to the problem of finding subgraphs
    complying to a set of rules, which guarantees closed assemblies (Point Group
    symmetries). Finally the assemblies are scored based on the individual scores of the
    constitutive interfaces, providing in the end a single probability of an assembly being the
    biological one, together with a confidence estimation.
    The software is accessible through an easy to use web graphical interface at
    http://www.eppic-web.org.


    View Posters By Category

    Search Posters: