RNA and Protein Structural Biology
Ontologies and NLP
Pathways, Networks and Proteomics
Sequence Analysis, Phylogeny and Evolution
Genomics and Gene Expression
Gene Regulation, microRNA's
Databases
To view the PLoS Computational Biology Late Breaking Poster Session, click here.
RNA and Protein Structural Biology
Poster A-2
(There will also be an oral presentation of this poster.)
A Conserved Sparse Dicodon Framework Which Correlates Sequence and Structure: Implications for Gene Finding
David Halitsky (Cumulative Inquiry, Inc); Arthur Lesk (Dept of Haemaology, CIMR); Jacques Fresco (Princeton University)
Abstract: Analysis of di-codon pairs in mRNA sequences can identify structurally similar features in the encoded proteins via a sparse signal characterized by number and order of certain dicodons occuring within codon subsequences of specific lengths. The signal reliably detects structurally similar features with virtually no underlying sequence similarity.
Poster A-3
De Novo Assembly of Transmembrane Helices of Polytopic Membrane Proteins Using Sequence Conservation Patterns
Yungki Park (Center for Bioinformatics, Saarland University); Volkhard Helms (Center for Bioinformatics, Saarland University)
Abstract: A novel two-step method for modeling structures of transmembrane helix bundle proteins was developed: generation of libraries of folds and specification of the best fold based on sequence conservation patterns. For a broad spectrum of test proteins, it consistently generated model structures within CA RMSDs of 3 ~ 5 Å.
Poster A-4
Protein-Protein Docking Methods Used to Study Complex Protein Interactions
Dana Haley-Vicente (Accelrys); Tim Glennon (Accelrys)
Abstract: Understanding the protein-protein interactions is important for insights into signal transduction pathways. Here we have applied protein-protein docking, Evolutionary trace, fold, hydrophobic, and electrostatics analysis to determine and understand the interaction between a regulator of G-protein signaling protein and the alpha subunit of G-proteins.
Poster A-5
Comprehensive LAboratory information Management system (CLAM): the Structural Module
Tjaart de Beer (University of Pretoria); Fourie Joubert (University of Pretoria)
Abstract: The aim of this project is to construct an Open Source, web based functional genomic information system called CLAM (Comprehensive LAboratory information Management). CLAM will contain modules for genotyping, proteomics, genetics, phylogenetics, microarray, comparative genomics and structural biology data analysis. This poster will focus on the structural module in CLAM.
Poster A-6
Metal binding sites: pre-organized scaffolds in the unbound state
Mariana Babor (Weizmann Institute of Science); Harry Greenblatt (Weizmann Institute of Science); Marvin Edelman (Weizmann Institute of Science); Vladimir Sobolev (Weizmann Institute of Science)
Abstract: Protein metal binding sites in the unbound state, and their rearrangements upon metal binding were analyzed. More than 40% of the metal binding sites show a capacity for flexibility, but in the vast majority of cases, part of the first coordination shell is already in place in the pre-bound form.
Poster A-7
Functional Prediction of Protein Mutants Using a Four-Body Potential
Majid Masso (George Mason University); Iosif Vaisman (George Mason University)
Abstract: Studies exploring single point mutants of HIV-1 protease and T4 lysozyme suggest that prediction of mutant enzyme catalytic activity is realizable by employing supervised learning in conjunction with mutant attribute vectors, based on a four-body statistical potential, that characterize constituent amino acid environmental changes from wild-type.
Poster A-8
(There will also be an oral presentation of this poster.)
Enzyme Mechanism Annotation and Classification
Daniel Almonacid (Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge); Gemma Holliday (Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge); Peter Murray-Rust (Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge); Janet Thornton (EMBL-EBI, Wellcome Genome Campus); John Mitchell (Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge)
Abstract: MACiE is a unique database containing enzyme-catalysed reaction mechanisms. Reaction steps as well as overall reactions are included. Data mining of this database has already provided a better insight into the nature's catalytic diversity. Our ongoing work addresses the evolution and classification of enzymes.
Poster A-9
Automated creation of in silico analogue ligand libraries from a lead molecule template
Wolf Cochrane (University Pretoria); Fourie Joubert (University Pretoria)
Abstract: LIGLIB (LIGand LIbrary Builder) is an open source tool that allows users to create an in silico library of molecules that are analogues of a lead chemical compound given as input to the software. The software is available as a plug-in to chimera.
Poster A-11
(There will also be an oral presentation of this poster.)
A novel approach to structural alignment using realistic structural and environmental information
Yu Chen (Bioinformatics program, University of Michigan); Gordon Crippen (College of Pharmacy, University of Michigan)
Abstract: We find a new structural alignment approach using realistic structural as well as environmental information. Statistics are defined to measure the goodness of alignments in structure cores. With this method, we can distinguish structures in different oligomeric states, and can flexibly align multiple domain proteins without domain splitting.
Poster A-12
Identifying Functional Signatures from Structural Alignments
Kai Wang (University of Washington); Ram Samudrala (University of Washington)
Abstract: We developed a method called Functional Signature from Structural Alignments (FSSA), to estimate the log odds of a residue being functionally important versus structurally important. The FSSA signatures can be used to interpret the functional importance of each residue, or classify proteins into functional categories.
Poster A-13
Doing a double take: function based target selection for structural genomics
Iddo Friedberg (The Burnham Institute); Phillip Lord (University of Manchester); Andrei Osterman (The Burnham Institute); Adam Godzik (The Burnham Institute)
Abstract: Structural genomics target selection schemes usually favor proteins predicted to have new folds. Here we argue that more targets should be selected within a given fold, to provide accurate templates not only for fold space, but also for function space, which is more finely grained.
Poster A-14
A new set of docking potentials for efficient discrimination between native and non-native conformations of protein complexes
Dror Tobi (Department of Computational Biology School of Medicine, University of Pittsburgh); Ivet Bahar (Department of Computational Biology School of Medicine, University of Pittsburgh)
Abstract: We generated putative docked complexes for a set of 63 non-reduncdant complexes, which were used in a linear programming algorithm to generate coarse-grained Docking Potentials. The resulting set of potentials show promising results for discriminating the native complex among decoys generated with the unbound form of the interacting proteins.
Poster A-16
Structural identification and prediction of amphipathic alpha-helices
Mamta Bajaj (School of Biological Sciences, University of Nebraska-Lincoln); Hideaki Moriyama (Department of Chemistry, University of Nebraska-Lincoln); Etsuko Moriyama (School of Biological Sciences and Plant Science Initiative, University of Nebraska-Lincoln)
Abstract: We developed a new method for identifying amphipathic alpha-helices based on PDB coordinate information, and identified 26 amphipathic alpha-helices that are not annotated as amphipathic alpha-helices in the PDB. Based on this dataset, we developed a new prediction method for amphipathic alpha-helices from primary structure information.
Poster A-17
Accurate Recognition of Protein-DNA Interaction Using Optimized Potential with Multi-body Consideration
Zhijie Liu (Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology, University of Georgia); Ying Xu (Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology, University of Georgia)
Abstract: A knowledge-based potential that considers the distance-dependent multi-body interactions was developed to quantitatively evaluate the binding affinity between protein and DNA. The potential achieved significant agreement between predictions and the experimental data, and succeeded in identification of DNA binding motifs of transcription factors in the genome-scale.
Poster A-18
Few strict rules determine permissible arrangements of strands in the Sandwich Proteins
Yih-Shien Chiang (Department of Health Informatics, SHRP, University of Medicine and Dentistry of New Jersey); Tatyana Gelfand (Department of Mathematics, Rutgers University); Thanasis Fokas (Department of Applied Mathematics and Theoretical Physics, University of Cambridge); Alexander Kister (Department of Health Informatics, SHRP, University of Medicine and Dentistry of New Jersey); Israel Gelfand (Department of Mathematics, Rutgers University)
Abstract: Analysis of the arrangements of strands in beta-sandwich proteins has led to propose a set of rules, which determine the main principles of the packing of strands in structures. These constraint rules allow one to determine all permissible motifs of the sandwich-like proteins. Keywords : Protein prediction, supersecondary structure.
Poster A-19
A Protein Structure Comparison system using 3D LRA
Chan-Yong Park (Electronics and Telecommunications Research Institute); Sung-Hee Park (Electronics and Telecommunications Research Institute); Dae-Hee Kim (Electronics and Telecommunications Research Institute); Seon-Hee Park (Electronics and Telecommunications Research Institute); Chi-Jung Hwang (Chung Nam University)
Abstract: The protein structure comparison using the LRA(Locally Relative Angle) is a algorithm of the efficient protein structure representation. The algorithm consists of two parts. The indexing part stores the LRA with all the proteins and the retrieval part adds the comparison process that compares the LRA of a query.
Poster A-20
Secondary Structure in the Target as a Confounding Factor in Synthetic Oligomer Microarray Design
Vladyslava Ratushna (Virginia Polytechnic Institute and State University); Jennifer Weller (George Mason University); Cynthia Gibas (Virginia Polytechnic Institute and State University)
Abstract: Prediction and thermodynamic analysis of secondary structure formation in a genome-wide set of transcripts from Brucella suis 1330 demonstrates that the properties of the target molecule have the potential to strongly influence the rate and extent of hybridization between transcript and tethered oligonucleotide probe in a microarray experiment.
Poster A-21
Protein Loop Modeling using Genetic Algorithms
Chiuan-Jung Chen (Department of Computer Science and Information Engineering, National Taiwan University); Chen-hsiung Chan (Department of Computer Science and Information Engineering, National Taiwan University); Cheng-Yan Kao (Department of Computer Science and Information Engineering, National Taiwan University)
Abstract: We have developed a robust protein loop modeling algorithm using genetic algorithm for conformation search. Using RMSD as fitness function, the prediction accuracy reaches 0.59 A for 60 loops with length of 8 residues. Our loop modeling algorithm can evaluate the strengths of various scoring functions.
Poster A-22
A Fast Similarity Search System for Protein 3D Structure Databases Using Spatial Topological Relationships and Rtree Index
Sung-Hee Park (Database & Bioinformatics Laboratory, Chungbuk National University); Keun Ho Ryu (Database & Bioinformatics Laboratory, Chungbuk National University); David Gilbert (Bioinformatics Research Centre, Department of Computing Science, University of Glasgow)
Abstract: We develop a prototype system for fast similarity search for protein 3D structure databases based on spatial index and topological relationship patterns. Our approach can rapidly generate a small candidate set to be subsequently used in more accurate and slow alignment methods. The sever will be available http://dblab.chungbuk.ac.kr/~simsearch.jsp
Poster A-23
(There will also be an oral presentation of this poster.)
On the importance of being left-handed
Marian Novotny (Uppsala University); Gerard Kleywegt (Uppsala University)
Abstract: The handedness of helices has not received much attention in the past. Therefore, an extensive survey of left-handed helices was undertaken to analyse their frequency, length, amino acid composition and possible structural or functional role. The survey suggests that left-handed helices are rare, but structurally or functionally significant.
Poster A-24
The Energetics and Stability of Transmembrane Helix Packing: a Density of States Simulation
Zhong Chen (Dept. of Biochemistry and Molecular Biology, University of Georgia); Ying Xu (Dept. of Biochemistry and Molecular Biology, University of Georgia)
Abstract: Packing of transmembrane helices was successfully modeled by Wang-Landau simulations that calculate the density of states, from which the stabilities of different packing topologies were obtained. Contrary to common belief, helix-lipid interactions seem to be as important as helix-helix interactions for structure formation of some membrane proteins.
Poster A-25
In silico structure-based design of a potent, mutation resilient, small peptide inhibitor against Rifampicin-resistant tuberculosis
Deepak Bunger (Post Graduate Institute of Medical Education and Research (PGIMER), Chandigarh); Gita Subba Rao (All India Institute of Medical Sciences (AIIMS), New Delhi)
Abstract: Mycobacterium tuberculosis RNA polymerase (RNAP) is a key enzyme involved in the replication of the bacterium and is a potential target for therapeutic intervention following infection. We present here the design of a peptide inhibitor of RNAP. The designed peptide has the potential of being a novel and promising drug candidate against Rifampicin-resistant M.tuberculosis.
Poster A-26
An Improved Fully-Connected Hidden Markov Model for Rational Vaccine Design
Chenhong Zhang (Department of Computer Science, University of Saskatchewan); Anthony Kusalik (Department of Computer Science, University of Saskatchewan); Mik Bickis (Mathematical Sciences Group, University of Saskatchewan)
Abstract: The predictive accuracy of a rational vaccine design program based on a fully-connected HMM is improved via a biochemistry-based matrix initialization heuristic and a topology reduction heuristic. With the combination of approaches, the program outperforms HMMER on two alleles tested, HLA-A*0201 and HLA-B*3501.
Poster A-27
A graph theoretical approach for the Identification of Protein Domains
Frank Emmert-Streib (Stowers Institute for Medical Research); Arcady Mushegian (Stowers Institute for Medical Research)
Abstract: We propose a graph theoretical approach for the problem of protein domain identification by representing its three-dimensional structure as graph. The domains of the protein are then identified as partitions of the graph. These partitions are obtained my maximizing an objective function corresponding to the mutual maximization of cycle distributions.
Poster A-28
Homology modeling of the AdoMetDC domain from the bifunctional malarial enzyme S-adenosylmethionine decarboxylase/Ornithine decarboxylase.
Gordon Wells (University of Pretoria, Department of Biochemistry); Lyn-Marie Birkholtz (University of Pretoria, Department of Biochemistry); Fourie Joubert (University of Pretoria, Bioinformatics and Computational Biology Unit); Rolf Walter (Bernhard Nocht Institute for Tropical Medicine, Department of Biochemical Parisitology); Abraham Louw (University of Pretoria, Department of Biochemistry)
Abstract: The AdoMetDC domain of the bifunctional malarial enzyme S-adenosylmethionine decarboxylase/Ornithine decarboxylase was modeled based on the human and potato crystal structures. From this and related site-directed mutagenesis a number of novel properties can be predicted, which may aid the discovery of novel parasite-specific inhibitors.
Poster A-29
Predicting lipid accessible surface areas of transmembrane residues
Zheng Yuan (Institute for Molecular Bioscience and ARC Centre in Bioinformatics, The University of Queensland); Shane Zhang (Institute for Molecular Bioscience and ARC Centre in Bioinformatics, The University of Queensland); Mellisa Davis (Institute for Molecular Bioscience and ARC Centre in Bioinformatics, The University of Queensland); Mikael Boden (2School of Information Technology and Electrical Engineering, The University of Queensland); Rohan Teasdale (Institute for Molecular Bioscience and ARC Centre in Bioinformatics, The University of Queensland)
Abstract: A Support vector regression approach has been used to predict lipid accessible surface areas (LASAs) of transmembrane residues. Based on a non-redundant dataset of 59 transmembrane helix proteins, we achieve a correlation coefficient 0.66 between predicted and observed LASAs by Jackknife tests. The mean absolute error can decrease to 19.6 squared Armstrong. Tested on 14 beta-barrel membrane proteins, the correlation coefficient and mean absolute error are 0.70 and 19.2 squared Armstrong, respectively. This approach is useful for prediction transmembrane domain arrangement.
Poster A-30
High-throughput exploration of functional residues in protein structures
Gabriele Ausiello (Centre for Molecular Bioinformatics, Dept. of Biology, Uni. of Tor Vergata, Rome); Andreas Zanzoni (Centre for Molecular Bioinformatics, Dept. of Biology, Uni. of Tor Vergata, Rome); Daniele Peluso (Centre for Molecular Bioinformatics, Dept. of Biology, Uni. of Tor Vergata, Rome); Allegra Via (Centre for Molecular Bioinformatics, Dept. of Biology, Uni. of Tor Vergata, Rome); Manuela Helmer-Citterich (Centre for Molecular Bioinformatics, Dept. of Biology, Uni. of Tor Vergata, Rome)
Abstract: pdbFun (pdbfun.uniroma2.it), a server for mass functional analysis of protein structures at residue level, integrates different databases and methods for 3D functional annotation together with a local structural comparison algorithm. pdbFun permits fast, detailed and high-throughput exploration of the whole PDB reorganized as an annotated residues DB.
Poster A-31
Sequence conservation and secondary structure - finding structural traits by using conservation as a magnifying glass.
Einat Sitbon (Weizmann Institute of Science); Shmuel Pietrokovski (Weizmann Institute of Science)
Abstract: What are structurally distinguishing features of conserved protein sequence regions? We found beta-strands abundant, alpha-helices rare, and certain combinations of secondary structures specific, in conserved sequence regions. These findings are relevant to basic science of protein structure, and to protein function prediction.
Poster A-32
Disparity in the nuclear localization signal of Stat1 and Stat3: Use of molecular modeling and visualization techniques for comparative analysis of relevant aspects of the crystal structures
Agnes Tan (Institute of Molecular and Cell Biology)
Abstract: Stat proteins possess distinct functions in the cytoplasm and in the nucleus. We have studied the crystal structures of Stat1 and Stat3 in view of differences in the nuclear localization signal. We have correctly predicted the relative importance of Arg214 in Stat3, and elucidated the packing of Leu407/411.
Poster A-33
Optimal relationship between average conformational entropy and average energy of residue interactions for fast protein folding
Oxana Galzitskaya (Institute of Protein Research, Russian Academy of Sciences); Sergiy Garbuzynskiy (Institute of Protein Research, Russian Academy of Sciences)
Abstract: Based on the known experimental data and using theoretical modeling of protein folding, we demonstrate that there exists optimal relationship between the average conformational entropy and the average energy of contacts per residue for fast protein folding. Our result is in agreement with the experimental folding rates for 59 proteins.
Poster A-34
Computational analysis of RNA binding proteins based on composition, sequence and structural information
Parthiban Vijaya (Cologne University BioInformatics Center / International Max Planck Research School); Michael Gromiha (Computational Biology Research Center (AIST)); Abhinandan Madenhalli (Cologne University BioInformatics Center); Dietmar Schomburg (Cologne University BioInformatics Center)
Abstract: RNA binding proteins are involved in key roles in the regulation of gene expression. Critical analyses of protein-RNA complexes at sequence/structural level are needed to understand the RNA-protein interactions and related molecular processes. Statistical methods were developed to analyse complexes which facilitate better understanding of biological processes.
Poster A-35
B-cell epitope predictions based on three-dimensional structural information of protein antigens.
Pernille Haste Andersen (Center for Biological Sequence Analysis, BioCentrum-DTU, Technical University of Denmark,); Morten Nielsen (Center for Biological Sequence Analysis, BioCentrum-DTU, Technical University of Denmark,); Ole Lund (Center for Biological Sequence Analysis, BioCentrum-DTU, Technical University of Denmark,)
Abstract: We have constructed a data set of conformational B-cell epitopes and used it to the analyse the structural characteristics of B-cell epitopes. A well-performing B-cell epitope predictor has been developed, which includes characteristics of the three-dimensional structure of a pathogenic protein.
Poster A-36
(There will also be an oral presentation of this poster.)
Route as trees: The parsing view on protein folding
Julia Hockenmaier (University of Pennsylvania); Aravind Joshi (University of Pennsylvania); Ken Dill (University of California at San Francisco)
Abstract: Protein folding is a parallel, hierarchical process. Therefore, folding routes should be viewed as trees, not linear pathways. The Cocke-Kasami-Younger parsing algorithm is an efficient, accurate technique to search all folding route trees. It predicts the Plaxco et al. result that folding speed is inversely correlated with native contact order.
Poster A-37
Protein Structure from Contact Maps: An Hierarchical Approach
Alan Ableson (Queen's University); Jim Davies (Queen's University); Tony Kuo (Queen's University); Eduardo Zuviria (Queen's University); Janice Glasgow (Queen's University)
Abstract: One approach to prediction of protein structure from sequence is to predict a contact map and structural features, and then reconstruct the protein from its predicted contact map. This poster proposes a method for structure determination from contact maps using the experience embedded in the PDB.
Poster A-38
A probabilistic approach to the prediction of non-covalent residue contacts
David Cook (University of Glasgow); Pawel Herzyk (University of Glasgow)
Abstract: Here we present a new approach to the prediction of non-covalent residue contacts in proteins. Preliminary results demonstrate that incorperating multiple scales derived from a hidden site class model into a correlated mutation algorithm is able to improve the accuracy over a single scale/matrix model.
Poster A-39
How old is your fold?
Sanne Abeln (University of Oxford, Department of Statistics); Henry Winstanley (University of Oxford, Department of Statistics); Charlotte M. Deane (University of Oxford, Department of Statistics)
Abstract: We have created the first relative age estimation technique for protein folds. The ages presented show correlation with other protein age estimators and are used to investigate evolutionary pressure on fold topology and complexity. This shows for example very different age patterns of alpha/beta folds compared to small folds.
Poster A-40
Computational simulations suggest multiple routes for substrates and products in mammalian cytochrome P450s
Karin Schleinkofer (Department of Bioinformatics, Biocenter, University of Würzburg); Sudarko Sudarko (Department of Chemistry, Faculty of Mathematics and Natural Sciences, University of Jember); Peter J. Winn (European Molecular Biology Laboratory); Susanne K. Luedemann (European Molecular Biology Laboratory); Rebecca C. Wade (EML Research)
Abstract: By molecular dynamic simulation of a membrane-bound mammalian P450, the microsomal CYP2C5, substrate access and product egress routes are proposed that differ from those found in previous simulations of soluble bacterial P450s. This highlights the adaptability of the P450 fold to different substrates and to cellular localization.
Poster A-41
Finding motifs in RNA 3-D structures by complete enumeration of cycles of relations
Majid Behbahani (CIISE, Concordia University); Sébastien Lemieux (CIISE, Concordia University)
Abstract: We propose an algorithm that identifies motifs from an RNA 3-D structure by enumerating and comparing all cycles of a given length in the graph of relations (GOR). The GOR is an extension of the secondary structure including tertiary interactions. Using this algorithm several well known motifs were identified.
Poster A-42
Role of sequence and evolutinary information in DNA-binding sites in proteins
Shandar Ahmad (Department of Bioscience and Bioinformatics, Kyushu Institutute of Technology); Akinori Sarai (Department of Bioscience and Bioinformatics, Kyushu Institutute of Technology)
Abstract: We implemented a neural network based algorithm to utilize evolutionary information of amino acid sequences in terms of their position specific scoring matrices (PSSMs) for a better prediction of DNA-binding sites. An average of sensitivity and specificity using PSSMs is upto 8.7% better than the prediction with sequence information only.
Poster A-43
Predicting RNA secondary structure at temperatures other than 37 °C
Zhi (John) Lu (University of Rochester); David Mathews (University of Rochester)
Abstract: In order to study RNA sencondary structure formation in organisms such as thermophiles and in experiments performed at temperatures other than 37 °C, nearest neighbor enthalpy parameters are derived from experimental results. Using enthalpy and free energy parameters for 37°C, RNA secondary structure can be predicted at different temperatures.
Poster A-44
e-Protein: A Distributed Pipeline for Structure-based Proteome Annotation using GRID Technology
Shikta Das (London e-Science Centre, Department of Compunting, Imperial College, London); Andrew McGough (London e-Science Centre,Department of Compunting, Imperial College, London); Keiran Fleming (Structural Bioinformatics Unit, Faculty of Life Sciences, Imperial College, London); John Darlington (London e-Science Centre, Department of Compunting, Imperial College, London); Michael Sternberg (Structural Bioinformatics Unit, Faculty of Life Sciences, Imperial College, London)
Abstract: The e-Protein project aims to provide a fully automated distributed pipeline for large-scale structural and functional annotation of all major proteomes utilising GRID technologies. We are using ICENI - a grid middleware - allowing biologists to browse and monitor available services and compose workflow components within a graphical interface.
Poster A-45
Multiple Mapping Method: A new approach to improve sequence to structure alignments
Brajesh Rai (Department of Biochemistry, Albert Einstein College of Medicine); Andras Fiser (Department of Biochemistry and Seaver Foundation Center for Bioinformatics, Albert Einstein College of Medicine)
Abstract: A new approach, Multiple Mapping Method, has been developed to optimally combine fragments from alternative input alignments. On a benchmark dataset of 6500 template-target protein pairs, the alignments generated by this method consistently outperformed the average accuracy of input alignments.
Poster A-46
A study of the folding energy spectrum of RNAs
Jerome Waldispuhl (Boston College); Peter Clote (Boston College)
Abstract: An important aspect of the RNA folding process concerns the distribution and free energy of kinetic traps. We present a new algorithm which computes for a given RNA sequence, the Boltzmann partition function for all locally optimal secondary structures, and show new results which may help to characterize biological sequences.
Poster A-47
Identification of folding essential residues by looking at an extensive DB of the structure descriptors in Diamond STING
Paula Kuser (EMBRAPA/CNPTIA); Michel Yamagishi (EMBRAPA/CNPTIA); Luiz Borro (EMBRAPA/CNPTIA); Adauto Mancini (EMBRAPA/CNPTIA); Roberto Higa (EMBRAPA/CNPTIA); Goran Neshich (EMBRAPA/CNPTIA)
Abstract: Diamond STING suite of programs for comprehensive analysis of structure, function and stability is presented. We show here the in silico process of identification of the folding essential amino acids (previously determined by experiment) by means of range selecting for a set of the STING_DB parameters.
Poster A-48
Prediction of coaxial stacking configuration of helices in RNA multibranch loops.
Rahul Tyagi (University of Rochester Medical Center); David Mathews (University of Rochester Medical Center)
Abstract: A dynamic programming algorithm to predict the coaxial stacking configuration in RNA multibranch loops using free energy nearest neighbour parameters is presented. We show that coaxial stacking in crystal structures can be predicted with considerable success using thermodynamic parameters.
Poster A-49
(There will also be an oral presentation of this poster.)
All-Atom Modeling of RNA Conformational Changes
David Mathews (University of Rochester Medical Center); David Case (The Scripps Research Institute)
Abstract: The GG mismatches in the duplex rGCAGGCGUGC have been shown by NMR to be dynamic. Here, the minimum energy pathway for the conformational change is modeled with all-atom calculations using Nudged Elastic Band and the AMBER forcefield.
Poster A-50
Automated Protein Backbone Tracing in Electron Density Maps using Belief Propagation
Frank DiMaio (UW-Madison Computer Sciences Department); Jude Shavlik (UW-Madison Computer Sciences Department); George Phillips (UW-Madison Biochemistry Department)
Abstract: One particularly time-consuming step in x-ray crystallography is interpretation of the electron density map. This paper describes an approach to automated backbone tracing in poor-quality density maps using belief propagation (BP). Several enhancements to BP are presented, making the algorithm feasible even for proteins several thousand residues in length.
Poster A-51
Structural Annotation Pipeline for Malaria proteins
Yolandi Joubert (University of Pretoria); Fourie Joubert (University of Pretoria)
Abstract: A pipeline for structural annotation of the P.falciparum proteins is being constructed. A series of established structure-related bioinformatics tools are included in the pipeline. Analyses are performed on a Linux cluster, with results being submitted to a PostgreSQL database. Once completed, it could be extended to other genomes.
Poster A-52
Function Inference Using Family-Specific Subgraph Fingerprints Mined from Protein Families
Deepak Bandyopadhyay (Department of Computer Science, University of North Carolina at Chapel Hill); Jun Huan (Department of Computer Science, University of North Carolina at Chapel Hill); Jinze Liu (Department of Computer Science, University of North Carolina at Chapel Hill); Jan Prins (Department of Computer Science, University of North Carolina at Chapel Hill); Jack Snoeyink (Department of Computer Science, University of North Carolina at Chapel Hill); Alexander Tropsha (Department of Medicinal Chemistry and Natural Products, School of Pharmacy, University of North Carolina at Chapel Hill); Wei Wang (Department of Computer Science, University of North Carolina at Chapel Hill)
Abstract: We propose a method for functional family inference by querying a new structure for occurrences of family-specific structural fingerprints mined from protein families using a graph representation. We compare against sequence, fold and other local structure based methods, and demonstrate applications to structural genomics targets and predicted structures.
Poster A-53
Evolutionary knowledge-based potentials for protein structure prediction
Alejandro Panjkovich (Pontificia Universidad Catolica); Andrej Sali (University of California, San Francisco); Marc Marti-Renom (University of California, San Francisco); Francisco Melo (Pontificia Universidad Catolica)
Abstract: A novel approach implementing evolutionary information into mean force potentials (MFPs) is presented and demonstrated to perform significantly better than current MFPs in fold assessment. The evolutionary potentials (EvPs) presented here are built in a fold-specific manner based on multiple sequence alignments and threading techniques.
Poster A-54
An Examination of Protein Stability Using Delaunay Tessellation that Includes Surface Hydration Effects
Gregory Reck (George Mason University); Iosif Vaisman (George Mason University)
Abstract: Several variations of a statistical potential function that include surface water effects are derived from Delaunay tessellation of a representative set of 1352 hydrated proteins. Each of the potential functions is correlated with previously reported experimental stability data from 366 single-point mutations of HIV reverse transcriptase.
Poster A-55
Diresidue neural network for the prediction of disulfide connectivity and ligand-bound cysteines
Fabrizio Ferre (Boston College); Peter Clote (Boston College)
Abstract: A novel diresidue neural network approach is used for the prediction of the cysteine disulfide connectivity and for the discrimination between ligand-bound, disulfide bond-involved and free cysteines. This method can be used to face problems for which protein sequences are more adequately modeled using diresidue, rather than monoresidue, position-specific scoring matrices.
Poster A-56
Comparative analysis of large protein structural families by TOPOFIT
Alex Abyzov (Biology Department, Northeastern University); Chesley Leslin (Biology Department, Northeastern University); Mounir Errami (Biology Department, Northeastern University); Valentin ILYIN (Biology Department, Northeastern University)
Abstract: Using novel TOPOFIT method we present a comparative analysis of several large protein families demonstrating: a clear identification of the common structural invariant, unambiguous and distinct clusters of proteins, strong correlation of active sites with the structural invariants, while reflecting the role of variable parts in specificity, recognition and flexibility
Poster A-57
Mining 3D-motifs using physical-chemical constraints: application to Cardiolipin binding sites
Dmitrii Polshakov (Ohio State University/Department of Chemistry/Department of Computer Science); Keith Marsolo (Ohio State University/Department of Computer Science); Srinivasan Parthasarathy (Ohio State University/Department of Computer Science)
Abstract: A new approach toward the discovery of biologically-meaningful structural motifs in proteins is presented. Using 3D-coordinates and a scaled set of physical-chemical properties, the approach is validated on several sets of functionally-related proteins. In addition, the first structural search on a subset of the membrane proteins containing Cardiolipin is performed.
Poster A-58
Computational Prediction of RNA-Binding Sites in Proteins Based on Amino Acid Sequence
Michael Terribilini (Iowa State University); Jae-Hyung Lee (Iowa State University); Changhui Yan (Iowa State University); Robert Jernigan (Iowa State University); Vasant Honavar (Iowa State University); Drena Dobbs (Iowa State University)
Abstract: We have developed a Naïve Bayes classifier for predicting RNA-binding residues in proteins using only protein sequence as input. The classifier identifies interface residues with 86% accuracy, 0.35 correlation coefficient. To our knowledge, this approach provides the best available sequence- based prediction of protein-RNA interaction sites.
Poster A-59
Novel Approach to Multi-scale Modeling of Protein Structure, Folding, Dynamics and Function
Pratul Agarwal (Oak Ridge National Laboratory); Al Geist (Oak Ridge National Laboratory)
Abstract: An integrated view of protein structure, folding, dynamics and function is emerging where protein complexes are viewed as dynamical entities. We are developing novel theoretical and computational approaches to enable simulations of protein complexes on biologically relevant time-scales, and to investigate link between dynamics, folding and function (enzyme catalysis/biomolecular recognition).
Poster A-60
FILTREST3D: program for discrimination of protein structure models that match the restraints from experimental data
Marta Kaczor (International Institute for Molecular and Cell Biology); Michal Gajda (International Institute for Molecular and Cell Biology); Janusz Bujnicki (International Institute for Molecular and Cell Biology)
Abstract: We developed a method and a web server for discriminating among large number of protein structure models with "fuzzy" restraints derived from mutagenesis, chemical modification, and crosslinking experiments. Restraints include: distances between residues, amino acid burial and secondary structure. Tested on a set of ROSETTA decoys for restriction enzymes using restraints from mutagenesis and CD spectroscopy.
Poster A-61
Structural determinants of pKa shifts in RNA
Christopher Tang (Columbia University); Emil Alexov (Columbia University); Anna Marie Pyle (Yale University/HHMI); Barry Honig (Columbia University/HHMI)
Abstract: We describe the calculation of pKa shifts in RNA structures. We show that shifts in pKas are quantitatively accurate when compared to experiment and we describe the structural features of RNA that are responsible for changes in these ionization constants.
Poster A-62
Molecular Dynamics of SXR-SMRT Interactions Revealed the Preference of NR-interacting Domain ID2 over ID1
Ching (Nina) Wang (GSBS, UMDNJ-RWJMS); Chia-Wei Li (GSBS, UMDNJ-RWJMS); J. Don Chen (GSBS, UMDNJ-RWJMS); William Welsh (GSBS, UMDNJ-RWJMS)
Abstract: Steroid and xenobiotic receptor is a member of the orphan nuclear receptors that mediates mammalian xenobiotic response. SMRT can repress SXR-mediated transactivation by binding to cofactor site through its two interacting domains ID1 and ID2. Our molecular dynamics studies revealed essential interactions for the preference of ID2 over ID1.
Poster A-63
Domain-based Small Molecule Binding Site Annotation
Kevin Snyder (The Blueprint Initiative, 9th floor, 522 University Ave, Toronto ON; Samuel Lunenfeld Research Institute/University of Toronto, Toronto ON); Howard Feldman (The Blueprint Initiative, 9th floor, 522 University Ave, Toronto ON; Samuel Lunenfeld Research Institute/University of Toronto, Toronto ON); Brigitte Tuekam (The Blueprint Initiative, 9th floor, 522 University Ave, Toronto ON; Samuel Lunenfeld Research Institute/University of Toronto, Toronto ON); John Salama (The Blueprint Initiative, 9th floor, 522 University Ave, Toronto ON; Samuel Lunenfeld Research Institute/University of Toronto, Toronto ON); Michel Dumontier (The Blueprint Initiative, 9th floor, 522 University Ave, Toronto ON; Samuel Lunenfeld Research Institute/University of Toronto, Toronto ON); Christopher Hogue (The Blueprint Initiative, 9th floor, 522 University Ave, Toronto ON; Samuel Lunenfeld Research Institute/University of Toronto, Toronto ON)
Abstract: SMID-BLAST (http://smid.blueprint.org/) is a freely available, multi-purpose tool for the annotation and prediction of protein-small molecule interactions and binding sites. The tool uses NCBI's RPS-BLAST algorithm to identify domains in the query sequence and then looks these up in SMID, a database of domain-small molecule interactions generated from the PDB.
Poster A-64
(There will also be an oral presentation of this poster.)
A Novel Covariance Model Based RNA Motif Finding Algorithm
Zizhen Yao (University of Washington, Seattle); Walter L. Ruzzo (University of Washington, Seattle)
Abstract: CMfinder predicts RNA motifs in unaligned sequences. It is an expectation maximization algorithm using Covariance Models for motif description, carefully crafted heuristics for effective motif search, and a novel Bayesian framework for structure prediction combining folding energy and sequence covariation. It performs better than alternatives, and integrates directly with genome-scale homology search.
Poster A-65
Detecting Functional Sites in Protein Structures Using Dynamics Perturbation Analysis
Dengming Ming (Computer and Computational Science Division, Los Alamos National Laboratory); Michael E. Wall (Computer and Computational Science Division and Bioscience Division, Los Alamos National Laboratory)
Abstract: Recently, we introduced a theoretical framework to quantify allosteric effects in proteins. We have developed an algorithm which makes use of this framework to predict functional sites in protein structures. Here we present this algorithm and results of its performance in predicting ligand-binding sites for 298 protein/ligand structures.
Poster A-66
NMRQ: A Web Server for the Validation, Comparison and Analysis of Protein Structures Solved by NMR
Gary Van Domselaar (Depts. of Computing Science & Biological Sciences, University of Alberta); Paul Stothard (Depts. of Computing Science & Biological Sciences, University of Alberta); Trent Bjorndahl (Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta); David Wishart (Depts. of Computing Science & Biological Sciences, University of Alberta)
Abstract: NMRQ is a web server for assessing and visualizing the quality of NMR-derived protein structures. NMRQ uses chemical shift assignments, NOE restraints, structure ensemble superposition, and structure geometry to score and rank models and structural features. Results are presented as publication-quality graphical and textual HTML reports.
Poster A-68
A novel method for comparing topological models of protein structures enhanced with ligand information
Mallika VEERAMALAI (Bioinformatics Research Centre, Dept of Computing Science, University of Glasgow); David GILBERT (Bioinformatics Research Centre, Dept of Computing Science, University of Glasgow)
Abstract: Protein structure comparison methods plays a vital role in understanding structural and functional relationships between protein, also essential for estimating the evolutionary distance between proteins and protein families. Here, we present a novel protein structure comparison method based on 'TOPS Strings+ models' which are topological models enhanced with lingand interaction information.
Poster A-69
PROLOOP-C
Balaji Jayaraman (University of Missouri Kansas City School of Computing and Enginnering); Deendayal Dinakarpandian (University of Missouri Kansas City School of Computing and Enginnering)
Abstract: Most proteins have parts of their structure that adopt an unstructured or random coil conformation. In some cases, these coils play an important functional role. ProLoop-C is a database of sequence independent clustering of random coils based on structural similarity.
Poster A-70
(There will also be an oral presentation of this poster.)
ncRNA genefinding in C. elegans
Shawn Stricklin (Department of Genetics, Washington University at St. Louis); Valerie Reinke (Department of Genetics, Yale University School of Medicine); Viktor Stolc (Center for Nanotechnology, NASA Ames Research Center); Sean Eddy (Howard Hughes Medical Institute, Washington University at St. Louis)
Abstract: We describe a computationally-directed screen for noncoding (ncRNA) genes based upon comparison of the genomes of C. elegans, C. briggsae, C. remanei, and more distantly-related nematodes. The ncRNA candidates identified by a multipronged computational approach are assayed using custom tiling microarrays targeting loci in all three nematode genomes.
Poster A-71
Exploiting Sequence and Structure Homologs to Identify Protein-Protein Binding Sites
Jo-Lan Chung (Department of Chemistry and Biochemistry, San Diego Supercomputer Center, University of California, San Diego); Wei Wang (Department of Chemistry and Biochemistry, University of California, San Diego); Philip Bourne (Department of Pharmacology, San Diego Supercomputer Center, University of California, San Diego)
Abstract: Structurally conserved residues, determined by the multiple structure alignments, were combined with other residue properties to predict protein-protein binding sites. The prediction results improved significantly and supported the hypothesis that in many cases protein interfaces require some residues to provide rigidity to minimize the entropic cost upon complex formation.
Poster A-72
Analysis and Prediction of Protein Ubiquitination Sites
Predrag Radivojac (Indiana University); Lilia Iakoucheva (The Rockefeller University)
Abstract: We describe the development of the ubiquitination sites predictor from a protein sequence. Our results indicate prevalence of E, D and phosphorylated residues in close proximity to Ub sites. The data provide evidence that Ub sites are preferentially located within disordered or flexibly ordered protein regions.
Poster A-73
A structure-based algorithm for the recognition of antifreeze proteins
Andrew Doxey (University of Waterloo); Mahmoud Yaish (University of Waterloo); Marilyn Griffith (University of Waterloo); Brendan McConkey (University of Waterloo)
Abstract: We present a simple and efficient structure-based algorithm capable of recognizing antifreeze proteins (AFPs) and their putative ice-binding surfaces. The algorithm discriminates AFPs from other structures in the protein data bank with high accuracy. We have applied the algorithm to identify a novel plant AFP.
Poster A-74
Predicting Protein Active Sites Using Protein Motion
Vinhthuy Phan (The University of Memphis); Sunder Tatta (The University of Memphis); Yongmei Wang (The University of Memphis)
Abstract: We describe a new method of predicting protein active sites based on motion. Information about active sites help biologists understand protein-protein interaction. Using the Elastic Network model to estimate global motion of proteins, we predict most deformed regions and hypothesize that these regions co-localize with active sites.
Poster A-75
Modelling Cotranslational Protein Folding
Fabien P.E. Huard (Department of Statistics, Macquarie University); Charlotte M. Deane (Department of Statistics, University of Oxford); G.R. Wood (Department of Statistics, Macquarie University)
Abstract: Cotranslational protein folding is acknowledged to occur. Simplified models of proteins (HP models) are used to explore the effect of key factors (such as surmountable energy barrier) on the difference between the native state of a cotranslationally folded protein and that of a protein folded from a fully extended state.
Poster A-76
(There will also be an oral presentation of this poster.)
Isostericity Matrices: Tools for Analyzing Recurrent Motifs and Structurally Aligning Homologous RNAs
Neocles Leontis (Bowling Green State University); Eric Westhof (Institut de Biologie Moleculaire et Cellulaire); Zirbel Craig (Bowling Green State University); Ali Mokdad (Bowling Green State University); Jesse Stombaugh (Bowling Green State University); Michael Sarver (Bowling Green State University)
Abstract: Isostericity Matrices (IM) for non-Watson Crick basepairs are important tools for deriving sequence signatures of recurrent RNA motifs, scoring and refining RNA sequence alignments, and evaluating motif conservation across phylogeny. Progress in automating procedures for productive, iterative RNA structural sequence alignment based on IM will be described.
Poster A-77
Conditional Random Fields for RNA Structural Alignment
Kengo Sato (Keio University, Department of Biosciences and Informatics); Yasubumi Sakakibara (Keio University, Department of Biosciences and Informatics)
Abstract: We propose a novel approach for estimating the parameters including the substitution scores of base pairs and the state transition scores for RNA structural alignment with Conditional Random Fields, which can discriminate between correct alignments and incorrect ones most likely.
Poster A-78
The Effects of Quadratic (Two-Body) vs. Linear (Simplifyed Two-Body) Scoring Functions in Core Structure Threading
Natasha L. Sefcovic (NCBI / NLM / NIH and Biology Department, Johns Hopkins Univerisity); Aron Marchler-Bauer (NCBI / NLM / NIH); Anna R. Panchenko (NCBI / NLM / NIH); Stephen H. Bryant (NCBI / NLM / NIH)
Abstract: We directly compared a dynamic programming (DP) threading program to a Monte Carlo (MC) threading program to study the effects of quadratic vs. linear scoring functions. The MC program performed slightly better as measured by ROC, but surprisingly, not because of the scoring functions.
Poster A-79
Visualizing Bacterial tRNA Identity Determinants and Antideterminants Using Function Logos and Inverse Function Logos
Eva Freyhult (The Linnaeus Centre for Bioinformatics, Uppsala University); Vincent Moulton (School of Computing Sciences, University of East Anglia); David Ardell (The Linnaeus Centre for Bioinformatics, Uppsala University)
Abstract: Two extensions to sequence logo graphs, function logos and inverse logos, are introduced. These are useful for finding features that distinguish a subclass of sequences from a general sequence family and underrepresented sequence features or functions, respecively. We apply function and inverse function logos to structurally aligned bacterial tDNAs.
Poster A-80
Two applications of Delaunay contact matrices to the analysis of protein structures
Todd Taylor (George Mason University); Iosif Vaisman (George Mason University)
Abstract: We detail two applications of Delaunay contact matrices to the analysis of protein structures. First, a variation of the Ising-lattice domain definition method of W. Taylor is described. Second, we use MDS to find the dimensionality of Delaunay contact graphs and thereby define three characteristic length scales in proteins.
Poster A-81
Localization of protein binding sites within families of homologous proteins
Dmitry Korkin (University of California, San Francisco); Fred Davis (University of California, San Francisco); Andrej Sali (University of California, San Francisco)
Abstract: We analyze whether binding sites of homologous proteins are localized, ie whether they share similar relative positions on protein surfaces, irrespective of the identities of their binding partners. The analysis shows that ~71% of the 1,884 SCOP domain families have binding sites with localization values greater than expected by chance.
Poster A-82
Unification of discrete and continuous effects on protein interfaces: an extension of the concept of hydrophobic effect and its application
Martin Jambon (The Burnham Institute); Christophe Geourjon (PBIL-IBCP); François Delfaud (MEDIT SA)
Abstract: We developed a computationally efficient system to represent proteins and identify functionally equivalent sites in structures that may not share any sequence or fold similarity, even locally. This involves microsites made of discrete and continuous components. This system is part of SuMo, online at http://sumo-pbil.ibcp.fr
Poster A-83
Portable virtual reality system using haptic device and naked eye 3D display, for molecular modeling.
Isao Okada (Tokyo Medical Dental University); Hiroshi Mizushima (Tokyo Medical Dental University); Takayuki Ohnishi (Tokyo Medical Dental University); Hiroshi Nagata (Tokyo Medical Dental University); Hiroshi Tanaka (Tokyo Medical Dental University)
Abstract: We have been developing Virtual Reality System for molecular modeling using haptic device. This time we have developed a easy-to-carry system for demonstration at the conference or at other labs using portable PC. We also integrated naked eye 3D display system for better reality.
Poster A-84
Characterization of Protein Structure Using Geometry and Topology
Bala Krishnamoorthy (Washington State University, Pullman, WA); J. Scott Provan (University of North Carolina at Chapel Hill, NC); Alexander Tropsha (University of North Carolina at Chapel Hill, NC)
Abstract: The alpha complex filtration of a protein represented by its alpha carbon atoms is analyzed. The topology of the neighborhood of a strand of residues is characterized by the largest connected components and holes in its filtration. A ``motif'' for 3D structure is characterized by the number of persistent components and holes, and their relative sizes.
Poster A-85
Sequence-Dependent Conformational Energy of DNA Derived from Molecular Dynamics Simulations: Towards the Understanding of Indirect Readout in Protein-DNA Recognition
Marcos J. Arauzo-Bravo (Kyushu Institute of Technology); Shandar Ahmad (Kyushu Institute of Technology); Satoshi Fujii (Kyushu University); Shigeori Takenaka (Kyushu University); Hidetoshi Kono (Neutron Research Center and Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research Institute); Nobuhiro Go (Neutron Research Center and Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research Institute); Akinori Sarai (Kyushu Institute of Technology)
Abstract: To estimate the sequence dependence of DNA conformation, we performed molecular dynamics of DNA including all possible tetranucleotide steps. From the MD trajectories we derived the equilibrium conformations and harmonic force field potentials. The force fields were applied to estimate the sequence specificity of free DNA and protein-DNA complexes
Poster A-87
IUPUI: Intrinsic Unstructured Protein Unsupervised-supervised Identifier - A Software Tool
Jack Yang (Indiana University School of Medicine, IUPUI); Mary Yang (Purdue University School of Electrical and Computer Engineering); Predrag Radivojac (Indiana University School of Medicine, IUPUI); Marc Cortese (Indiana University School of Medicine, IUPUI); Vladimir Uversky (Indiana University School of Medicine, IUPUI); Keith Dunker (Indiana University School of Medicine, IUPUI)
Abstract: Regions of proteins that have no definite tertiary structure are known as Intrinsically Unstructured Protein (IUP) regions. We developed a software tool to aid in identifying such regions, called Intrinsically Unstructured Protein Unsupervised-supervised Identifier (IUPUI). We demonstrated the effectiveness of the IUPUI predictor, and compared favorably to existing approaches.
Poster A-88
The Victor/FRST Function for Model Quality Estimation
Silvio Tosatto (Dept. of Biology & CRIBI Biotech Centre, University of Padova)
Abstract: Scoring functions are widely used in the final step of model selection in protein structure prediction. A novel combination of pairwise, solvation and torsion angle potentials contain largely orthogonal information. Combining these features with a linear weighting function, a robust energy function for discrimination of native-like structures was constructed.
Poster A-89
Understanding the Origin of "High Energy" of ATP: Ab initio Studies of the Tri- and Diphosphate Fragments of Adenosine Triphosphate
Priti Hansia (Molecular Biophysics Unit, Indian Institute of Science); Nandini Guruprasad (Molecular Biophysics Unit, Indian Institute of Science); Saraswathi Vishveshwara (Molecular Biophysics Unit, Indian Institute of Science)
Abstract: Methyl triphosphate and methyl diphosphate in their different protonation states have been investigated at high levels of quantum mechanical calculations. The optimized geometries, molecular orbitals contributing to the high energy of ATP and dependence of vibrational frequencies on the number of phosphate groups and the charged states have been reported.
Back to top
Ontologies and NLP
Poster B-1
Extraction of Transcript Diversity from Scientific Literature
Parantu Shah (EMBL); Peer Bork (EMBL)
Abstract: We developed an information extraction method specifically for extracting information about alternative transcript and associate information from scientific literature
Poster B-2
PTKB: Protein Translocation Knowledge-Base
Zhiyong Lu (University of Colorado School of Medicine); Philip Ogren (University of Colorado School of Medicine); Andrew Dolbey (University of Colorado School of Medicine); Larry Hunter (University of Colorado School of Medicine)
Abstract: Protein translocation, by which proteins are inserted into or across membranes, is essential to all living organisms. We propose to use NLP techniques to automatically transform GeneRIFs mentioning the intracellular transport into a formal knowledge representation that captures what is being transported, from where, to where and by what mechanisms.
Poster B-3
ORIGIN - An educational Ontology about the Central Dogma of Biology
Nuno T Alves (Portugal Telecom); Vitor Fonseca (IST/INESC-ID); Arsénio M Fialho (IST/BSRG); Ana T Freitas (IST/INESC-ID); H Sofia Pinto (IST/INESC-ID)
Abstract: This work presents an ontology about the Central Dogma of Molecular Biology processes for prokaryotic organisms, which will be connected to an inference engine to allow question answering. The main concepts represented include a definition of processes, activities, the roles and relations of the entities in the processes.
Poster B-4
MAO: Multiple Alignment Ontology
Julie Thompson (Institut de Genetique et de Biologie Moleculaire et Cellulaire); Patrice Koehl (UC Davis); Stephen Holbrook (Lawrence Berkeley National Laboratory); Kazutaka Katoh (Institut for Chemical Research, Kyoto Unviersity); Eric Westhof (Institut de Biologie Moleculaire et Cellulaire); Dino Moras (Institut de Genetique et de Biologie Moleculaire et Cellulaire); Olivier Poch (Institut de Genetique et de Biologie Moleculaire et Cellulaire)
Abstract: MAO is an ontology for data retrieval and exchange for DNA/RNA, protein sequence and structure alignment methods. MAO concepts cover the main features of multiple alignments and attributes are defined for residue conservation, structural location and function. MAO is available via the OBO web site (http://obo.sourceforge.net/).
Poster B-5
(There will also be an oral presentation of this poster.)
Where do we GO next? Refining the content of the Gene Ontology
Midori Harris (The Gene Ontology Consortium, EMBL-EBI)
Abstract: The Gene Ontology (GO) Consortium is committed to the continued refinement of its ontologies in response to the needs of database curators and many other users. The GO update procedure ensures that ontology changes are useful, accurate, logically consistent, and well documented.
Poster B-6
Mining Data from Mouse Mutagenesis Projects using Ontologies
Simon Greenaway (MRC Mammalian Genetics Unit); Georgios Gkoutos (MRC Mammalian Genetics Unit); Ann-Marie Mallon (MRC Mammalian Genetics Unit); John Hancock (MRC Mammalian Genetics Unit)
Abstract: We have applied our recently developed ontological schema for the description of mouse phenotypes to phenotype data from a major mouse mutagenesis project carried out at Harwell, U.K. and used data mining techniques identify internal correlations in the data. Results of the analysis will be presented.
Poster B-7
(There will also be an oral presentation of this poster.)
CGHGate: Array-CGH, Case Reports, Phenotypes and Biomedical Literature for Human Genome Annotation
Steven Van Vooren (Katholieke Universiteit Leuven, Department of Electrical Engineering, SISTA/BIOI); Nicole Maas (Center for Human Genetics, Universitaire Ziekenhuizen Gasthuisberg); Joris Vermeesch (Center for Human Genetics, Universitaire Ziekenhuizen Gasthuisberg); Yves Moreau (Katholieke Universiteit Leuven, Department of Electrical Engineering, SISTA/BIOI); Bart De Moor (Katholieke Universiteit Leuven, Department of Electrical Engineering, SISTA/BIOI)
Abstract: As Microarray-CGH is introduced into the clinical practice for the identification of submicroscopic genomic aberrations, tools to handle related data become essential for clinical geneticists. CGHGate is a web application that combines a constitutional cytogenetics database and tools for search, visualisation, genome annotation and data- and text-mining.
Poster B-8
(There will also be an oral presentation of this poster.)
Text-mining challenges for protein family database annotation.
Anna Divoli (The University of Manchester); Teresa Attwood (The University of Manchester)
Abstract: Different biological databases have different requirements for, and standards of, annotation. This work concerns the development of text-mining software to assist the curators particularly of protein family databases, using manually-crafted annotations to improve the design of a new decision-support tool, BioIE. We report here the results of this study.
Poster B-9
Combined data-mining of literature, gene/protein databases, and gene expression data identify uncharacterized cancer-specific targets
Pavel Pospisil (Department of Radiology, Harvard Medical School, Harvard University); Lakshmanan Iyer (Bauer Center for Genomics Research, Harvard University); Amin Kassis (Department of Radiology, Harvard Medical School, Harvard University)
Abstract: We present a systematic data-mining study to identify cancer-gene targets. It covers literature, annotated sequence, and structure databases, as well as gene expression data sets. The results have allowed us to distinguish targets for the entrapment of radiolabeled compounds in the extracellular spaces of solid tumors.
Poster B-10
(There will also be an oral presentation of this poster.)
Ontological Visualization of Protein-Protein Interactions Using the Gene Ontologies
Harold Drabkin (Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME); Christopher Hollenbeck (Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY); David Hill (Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME); Mary Dolan (Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME); James Kadin (Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME); Judith Blake (Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME)
Abstract: Cellular processes require interaction of many proteins. Determining the collective network of such interactions will further understanding the role of individual proteins. The GO is used to provide functional annotation of proteins. We present a methodology for integrating and visualizing protein-protein interaction networks utilizing information encoded in GO annotations.
Poster B-11
Direct protein name recognition in full text, application to literature mining for receptor/G protein coupling interactions
Lei Shi (Weill Medical College of Cornell University); Fabien Campagne (Weill Medical College of Cornell University)
Abstract: The poster will present a computationally efficient method to extract protein names from full text without using a pre-existing protein name lexicon (such as collected from protein databases). We will also discuss its application to gathering information about the coupling of G Protein-Coupled Receptors to G proteins.
Poster B-12
Literature Based Functional Analysis of Microarray Data
Lai Wei (University of Tennessee Health Science Center); Kevin Heinrich (University of Tennessee); Lijing Xu (University of Tennessee Health Science Center); Michael Berry (University of Tennessee); Lawrence Pfeffer (University of Tennessee Health Science Center); Ramin Homayouni (University of Tennessee Health Science Center)
Abstract: We have developed an automated method which utilizes Latent Semantic Indexing (LSI) of titles and abstracts in MEDLINE citations to rank genes based on conceptual relationships to user defined keyword queries. Here, we demonstrate that this method provides a flexible tool for functional analysis of microarray expression data.
Poster B-13
Determining Domain-Specific Semantic Categories for Biological Named Entity Recognition System
Hyun-Sook Lee (Bioinformatics Research Team, Electronics and Telecommunication Research Institute, 161, Gajeong-dong, Daejeon, 305-350); Hyunchul Jang (Bioinformatics Research Team, Electronics and Telecommunication Research Institute, 161, Gajeong-dong, Daejeon, 305-350); Jasoo Lim (Bioinformatics Research Team, Electronics and Telecommunication Research Institute, 161, Gajeong-dong, Daejeon, 305-350); Soo-Jun Park (Bioinformatics Research Team, Electronics and Telecommunication Research Institute, 161, Gajeong-dong, Daejeon, 305-350); Seon-Hee Park (Bioinformatics Research Team, Electronics and Telecommunication Research Institute, 161, Gajeong-dong, Daejeon, 305-350)
Abstract: To recognize named entities from bio-medical literature, appropriate semantic categories in a certain domain must be determined. This paper proposes a method of selecting domain-specific semantic categories automatically using UMLS without non-trivial tasks building domain knowledge. This method helps named entity recognizing system handle various domains effectively.
Poster B-14
A Corpus Tagging Tool and Rules for Biological Relation Events
Hyunchul Jang (Electronics and Telecommunications Research Institute); Hyun-Sook Lee (Electronics and Telecommunications Research Institute); Soo-Jun Park (Electronics and Telecommunications Research Institute); Seon-Hee Park (Electronics and Telecommunications Research Institute); Kyu-Chul Lee (Chungnam National University)
Abstract: We are creating a tagged corpus from MEDLINE abstracts to extract biological relationships. We are tagging named entities and their relation events. The categories of named entities and types of events cover most of UMLS semantic types. We made a tagging tool and defined tagging rules for this.
Poster B-15
Attack of the Clones: HL7 Clones vs. GenomicClones
Amnon Shabo (Shvo) (IBM Research Lab in Haifa)
Abstract: This poster presents HL7 standard specifications developed by the Clinical Genomics SIG. The core specification is the "Genotype" model which fuses bioinformatics markups into an HL7 schema, enabling the realization of the "Encapsulate & Bubble-up" paradigm and bridging the gap from genotypic to phenotypic clinical data.
Poster B-16
Structured Online Submission of Entity Relationships by Authors: Changing the Paradigm of Text Mining
Choon Kong Yap (Bioinformatics Institute); Sudhanshu Patwardhan (Bioinformatics Institute); Jagadish Hosagrahar Visvesvaraya (University of Michigan)
Abstract: Beyond Information Extraction, current literature mining tools fail to infer all relationships and derive accurate knowledge. Finer nuances of a paper can be captured and best represented only by the author. We propose structured online submissions of entity relationships by authors themselves, thus changing the paradigm of text mining.
Poster B-17
BIGRE: An Ontology Driven Bioinformatics Service Integration Environment
Olivier DUGAS (Université Libre de Bruxelles); Joseph MAVOR (Université Libre de Bruxelles); Pierre BUYLE (Faculté Universitaire Notre-Dame de la Paix); Quentin DALLONS (Faculté Universitaire Notre-Dame de la Paix); Amin MANTRACH (Université Libre de Bruxelles); Utku SALIHOGLU (Université Libre de Bruxelles); Hugues BERSINI (Université Libre de Bruxelles); Vincent ENGLEBERT (Faculté Universitaire Notre-Dame de la Paix); Marc COLET (Université Libre de Bruxelles)
Abstract: BIGRE is a distributed generic service integration framework based on service semantics trying to solve the service interoperability problem. BIGRE uses ontologies describing service technical features and bioinformatics concepts. Client, Mediators and Wrapper are design-focused on easy acessibility for end users and service integrators.
Poster B-18
Probe2GO: amplifying the GO annotation of Affymetrix probe sets
Enrique Muro (Ontario Genomics Innovation Centre, Ottawa Health Research Institute); Carolina Perez-Iratxeta (Ontario Genomics Innovation Centre, Ottawa Health Research Institute); Miguel Andrade (Ontario Genomics Innovation Centre, Ottawa Health Research Institute)
Abstract: We present and evaluate a strategy for amplifying the GO annotations of entries from a sequence database. We applied it to the probes of the Affymetrix gene expression microarrays. The amplified annotations and the evidence supporting the annotation are accessible from a web server.
Poster B-19
Discovering Biomedical Domain-specific Action Vocabularies for Targeted Literature Mining
Merine Thomas (Dept of Computer and Information Science, IUPUI); Mathew Palakal (Dept of Computer and Information Science, IUPUI); Sudhanshu Patwardhan (Bioinformatics Institute, A*STAR); Muralidharan Kannan (Dept of Computer and Information Science, IUPUI)
Abstract: Each domain in biomedicine has a set of verbs or actions that get used more frequently and in a unique pattern in that particular domain compared to other domains. Discovering such sub-vocabularies has potential impact on setting rules for literature mining and the study gives confidence in the hypothesis.
Poster B-20
(There will also be an oral presentation of this poster.)
Human-Mouse Anatomical Ontology Mapping: Terminological and Structural Support
Sarah Luger (University of Edinburgh, Edinburgh, Scotland); Stuart Aitken (University of Edinburgh, Edinburgh, Scotland); Bonnie Webber (University of Edinburgh, Edinburgh, Scotland)
Abstract: Exploiting discoveries in mouse at a systems level requires the structuring mouse anatomical information in line with human anatomical information. We have found that we need to analyze both terminology and structure in order to support the alignment of anatomical ontologies between species, and propose automated methods for alignment.
Poster B-21
Extracting Genetic Pathways from Text and Grounding at the Spatio-Temporal Level
Gail Sinclair (University of Edinburgh); Bonnie Webber (University of Edinburgh); Duncan Davidson (Human Genetics Unit, Medical Research Council)
Abstract: In developmental biology, it is critical to link knowledge concerning genetic pathways with processes going on at cellular and tissue level. We are exploring methods of detecting and extracting information about such links from free text, by way of the description of events and their relations in space and time.
Poster B-22
Relating discrete annotation schemes in the functional space through literature analysis
Monica Chagoyen (Centro Nacional de Biotecnologia - CSIC); Carlos Oscar S. Sorzano (Escuela Politecnica Superior, Universidad San Pablo-CEU); Pedro Carmona-Saez (Centro Nacional de Biotecnologia - CSIC); Jose M. Carazo (Centro Nacional de Biotecnologia - CSIC); Alberto Pascual-Montano (Dpto. Arquitectura de Computadores. Universidad Complutense de Madrid)
Abstract: We propose a methodology to create functional similarity measurements from data annotations using conceptual featural representations obtained from the analysis of relevant literature. The literature contains our current state of knowledge regarding gene function. Therefore, it is a good source of data from which to establish functional associations.
Poster B-23
A Novel Sentence Clustering Approach for Functional Annotation of Gene Expression Clusters
Jeyakumar Natarajan (Bioinformatics Research Group, University of Ulster); Eric G. Bremer (Brain Tumor Research Program, Children's Memorial Hospital, and Feinberg School of Medicine, Northwestern University); Catherine DeSesa (SPSS, Inc, Chicago); Catherine J. Hack (Bioinformatics Research Group, University of Ulster); Werner Dubitzky (Bioinformatics Research Group, University of Ulster)
Abstract: Information on gene function was extracted from fulltext using natural language processing and sentence clustering and then used to interpret gene clusters identified in microarray data. Initial results have shown that the method effectively extracts information from fulltext, furthermore this information could not be identified through analysis of abstracts alone.
Poster B-24
Nearest Neighbor Categorization for CASP Function Prediction
Karin Verspoor (Los Alamos National Laboratory); Judith Cohn (Los Alamos National Laboratory); Susan Mniszewski (Los Alamos National Laboratory); Cliff Joslyn (Los Alamos National Laboratory)
Abstract: We present methods for protein function prediction, represented by Gene Ontology (GO) annotations. We identify neighbors of input sequences, collect GO nodes associated with these neighbors in Swiss-Prot, and categorize GO nodes utilizing Gene Ontology Categorizer technology. The resulting nodes are interpreted as the function of the original sequence.
Poster B-25
Machete: Carving Out Paths to Knowledge in Bioscience Literature
Shannon Bradshaw (The University of Iowa); Marc Light (The University of Iowa)
Abstract: Biologists face the daunting task of organizing volumes of scientific information. After retrieval of relevant articles, useful passages must be located. Compounding the problem is that the same information is extracted time and again by many individuals. Machete is a system targeted at these problems in managing bioscience knowledge.
Poster B-26
Transforming Full-Text Literature to Formalized Facts
Qing Dong (Stanford University); Rob Nash (Stanford University); Nicholas Stover (Stanford University); Christopher Lane (Stanford University); Shuai Weng (Stanford University); Rama Balakrishnan (Stanford University); Karen Christie (Stanford University); Maria Costanzo (Stanford University); Kara Dolinski (Princeton University); Stacia Engel (Stanford University); Dianna Fisk (Stanford University); Jodi Hirschman (Stanford University); Eurie Hong (Stanford University); Cynthia Krieger (Stanford University); Rose Oughtred (Princeton University); Marek Skrzypek (Stanford University); Chandra Theesfeld (Stanford University); Gail Binkley (Stanford University); Stuart Miyasato (Stanford University); Anand Sethuraman (Stanford University); Mayank Thanawala (Stanford University); Rey Andrada (Stanford University); David Botstein (Princeton University); J. Michael Cherry (Stanford University)
Abstract: To extract formalized facts from scientific literature, SGD builds an automated pipeline to collect full-text documents. Most of the documents archived have been reviewed by scientific curators and can serve as a training set for text-mining algorithms. We implemented Textpresso, a vocabulary-based information retrieval and extraction system developed at Wormbase.
Poster B-27
BLIMP: Biomedical Literature Mining Publications Forum; A Web-Based Resource
Hagit Shatkay (School of Computing, Queen's University, Ontario); Limin Zheng (School of Computing, Queen's University, Ontario)
Abstract: BLIMP is an online resource for compiling and sharing a complete bibliography on biomedical text mining. Bridging among the diverse research communities and publication venues, it holds hundreds of entries, features a search engine tailored for its scope, and supports submission of new items. (See http://blimp.cs.queensu.ca)
Poster B-28
Mining for Novel TNF Ligands using Unison, an Open Source Database for Target Discovery
Reece Hart (Genentech, Inc.)
Abstract: We describe Unison, our Open Source protein sequence and structure mining tool, and illustrate its applicability to mining for novel tumor necrosis factor ligands. Unison integrates protein threading, HMM and PSSM alignment, localization, transmembrane, signal sequence, GPI, and other predictions to enable complex, holistic mining queries and facilitate therapeutic target discovery.
Poster B-29
(There will also be an oral presentation of this poster.)
Hubs of Knowledge: using the functional link structure in Biozon to mine for biologically significant entities
Paul Shafer (Cornell University); Timothy Isganitis (Cornell University); Golan Yona (Cornell University)
Abstract: We describe a system that builds upon the complex infrastructure of Biozon and applies methods equivalent to Google's PageRank to rank documents that match queries. We explore different models and study the spectral properties of their data graphs. A working ranking system of biological entities available at biozon.org
Poster B-30
GeneTegrate: a platform for integrating biology
Yanay Ofran (of Biochemistry and Molecular Biophysics, Columbia University); Guy Yachdav (of Biochemistry and Molecular Biophysics, Columbia University); Yechiam Yemini (Columbia University Center for Computational Biology and Bioinformatics (C2B2)); Sarah Gilman (Department of Computer Science, Columbia University); Burkhard Rost (Department of Biochemistry and Molecular Biophysics, Columbia University); Mark Treshock (Department of Computer Science, Columbia University)
Abstract: GeneTegrate provides a unifying semantic modeling layer to enrich, simplify and accelerate the analyses of distributed heterogeneous biological data. The system integrates many different levels of biological knowledge, from a single atom to a genetic network. More importantly, GeneTegrate displays the integrated data graphically, making the cognitive assimilation straightforward.
Poster B-31
MineOmics: Development of a Text Mining Tool that Provides Gene Information in Specific Disease and Biological Context
Matthew Tiller (Centers for Disease Control and Prevention); Eric Aslakson (Centers for Disease Control and Prevention); Suzanne Vernon (Centers for Disease Control and Prevention)
Abstract: Researchers are overwhelmed by voluminous high-throughput omic data. We introduce a pluggable text mining tool, called MineOmics. This tool utilizes statistical techniques and support vector machines to glean relevant information from electronic text repositories.
Poster B-32
Event Ontology: Biological ontology for annotating biological pathways and sub-pathways
Tatsuya KUSHIDA (Institute for Bioinformatics Research and Development, Japan Science and Technology Agency); Satoko YAMAMOTO (Institute for Bioinformatics Research and Development, Japan Science and Technology Agency); Takao ASANUMA (Institute for Bioinformatics Research and Development, Japan Science and Technology Agency); Emi HATTORI (Information and Mathematical Science Laboratory, Inc., Tokyo, JAPAN); Yuki YAMAGATA (Institute for Bioinformatics Research and Development, Japan Science and Technology Agency); Toshihisa TAKAGI (Graduate School of Frontier Sciences, University of Tokyo); Ken Ichiro FUKUDA (Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo JAPAN)
Abstract: Event Ontology is an ontology that organizes the terms of pathway objects such as sub-pathways, biological processes and experimental environments appearing in the biological pathways (e.g., signal transductions, disease pathways, metabolic pathways, etc.) The terms in the Event Ontology were manually extracted from scientific articles and text books.
Poster B-33
Maximum Entropy Modeling of Recognizing Biomedical Named Entities on the Literatures
Jaesoo Lim (Bioinformatics Research Team, Electronics and Telecommunications Research Institute, 161, Gajeong-dong, Daejeon, 305-350); Hyun-Sook Lee (Bioinformatics Research Team, Electronics and Telecommunications Research Institute, 161, Gajeong-dong, Daejeon, 305-350); Hyunchul Jang (Bioinformatics Research Team, Electronics and Telecommunications Research Institute, 161, Gajeong-dong, Daejeon, 305-350); Soo-Jun Park (Bioinformatics Research Team, Electronics and Telecommunications Research Institute, 161, Gajeong-dong, Daejeon, 305-350); Seon-Hee Park (Bioinformatics Research Team, Electronics and Telecommunications Research Institute, 161, Gajeong-dong, Daejeon, 305-350)
Abstract: We suggest a Maximum Entropy modeling based on rich contextual features with closed vocabulary. We evaluated our resulting system on the GENIA corpus with the same way to the Bio-Entity Recognition Task at BioNLP/NLPBA 2004. The system exhibited a recall and a precision of 0.6332 and 0.6014, respectively.
Poster B-34
(There will also be an oral presentation of this poster.)
Literature Data Mining and Protein Ontology Development at the Protein Information Resource (PIR)
Zhang-Zhi Hu (Protein Information Resource, Georgetown University Medical Center); Inderjeet Mani (Department of Linguistics, Georgetown University); Hongfang Liu (Department of Information System, University of Maryland); Vijay Shanker (Department of Computer and Information Science, University of Delaware); Vincent Hermoso (Protein Information Resource, Georgetown University Medical Center); Anastasia Nikolskaya (Protein Information Resource, Georgetown University Medical Center); Darren Natale (Protein Information Resource, Georgetown University Medical Center); Cathy Wu (Protein Information Resource, Georgetown University Medical Center)
Abstract: A literature mining resource iProLINK is developed to provide data sources for research on literature-based curation and protein ontology development. A rule-based system RLIMS-P is benchmarked and used to extract phosphorylation information from PubMed abstracts, and a family classification-based protein ontology developed to complement other ontologies.
Poster B-35
Text Mining of MEDLINE Abstracts
Venu Dasigi (Southern Polytechnic State University); Sirisha Kanda (Southern Polytechnic State University); Sham Navathe (Georgia Institute Of Technology)
Abstract: A database is being constructed for supporting text mining on MEDLINE abstracts. This paper focuses on the usage of the data base. Creation of different views of the data for supporting different mining applications such as gene clustering, as well as practical issues relating to the size are described.
Poster B-36
Towards a comprehensive catalog of gene-disease and gene-drug relationships in cancer.
Christine M.E. Schueller (Biomax Informatics AG); Andreas Fritz (Biomax Informatics AG); Eduardo Torres Schumann (Biomax Informatics AG); Karsten Wenger (Biomax Informatics AG); Kaj Albermann (Biomax Informatics AG); George A. Komatsoulis (National Cancer Institute Center for Bioinformatics (NCICB)); Peter A. Covitz (National Cancer Institute Center for Bioinformatics (NCICB)); Lawrence W. Wright (National Cancer Institute Office of Communications); Frank Hartel (National Cancer Institute Center for Bioinformatics (NCICB))
Abstract: The National Cancer Institute (NCI) partnered with Biomax in order to expand the cancer gene section of the NCI Thesaurus to its full extent. Linguistic text analysis of Medline as well as thorough manual annotation using various ontologies was applied to populate a reference terminology with biologically meaningful content.
Poster B-37
Prediction of Protein Function Across Gene Ontology Terms
Roman Eisner (University of Alberta); Alona Fyshe (University of Alberta); Russell Greiner (University of Alberta); Paul Lu (University of Alberta); Brandon Pearcy (University of Alberta); Brett Poulin (University of Alberta); Duane Szafron (University of Alberta); David Wishart (University of Alberta)
Abstract: We present a classification system which predicts the function of novel proteins. The predicted protein function(s) are terms within the Molecular Function aspect of Gene Ontology, which is a controlled vocabulary. We discuss our efforts to exploit the hierarchical structure of GO to increase our predictive accuracy and computational efficiency.
Poster B-38
Clustering of Pfam Protein Families in MeSH Space
Andreas Rechtsteiner (Los Alamos National Lab & Portland State University); Luis M Rocha (School of Informatics and Cognitive Science Program, Indiana University); Charlie E Strauss (Bioscience Division, Los Alamos National Lab)
Abstract: A large-scale, quantitative study of a literature mining algorithm for protein function prediction is presented. The test set is the Pfam protein sequence family classification. For 15200 proteins from 1611 Pfam families their family is predicted based on the MeSH terms of the literature associated with the proteins.
Poster B-39
WFLOW: A Browser Based Web Services Workflow Editor
James Long (University of Alaska Fairbanks); Tom Marr (University of Alaska Fairbanks)
Abstract: Workflows may be built using web services as building blocks. WFLOW is a browser-based workflow editor that uses Tigra Tree Menu, Graphviz, and gSOAP to build, display, and invoke web services workflows on our bioinformatics cluster. Future work will incorporate an ontology for our WSDL to constrain graph semantics.
Poster B-41
Mutation Miner
Christopher Baker (Concordia University); Rene' Witte (Universitaet Karlsruhe)
Abstract: Transfer of mutation specific raw-text annotations to protein structures requires an algorithm that can integrate natural language processing, database queries, sequence retrieval, sequence alignment and residue mapping. A multi component system is described for this purpose and we evaluate the use of text mining to drive protein structure visualization providing the protein engineer with enhanced access to the knowledge reported by multiple investigators.
Poster B-42
Quantitative Assessment for Relationship between Sequence Similarity and Function Similarity
Trupti Joshi (Digital Biology Laboratory, University of Missouri-Columbia, Columbia, MO); Dong Xu (Digital Biology Laboratory, University of Missouri-Columbia, Columbia, MO, USA)
Abstract: To quantify assignment errors in gene function prediction using comparative sequence analysis, we studied the relationship between sequence similarity and function similarity in terms of the three aspects of Gene Ontology (biological process, molecular function, and subcellular localization). Our study provides a benchmark to estimate the confidence in function assignment.
Poster B-43
Wnt Pathway Analysis with Automated Natural Language Processing
Carlos Santos (University of Michigan / Bioinformatics Program); Daniela Eggle (University of Michigan / Bioinformatics Program); David States (University of Michigan / Bioinformatics Program)
Abstract: We present a natural language processing pipeline to extract and analyze biomolecular interaction assertions from biomedical text. Focusing on the Wnt signaling pathway, the pipeline expands the existing canonical reference pathway with interaction assertions relating to that pathway, as well as renders various organism-specific variations upon the canonical pathway.
Poster B-44
Improved Order Theoretical Techniques for GO Functional Annotation
Cliff Joslyn (LANL); Susan Mniszewski (LANL); Karin Verspoor (LANL); Judith Cohn (LANL)
Abstract: We present new order theoretical advances for the POSOC categorization algoithm applied to functional annotation: a pseudo-distance measure based on discrete Markov processes; an interval-valued rank measure in terms of vertical level in the GO; and order theoretical measures of horizontal distance based on so-called ``fence'' measures.
Poster B-45
Ontology-based pattern identification - a novel algorithm for gene function prediction
Yingyao Zhou (Genomics Institute of the Novartis Research Foundation); Jason Young (The Scripps Research Institute); Andrey Santrosyan (Genomics Institute of the Novartis Research Foundation); Kaisheng Chen (Genomics Institute of the Novartis Research Foundation); Frank Yan (Genomics Institute of the Novartis Research Foundation); Elizabeth Winzeler (Genomics Institute of the Novartis Research Foundation)
Abstract: Ontology-based Pattern Identification (OPI) is a novel data-mining algorithm that predicts gene function based on expression data and gene ontology. Instead of relying on a universal threshold of expression similarity, OPI automatically determines the optimal analysis settings that yield gene lists with highest statistical significance for function prediction.
Poster B-46
Semantic Model of NCI Thesaurus: Representing Genes and Alleles
Sherri de Coronado (National Cancer Institute Center for Bioinformatics); Gilberto Fragoso (National Cancer Institute Center for Bioinformatics); Francis Hartel (National Cancer Institute Center for Bioinformatics); Dan Lyman (IMC); Ranjana Srivastava (IMC)
Abstract: We present a vocabulary model of gene alleles developed for the NCI Thesaurus to satisfy various user needs for genes, alleles, diseases, drugs and related concepts including semantic relationships among gene classes, wild type and allelic variants, fusion genes, and oncogenes with related domains such as diseases, pathways and processes.
Poster B-47
Extraction and analysis of protein functional links from MEDLINE abstracts
Nikolai Daraselia (Ariadne Genomics, Inc); Sergei Egorov (Ariadne Genomics, Inc); Anton Yuryev (Ariadne Genomics, Inc); Andrey Yazhuk (Ariadne Genomics, Inc)
Abstract: We describe MedScan, a completely automated information extraction system, based on full sentence parsing. MedScan is tailored towards protein function information extraction, and was used to extract about 500,000 proteins functional links from the 2004 release of MEDLINE. A simple statistical analysis of the extracted data is presented.
Poster B-48
Extensions of the Gene Ontology in the Mouse Genome Informatics system
David P Hill (The Jackson Laboratory); Harold J Drabkin (The Jackson Laboratory); Mary E Dolan (The Jackson Laboratory); Li Ni (The Jackson Laboratory); Alexander D Diehl (The Jackson Laboratory); Christopher Hollenbeck (Rensselaer Polytechnic Institute); Joel E Richardson (The Jackson Laboratory); James A Kadin (The Jackson Laboratory); Judith A Blake (The Jackson Laboratory)
Abstract: The Mouse Genome Informatics (MGI) resource provides extensive information about laboratory mouse biology. The Gene Ontology (GO) is incorporated into MGI and used for functional annotations. GO advances within the MGI resource include user-friendly auto-text summaries, GO data analysis and visualization tools, extended annotation sets, and fully-integrated queries.
Poster B-49
Crossing of Subdiscipline Boundaries in the Biomedical Literature Explosion
Andrew Dolbey (Center for Computational Pharmacology, UCHSC); Lawrence Hunter (Center for Computational Pharmacology, UCHSC)
Abstract: In this poster, we show a case of subdiscipline boundary crossings in biomedical literature. The Medline citations for a single gene were collected. The spread of these citations across journals is tabulated, and then a graph of their distribuition across subspecializations is demonstrated.
Poster B-50
An autonomous web service sequence analysis agent
Ayton Meintjes (Bioinformatics Unit, University of Pretoria); Fourie Joubert (Bioinformatics Unit, University of Pretoria)
Abstract: Describes the development of an automated software agent which accepts a sequence of interest and then queries various data sources for similar/related sequences. Metadata relating to these sequences are then retrieved and a subset of the results most likely to be of interest then further investigated.
Poster B-51
The Challenge of Phenotype Data: developing methods to access mouse phenotypes and human associations
Janan Eppig (The Jackson Laboratory); Howard Dene (The Jackson Laboratory); Susan Bello (The Jackson Laboratory); Megan Cassell (The Jackson Laboratory); Donna Burkart (The Jackson Laboratory); Ira Lu (The Jackson Laboratory); Linda Washburn (The Jackson Laboratory); Monika Tomczuk (The Jackson Laboratory); Anna Anagnostopoulos (The Jackson Laboratory); Cynthia Smith (The Jackson Laboratory)
Abstract: The mouse is the premier organism used as a model to study human biology and disease. The Mouse Genome Informatics (MGI) program is developing the Mammalian Phenotype Ontology for annotating mouse phenotypic data and providing integrated access to these data in human readable and computationally tractable formats.
Back to top
Pathways, Networks and Proteomics
Poster C-1
Role of c-jun N-terminal MAP Kinase in rF1 induced activation of murine peritoneal macrophages
Rajesh Sharma (School of Biotechnology, Banaras Hindu University); Ajit Sodhi (School of Biotechnology, Banaras Hindu University); H. V Batra (Division of Microbiology, DRDE)
Abstract: Fraction 1 of Yersinia pestis activated JNK MAP kinase. SP600125 inhibited the JNK phosphorylation. Where as, the rF1-induced JNK activity was correlated to inhibition of NO caused by SP600125 in the rF1-treated macrophages. Taken together, data suggests the involvement of JNK pathway in rF1 induced activation of macrophages.
Poster C-2
Bioinformatics tools to encode and integrate microscopy time-lapse sequences for drug discovery: Lineage analysis the basis for novel cell-based assays
Imtiaz Khan (Cardiff University); Lee Campbell (Cardiff University); Paul Smith (Cardiff University); Rachel J Errington (Cardiff University)
Abstract: Exploiting the potential for pharmacological modulation of tumour is a key goal for drug discovery. We describe here novel bioinformatics tools for encoding cell behaviour derived from time-lapse microscopy. The strategy is to inform mathematical models capable of predicting in silico drug signatures for use in screening and therapeutics.
Poster C-3
A spatial resolution model for actin polymerization and fusion of lysosome phagosome.
Juilee Thakar (Department of Bioinformatics, University of Würzburg); Mark Kühnel (European Molecular Biology Laboratory); Gareth Griffiths (European Molecular Biology Laboratory); Thomas Dandekar (Department of Bioinformatics, University of Würzburg)
Abstract: Some pathogens, example Mycobecterium Tuberculosis inhibit fusion of phagosome and lysosome leading to their survival in the phagosome. We developed a model to study formation of actin filaments on the phagosome membrane and in turn leading to fusion of phagosome with the lysosome to understand the critical steps in the process.
Poster C-4
Quantifying the relevance of different mediators in the human immune cell network
Paolo Tieri (Dept. Exp. Pathology Università di Bologna); Silvana Valensin (C.I.G. Università di Bologna); Vito Latora (Dept. Physics Università di Catania); Gastone Castellani (C.I.G. Università di Bologna); Daniel Remondini (C.I.G. Università di Bologna); Massimo Marchiori (W3C MIT Lab for Computer Science); Claudio Franceschi (C.I.G. Università di Bologna)
Abstract: Immune cells communicate through secreted mediator proteins. We present a method for quantifying the relevance of these mediators in an immune network where cells are nodes and mediators are the connecting links. Our results reveal that few mediators play a prominent role in the interactions among the immune cell types.
Poster C-5
Fusion of Multiple Decision Models in Proteomic Biomarker Discovery.
Asha Thomas (University of Louisville - Department of Computer Engineering and Computer Science); Georgia Tourassi (University of Louisville - Department of Computer Engineering and Computer Science); Adel Elmaghraby (University of Louisville - Department of Computer Engineering and Computer Science); Nigel G Cooper (University of Louisville - Anatomical Sciences and Neurobiology); Sumanth D Prabhu (University of Louisville - Medicine); Saeed A Jortani (University of Louisville - Pathology and Laboratory Medicine); Roland Valdes Jr (University of Louisville - Pathology and Laboratory Medicine)
Abstract: We investigated the feasibility of combining linear and nonlinear decision models to improve the discriminatory performance for protein mass spectrometry data of heart failure patients. The results obtained show that the fusion of multiple decision models is a promising approach in proteomic data analysis.
Poster C-7
Comprehensive Network Analysis of Glaucoma Pathophisiology
Yuri Nikolsky (GeneGo Inc.); Tatiana Nikolskaya (GeneGo, Inc.); Eugene Kirillov (GeneGo, Inc.); Eugene Rakhmatulin (GeneGo, Inc.); Svetlana Sorokina (GeneGo, Inc.); Tatiana Serebrijskaya (GeneGo, Inc.); Sean Ekins (GeneGo, Inc.); Andrej Bugrim (GeneGo, Inc); Dmitri Novikov (University of Illinois); Valery Shestopalov (University of Miami); Robert Haselkorn; Dmitry Ivanov; Vadim Brodianskir; Olga A. Agapova; M. Rosario Hernandez
Abstract: We developed a general approach for assembly, prioritization and analysis of biological networks implicated in complex diseases using heterogeneous experimental datasets and known human protein interactions. Specifically, we studied the networks affected in optic nerve head astrocytes in primary open angle glaucoma based on microarray gene expression and genetic data.
Poster C-8
Conditional network analysis: exploring network dynamics and identifying key modulator genes from gene expression data
Kai Wang (Joint Centers for Systems Biology, Columbia University); Nilanjana Banerjee (Joint Centers for Systems Biology, Columbia University); Adam Margolin (Joint Centers for Systems Biology, Columbia University); Ilya Nemenman (Joint Centers for Systems Biology, Columbia University); Katia Basso (Institute of Cancer Genetics, Columbia University); Riccardo Dalla-Favera (Institute of Cancer Genetics, Columbia University); Andrea Califano (Joint Centers for Systems Biology, Columbia University)
Abstract: We develop a systematic approach for identifying key modulators of transcriptional interactions. By reverse-engineering thousands of cellular networks conditioned on the expression of candidate modulator genes, we identify putative modulators which cause statistically significant changes in network topology. These are indeed enriched in GO categories involved in cellular regulations.
Poster C-9
GASP: GC/MS Analysis Software Package
Paulo Augusto Suano Nuin (McMaster University, Dept of Biology); Elizabeth Weretilnyk (McMaster University, Dept of Biology); Peter Summers (McMaster University, Dept of Biology); David Guevara (McMaster University, Dept of Biology); Brian Golding (McMaster University, Dept of Biology)
Abstract: The GC/MS Analysis Software Package (GASP) allows for the comparison of data between different GC/MS experiments, and makes possible a comparative analysis of all chromatographically separated compounds between different GC/MS runs.
Poster C-10
Reconstruction of Genetic Regulatory Networks Using the Network Inference Testbed Software Environment
Ronald Taylor (Pacific Northwest National Laboratory (US Dept of Energy)); William Cannon (Pacific Northwest National Laboratory (US Dept of Energy))
Abstract: The Network Inference Testbed (NIT) is being created at Pacific Northwest National Laboratory as an interactive environment for the evaluation of algorithms used in the reconstruction of the structure of regulatory networks. The NIT compares and trains genetic network inference methods on artificial networks and simulated gene expression perturbation data.
Poster C-11
PATIKA: An informatics infrastructure for cellular networks
Emek Demir (Bilkent University); Asli Ayaz (Bilkent University); Ozgun Babur (Bilkent University); Ahmet Cetintas (Bilkent University); Ugur Dogrusoz (Bilkent University); Emine Zeynep Erson (Bilkent Univesity); Erhan Giral (Bilkent University); Cagri Aksay (Bilkent University); Fatma Arik (Bilkent University); Esra Ataer (Bilkent University); E. Belviranli; R. Colak; G. Cozen; A. Dilek; E. Kaya; H. Yildirim
Abstract: The PATIKA Project aims for an informatics infrastructure to cope with the inherently complex cellular pathway data and provides software tools with sophisticated visualization technology around a central database using an extensive ontology and data integration mechanisms. It also features advanced database querying, microarray data analysis, and automatic layout components.
Poster C-12
Discovery of biological networks from diverse functional genomic data
Chad Myers (Princeton University); Drew Robson (Princeton University); Adam Wible (Princeton University); Chandra Theesfeld (Saccharomyces Genome Database); Kara Dolinski (Princeton University); Olga Troyanskaya (Princeton University)
Abstract: We have developed a general system for discovery of biological pathways from diverse functional genomic data. Our methodology employs a Bayesian network to integrate 9 different types of evidence for protein relationships from over 950 publications and a graph search algorithm designed for recovering functionally coherent groups of proteins.
Poster C-13
A Sequence-Based Characterization Of Human Proteins Localized Within Cell Compartments
George Acquaah-Mensah (Massachusetts College of Pharmacy and Health Sciences); Sonia Leach (University of Colorado School of Medicine); Cary Miller (University of Colorado School of Medicine)
Abstract: Exploratory Data Analysis was used to characterize amino acid sequences of human proteins in sub-cellular localizations. Based on hydrophobicity, polarity, polarizability, normalized van der Waals volume and charge, descriptions of amino acid composition, transitions and distribution for proteins localized in the nucleus, cytosol, plasma membrane and mitochondrion are provided.
Poster C-14
Combining Alignment and N-grams in G-Protein Coupling Specificity Prediction
Betty Cheng (Language Technologies Institute, School of Computer Science, Carnegie Mellon University); Jaime Carbonell (Language Technologies Institute, School of Computer Science, Carnegie Mellon University); Judith Klein-Seetharaman (Department of Pharmacology, University of Pittsburgh Medical School)
Abstract: Understanding the signalling mechanism of G-protein coupled receptors requires knowledge of the G-proteins a given receptor can couple with. By combining n-grams and alignment information and using the whole receptor instead of focusing on the intracellular regions as in previous studies, our coupling specificity prediction method outperforms the current state-of-the-art.
Poster C-15
(There will also be an oral presentation of this poster.)
Multiple Knockouts Analysis of Genetic Robustness in the Yeast Metabolic Network
David Deutscher (Tel Aviv University); Isaac Meilijson (Tel Aviv University); Eytan Ruppin (Tel Aviv University)
Abstract: Genetic robustness, a constant phenotype in face of genetic perturbations, is widespread in biological systems. By analyzing results of multiple concurrent knockouts to the metabolic genes of S.cerevisiae, we provide the first large-scale study of metabolic network robustness, portraying its architecture and shedding new light on its evolution.
Poster C-16
Conversion of CellML into SBML
Maria Schilstra (Biocomputation Research Group, STRI, University of Hertfordshire); Joanne Matthew (Biocomputation Research Group, STRI, University of Hertfordshire); Michael Hucka (Control and Dynamical Systems, Caltech); Andrew Finney (Biocomputation Research Group, STRI, University of Hertfordshire)
Abstract: SBML and CellML are XML-based standard languages for describing biochemical models, with comparable scope and representation. We have created a CellML to SBML transformation tool (in XSLT) that is capable of converting 95% of the models in the CellML model repository into valid SBML without loss of information.
Poster C-17
Method for quantitation of MS-peaks from 18O/16O labeled phospho-peptides
Claus A. Andersen (Siena Biotech SpA, Discovery Research); Stefano Gotta (Siena Biotech SpA, Discovery Research); Roberto Raggiaschi (Siena Biotech SpA, Discovery Research); Andreas Kremer (Siena Biotech SpA, Discovery Research); Letizia Magnoni (Siena Biotech SpA, Discovery Research); Georg C. Terstappen (Siena Biotech SpA, Discovery Research)
Ab