Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

Posters - Schedules

Poster presentations at ISMB/ECCB 2021 will be presented virtually. Authors will pre-record their poster talk (5-7 minutes) and will upload it to the virtual conference platform site along with a PDF of their poster beginning July 19 and no later than July 23. All registered conference participants will have access to the poster and presentation through the conference and content until October 31, 2021. There are Q&A opportunities through a chat function and poster presenters can schedule small group discussions with up to 15 delegates during the conference.

Information on preparing your poster and poster talk are available at: https://www.iscb.org/ismbeccb2021-general/presenterinfo#posters

Ideally authors should be available for interactive chat during the times noted below:

View Posters By Category

Session A: Sunday, July 25 between 15:20 - 16:20 UTC
Session B: Monday, July 26 between 15:20 - 16:20 UTC
Session C: Tuesday, July 27 between 15:20 - 16:20 UTC
Session D: Wednesday, July 28 between 15:20 - 16:20 UTC
Session E: Thursday, July 29 between 15:20 - 16:20 UTC
A Machine Learning approach for pre-miRNA discovery in SARS-CoV-2
COSI: COVID-19
  • Gabriela Merino, SINC-CONICET-FICH-UNL/IIB-UNER, Argentina
  • Leandro Bugnon, SINC-CONICET-FICH-UNL, Argentina
  • Jonathan Raad, SINC-CONICET-FICH-UNL, Argentina
  • Federico Ariel, IAL-CONICET-UNL, Argentina
  • Diego Milone, SINC-CONICET-FICH-UNL, Argentina
  • Georgina Stegmayer, SINC-CONICET-FICH-UNL, Argentina

Short Abstract: We have developed a novel approach based on machine learning (ML) for identifying precursors of microRNAs (pre-miRNAs) in the genome of the novel coronavirus SARS-CoV-2. The discovery of miRNAs in the novel virus is of high importance in the context of the current sanitary crisis for the improvement of diagnostic and treatment strategies. For the discovery of pre-miRNAs 3 ML methods were used in combination: a novel deep convolutional neural network (mirDNN), a deep self-organizing map (deeSOM), and a one-class support vector machine (OC-SVM). Each method provided a list of candidates to potential pre-miRNAs in the viral genome, supported by a score. In this study, pre-miRNAs were identified as those having scores in the top 10th percentile in all methods. With this approach, 12 candidate structures were discovered in the viral genome and validated with small RNA-seq data. The expression of 8 mature miRNAs-like sequences was confirmed from SARS-CoV-2 infected human cells. The predicted miRNAs were found as targeting a subset of human genes of which 109 are transcriptionally deregulated upon infection, and 28 of those genes are down-regulated in infected human cells and related to respiratory diseases and viral infection, previously associated with SARS-CoV-1 and SARS-CoV-2.

A Machine Learning Model for Predicting Deterioration of COVID-19 Inpatients
COSI: COVID-19
  • Omer Noy, Tel-Aviv University, Israel
  • Dan Coster, Tel-Aviv University, Israel
  • Maya Metzger, Tel-Aviv University, Israel
  • Itai Attar, Tel-Aviv University, Israel
  • Shani Shenhar-Tsafraty, Tel-Aviv Sourasky Medical Center, Tel-Aviv University, Israel
  • Shlomo Berliner, Tel-Aviv Sourasky Medical Center, Tel-Aviv University, Israel
  • Galia Rahav, Sheba Medical Center, Tel-Aviv University, Israel
  • Ori Rogowski, Tel-Aviv Sourasky Medical Center, Tel-Aviv University, Israel
  • Ron Shamir, Tel-Aviv University, Israel

Short Abstract: The COVID-19 pandemic has been spreading worldwide since December 2019, presenting an urgent threat to global health. Due to the limited understanding of disease progression and of the risk factors for the disease, it is a clinical challenge to predict which hospitalized patients will deteriorate. Moreover, several studies suggested that taking early measures for treating patients at risk of deterioration could prevent or lessen condition worsening and the need for mechanical ventilation. We developed a predictive model for early identification of patients at risk for clinical deterioration by analyzing electronic health records of COVID-19 inpatients at the two largest medical centers in Israel. Our model employs machine learning methods and uses routine clinical features. Deterioration was defined as a high NEWS2 score adjusted to COVID-19. In prediction of deterioration within the next 7-30 hours, the model achieved an area under the ROC curve of 0.84 and area under the precision-recall curve of 0.74. It achieved values of 0.76 and 0.7 respectively in external validation on data from a different hospital.

A multi-modal data harmonisation approach for the discovery of COVID-19 drug targets
COSI: COVID-19
  • Tyrone Chen, Monash University, Australia
  • Melcy Philip, Monash University, Australia
  • Kim-Anh Lê Cao, The University of Melbourne, Australia
  • Sonika Tyagi, Monash University, Australia

Short Abstract: Despite the volume of experiments performed and data available, the complex biology of the coronavirus SARS-CoV-2 is not yet fully understood. Existing molecular profiling studies have focused on analysing a single omics data type, which captures changes in a small subset of the molecular perturbations caused by the virus. As the logical next step, results from multiple such omics analyses may be integrated in parallel to highlight the interrelationships of disease-driving biomolecules. We demonstrate that valuable information may be masked by using the former fragmented views in analysis, and biomarkers resulting from such an approach cannot fully reveal disease etiology. Hence, we present a reproducible and open-access data harmonisation R package that can be scaled to future multiomics analyses to study a phenotype in a holistic manner. To demonstrate the effectiveness of our pipeline, we applied it to a drug screening task. We integrated multiomics data to find the low-level statistical associations between data features in two case studies. Strongly correlated features within each of these two datasets were used for drug-target analysis, resulting in a list of 84 drug-target candidates, including 7 high confidence targets, amsacrine, bosutinib, ceritinib, crizotinib, nintedanib and sunitinib as potential starting points for drug development.

Accelerating phylogenetic tree optimization using efficient placement heuristics
COSI: COVID-19
  • Yatish Turakhia, University of California, Santa Cruz, United States
  • Cheng Ye, University of California, San Diego, United States
  • Bryan Thornlow, University of California, Santa Cruz, United States
  • Jakob McBroome, University of California, Santa Cruz, United States
  • Angie Hinrichs, University of California, Santa Cruz, United States
  • Nicola De Maio, EBI-EMBL, United States
  • Nick Goldman, EBI-EMBL, United States
  • David Haussler, University of California, Santa Cruz, United States
  • Russell Corbett-Detig, University of California, Santa Cruz, United States

Short Abstract: The unprecedented accumulation of SARS-CoV-2 sequencing data has completely overwhelmed our current phylogenetic inference and interpretation tools. With tens of thousands of new sequences being deposited to online databases everyday over the existing million-plus sequences, phylogenetic placement tools, such as UShER, seem to provide the only practical approach to maintain and update a comprehensive global phylogeny for studying viral evolution and transmission dynamics. However, such sequential addition of new samples onto existing trees can result in sub-optimal tree structures, which aggravates gradually as more sequences are incorporated. This necessitates a periodic re-optimization of the tree but prior tree optimization tools are inadequate to meet the enormous computational demands of the SARS-CoV-2 data. In this talk, I will present a novel tree optimization heuristic that adapts UShER's highly-optimized placement module and data structures to optimize the tree for maximum parsimony using lazy SPR (subtree pruning and regrafting) moves. Our preliminary results on SARS-CoV-2 phylogenies suggests that this heuristic performs competitively to prior tools in optimizing the tree for maximum parsimony, and is the only available tool that can handle the current scale and new influx of SARS-CoV-2 data. At UCSC, we are now using this tool to periodically optimize the SARS-CoV-2 global phylogenetic tree that we maintain and share publicly.

Adaptive sentinel testing in workplace for COVID19 pandemic
COSI: COVID-19
  • R. Krishna Murthy Karuturi, The Jackson Laboratory, United States
  • Edison Liu, The Jackson Laboratory, United States
  • Yi Li, The Jackson Laboratory, United States
  • Joshy George, The Jackson Laboratory, United States

Short Abstract: Employee testing and isolation is one of the critical strategies to create a safe workplace during the pandemic for majority of organizations. Adaptive testing frequency reduces cost while keeping the epidemic under control at workplace. Important differentiating characteristics of workplace are employee exposure to COVID19 could be different from local community, potential dual exposure to COVID19 at work and home, and ability to accurately track transmission at workplace. We developed a bi-modal SEIR model and an R-shiny tool to estimate testing frequency using population incidence, risks of acquiring infection from community and workplace, workforce size, and sensitivity of testing. Simulations revealed that employee behavior in adherence to protective measures (e.g. masking, social distancing, and avoiding crowds) and minimizing number of onsite employees have large effects on testing frequency, followed by reducing workplace transmission rate through workplace mitigation protocols and higher sensitivity of the test deployed, though to lesser extent. While offering feasibility in logistics, compared to cohort testing, the sentinel testing leads to only marginal increase in number of infections for high community incidence rates. We applied our model to estimate testing frequency for all campuses of The Jackson Laboratory, show that our model accurately guides the testing regimen.

Bayesian detection and uncertainty quantification of change points of COVID-19 cases in the Midwest: Timeliness of non-pharmaceutical interventions.
COSI: COVID-19
  • Alessandro Maria Selvitella, Purdue University Fort Wayne, United States
  • Kathleen Lois Foster, Ball State University, United States

Short Abstract: In this work, we study the time evolution of COVID-19 and in particular its changes with respect to non-pharmaceutical interventions (eg. lockdowns, social distancing, face mask, stay at home, and many others). We will concentrate in particular on understanding the relationship between qualitative changes in the curve of COVID-19 cases in the Midwest and two government policy orders ("Face Mask" and "Stay at Home") by using Bayesian detection and uncertainty quantification of the first change point of the COVID-19 case curve. We found evidence that there has been qualitative rate changes in the diffusion of COVID-19 before the "Stay at Home" and "Face Mask" orders were implemented in all Midwest states but Illinois. This calls for possibly quicker governmental actions in those states.

Big Data Analytics and Visualization for COVID-19 Intervention Studies
COSI: COVID-19
  • Aditya Rao, Tata Consultancy Services Ltd, India
  • Rajgopal Srinivasan, Tata Consultancy Services Ltd, India
  • Vangala Govindakrishnan Saipradeep, Tata Consultancy Services Ltd, India
  • Thomas Joseph, Tata Consultancy Services Ltd, India
  • Sujatha Kotte, Tata Consultancy Services Ltd, India
  • Naveen Sivadasan, Tata Consultancy Services Ltd, India

Short Abstract: The COVID-19 pandemic has led to a massive and collective pursuit by the research community to find effective diagnostics, drugs and vaccines guided by useful information from literature. We applied text-mining on MEDLINE abstracts and the CORD-19 corpus to extract a rich set of pair-wise correlations between various biomedical entities. We built a comprehensive pair-wise entity association network involving 15 different entity types using both text-mined associations as well as novel associations obtained using link prediction. The resulting network also contains a specialized COVID-19 subnetwork that provides a network view of COVID-19 related literature. We built a UI to explore the information captured in the correlation network. CoNetz consisted of pairwise associations involving ~174,000 entities covering 15 different entity types. The specialized COVID-19 subnetwork consisted of ~7.8 million pair-wise associations involving ~43,000 entities.

Chest X-ray Classification for Detecting COVID-19 Using Convolutional Neural Network
COSI: COVID-19
  • S Suba, International Institute of Information Technology, Hyderabad, India
  • Nita Parekh, International Institute of Information Technology, Hyderabad, India

Short Abstract: The world is facing the most difficult challenge of this century with Covid-19 pandemic. Despite all measures taken including vaccinating people, the respiratory disease is striking with higher infection and fatality rates in many countries. Hospitals and health departments are unable to handle the surge for requirement of life sustaining instruments/facilities like ventilators, oxygen supply, etc. Faster diagnosis and patient prioritization are the need of the hour and chest X-rays (CXR) have shown promise. Since portable X-ray machines are cheap and can easily be taken to the point-of-care, new methods to identify COVID-19 from CXRs can be helpful in such situations. Here we propose a simple Convolutional Neural Network (CNN) approach for classifying chest X-Rays (CXRs) into Covid and Normal categories. With accuracy 98%, recall 99% and precision 97.05%, the performance of the model is comparable to the state-of-the-art deep learning networks. Also, it has much fewer parameters (~2M) compared to over 10M for most deep networks. This makes the proposed model deployable on portable devices like smartphones and can be used in clinical settings. The dataset used for training the model is the largest openly available resource in our knowledge. 

Comparison of spike protein gene sequence between Pakistani and mutant strains of SARS-CoV-2
COSI: COVID-19
  • Jamshed Arslan, Salim Habib University (formerly, Barrett Hodgson University), Pakistan
  • Arooj Shafiq, Salim Habib University (formerly, Barrett Hodgson University), Pakistan
  • Humaira Jamshed, Habib University, Pakistan, Pakistan

Short Abstract: Background: The approval and the subsequent administration of COVID-19 vaccines have given the world a sigh of relief. However, the emergence of escape variants (strains of SARS-CoV-2 that can escape antibody neutralization) is raising questions on the usefulness of currently approved COVID-19 vaccines. Likewise, the mutant UK, South African, Brazilian and Indian strains of SARS-CoV-2 are increasingly being reported in various parts of the world. To our knowledge, a comparative analysis of the strain prevalent in Pakistan with the mutant strains is scarce.
Objective: To compare and contrast the gene sequence encoding spike protein of SARS-CoV-2 between the commonly-reported variety in Pakistan with the mutant UK, South African, Brazilian and Indian strains. The choice of spike protein stems from the fact that it is the primary target of currently available vaccines.
Methods: For comparison of the viral strains, we focused two clinically relevant parts of the SARS-CoV-2: receptor-binding domain (RBD) and/or N-terminal domain (NTD). NCBI database was searched for gene sequences. Multiple sequence alignment and structural analysis were used to investigate genetic variance and their implications.
Results and Discussion: As expected, differences have been observed between Pakistani and mutant strains. Variations in spike protein may give us the extent of viral evolution under the selection pressures of the vaccine. We do not yet know the implications of these differences for the efficacy of COVID-19 vaccines, but these variations may render spike protein resistant to current vaccines. This indicates the need for targeting viral antigens other than the spike protein. We propose that multiple antigens (such as membrane M, envelope E and nucleocapsid N proteins) should be targeted simultaneously. The polyclonal response in such cases can deter the emergence of escape strains.

Connectivity imputation implicates angiogenesis and finds therapeutic targets for neurological manifestations of COVID-19
COSI: COVID-19
  • Di Zhou, Tufts University, United States
  • Donna Slonim, Tufts University, United States
  • Diana Sapashnik, Tufts University, United States
  • Rebecca Newman, Tufts University, United States
  • Christopher Michael Pietras, Tufts University, United States

Short Abstract: Neurological symptoms and long-term neurological sequelae of COVID-19
are common. Causative mechanisms are poorly understood and effective
therapies remain to be found. Here, we aim to discover therapeutic
targets for patients with neurological manifestations. We combined
true and imputed neuronal connectivity mapping data to identify
compounds that reverse gene expression signatures of SARS-CoV-2
infection in two models of brain cells, and we identified targets of
the most-connected compounds in these models. Interestingly, we found
little overlap of drugs across models, but substantial overlap of
their targets. Gene network modules known to capture molecular
functions were assessed for target gene enrichment. We then examined
functions of both the target genes and the wider enriched modules for
clues to targetable regulators of neurological symptoms. In addition
to immune processes centered around cellular adhesion and T-cell
activation, we identified calcium signaling, Src kinases, and
angiogenesis via the VEGF, PDGFR, and PI3K pathways as key targetable
processes in the CNS consequences of COVID-19. Our work demonstrates
the impact of cell-specific connectivity mapping and its potential for
addressing neurological diseases.

COVID-19 incidence in the Indiana's secondary school system through a Conditional Gaussian model and an age-structured compartmental model
COSI: COVID-19
  • Alessandro Maria Selvitella, Purdue University Fort Wayne, United States
  • Kale Menchhofer, Purdue University Fort Wayne, United States
  • Nathan Mills, Purdue University Fort Wayne, United States
  • Kathleen Lois Foster, Ball State University, United States

Short Abstract: COVID-19 was declared a pandemic by the World Health Organization in March 2020. One of the most noteworthy circumstances of the COVID-19 outbreak in the United States was the closure of virtually all schools throughout the country. Since their closure, one of the most pressing issues pertaining to COVID-19 is how to properly reopen schools without sparking a surge in cases throughout the community. In this work, we will concentrate on a couple of distinct models with the intent of capturing important factors in the diffusion of the coronavirus in Indiana’s secondary school system. For the sake of interpretability, we confined our analysis to the simplest models capturing the phenomenon under study. In the first model, we analyze the number of cases in each school, subdividing them by county. The distribution of the number of cases in schools within a given county is modeled with a Conditional Gaussian Distribution; namely, we model the number of cases in each county as a linear function of the sum of the student cases in that county plus a Gaussian error. The second model is a compartmental model with age structure (4 compartments of young interacting with 4 compartments of adults). We find that: (i) The conditional sum of the student cases per county scales linearly with the number of cases of the county, speculatively suggesting the possibility of concentrating the testing in schools and using the scaling factor to estimate the incidence of COVID-19 in the full population; (ii) The simulations of the compartmental model with parameters in line with those of Indiana showed that even if adults keep their contact with other adults to a minimum, transmission from young can present itself to be extremely detrimental to the more at-risk population. This shows that optimal school reopening strategies can potentially benefit not only the school population, but the entire community. Taken in conjunction, these results underline once more the importance of adopting proper school reopening strategies and how they relate to the diffusion of the coronavirus outside the school environment.

COVIDHunter: An Accurate, Flexible, and Environment-Aware Open-Source COVID-19 Outbreak Simulation Model
COSI: COVID-19
  • Mohammed Alser, ETH Zurich, Switzerland
  • Jeremie S. Kim, Carnegie Mellon University, ETH Zurich, Switzerland
  • Nour Almadhoun Alserr, ETH Zurich, Switzerland
  • Stefan W. Tell, ETH Zurich, Switzerland
  • Onur Mutlu, ETH Zurich, Switzerland

Short Abstract: Early detection and isolation of COVID-19 patients are essential to successfully mitigating the disease spread.
With a limited number of both vaccinations and daily COVID-19 tests performed in every country, simulating COVID-19 reproduction and the effects of mitigation strategies remains among the most effective ways to alleviate healthcare systems and guide policy-makers. We introduce COVIDHunter, a flexible and accurate COVID-19 outbreak simulation model that evaluates mitigation measures applied to a population and provides suggestions for future mitigation measures. COVIDHunter quantifies the spread of COVID-19 in a geographical region by simulating the COVID-19 reproduction rate while considering external factors such as environmental conditions (e.g., climate, temperature) and mitigation measures.

Using Switzerland as a case study, COVIDHunter recommends that policy-makers maintain current mitigation measures for 30 days to prevent overwhelming hospital capacity. Relaxing the mitigation measures by 50% for 30 days increases both the daily capacity need for hospital beds (including ICU beds and ventilators) and daily number of deaths exponentially by an average of 5.1x. Unlike existing state-of-the-art models (IBZ, LSHTM, ICL, IHME), COVIDHunter accurately monitors and predicts the daily number of COVID-19 cases, hospitalizations, and deaths. COVIDHunter is adaptable to various scenarios with different environmental conditions and mitigation measures.

CoVigator – geographical and temporal navigation through SARS-CoV-2 genomic variants
COSI: COVID-19
  • Thomas Bukur, TRON - Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz gGmbH, Germany
  • Pablo Riesgo Ferreiro, TRON - Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz gGmbH, Germany
  • Patrick Sorn, TRON - Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz gGmbH, Germany
  • Ranganath Gudimella, TRON - Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz gGmbH, Germany
  • Thomas Rösler, TRON - Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz gGmbH, Germany
  • Martin Löwer, TRON - Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz gGmbH, Germany
  • Barbara Schrörs, TRON - Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz gGmbH, Germany

Short Abstract: The outbreak of the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) has led to the global pandemic COVID-19 with over 147 M cases and over 3.1 M deaths (WHO April 28, 2021). A first offensive of effective vaccines has been developed in unprecedented speed. However, the discovery of SARS-CoV-2 spike-glycoprotein mutants and the consequential potential to escape vaccine-induced protection demonstrates the importance of monitoring SARS-CoV-2 sequences to enable early detection and monitoring of such genomic variants of concern.
We have developed CoVigator, a knowledge base to enable geographical and temporal navigation through SARS-CoV-2 genomic variants. The tool enables researchers to investigate SARS-CoV-2 variants and derived epitopes to guide development of future vaccine targets and to tackle potential escape variants.
CoVigator routinely downloads virus genome assemblies or raw sequencing data from GISAID and ENA respectively. We perform quality controls on the raw data, normalize metadata, align sequences to the reference genome, call variants, annotate variants with their functional effect, compute the co-occurrence matrix that allows us to cluster variants and finally estimate the immunogenicity of every non-synonymous variant. The results are set in temporal and geographic context and can be investigated via our interactive dashboard on genome and gene level (covigator.tron-mainz.de).

Deep learning of lung lesions detection and quantification from chest computed tomography images uncover clinical relevance for COVID-19
COSI: COVID-19
  • Chuanqing Wu, Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, HUST, China
  • Yang Liu, The Jackson Laboratory for Genomic Medicine, 06032, CT USA, United States
  • Jingxiong Tao, Department of Radiology, General Hospital of the Yangtze River Shipping, Wuhan Brain Hospital, China, China
  • Abhishek Agarwal, The Jackson Laboratory for Genomic Medicine, 06032, CT USA, United States
  • Yue Zhao, The Jackson Laboratory for Genomic Medicine, 06032, CT USA, United States
  • Yuan Li, Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, HUST, China
  • Dianshi Wang, Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, HUST, China
  • Zhouyuan Du, Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, HUST, China
  • Peng Hu, Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, HUST, China
  • Na Fan, Department of Radiology, General Hospital of the Yangtze River Shipping, Wuhan Brain Hospital, China, China
  • Weiwei Yang, Department of Radiology, General Hospital of the Yangtze River Shipping, Wuhan Brain Hospital, China, China
  • Shengjun Li, Department of Radiology, Yingcheng Chinese Medical Hospital, China, China
  • Joshy George, The Jackson Laboratory for Genomic Medicine, 06032, CT USA, United States
  • Kaixiong Tao, Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, HUST, China
  • Sheng Li, The Jackson Laboratory for Genomic Medicine, 06032, CT USA, United States

Short Abstract: The COVID-19 pandemic was straining the diagnostic and treatment capacities of countries around the world at the beginning. While the RT-PCR test is the most accurate diagnostic method, chest computed tomography (CT) images are helpful to evaluate the severity of sickness. Recently, the vigorous development of deep learning models has proposed to detect lesions on lung CT images. However, there are few methods for identifying overlapped COVID-19 lesions and classifying lesions on lung CT images. To remedy this situation, we build a deep learning MASK R-CNN segmentation model on medical images that obtains high accuracy in identifying COVID-19 lesions with average precision at 0.63 (Figure 1a). Here, we developed a 3D reconstruction of CT scans for COVID-19 lesion volume quantification and evaluation. We further employed a classification model for predicting ventilator demand of new patients (Figure 1b). The lung lesion quantification shows that the clinical conditions – elderly, BMI, gender, and chronic disease (diabetes) linked to the lesion volumes significantly. Moreover, the ventilator demand classifier on the test dataset achieves an F1 score of 0.86 and ROC AUC at 0.92. We are showing the first evidence that CT image can be used to predict if a patient will be more likely to use a ventilator. Our model can be used for identifying if patients are at risk of severe illness, ensuring that patients with severe illness receive appropriate care as early as possible and allow for effective allocation of health resources.

Deep variational graph autoencoders for novel host-directed therapy options against COVID-19
COSI: COVID-19
  • Alexander Schoenhuth, University of Bielefeld, Germany
  • Sumanta Ray, University of Bielefeld, India
  • Snehalika Lall, Indian Statistical Institute, India
  • Anirban Mukhopadhyay, University of Kalyani, India
  • Sanghamitra Bandyopadhyay, Indian Statistical Institute, India

Short Abstract: Motivation: The COVID-19 pandemic is still asking urgent questions with respect to therapeutic options. Drugs that can be repurposed promise rapid implementation in clinical practice because of their prior approval. There is still room for considerable improvement, because advanced artificial intelligence techniques for screening drug repositories have not been fully exploited so far.

Result: We construct a comprehensive network by combining year-long curated drug-protein/protein-protein interaction data on the one hand, and most recent SARS-CoV-2 protein interaction data on the other hand. We learn the structure of the resulting encompassing molecular interaction network and predict missing links using variational graph autoencoders (VGAEs), as a recent advance in deep learning that has not been systematically exploited so far in drug repurposing. We focus on hitherto unknown links between drugs and human proteins that play key roles in the replication cycle of SARS-CoV-2. Thereby, we establish novel host-directed therapy (HDT) options whose utmost plausibility is confirmed by realistic simulations. As a consequence, many of the predicted links are likely to be crucial for the virus to thrive on the one hand, and can be targeted with existing drugs on the other hand.
Availability:github.com/sumantaray/Covid19

Demographic Analysis of SARS-CoV-2 Mutations in India
COSI: COVID-19
  • Kushagra Agarwal, International Institute of Information Technology, Hyderabad, India
  • Nita Parekh, International Institute of Information Technology, Hyderabad, India

Short Abstract: The current pandemic due to COVID-19 has affected over 30M people in India to date. To understand how the virus is mutating and to assess the impact of contact tracing, quarantine, and lockdown, we carried out demographic analysis of genetic variations in 4708 Indian SARS-CoV-2 isolates till 11th Jan ‘21. The results were compared to our previous study conducted on 685 Indian isolates during the early period 27th Jan – 27th May 2020. Phylogenetic and Principal component analyses were carried out to identify region-specific clusters. From the early period study, we discovered a novel subclade I/GJ-20A and a pair of co-occurring Maharashtra-specific mutations, which might explain the high number of deaths in the states of Gujarat and Maharashtra. The extended analysis on the larger dataset identified 7126 unique mutations and revealed that subclade I/GJ-20A continues to dominate in Gujarat, while the subclade I/A3i (with predominance in Telangana state) and the pair of Maharashtra specific mutations had very few new samples indicating their containment. However, another subclade I/MH-20B was seen to emerge in Maharashtra, two clusters I/Tel-A-20B, I/Tel-B-20B and a pair of co-occurring mutations were observed in Telangana, and a novel subclade I/AP-20A was found in Andhra Pradesh. Region-specific sequencing efforts are required in a vast country like India to understand the dynamics and to follow up the various virus strains emerging in the country.

Density and diversity of disordered SLiMs drive divergence of pathogenicity in coronaviruses
COSI: COVID-19
  • Heidy Elkhaligy, Florida International University, United States
  • Jessica Siltberg-Liberles, Florida International University, United States
  • Christian A Balbin, Florida International University, United States
  • Alberto Sigler, Florida International University, United States
  • Daniel Morales, Florida International University, United States
  • Jessica L Gonzalez, Florida International University, United States
  • Teresa Liberatore, Florida International University, United States
  • Christopher Mederos, Florida International University, United States
  • Patricia Milanes, Florida International University, United States
  • Gisselle Prida, Florida International University, United States
  • Kyana Rodriguez, Florida International University, United States
  • William Vidal, Florida International University, United States

Short Abstract: Viruses mimic their host's proteins to alter the cell machinery using functional signatures formed by consecutive amino acids called short linear motifs (SLiMs). SLiMs may occur by chance due to mutational processes, and the false positive rate is high. The chance of identifying a functional SLiM can be improved by considering surface accessibility and intrinsic disorder. To investigate how SLiMs varies across SARS-CoV-2, the cause of the COVID-19 pandemic, and its related betacoronaviruses, protein families for all SARS-CoV-2 proteins were constructed. SLiMs, intrinsic disorder, and surface accessibility were predicted for all sequences and mapped to their corresponding multiple sequence alignment. Comparative analysis of the three main clades shows more accessible disordered SLiMs in the SARS and MERS clades compared to the outgroup clade. Analysis of the human coronaviruses reveals specific SLiMs in severe coronaviruses that were absent in milder coronaviruses. The observed increase in density and diversity of SLiMs in accessible and disordered regions for the more severe coronaviruses indicates additional opportunities to interfere with the host cell machinery. Moreover, MERS-CoV, the most fatal of all human coronaviruses, exhibits unique motifs. Thus, SLiM density and diversity may contribute to the divergence of viral pathogenicity and clinical manifestations in COVID-19.

Detecting common low complexity regions for SARS-CoV-2 and human proteomes to prevent potential multidirectional risk factors in vaccine development
COSI: COVID-19
  • Aleksandra Gruca, Department of Computer Networks and Systems, Silesian University of Technology, Poland
  • Joanna Ziemska-Legięcka, Institute of Biochemistry and Biophysics Polish Academy of Sciences, Poland
  • Patryk Jarnot, Department of Computer Networks and Systems, Silesian University of Technology, Poland
  • Elzbieta Sarnowska, Department of Molecular and Translational Oncology, Maria Sklodowska-Curie National Research Institute of Oncology, Poland
  • Tomasz J. Sarnowski, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Poland
  • Marcin Grynberg, Institute of Biochemistry and Biophysics Polish Academy of Sciences, Poland

Short Abstract: In this work we report on the existence of low complexity regions (LCRs) in the SARS-CoV-2 proteome which are similar to those present in human proteins located in neural tissues, B-cells and T-cells, or involved in such important processes as alternative splicing or cytokinesis. Our findings are especially important in the context of the anti-COVID-19 vaccine development which requires thoughtful choice of viral targets (epitopes).

Although the epitopes for neutralizing SARS-CoV-2 antibody are known, the public information about the specific antigens which were used in vaccine development is not available. Several articles discuss epitope choices in the SARS-CoV-2 proteome, others focus on phylogenetic traits and history of the Coronaviridae genome/proteome. However, none of them explicitly analyse viral protein LCRs in this context.

Using our methods specifically designed to compare LCRs we show that five LCRs in three proteins (nsp3, S and N) encoded by the SARS-CoV-2 genome are highly similar to regions from the human proteome. As many as 21 predicted T-cell epitopes and 27 predicted B-cell epitopes overlap with the five SARS-CoV-2 LCRs similar to human proteins.

Our findings are crucial to the process of selection of new epitopes for drugs or vaccines which should omit such regions. In this work we indicate that epitopes cannot be selected based only on factors like phylogenetic conservation or potential epitope targets. Finding five LCRs that are highly similar to regions from human proteome poses a serious threat to the vaccine or drug design as the vaccine targeted LCRs may potentially be ineffective or alternatively lead to autoimmune diseases development.

Discovery of Broad Spectrum Coronavirus Drug Repurposing Candidates Via Computational Target Identification and Drug-Target Interaction Predictions
COSI: COVID-19
  • Stephen MacKinnon, Cyclica Inc., Canada
  • Michael Sugiyama, Ryerson University, Canada
  • Haotian Cui, University of Toronto, Canada
  • Dasha Redka, Cyclica Inc., Canada
  • Vijay Shanahi, Cyclica Inc., Canada
  • Bo Wang, University of Toronto, Canada
  • Costin Antonescu, Ryerson University, Canada

Short Abstract: The next major challenge in our global fight against COVID-19 involves future variants capable of evading vaccines and developing drug resistance. Presumably, host-based targets may offer an alternative therapeutic strategy with a high barrier to drug resistance. Here, we identify novel COVID-19 host targets and drug repurposing candidates, using two complementary machine learning technologies. Non-obvious, host-based targets were first identified using a Graph Convolutional Network (GCN) trained to model clinically relevant network proximity distances in a multiscale interactome. The multiscale interactome combines relationships between genes, proteins, drugs, biological pathways and disease, including viral-host protein-protein interactions networks. Additional drug repurposing candidates were then retrieved from PolypharmDB for several GCN-identified host targets. PolypharmDB is a database of 10,244 clinically-tested drugs cross-screened with 8535 human proteins, based on a deep learning Drug Target Interaction (DTI) prediction model. Twenty-six FDA-approved drugs identified by this screen were selected for cellular infectivity assays, of which four had demonstrable bioactivities. One notable hit demonstrated potent antiviral activity against five genetically different human coronaviruses, while its predicted target was also confirmed by siRNA gene silencing. Together, we present a promising drug repurposing candidate and a new therapeutic target for future drug design programs.

BioRxiv Preprint: www.biorxiv.org/content/10.1101/2021.04.13.439274v1

Drug repurposing for COVID-19 through biological process and pathway analysis of the transcriptional signature elicited by SARS-CoV-2 infection
COSI: COVID-19
  • Poulami Chaudhuri, Tata Consultancy Services, India
  • Akriti Jain, Tata Consultancy Services, India
  • Sutapa Datta, Tata Consultancy Services, India
  • Rajgopal Srinivasan, Tata Consultancy Services, India

Short Abstract: There is continuing interest in identifying drugs that can be repurposed for the treatment of COVID-19. We explored the LINCS database to screen for perturbagens that can reverse the expression profile of differentially expressed genes (DEGs) identified post SARS-CoV-2 infection (MOI 0.2) of primary lung cells and DEGs identified from transcriptional profiling of deceased COVID-19 patients. The perturbagens were evaluated for their potential against COVID-19 based on parameters such as their ability to act as antiviral agents, antiplasmodial agents (especially antimalarial), their FDA approval status, the number of genes perturbed by them, their effect on immune related pathways and/or inflammation pathways. Based on the analysis we identified three different categories of small molecules i.e. fourteen exclusive perturbagens that target the early phase of infection, twelve perturbagens that target the advanced stage of the disease and sixteen overlapping perturbagens which may be used throughout the course of infection. Some of these drugs are antivirals, while others could be involved in countering immune dysregulation caused by the infection. Several of these drugs have been identified as likely candidates by other methods also, adding to our confidence in the potential candidates identified for re-purposing.

Drug repurposing of COVID-19 with a network module perspective
COSI: COVID-19
  • Ines Rivero-Garcia, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Spain
  • Erik Sonnhammer, Stockholm University, Sweden
  • Miguel Castresana-Aguirre, Stockholm University, Science for Life Laboratory,, Sweden
  • Luca Guglielmo, Stockholm University, Science for Life Laboratory,, Sweden
  • Dimitri Guala, Stockholm University, Science for Life Laboratory,, Sweden

Short Abstract: For fast-evolving diseases such as the ongoing COVID-19, traditional de novo drug design may fail at providing timely treatment options. Additionally, targeting a single gene might fail to provide a therapeutic effect due to the complexity of the disease phenotype and the functional gene associations that sustain cellular function. Genes associated with a given disease tend to cluster together within the functional interactome, forming disease modules that represent different pathobiological mechanisms and whose integrity is perturbed in disease scenarios. Here we approach drug repurposing from a network and module-based perspective to identify the disease modules formed by 332 human proteins that interact with SARS-Cov-2 proteins and examine how drugs target these modules. We found that these human proteins form 4 modules and are targeted by 52 drugs, of which 51 target the same module. Although there are several polypharmacological drugs targeting this disease, Fostamatinib is the only one that targets different modules. This imbalance in the number of drugs targeting the different modules opens the possibility of employing single or combinatorial polypharmacological drugs that can target all modules (i.e. pathophysiological mechanisms) involved in the disease to potentially achieve higher treatment efficacy.

Efficient Algorithms for Optimized mRNA Sequence Design
COSI: COVID-19
  • He Zhang, Baidu Research USA, United States
  • Liang Zhang, Baidu Research USA; Oregon State University, United States
  • Ang Lin, Stemirna Therapeutics Inc., China
  • Ziyu Li, Baidu Research USA, United States
  • Congcong Xu, Stemirna Therapeutics Inc., China
  • Kaibo Liu, Baidu Research USA, United States
  • Boxiang Liu, Baidu Research USA, United States
  • Xiaopin Ma, Stemirna Therapeutics Inc., China
  • Fanfan Zhao, Stemirna Therapeutics Inc., China
  • Hangwen Li, Stemirna Therapeutics Inc., China
  • David Mathews, University of Rochester, United States
  • Yujian Zhang, Stemirna Therapeutics Inc., China
  • Liang Huang, Baidu Research USA; Oregon State University, United States

Short Abstract: A messenger RNA (mRNA) vaccine can benefit from an mRNA sequence that is stable and highly productive in protein expression, which have been shown to be correlated to greater mRNA secondary structure folding stability and optimal codon usage. However, sequence design remains a hard problem due to the exponentially many synonymous mRNA sequences that encode the same protein. We propose and implement an efficient algorithm that can solve this problem in O(n^3)-time theoretically, where n is the mRNA sequence length. We observe that this algorithm achieves quadratic-runtime in practice when n<8,000, and can design SARS-CoV-2 spike genome in 7.9 mins. We further develop a linear-time approximate version, LinearDesign, based on beam pruning heuristics, which can finish spike genome design in 4 mins with only 5.5% MFE loss. We also extend this algorithm for incorporating the codon optimality, which can jointly optimize folding free energy and codon usage. Our novel algorithm enlarges the design space greatly, reaching a large region that has never been explored before. We design seven mRNA sequences of the SARS-CoV-2 spike protein, which perform better on chemical stability, protein expression and immunogenicity than the codon-optimized benchmark in wet-lab assays.

Functional profiling of respiratory tract microbiomes indicates altered microbial pathways in COVID-19 patients
COSI: COVID-19
  • Niina Haiminen, IBM, United States
  • Filippo Utro, IBM, United States
  • Ed Seabolt, IBM, United States
  • Laxmi Parida, IBM, United States

Short Abstract: When SARS-CoV-2 enters a host, it interacts with the micro-organisms already inhabiting the body. Understanding the virus-host-microbiome interactions could yield additional insights into the biological processes perturbed by viral invasion. Notably, alterations in the gut microbiome species and metabolites have been documented during respiratory viral infections, potentially impacting the lungs via gut-lung microbiome crosstalk.

To characterize microbial functions in the lungs during SARS-CoV-2 infection, we carried out a functional analysis of recent RNA sequencing data from bronchoalveolar lavage fluid. We applied PRROMenade with a collection of annotated bacterial and viral protein domains from the IBM Functional Genomics Platform. The cohorts of eight COVID-19 patients, twenty-five community-acquired pneumonia cases, and twenty healthy controls clearly separated based on their microbial functional profiles.

Distinguishing metabolic pathway signatures were discovered by identifying outlying pathway scores computed from the functional profiles. The findings, containing overlaps with previous studies, include decreased potential for lipid metabolism and glycan biosynthesis and metabolism, and increased potential for carbohydrate metabolism in COVID-19 microbiomes. The results also suggest additional altered pathways, possibly specific to the lower respiratory tract microbiome, calling for further research on host-microbiome interactions during SARS-CoV-2 infection, potentially supporting the development of probiotics to improve clinical outcomes.

Genome-wide mapping of SARS-CoV-2 RNA structures identifies therapeutically-relevant elements
COSI: COVID-19
  • Martijn van Hemert, Leiden University Medical Center, Netherlands
  • Janusz Bujnicki, Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Poland
  • Danny Incarnato, University of Groningen, Netherlands
  • Ilaria Manfredonia, University of Groningen, Netherlands
  • Chandran Nithin, International Institute of Molecular and Cell Biology in Warsaw, Poland
  • Almudena Ponce-Salvatierra, International Institute of Molecular and Cell Biology in Warsaw, Poland
  • Pritha Ghosh, International Institute of Molecular and Cell Biology in Warsaw, Poland
  • Tomasz Wirecki, International Institute of Molecular and Cell Biology in Warsaw, Poland
  • Tycho Marinus, University of Groningen, Netherlands
  • Natacha Ogando, Leiden University Medical Center, Netherlands
  • Eric Snijder, Leiden University Medical Center, Netherlands

Short Abstract: SARS-CoV-2 is a betacoronavirus with a linear single-stranded, positive-sense RNA genome, whose out-break caused the ongoing COVID-19 pandemic. The ability of coronaviruses to rapidly evolve, adapt,and cross species barriers makes the development of effective and durable therapeutic strategies a challenging and urgent need. As for other RNA viruses, genomic RNA structures are expected to lay crucial roles in several steps of the coronavirus replication cycle. Despite this, only a handful of functionally-conserved coronavirus structural RNA elements have been identified to date. We performed RNA structure probing to obtain single-base resolution secondary structure maps of the full SARS-CoV-2 coronavirus genome both in vitro and in living infected cells.
Probing data recapitulate the previously described coronavirus RNA elements (5′UTR and s2m), and reveal new structures. Of these, ∼10% show significant covariation among SARS-CoV-2 and other coronaviruses, hinting at their functionally conserved roles. Secondary structure-restrained 3D modeling of these segments further allowed for the identification of putative druggable pockets. In addition, we identify a set of single-stranded segments in vivo, showing high sequence conservation, suitable for the development of anti-sense oligonucleotide therapeutics. Collectively, our work lays the foundation for the development of RNA-targeted therapeutic strategies to fight SARS-related infections.
To our knowledge this has been the first published study of the whole 30,000 nt SARS-CoV-2 RNA genome structure studied in vitro and in vivo:
Nucleic Acids Res. 2020 Dec 16; 48(22): 12436–12452.
Published online 2020 Nov 10. doi: 10.1093/nar/gkaa1053

Government measures against the COVID-19 pandemic must be determined according to the socio-economic status of the country
COSI: COVID-19
  • Alessandro Maria Selvitella, Purdue University Fort Wayne, United States
  • Kathleen Lois Foster, Ball State University, United States

Short Abstract: Ever since COVID-19 began spreading across the world governments and researchers have been scrambling to figure out where it would go next and how to respond. Some of the most pressing questions relate to factors that govern the spread of COVID-19, what populations are most vulnerable and how to allocate resources effectively. Societal and economic status of a country (e.g. baseline health, wealth/distribution of wealth, government effectiveness, and education status) can impact the spread (epidemiology) and efficacy of control measures. How do socio-economic factors, predating the pandemic, relate to the number of cases, deaths, and the ratio of deaths/cases due to COVID-19 early in the pandemic? Our analysis shows that: (i) Governments may benefit from heterogeneous allocation of healthcare resources and a decentralization of their healthcare system; (ii) Blanket policies are sub-optimal, as greater government health expenditure and access to essential health services is not a guarantee of success in combatting the pandemic; and (iii) Countries with more informed populations that had greater economic equity, employment rates, and personal and economic freedom, but were guided by an effective government appear to have done better in the first months of the pandemic.

Identification and characterisation of a neutralizing antibody epitope for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike protein
COSI: COVID-19
  • Muhammet Celik, Department of Biotechnology, Konya Food and Agriculture University, Konya, Turkey, Turkey
  • Lim Wan Ching, Centre for Bioinformatics, School of Data Sciences, Perdana University, Kuala Lumpur, Malaysia, Malaysia
  • Choi Sy Bing, Centre for Bioinformatics, School of Data Sciences, Perdana University, Kuala Lumpur, Malaysia, Turkey
  • Asif M. Khan, Perdana University, Kuala Lumpur, Malaysia / Bezmialem Vakif University, Beykoz, Istanbul, Turkey, Turkey
  • Vladimir Brusic, Faculty of Science and Engineering, University of Nottingham, Ningbo, China, China

Short Abstract: The spike protein of SARS-CoV-2 is a primary target of the host immune response. Herein, we aimed to computationally identify and characterize neutralizing B-cell epitopes for SARS-CoV-2 spike. At the time of this analysis, there was a lack of such reported information. Thus, known neutralized strain of the earlier close homolog SARS-CoV-1, targeting the S1 receptor binding domain (PDB: 2GHW), was used as a proxy. Epitope residues on spike, recognised within a proximity of a 5Å distance by the residues of neutralizing antibody 80R were determined. Twenty-nine residues of the S1 RBD were observed to be interacting with 34 residues of 80R. The epitope residues were correspondingly align-mapped to SARS-CoV-2 spike. Separately, 2240 non-redundant SARS-CoV-2 spike protein sequences were collected from public repositories (June 2020). Only seven epitope residues showed variability among the reported virus strains but were of negligible incidence (<0.002%). The minimal variability of the neutralizing antibody-interacting residues is in agreement with the reports of high efficacy (>95%) of vaccines that target the spike protein. The approach herein demonstrates the value of using existing 3D structures of neutralising antibodies in complex with a virus protein for identification of putative neutralising epitope residues for a novel, homologous virus protein.

Identification and evolutionary diversification of novel immunoglobulin and ion channel proteins in SARS-CoV-2 and related viruses
COSI: COVID-19
  • Dapeng Zhang, Saint Louis University, United States
  • Yongjun Tan, Saint Louis University, United States
  • Theresa Schneider, Saint Louis University, United States
  • Matthew Leong, Saint Louis University, United States
  • Prakash Shukla, University of Utah School of Medicine, United States
  • Mahesh Chandrasekharan, University of Utah School of Medicine, United States
  • L Aravind, NCBI/NLM/NIH, United States

Short Abstract: The ongoing COVID-19 pandemic strongly emphasizes the need for a better understanding of the function and evolution of its causative agent SARS-CoV-2. Despite intense scrutiny, several structural/accessory proteins of SARS-CoV-2 remain enigmatic. By using a series of dedicated computational methods, we have successfully uncovered several previously unrecognized families of immunoglobulin (Ig) proteins and ion channel proteins in SARS-CoV-2 and many other viruses. The novel Ig proteins include the mysterious ORF8 proteins from SARS-CoV/SARS-CoV-2 related viruses, many proteins from alpha-CoVs and unrelated animal viruses. We show that the ORF8 proteins from the SARS-CoV/SARS-CoV-2 clade are rapidly evolving, which suggests that they might function as immune modulators to delay/attenuate the host immune response against viruses. In addition, we unified the SARS-CoV ORF3a family with several families of viral proteins, including ORF5 from MERS-CoVs, ORF3c from beta-CoVs, ORF3b in alpha-CoVs, most importantly, the Matrix proteins from all CoVs, and more distant homologs from other nidoviruses. We presented computational evidence that these viral families might utilize specific conserved polar residues to constitute an aqueous pore within the membrane-spanning region. This suggest that the novel coronavirus Matrix/ORF3 ion channel proteins might confer a role in virion assembly and membrane budding.

Identification of druggable pockets and structural analysis of the interaction between the SARS-CoV-2 Spike protein and the human ACE2 receptor
COSI: COVID-19
  • Mariem Ghoula, Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, France
  • Sarah Naceri, Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, France
  • Samuel Sitruk, Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, France
  • Delphine Flatters, Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, France
  • Anne-Claude Camproux, Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, France
  • Gautier Moroy, Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, France

Short Abstract: To develop anti-viral therapeutics for SARS-CoV-2, it is important to identify the amino acids stabilizing the SARS-CoV-2 RBD and ACE2 complex and to target specific regions of the complex in order to disrupt it. In this aim, we performed a structural analysis of two crystallographic structures (PDB codes: 6M0J and 6LZG) in order to understand their binding mechanism. Then, the complex and the isolated SARS-CoV-2 RBD protein’s stability and flexibility were studied through Molecular Dynamics simulations. The free binding energy calculations of the complex and the identification of contributing key hotspots were done using the Molecular Mechanics Poisson-Boltzmann Surface Area method. Using the PockDrug software, an extensive pocket search and druggability prediction were conducted to detect the main classes of pockets in the RBD protein. Altogether, our study helped us to identify interesting druggable pockets comprising crucial key residues for the RBD-ACE2 interaction and that can be targeted by efficient inhibitors that could potentially prevent the virus infection. Moreover, it has shown us the impact of the new emerging mutations (K417N, N501T, E484K) and the understanding of their molecular mechanisms in the most worrying SARS-CoV-2 variants such as the UK, the South-African and the Brazilian ones.

Identification of highly conserved, HLA-A2, -A3, and -B7 supertype-restricted T-cell epitopes in the SARS-CoV-2 structural proteins
COSI: COVID-19
  • Melike Karakaya, Bezmialem Vakif University, Turkey / Biruni University, Turkey, Turkey
  • Li Chuin Chong, Beykoz Institute of Life Sciences and Biotechnology, Bezmialem Vakif University, Turkey, Malaysia
  • Mohammad Asif Khan, Bezmialem Vakif University, Turkey / Perdana University, Malaysia, Turkey

Short Abstract: The ongoing COVID-19 pandemic demands a better understanding of the host-pathogen interaction. Understanding the host immune response, in particular, is critical to develop novel effective intervention strategies against the SARS-CoV-2. Herein, we identified highly conserved, HLA-supertype restricted T-cell epitopes that capture the diversity of the viral variants and are applicable to the human population at large. All reported sequences of the structural proteins (Spike, S; Envelope, E; Membrane, M; and Nucleocapsid, N) were downloaded from the GISAID EpiCoV database as of January 2021. The sequences (1,600,549) were processed for each protein, deduplicated using CD-HIT, and aligned with MUSCLE. Shannon’s entropy was used to quantify the protein sequence diversity by use of ABK-AVANA; an overlapping k-mer window of nine was used for immunological applications. The structural proteins were generally conserved, with a low mean nonamer entropy of ~0.13. Highly conserved nonamers were determined (positions of entropy < 0.4) and concatenated if overlapped. Fourteen concatenated sequences were selected for the prediction of HLA-A2, A3, and B7 supertype-restricted epitopes, which provide for a large population coverage (~86%). Thirty-four putative epitopes (S: 23; M: 8; N: 3) were predicted within nine of the concatenated sequences. These sequences merit further investigation as epitope-based vaccine candidates.

Identifying potential novel insights for COVID-19 pathogenesis and therapeutics using an integrated bioinformatics analysis of host transcriptome
COSI: COVID-19
  • Amal Mahmoud, Department of Biology, College of Science, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia, Saudi Arabia
  • Mahmoud Elhefnawi, Biomedical informatics and chemoinformatics group, Informatics and systems department, National Research Center, Egypt., Egypt
  • Salem El-Aarag, Bioinformatics Department, Genetic Engineering and Biotechnology Research Institute (GEBRI), University of Sadat City, Egypt

Short Abstract: The molecular mechanisms underlying the pathogenesis of COVID-19 has not been fully discovered. This study aims to decipher potentially hidden parts of the pathogenesis of COVID-19, potential novel drug targets, and to identify potential drug candidates. Two gene expression profiles (GSE147507-GSE153970) were analyzed and overlapping differentially expressed genes (DEGs) were selected for which top enriched transcription factors and kinases were identified and pathway analysis was performed. Protein-protein interaction (PPI) of DEGs was constructed, hub genes were identified and module analysis was also performed. DGIdb database was used to identify drugs for the potential targets (hub genes and the most enriched transcription factors and kinases for DEGs). A drug-potential target network was constructed and drugs are ranked according to the degree. L1000FDW web-based utility was used to identify drugs that can reverse transcriptional profiles of COVID-19. We identified drugs currently in clinical trials and novel potential 8 drugs (Dasatinib, Bosutinib, Entrectinib, Ponatinib, Vandetanib, Sorafenib, Vemurafenib, and Exemestane). Besides the well-known pathogenic pathways, It was found that axon guidance is a potential pathogenic pathway. Sema7A, which may exacerbate hypercytokinemia, is considered a potential novel drug target. Another potential novel pathway is related to TINF2 overexpression which may induce potential telomere dysfunction and hence DNA damage that may exacerbate lung fibrosis.

Investigating the Interaction between SARS-CoV-2 NSP15 and Human RNF41 Using In Silico Methods
COSI: COVID-19
  • Annika Viswesh, Palo Alto High School, USA and Stanford University, USA, United States
  • Soichi Wakatsuki, Stanford University, USA, United States

Short Abstract: Patients with acute SARS-CoV-2 infection exhibit hyper-inflammatory response and Type 1 Interferon (IFN-1) deficiency. Studies show that SARS-CoV-2 NSP15 suppresses the immune response; however, this has not been investigated at a molecular level. RNF41 controls inflammation and IFN-1 production by binding to MYD88 and TBK1 in the immune signaling pathways. We hypothesized that SARS-CoV-2 NSP15 binds to RNF41 and inhibits RNF41 from regulating the immune signaling pathways. Molecular docking of RNF41 C-terminal domain (CTD) to five NSP15 poses, MYD88, TBK1, and USP8, were performed. Previously unknown structure of RNF41 Zinc-finger domain (ZFD) was generated using homology modeling and docked to different NSP15 poses after determining the RNF41 ZFD active sites using computational techniques. Results show NSP15, TBK1, MYD88, and USP8 bind to the same residues of RNF41 CTD, and NSP15 has the highest binding affinity to RNF41 CTD. Preliminary MD simulations support the docking results confirming our hypothesis that binding between RNF41 CTD and NSP15 could cause the immune system's disruption. Further, NSP15’s binding sites were > ~8 Å away from its catalytic sites, indicating that NSP15’s cleaving function can continue even when NSP15 binds to RNF41. These results set the direction for researching drugs to target SARS-CoV-2 NSP15’s binding sites.

Machine-readable full text collection of COVID-19 preprints in Europe PMC
COSI: COVID-19
  • Michael Parkin, EMBL-EBI, United Kingdom

Short Abstract: During the COVID-19 pandemic many researchers have published their results rapidly via preprints. These preprints are often scattered across different platforms, and in non-standard formats. In recognition of this, Europe PMC (europepmc.org/), an open science platform that enables access to a worldwide collection of life science literature from trusted sources around the globe, launched a project in July 2020 to make the full text of COVID-19 preprints available for reading and reuse via a standard XML format. Currently, over 29,000 full text COVID-19 preprints from several servers (including medRxiv, bioRxiv, arXiv, ChemRxiv, Research Square, and SSRN) can be programmatically searched in Europe PMC, with over 18,000 open access preprints available via bulk download. Preprints are linked to journal-published articles, open peer review materials, as well as underlying data in community databases, including PDBe, ENA, and many more. We hope that this COVID-19 full text preprint collection will accelerate scientific research on COVID-19, and form a corpus for future history of science research.

Modelling COVID-19 Spread in India under Migration
COSI: COVID-19
  • Malay Bhattacharyya, Indian Statistical Institute, Kolkata, India

Short Abstract: The epidemiological models play a key role in understanding the spread of infectious diseases. Unfortunately, most of these models, be it stochastic or deterministic, consider the parameters that are purely based on the disease scenario of the population. There is hardly any approach that consider network dynamics that happen between the geographic locations that are encountered by the model. Most of the existing epidemiological models are based on a closed environment. It is highly promising to study how the spread of infections will change in several locations that experience migration. We chose to employ the model developed by Somermeijer. This model suggests to weight the population variables. We added this as a component of the basic epidemiological model. For getting more insights about the scenario, we carried out an interstate network analysis. We define the distance between Indian states in the form of shortest path distance between the corresponding pair of nodes in the interstate connectivity network. With the daywise data of infected individuals of COVID-19 across the different states in India, we have built up an interstate network. On studying the network parameters, we observed that degree of a node has a higher significance than the between centrality.

Molecular interplay between SARS-CoV-2 and human proteins for viral activation and entry, potential drugs for combat and scope for new therapeutics
COSI: COVID-19
  • Naveen Vankadari, Monash University, Australia

Short Abstract: The pandemic Coronavirus Disease 2019 (COVID19) caused by SARS-CoV-2 is a serious public health concern with global morbidity of over 115 million and a mortality of 2.5 million. Whilst the vaccination been administered in many countries, there several antiviral treatments are being clinically evaluated to fill the “therapeutic gap” in parallel. The development of potential drugs or potential vaccines requires an understanding of SARS-CoV-2 pathogenicity and mechanism of action. Thus, it is essential to understand the full repertoire of viral proteins and their interplay with host factors. Here, we show how the SARS-CoV-2 spike protein undergoes 3 stages of processing to allow virion activation and host cell infection. Our comprehensive structural and computational studies reveal why COVID19 is hypervirulent and incites the possible reason for the failure of several antibody treatments. In addition, our resolved complex structures of spike protein with different host cell receptors shows the complexity of entry. We also demonstrate via experimental, biophysical and molecular dynamics studies that how the host proteins CD26 (DPP4), CD147, Furin and TMPRSS2 process the viral spike glycoprotein and assist in the viral entry in addition to ACE2. These results cognise the detailed mechanism of spike glycoprotein for its entry or cascade into the host cell and also reveal new avenues for potential therapeutics to block different stages of viral entry and new pathways for vaccine development.

Most frequent South Asian haplotypes of ACE2 share identity by descent with East Eurasian populations
COSI: COVID-19
  • Anshika Srivastava, Banaras Hindu University, India
  • Gyaneshwer Chaubey, Banaras Hindu University, India

Short Abstract: It was shown that the human Angiotensin-converting enzyme 2 (ACE2) is the receptor of recent coronavirus SARS-CoV-2, and variation in this gene may affect the susceptibility of a population.Genetically, South Asians are more related to West Eurasian populations rather than to East Eurasians.
Therefore, we have analysed the sequence data of ACE2 among 393 samples worldwide, focusing on South Asia. We did a haplotype analysis from ACE2 polymorphisms.
We observed that the majority of South Asian haplotypes are closer to East Eurasians rather than to West Eurasians. The phylogenetic analysis suggested that the South Asian haplotypes shared with East Eurasians involved two unique event polymorphisms (rs4646120 and rs2285666). In contrast with the European/American populations, both of the SNPs have largely similar frequencies for East Eurasians and South Asians. we ascertained a significant positive correlation for alternate allele (T or A) of rs2285666, with the lower infection as well as case-fatality rate among Indian populations. Therefore, it is likely that among the South Asians, host susceptibility to the novel coronavirus SARS-CoV-2 will be more similar to that of East Eurasians rather than to that of Europeans.

Mutational Signatures Help to Uncover the Role of Smoking in COVID-19 vulnerabilities
COSI: COVID-19
  • Yoo-Ah Kim, NCBI, National Library of Medicine, National Institutes of Health, United States
  • Teresa Przytycka, NCBI, National Library of Medicine, National Institutes of Health, United States
  • Ariella Saslafsky, NCBI, National Library of Medicine, National Institutes of Health, United States
  • Damian Wójtowicz, NCBI, National Library of Medicine, National Institutes of Health, United States
  • Ermin Hodzic, NCBI, National Library of Medicine, National Institutes of Health, United States
  • Bayarbaatar Amgalan, NCBI, National Library of Medicine, National Institutes of Health, United States

Short Abstract: Coronavirus disease (COVID-19) is an infectious disease caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). There is a great heterogeneity within the population in susceptibility to the infection and the disease progression depending on various factors such as smoking, age etc. However, studies focusing on the impact of smoking on COVID-19 have been suggesting contradictory conclusions. To provide a better understanding of the role of smoking in COVID-19 vulnerabilities, we utilize TCGA LUAD (lung adenocarcinoma) dataset that includes both tumor and control lung samples.
An important advantage of utilizing the TCGA data is the possibility of leveraging the mutation data to estimate the amount of exposure to smoking for each patient by inferring mutational signatures. The strength of this signature can be used as a proxy of the amount of exposure to smoking and subsequently leveraged to interrogate the relationship between the exposure to smoking and biological processes in both tumor and healthy tissues. Our analysis revealed positive (though not always significant) correlation between smoking and expression of genes facilitating SARS-CoV-2 entrance to human tissues. In addition, we show smoking-related activation of immune response that can potentially induce cytokine storm. While our study does not definitely resolve the relation between smoking and getting infected, it strongly suggests that smoking increases odds of more serious progression for infected patients.

Network controllability analysis for drug repurposing in COVID-19
COSI: COVID-19
  • Nicoleta Siminea, National Institute of Research and Development for Biological Sciences, and University of Bucharest, Romania
  • Victor Popescu, Department of Information Technologies, Abo Akademi University, Turku, Finland
  • Jose Angel Sanchez Martin, Department of Computer Science, Technical University of Madrid, Spain
  • Daniela Florea, National Institute of Research and Development for Biological Sciences, Bucharest, Romania
  • Georgiana Gavril, National Institute of Research and Development for Biological Sciences, Bucharest, Romania
  • Ana-Maria Gheorghe, National Institute of Research and Development for Biological Sciences, Bucharest, Romania
  • Corina Itcus, National Institute of Research and Development for Biological Sciences, Bucharest, Romania
  • Krishna Kanhaiya, Department of Information Technologies, Abo Akademi University, Turku, Finland
  • Octavian Pacioglu, National Institute of Research and Development for Biological Sciences, Bucharest, Romania
  • Laura Ioana Popa, National Institute of Research and Development for Biological Sciences, Bucharest, Romania
  • Romica Trandafir, National Institute of Research and Development for Biological Sciences, Bucharest, Romania
  • Maria Iris Tusa, National Institute of Research and Development for Biological Sciences, Bucharest, Romania
  • Manuela Sidoroff, National Institute of Research and Development for Biological Sciences, Bucharest, Romania
  • Mihaela Paun, National Institute of Research and Development for Biological Sciences, and University of Bucharest, Romania
  • Eugen Czeizler, Abo Akademi University, Turku, and National Institute of Research and Development for Biological Sciences, Bucharest, Finland
  • Andrei Paun, National Institute of Research and Development for Biological Sciences, and University of Bucharest, Romania
  • Ion Petre, University of Turku, and National Institute of Research and Development for Biological Sciences, Bucharest, Romania

Short Abstract: We investigated a network-based approach to drug repurposing in COVID-19. The focus of our analysis is on two recently published sets of genes whose loss of function led to enrichments in two experiments (low/high multiplicity of infection) of cell survivability on SARS-CoV-2 infected cells. We constructed a directed protein-protein interaction network for each of these sets of host factors, including proteins upstream of the host factors at a distance at most 2, and proteins downstream of drug targets at a distance at most 2. We used interaction data from KEGG, OmniPath and SIGNOR. Using targeted network controllability and the NetControl4BioMed platform we identified control paths of length at most three starting in drug targets and controlling the set of host factors. We focused on the drugs predicted to be most efficient in terms of the highest number of host factors they can control. We obtained 130 drugs (antineoplastic and immunomodulating agents, antithrombotic agents, sex hormones and other compounds) that we validated against existing experimental data (including viral entry, viral replication, in vitro infectivity, life virus infectivity and human cell toxicity) and clinical trials results. We conclude that network modeling methods can be efficient in drug repurposing studies for COVID-19.

Ongoing Global and Regional Adaptive Evolution of SARS-CoV-2
COSI: COVID-19
  • Nash Rochman, The National Institutes of Health, United States
  • Yuri Wolf, The National Institutes of Health, United States
  • Guilhem Faure, The Broad Institute of MIT and Harvard, United States
  • Pascal Mutz, The National Institutes of Health, United States
  • Feng Zhang, The Broad Institute of MIT and Harvard, United States
  • Eugene Koonin, The National Institutes of Health, United States

Short Abstract: We analyzed more than 300,000 genomes of SARS-CoV-2 variants available as of January 2021. We demonstrate the ongoing evolution of SARS-CoV-2 during the pandemic is characterized primarily by purifying selection, with a set of sites evolving under positive selection. The receptor-binding domain of the spike protein and the nuclear localization signal (NLS) associated region of the nucleocapsid protein are enriched with positively selected mutations. These replacements form a strongly connected network of apparent epistatic interactions and are signatures of major partitions in the SARS-CoV-2 phylogeny. Analysis of the phylogenetic distances between pairs of regions reveals four distinct periods of the pandemic linked to the emergence of key mutations. First, rapid diversification into region-specific phylogenies ending February 2020. A major extinction event and global homogenization concomitant with the spread of D614G in the spike protein followed, ending March 2020. NLS associated variants across multiple partitions rose to global prominence March-July, during a period of stasis in terms of inter-regional diversity. Finally, beginning July 2020, multiple mutations, some of which enable antibody evasion, began to emerge associated with ongoing regional diversification. Understanding these trends, which might be indicative of speciation, are paramount to both ongoing and future public health responses.

Prediction and Classification of virus families/sub-families, and Human-MERS/SARS-CoV/SARS-CoV-2 PPIs Prediction using Machine Learning
COSI: COVID-19
  • Rakesh Kaundal, Utah State University, United States
  • David Guevara, Utah State University, United States

Short Abstract: Proteins usually need other proteins to be able to function or to be regulated. These reactions are called protein-protein interactions (PPI). It is important to know about these PPIs for drug target discovery because we can enhance or disrupt these interactions to relieve diseases. Predicting these interactions computationally is important to lower costs.
We have developed a four-phase prediction and classification system for the virus families and sub-families, followed by the prediction criteria whether two protein sequences are a human-MERS/SARS1/SARS2 interaction with machine learning and neural networks, and the predictor will be used from a web server (being implemented, training/testing finished). The web server will work in four phases: it will predict whether the viral sequence is, indeed, a viral protein, it will predict whether the protein comes from a coronavirus, it will predict whether the protein comes from MERS, SARS1 or SARS2, and it will predict whether both proteins interact or not.
We did some testing for phase 4, and with a random forest model, we got 97.25% accuracy with a 1:1 dataset between positive interactions and negative interactions, and with a convolutional neural network, we got 96.97% with a 1:1 dataset between positive interactions and negative interactions, and 96.65% with a 1:5 dataset. We will test with a neural network (deep learning) with attention mechanism and more features to further enhance the accuracy.

Prediction and evolution of the molecular fitness of SARS-CoV-2 variants: Introducing SpikePro
COSI: COVID-19
  • Fabrizio Pucci, Université Libre de Bruxelles, Belgium
  • Marianne Rooman, Université Libre de Bruxelles, Belgium

Short Abstract: The understanding of the molecular mechanisms driving the fitness of the SARS-CoV-2 virus and its mutational evolution is still a critical issue. We built a simplified computational model, called SpikePro, to predict the SARS-CoV-2 fitness from the amino acid sequence and structure of the spike protein. It contains three contributions: the inter-human transmissibility of the virus predicted from the stability of the spike protein, the infectivity computed in terms of the affinity of the spike protein for the ACE2 receptor, and the ability of the virus to escape from the human immune response based on the binding affinity of the spike protein for a set of neutralizing antibodies. Our model reproduces well the available experimental, epidemiological and clinical data on the impact of variants on the biophysical characteristics of the virus. For example, it is able to identify circulating viral strains that, by increasing their fitness, recently became dominant at the population level. SpikePro is a useful, freely available, instrument which predicts rapidly and with good accuracy the dangerousness of new viral strains. It can be integrated and play a fundamental role in the genomic surveillance programs of the SARS-CoV-2 virus that, despite all the efforts, remain time-consuming and expensive.

Prioritization and Proposition of Novel COVID-19 Therapies based on Network Representation Learning
COSI: COVID-19
  • Lauren Nicole DeLong, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Germany
  • Holger Fröhlich, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Germany
  • Andrea Zaliani, Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), Germany
  • Tamara Raschka, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Germany
  • Bruce Schultz, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Germany
  • Manuel Lentzen, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Germany
  • Neal Ravindra, Yale University US, United States
  • David van Dijk, Yale University US, United States

Short Abstract: In addition to vaccines, the World Health Organization sees novel medications as an urgent matter to fight the ongoing COVID-19 pandemic. Multi-relational biomedical information about host-pathogen interactions and further disease associated genes can be modeled using graph theory by representing biological entities, like proteins, as nodes, and the relationships between them as edges. The resulting graph provides an ideal structure for various Network Representation Learning (NRL) tasks. As AI-based network algorithms require sufficient data, NRL algorithms for drug target searching in the COVID-19 context are underexplored. Our work is therefore one of the first to exploit graph-based structures to uncover potential COVID-19 therapies. By utilizing our previously published GuiltyTargets NRL approach, our methods rank every protein in a lung proteome-filtered knowledge graph according to its likelihood of being a potential drug target for COVID-19. Using both bulk and single-cell RNA Sequencing datasets as node features, we recovered approximately 50 highly prioritized targets, 13 of which were consistently high. Of these, MAP2K7, CBSL, and GRK2 were experimentally validated as effective targets in SARS-CoV-2 infected Vero-E6 cells. The high connectivity between top targets highlights the significance of using network connectivity for target prediction (Figure 1). These results are undergoing further biological validation.

RAU: An Interpretable Automatic Infection Diagnosis of COVID-19 Pneumonia with Residual Attention U-Net
COSI: COVID-19
  • Xiaocong Chen, The University of New South Wales, Australia
  • Lina Yao, University of New South Wales, Australia
  • Yu Zhang, Lehigh University, United States

Short Abstract: The novel coronavirus disease 2019 (COVID-19) has been spreading rapidly around the world and caused a significant impact on public health and economy. However, there is still lack of studies on effectively quantifying the different lung infection areas caused by COVID-19. As a basic but challenging task of the diagnostic framework, distinguish infection areas in computed tomography (CT) images and help radiologists to determine the severity of the infection rapidly. To this end, we proposed a novel deep learning algorithm for automated infection diagnosis of multiple COVID-19 Pneumonia. Specifically, we use the aggregated residual network to learn a robust and expressive feature representation and apply the soft attention mechanism to improve the capability of the model to distinguish a variety of symptoms of the COVID-19. With a public CT image dataset, the proposed method achieves 0.91 DSC which is 14.6% higher than selected baselines. Experimental results demonstrate the outstanding performance of our proposed model for the automated segmentation of COVID-19 Chest CT images. Our study provides a promising deep learning-based segmentation tool to lay a foundation to facilitate the quantitative diagnosis of COVID-19 lung infection in CT images.

Revealing SARS-CoV-2 protein architectures and function by integrating modeling and in situ MS proteomics
COSI: COVID-19
  • Nir Kalisman, The Hebrew University of Jerusalem, Israel
  • Dina Schneidman, The Hebrew University of Jerusalem, Israel
  • Michal Linial, The Hebrew University of Jerusalem, Israel

Short Abstract: The genome of SARS-CoV-2, the causal virus of the COVID-19 pandemic, encodes 29 proteins. However, only a handful of them is associated with structure and function. In this study, we utilize a novel application called in situ cross-linking mass spectrometry (in situ CLMS) that provides rich spatial information on the structures of proteins as they occur in intact cells. We demonstrate the utility of this approach by targeting three SARS-CoV-2 proteins for which full atomic structures are missing. We show that integrating cross-links with external structural data is sufficient to model the full-length protein. Cells that expressed tagged-Nsp1 were subjected to in situ CLMS approach. We identified the interactions of Nsp1 with the 40S ribosomal subunit which confirms its fundamental role in blocking translation of infected cells. Similarly, based on structure predictions of individual domains for Nsp2 by AlphaFold2, we successfully assembled Nsp2 into a single consistent model. The Nucleocapsid (N) protein plays a key role in genome packing and virion assembly. Using in situ CLMS was fundamental to assemble a model of the full dimer from available 3D structures of individual domains. These results highlight the importance of cellular context for achieving detailed atomic resolution of SARS-CoV-2 proteins.

SARS-CoV-2 variant timemaps
COSI: COVID-19
  • Rene Warren, BC Genome Sciences Centre, Canada
  • Inanc Birol, BC Genome Sciences Centre, Canada

Short Abstract: As the year 2020 came to a close, several new severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants of concerns (VOCs) have been reported and new VOCs continue to emerge due to relatively fast spreading and high mutation rate of SARS-CoV-2. However, it is difficult to comprehend the scale, in sequence space, geographical location and time, at which SARS-CoV-2 mutates and evolves in its human hosts. To get an appreciation for the rapid evolution of the coronavirus, we built interactive scalable vector graphics (SVG) maps that show nucleotide variations rapidly accumulating on the SARS-CoV-2 genome compared to that of the initial ground-zero SARS-CoV-2 isolate (Wuhan-Hu-1) sequenced in January 2020. To build our SVG maps, which include specific VOCs sampled from the six most populated continents, we periodically access the GISAID repository, chart nucleotide variations in the GISAID catalogue relative to the Wuhan-Hu-1 reference and organize them by date/jurisdiction and evaluate their predicted effect on the gene product. The information is plotted with interactive SVGs files, which are hosted publicly (bcgsc.github.io/SARS2/) and can be queried with mouse-hover to gain rapid insights into emergence. We think these maps will be of utility to researchers in their exploration of SARS-CoV-2 variants.

Single-nucleotide conservation state annotation of the SARS-CoV-2 genome
COSI: COVID-19
  • Jason Ernst, University of California, Los Angeles, United States
  • Soo Bin Kwon, University of California, Los Angeles, United States

Short Abstract: Given the global impact and severity of COVID-19, there is a pressing need for a better understanding of the SARS-CoV-2 genome and mutations. Multi-strain sequence alignments of coronaviruses (CoV) provide important information for interpreting the genome and its variation. We apply a comparative genomics method, ConsHMM, to the multi-strain alignments of CoV to annotate every base of the SARS-CoV-2 genome with conservation states based on sequence alignment patterns among CoV. The learned conservation states show distinct enrichment patterns for genes, protein domains, and other regions of interest. Certain states are strongly enriched or depleted of SARS-CoV-2 mutations, which can be used to predict potentially consequential mutations. We expect the conservation states to be a resource for interpreting the SARS-CoV-2 genome and mutations.

Study of hTMPRSS2 ectodomain-SARS-CoV2 S2’ subunit interactions and virtual screening of drug molecules to identify potential drugs for Covid-19
COSI: COVID-19
  • H.A. Nagarajaram, University of Hyderabad,Hyderabad, India
  • Konduru Guruprasad Varma, Centre for Dna Fingerprinting and Diagnostics (CDFD),Hyderabad, India

Short Abstract: The Spike protein of the SARS-CoV2 undergoes proteolytic cleavage at S1-S2 cleavage site by humanTMPRSS2 and S2 domain of the viral protein is further cleaved to become S2' fusion peptide. However it is still not known how the fusion peptide fuses to the host cell and what host factors mediate this fusion. The N-terminal of TMPRSS2 ectodomain (hNECD) is composed of LDLRA repeat (112-149) and one SRCR domain (150-242). SRCR domains have been known to bind to different types of ligands including viruses and even mediate their endocytosis. Interestingly, in the case of SARS-CoV2 only receptor mediated endocytosis has been observed. So we hypothesized that fusion peptide S2’ binds to hNECD and helps in endocytosis of virus particles into the human cell. We did homology modelling of the hNECD followed by protein-protein docking studies with the known 3D structure of S2’. We analysed binding poses form protein-protein docking studies and identified the best possible pose for interaction of S2’ and hNECD. We performed virtual screening of FDA approved drug molecules and Indian medicinal plant based compunds targeting hNECD interface for S2’ and identified top 10 drug molecules and medicinal plant compunds which can inhibit interaction of hNECD with S2’.

The TMPRSS2 common variant rs12329760 shows evidence of protection against severe COVID-19
COSI: COVID-19
  • Alessia David, Imperial College London, United Kingdom
  • Nicholas Parkinson, Roslin Institute, University of Edinburgh, Edinburgh, UK, United Kingdom
  • Thomas P. Peacock, Department of Infectious Diseases, Imperial College London, London, UK, United Kingdom
  • Erola Pairo-Castineira, Roslin Institute, University of Edinburgh, Edinburgh, UK, United Kingdom
  • Tarun Khanna, Centre for Integrative System Biology and Bioinformatics, Imperial College London, London, UK, United Kingdom
  • Aurelie Cobat, Laboratory of Human Genetics of Infectious Diseases, INSERM, Paris, France, France
  • Albert Tenesa, Roslin Institute, University of Edinburgh, Edinburgh, UK, United Kingdom
  • Vanessa Sancho-Shimuzu, Department of Paediatric Infectious Diseases & Virology, Imperial College London, London, UK, United Kingdom
  • Jean-Laurent Casanova, St. Giles Laboratory of Human Genetics of Infectious Diseases, The Rockefeller University, New York, NY, USA, United States
  • Laurent Abel, Laboratory of Human Genetics of Infectious Diseases, INSERM, Paris, France, EU, France
  • Wendy S. Barclay, Department of Infectious Diseases, Imperial College London, London, UK, United Kingdom
  • J. Kenneth Baillie, Roslin Institute, University of Edinburgh, Edinburgh, UK, United Kingdom
  • Michael J.E. Sternberg, Centre for Integrative System Biology and Bioinformatics, Imperial College London, London, UK, United Kingdom

Short Abstract: The human protein transmembrane protease serine type 2 (TMPRSS2) is required to activate the spike protein of SARS-CoV-2, thus facilitating entry into target cells. We hypothesized that naturally occurring TMPRSS2 variants affecting TMPRSS2 structure or function can affect the wide range of COVID-19 phenotypes, from asymptomatic to severe.
We built a three-dimensional structure of TMPRSS2 using homology modeling and used a range of bioinformatics tools to predict the effect of naturally-occurring human TMPRSS2 genetic variants annotated in GnomAD database. 136 rare variants and one common variant, rs12329760 (p.V160M), were predicted damaging. The rs12329760 variant is very common in the population with minor allele frequency (MAF) ranging from 0.38 in East Asians to 0.15 in Latino individuals. We show, in 2244 critically-ill COVID-19 positive patients, that rs12329760 is associated with a reduced likelihood of developing severe COVID-19 (OR 0.87, 95%CI:0.79-0.97, p=0.01). This association was stronger in homozygous individuals when compared to the general population (OR 0.65, 95%CI:0.50-0.84, p=1.3×10-3). We demonstrate in vitro that this variant affects the catalytic activity of TMPRSS2 and is less able to support SARS-CoV-2 spike-mediated entry into cells.
In conclusion, TMPRSS2 rs12329760 is a common variant associated with a statistically significant modest decreased risk of severe COVID-19.

Transcriptome signature analysis identifies plasma membrane cholesterol depletion as a potential factor for antiviral drug effect against SARS-CoV-2
COSI: COVID-19
  • Szilvia Barsi, Semmelweis University, Hungary
  • Alberto Valdeolivas, Heidelberg University, Faculty of Medicine, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
  • Dániel J. Tóth, Semmelweis University, Faculty of Medicine, Department of Physiology, Budapest, Hungary
  • Péter Várnai, Semmelweis University, Faculty of Medicine, Department of Physiology, Budapest, Hungary
  • László Hunyady, Semmelweis University, Faculty of Medicine, Department of Physiology, Budapest, Hungary
  • Julio Saez-Rodriguez, Heidelberg University, Faculty of Medicine, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
  • Bence Szalai, Semmelweis University, Faculty of Medicine, Department of Physiology, Budapest, Hungary

Short Abstract: The COVID-19 is a global pandemic that has led to millions of deaths worldwide. Repurposing already approved drugs is a promising strategy to find new therapeutic opportunities against rapidly spreading diseases. Signature-based computational drug repurposing can accelerate the discovery of effective drugs. In this study, we analysed in vitro SARS-CoV-2 infected cell lines and gene expression signatures of effective antiviral drugs. Infection-induced and drug-induced molecular profiles are found similar for the reason that they activate adaptive, antiviral responses, as NFKB and JAK-STAT signaling. We found that similarity between infection and drug-induced signatures is predictive for effective drugs against SARS-CoV-2. In addition, we identified clusters of drugs that strongly activate SREBF transcription factors, the main modulator of cholesterol synthesis. To investigate the effect of these antiviral drugs on cholesterol metabolism, we analysed cholesterol sensor localisation in cells during drug treatment, and showed that these drugs decreased the plasma membrane cholesterol level. These results suggest that effective drugs can provoke similar gene expression signatures as viral infection by activating cellular immune responses. Furthermore, some effective drugs have a cholesterol depleting effect on the plasma membrane and according to our hypothesis they can thus prevent viral entry.

WikiPathways as a platform for COVID-19 pathway models
COSI: COVID-19
  • Martina Kutmon, Maastricht University, Netherlands
  • Nhung Pham, Maastricht University, Netherlands
  • Finterly Hu, Maastricht University, Netherlands
  • Friederike Ehrhart, Maastricht University, Netherlands
  • Egon Willighagen, Maastricht University, Netherlands
  • Alexander Pico, The Gladstone Institutes, UCSF, United States
  • Chris Evelo, Maastricht University, Netherlands

Short Abstract: COVID-19 is causing severe health problems all over the world. To identify effective treatments for COVID-19, detailed pathway models to analyze, understand, and predict downstream effects are essential. In the COVID-19 Disease Map project, pathway curators and repositories joined forces to build a knowledge repository of molecular mechanisms of COVID-19. The project aims to describe intensively curated molecular pathways depicting host-virus interactions useful for data analysis and modeling. WikiPathways (www.wikipathways.org), an established community-curated pathway database, has been a core contributor from the start.

Using pathway models from WikiPathways and the COVID-19 Disease Map, we developed an automated R workflow using pathway and network analysis approaches to analyzed transcriptomics datasets focussing on pathway crosstalk between virus-related and host-immune response processes. The workflow also enables the extension of pathways with drug-target information, and the identification of missing knowledge in our pathway models, which can be used to target new investigations.

With this project, we highlight WikiPathways and COVID-19 Disease Map as central resources for COVID-19 related pathway information. The development of automated and reproducible data analysis workflows is essential to further expand our understanding of this virus infection in parallel with the incoming information and availability of datasets.



International Society for Computational Biology
525-K East Market Street, RM 330
Leesburg, VA, USA 20176

ISCB On the Web

Twitter Facebook Linkedin
Flickr Youtube