Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

UPCOMING DEADLINES & NOTICES

  • Registration deadline for organisers and speakers
    ECCB 2024
    April 30, 2024
  • Last day to upload ANY/ALL files to the virtual Platform
    GLBIO 2024
    May 06, 2024
  • Acceptance notification for talks and posters
    ECCB 2024
    May 08, 2024
  • Tech track proposal deadline (closes earlier if capacity is reached)
    ISMB 2024
    May 10, 2024
  • Early bird registration opens
    APBJC 2024
    May 10, 2024
  • Talk and/or poster acceptance notifications
    ISMB 2024
    May 13, 2024
  • Conference fellowship invitations sent for early abstract accepted talks and posters
    ISMB 2024
    May 13, 2024
  • (Conditional) Acceptance notification for proceedings
    ECCB 2024
    May 15, 2024
  • Registration deadline for talk presenting authors
    ECCB 2024
    May 15, 2024
  • CAMDA extended abstracts deadline
    ISMB 2024
    May 20, 2024
  • Late poster submissions deadline
    ISMB 2024
    May 20, 2024
  • Conference fellowship application deadline
    ISMB 2024
    May 20, 2024
  • Revised paper deadline
    ECCB 2024
    May 25, 2024
  • Tech track acceptance notification
    ISMB 2024
    May 31, 2024
  • Last day for discounted student hotel booking
    ISMB 2024
    May 27, 2024
  • Late poster acceptance notifications
    ISMB 2024
    May 28, 2024
  • CAMDA acceptance notification
    ISMB 2024
    May 30, 2024
  • Complete workshop/tutorial programme with speakers and schedule online
    ECCB 2024
    May 30, 2024
  • Conference fellowship acceptance notification
    ISMB 2024
    May 31, 2024
  • Tech track presentation schedule posted
    ISMB 2024
    May 31, 2024
  • Final acceptance notification for proceedings
    ECCB 2024
    May 31, 2024

Upcoming Conferences

A Global Community

  • ISCB Student Council

    dedicated to facilitating development for students and young researchers

  • Affiliated Groups

    The ISCB Affiliates program is designed to forge links between ISCB and regional non-profit membership groups, centers, institutes and networks that involve researchers from various institutions and/or organizations within a defined geographic region involved in the advancement of bioinformatics. Such groups have regular meetings either in person or online, and an organizing body in the form of a board of directors or steering committee. If you are interested in affiliating your regional membership group, center, institute or network with ISCB, please review these guidelines (.pdf) and send your exploratory questions to Diane E. Kovats, ISCB Chief Executive Officer (This email address is being protected from spambots. You need JavaScript enabled to view it.).  For information about the Affilliates Committee click here.

  • Communities of Special Interest

    Topically-focused collaborative communities

  • ISCB Member Directory

    Connect with ISCB worldwide

  • Green ISCB

    Environmental Sustainability Effort

  • Equity, Diversity, and Inclusion

    ISCB is committed to creating a safe, inclusive, and equal environment for everyone

Professional Development, Training, and Education

ISCBintel and Achievements

ISCBacademy 2022 Archived Webinars



To view previous webinars use the links below

2020 Webinars | 2021 Webinars


Please use the links below to view 2022 webinars:


Integrated analysis of single-cell data across technologies and modalities
by Rahul Satija

January 11, 2022 at 11:00 AM EST

The simultaneous measurement of multiple modalities represents an exciting frontier for single-cell genomics and necessitates computational methods that can define cellular states based on multimodal data. Here, we introduce “weighted-nearest neighbor” analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of 211,000 human peripheral blood mononuclear cells (PBMCs) with panels extending to 228 antibodies to construct a multimodal reference atlas of the circulating immune system. Multimodal analysis substantially improves our ability to resolve cell states, allowing us to identify and validate previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets and to interpret immune responses to vaccination and coronavirus disease 2019 (COVID-19). Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets and to look beyond the transcriptome toward a unified and multimodal definition of cellular identity.

Click here to watch

Hosted by:

- top -


Accelerating biomedical discovery with large-scale knowledge assembly and human-machine collaboration
by Benjamin Gyori

January 18, 2022 at 11:00 AM EST

The rate at which biomedical knowledge is produced (both at the level of new publications and data sets) is accelerating, and there is an increasing need to monitor, extract and assemble this knowledge in an actionable form. Classic mechanistic models take substantial human effort to construct and rarely scale to the level of omics datasets, while statistical approaches often do not make use of prior knowledge about mechanisms. To address these challenges, we present INDRA, an automated knowledge assembly system which integrates multiple text mining tools that process the scientific literature, and structured sources (pathway databases, drug-target databases, etc.). INDRA standardizes knowledge extracted from these sources and corrects errors, resolves redundancies, fills in missing information, and calculates confidence to create a coherent knowledge base. From this knowledge, various executable model types (ODEs, Boolean networks, etc.) and causal networks can be generated automatically for further analysis. We discuss technology built on top of INDRA, including human-machine dialogue systems, and EMMAA, a framework which makes available a set of self-updating and self-analyzing models of specific diseases and pathways. We present applications of these tools to automatically construct explanations for experimental observations in multiple disease areas.

Click here to watch

Hosted by:

- top -

COVID-19 Disease Map: building a computational repository of SARS-CoV-2 virus-host interaction mechanisms
by Marek Ostaszewski

February 1, 2022 at 11:00 AM EST

Disease Maps are computational and visual knowledge repositories constructed to catalogue, standardise, and model disease-related mechanisms. They allow to bridge the knowledge gap between biomedical experts and the computational biologists towards contextualised data analysis and modelling of a given pathophysiology. Disease Maps are built using graphical and computational Systems Biology standards and can be used as interactive knowledge repositories, platforms for visual analytics of omics datasets, or integrated into large-scale computational workflows. With the global impact of COVID-19, we organised a community effort to develop a COVID-19 Disease Map to help researchers worldwide to study the mechanisms of the SARS-CoV-2 – host interactions. Our effort engaged over 250 members, contributing as domain experts, diagram curators, analysts, and modellers. This talk will discuss the challenges of community biocuration and integration of a plethora of resources, from Systems Biology diagrams, through interaction databases and text mining results to modelling pipelines of varying granularity.

Click here to watch

Hosted by:

- top -

ISCBacademy Contact



Contact ISCBacademy through this email:

This email address is being protected from spambots. You need JavaScript enabled to view it.

You are also welcome to join the Slack Workspace to ask questions!


- top -

ISCBacademy Team



Batool Almarzouq - Content Coordinator

Nazeefa Fatima - HiTSeq Coordinator

Seth Munholland - Technical Coordinator

Venkata Satagopam - TransMed COSI Representative


- top -

Non-Member Fees



ISCBacademy webinars are offered as a free service to all ISCB members. Non-member rates are based on the ranked membership dues.

  High Income Countries Upper-Middle Income Countries Lower-Middle Income Countries Low Income Countries
Professional $140 $55 $30 $20
Post-Doc $90 $30 $20 $10
Student $60 $30 $20 $10

- top -

ISCBacademy 2021 Archived Webinars



To view previous webinars use the links below

2020 Webinars | 2022 Webinars


Please use the links below to view 2021 webinars:


Approaching Indigenous communities on their own terms in microbiome research
By Matthew Anderson

January 14, 2021

Principles of individual consent and sample deidentification stand as pillars of modern biomedical research but are flawed with respect to certain populations. Indigenous peoples have historically been targeted by unethical practices that continue into the present even when following best practices for conducting research with human subjects. This has led some studies in American Indian/Alaskan Native (AI/AN) populations to included additional safeguards that are reinforced through these communities’ unique legal status as domestic dependent nations. Yet, use of microbiome datasets generally lacks restriction on data sharing and other protections because of their perceived inability to significantly impact public health or individual welfare despite over a decade of work demonstrating the importance of microbial population in human development, metabolism, and immunopathologies. Additionally, raw datasets can contain large proportions of human-derived reads that include information on the host and not just microbes. Current projects in partnerships with the Cheyenne River Sioux Tribe serve as new models of community partnerships to address issues of sovereignty in human and non-human datasets.

Click here to watch

Hosted by:

- top -


The ISCB Competency Framework: what is it and how does it support bioinformatics education and training?
By Cath Brooksbank

January 26, 2021

Demand for the application of data science techniques to life science research is accompanied by an increased need for bioinformatics expertise across a broad range of professionals – from lab-based molecular life-scientists through computer scientists to software engineers; furthermore, the applications of data-driven biology are just as varied, encompassing fundamental life-science, medicine, agriculture and environmental science. Educating and training the individuals who choose career paths in this varied and fast-moving field is therefore challenging, and educators can struggle to keep up with the needs of employers. 

The ISCB competency framework was developed by the ISCB Education Committee in consultation with a global community of bioinformatics professionals to bridge this gap. It provides a minimum information standard defining the competencies required, and the levels they’re required at, for a range of roles that require bioinformatics expertise, and it provides a tool to support bioinformatics educators to develop courses and curricula that meet the needs of employers.

In this webinar I will explain why the ISCB adopted a competency-based approach, describe the newly released version 3 of the framework, summarise how educators and trainers can use the framework to develop new learning interventions or update pre-existing ones, and outline how the ISCB is planning to support a competency-based approach to bioinformatics education and training in the future, both through continuing improvement of the framework and through initiatives to encourage the recognition of courses and curricula that make use of it.

Click here to watch

Hosted by:

- top -


SaGePhy: A phylogenetic simulation framework for gene and subgene evolution
By Soumya Kundu

January 28, 2021

SaGePhy (pronounced sage-phy) is a software package for improved phylogenetic simulation of gene and subgene evolution. SaGePhy can be used to generate species trees, gene trees, and subgene or (protein) domain trees using a probabilistic birth–death process that allows for gene and subgene duplication, horizontal gene and subgene transfer, and gene and subgene loss. SaGePhy implements a range of important features not generally found in other phylogenetic simulation frameworks; these include the ability to simulate (i) subgene or domain level events inside one or more gene families, (ii) both additive and replacing horizontal gene and subgene/domain transfers, (iii) distance-biased horizontal transfers, and (iv) probabilistic sampling of species tree and gene tree nodes, respectively, for gene- and domain-family birth. SaGePhy therefore makes it possible to perform more realistic simulation of gene and subgene/domain evolution.

Click here to watch

Hosted by:

- top -


Responsibilities for the Stewardship of Indigenous Data in Open Science
by Stephanie Russo Carroll

February 18, 2021

As big data, open data, and open science advance to increase access to complex and large datasets for innovation, discovery, and decision-making, Indigenous Peoples’ rights to control and access their data within these data environments remain limited. Indigenous Data Sovereignty focuses on the protection of Indigenous rights and interests in the control and governance of Indigenous data. Indigenous data interests stretch across diverse disciplinary fields connecting community data governance ambitions with institutional and individual responsibilities in practice. Given this reach, a range of initiatives have been developed to strategically build new capabilities for strengthening control and governance of Indigenous data. These initiatives draw on a variety of methods and tactics across law, policy, ethics, and infrastructure. Applying these new tools and mechanisms in open science shifts Indigenous Peoples from invisibility within data ecosystems to vibrant contributors to open science.

Click here to watch

Hosted by:

- top -


Recognising Indigenous Rights in Digital Sequence Information
by Maui Hudson

March 17, 2021

Indigenous concerns about genomic research have been strongly articulated over the past few years with accompanying suggestions about how to improve relationships with indigenous communities and the practice of research. Discussions are now moving towards how Indigenous rights can be recognised in the context of Digital Sequence Information including the recognition of provenance and sharing of protocols and permissions through labelling systems like Local Contexts.

Click here to watch

Hosted by:

- top -


The Evolution of the Data Sharing Culture in Structural Biology
By Helen Berman

May 25, 2021

The Protein Data Bank was established 50 years ago in 1971. In this Webinar I will describe its evolution from a small repository to a large international data resource. The roles that the many stakeholders played in creating a data sharing culture and how science has benefited from that culture will be discussed.

Click here to watch

Hosted by:

- top -


How did they get there? Genetic History of Native Americans in the Central Andes
By Victor Borda

June 10, 2021

Central Andes, which extends from Southern Ecuador to Southern Peru, was the homeland of civilizations that reached the state-level society in pre-Columbian times. The term “Central Andes” do not include solely the highland mountains but also the regions affected by both slopes to the east (Amazon) and west (Pacific Coast). Cultural interaction involving these regions were described for the last 5000 years. Here we describe genetic evidence that these cultural connections were accompanied by gene flow across the Andes and Northern Peru was one of the main scenarios for these movements.

Click here to watch

Hosted by:

- top -


Early publication access and EMBL-EBI bio-molecular data tackle COVID-19
By Matt Pearce and Michael Parkin

June 28, 2021

The COVID-19 Data Portal (CDP) and Europe PMC’s full-text collection of COVID-19 preprints represent two efforts by EMBL-EBI to make data available to promote coronavirus research.

The COVID-19 Data Portal (CDP) was launched in April 2020 to provide access to SARS-CoV-2 and COVID-19 biomolecular data in an accessible manner. The data portal is part of the European COVID-19 Data Platform, which is provided by EMBL's European Bioinformatics Institute (EMBL-EBI), ELIXIR, partners from the ReCoDID and VEO projects and the European Open Science Cloud. There are national portals that complement the covid19dataportal.org and represent a broad international collaboration.

In recognition of many researchers publishing their COVID-19 results rapidly via preprints during the pandemic, Europe PMC (https://europepmc.org/), an EMBL-EBI database for life science literature, launched a project in July 2020 to make the full text of COVID-19 preprints available for reading and reuse via a standard XML format. Preprints are linked to journal-published articles, open peer review materials, as well as underlying data in community databases, including PDBe, ENA, and many more. The full text corpus of COVID-19 preprints with an open access license or similar is made available for download via a public API and FTP site, enabling deeper analysis.

Click here to watch

Hosted by:

- top -

Protein Structure Prediction in a Post-AlphaFold2 World
By Mohammed AlQuraishi

September 7, 202

AlphaFold2 burst on the life sciences stage in late 2020 with the remarkable claim that protein structure prediction has been solved. In this talk I will argue that in some fundamental sense the core scientific problem of static structure prediction is finished, but that further maturation is necessary before AlphaFold2 and similar systems can address biological questions beyond those of structure determination itself. I will outline some of these necessary developments and highlight one in particular: the prediction of structure from individual protein sequences. I will describe present challenges and opportunities, and our efforts to tackle them by combining advances in protein language modeling with end-to-end differentiable structure prediction, presenting new results on the prediction of orphan and de novo designed proteins. Time permitting, I will end by speculating on what abundant availability of structural information might mean for the future of biology.

Click here to watch

Hosted by:

- top -


Bezos to Bottlenecks: The Chasm between Altruism & the Amerindigenous
By Joseph Yracheta

September 9, 2021 at 1:00PM EDT

This webinar is being offered free of charge to members and non-members

Background and Aim:
American Indians suffer from higher rates of several conditions like diabetes, chronic kidney disease, cardiovascular disease and disproportionate exposures to metals and/or other toxic environmental hazards. Indigenous people in the rest of the Americas (Latin Indigenous) and Polynesia show remarkable similarities despite not having a common ancestry. Exposure to colonization and its long lasting systemic effects are common, however. Gene-environmental studies are key to creating interventions for these groups. This includes the internal environment of the cell and its myriad nucleic interactions in the cytosol, mitochondria, nucleus and virome.

Conclusions:
Few studies or institutions have explained the impact of multifactorial research or unique Amerindigenous Dynamic Architecture and Omic substructure to community decision makers. Nor have they tried to broker in any meaningful way, the disconnect between funders and implementers.
Systemically biased socio-economic realities that negatively impact Indigenous communities are likely to be breached only by scientists, lawyers, ethicists and public relations experts from Indigenous communities. Successful research can only be achieved by creating a trustworthy system, not by creating trusting participants. Increasing the numbers of trained professionals in and around the research endeavor is the only way to account & respond to the historic mistrust of Indigenous communities where internal dialogue and explication of human & environmental interactions can lead robust and transformative research.

Keywords:
American Indian, Omics, Environmental Exposure, Exposome, Amerindigenous, Community Engagement, ELSI, Informed Consent, Systemic Racism

Click here to watch

Hosted by:

- top -

Open Sourcing Ourselves - Together
By Mad Price Ball

September 14, 2021

"Open source" refers to the practice of making software freely available, re-usable, and adaptable. We might also ask: how can we apply "open source" to understanding ourselves as humans -- our genomes, health, or behavior? While navigating concerns about privacy and consent, the principles of "open" should also prompt us to consider what we can do to enable others. How can we make it more "open" for people to research themselves? Open source communities have come to understand that it takes more than just sharing code: it requires building a community. These same principles also apply to individual and collective research about our health. Drawing on my work with the Personal Genome Project and Open Humans, I share insights and lessons I've learned in efforts to collect, share, and analyze our personal data to better understand ourselves.

Click here to watch

Hosted by:

- top -


Resolving and avoiding design conflicts in ontology development and deployment
By Maria Keet

September 21, 2021 at 11:00AM EDT

Ontology development avails of science, engineering, and philosophy to represent the subject domain knowledge formally so that it can be used to enhance information systems. This process involves resolving ontological differences and making choices between conflicting axioms, which are due to various reasons. Examples include different foundational ontologies, alternate design patterns for the prospective ontology’s use case, and an ontology language’s expressivity limitations.
Instead of ad hoc decision-making, science and engineering-based modelling guidance with methods and tools can alleviate these issues to assist with the meaning negotiation and conflict resolution in a systematic way. In this talk, I will discuss common conflicts and typical steps toward resolution, including the tool availability for it. A similar situation with trade-offs exist when deploying ontologies for ontology-based data access and integration, which we shall touch upon as well. Use cases, tools, and experiments were in several subject domains, such as avian influenza, horizontal gene transfer, and metabolic pathways.

Click here to watch

Hosted by:

- top -


Alternative approach for discovering relationship between bacteriophages and antimicrobial resistance
By Roumyana Yordanova

October 5, 2021

Recent focus on the relationship between bacteriophages and antimicrobial resistance in the context of contemporary microbiology related to medicine and pharmaceutics is driven by their potential contribution to the current growing importance of antimicrobial resistance. There exists a number of research studies which confirm [1], or question [2] the role of the bacteriophages in dissemination of antimicrobial resistance genes.
A major objective of the CAMDA challenge is to acquire more knowledge about the relationship between viruses, their hosts and antimicrobial resistance genes in determining if antimicrobial resistance indeed can spread through phages. This study is focused on discovering relationship and possible dependencies between bacteriophages and antimicrobial resistance based on the data collected from different city environments all over the world. The approach used in our analyses consists of several different methods which assess the differential abundance of phages, their diversity across samples, the impact on antimicrobial resistance categories and associations with ARGs genes. The relationship between phages, their hosts and antimicrobial resistance is also explored by a Bayesian spatial model.

Click here to watch

Hosted by:

- top -

Injecting Life into Visualizations for Biomedical Research
By Marc Streit

October 12, 2021 at 11:00AM EDT

Biology has become increasingly data-driven. Visualization is now an important part of the data science toolbox. Many researchers, however, still think of visualization primarily as a means to communicate insights rather than a fundamental building block of the discovery process.

One effective way to make sense of large and heterogeneous biological data is to combine the strengths of visualization with the power of analytical reasoning, automated analysis, and modern AI capabilities. This powerful combination can lead to discoveries that neither a computer nor a human could make alone.

I will start this talk by giving examples of interactive web-based visualization tools that were designed for the purpose of drug discovery and cancer research. In the second part of the talk, I will show how low-dimensional embeddings of high-dimensional data can be used for understanding and explaining complex models and processes.

Click here to watch

Hosted by:

- top -


SARS-CoV-2 structural coverage map reveals viral protein assembly, mimicry, and hijacking mechanisms
By Seán O’Donoghue, Andrea Schafferhans, and Neblina Sikta

October 20, 2021 at 10:00AM EDT

We will discuss our recent modelling study of the 3D structures of all SARS-CoV-2 proteins. Using HMMs, we generated 2,060 models that span 69% of the viral proteome (https://doi.org/10.15252/msb.202010079). These models revealed viral mimicry and hijacking mechanisms that reverse post-translational modifications, block host translation, and disable host defenses. The models also revealed new insights into viral replication.

To make these models accessible, we devised a structural coverage map, a concise visual summary of what is known — and not known — about viral protein structures. We used the map to create the Aquaria-COVID resource (https://aquaria.ws/covid), designed to help researchers use the 79 structural states identified in our work to understand COVID-19 mechanisms, and to draw attention to the 31% of the viral proteome that remains structurally unknown or ‘dark’.

We will also discuss a new resource we developed to help combat emerging viral strains by streamlining the use of protein structures in variant analysis (https://doi.org/10.1101/2021.09.10.459756). All structural data on a variant can be accessed via simple URLs: for example, https://aquaria.app/SARS-CoV-2/S?L452R specifies the L452R variant in 'S', i.e., the 'spike' protein of SARS-CoV-2.

Click here to watch

Hosted by:

- top -


Multi-Omic Data and Clinical Risk Factor Integration to Build Interpretable Predictive Models for Type 1 Diabetes
By Bobbie-Jo Webb-Robertson

October 20, 2021 at 11:00AM EDT

Type 1 diabetes (T1D) is a chronic autoimmune disease that results from autoimmune destruction of insulin-producing pancreatic beta-cells. T1D progresses through stages and clinical diabetes is generally preceded by the presentation of diabetes-related autoantibodies (IA), but no symptoms.  As the cause of the disease remain elusive, multiple diabetes cohorts, such as the Diabetes Autoimmunity Study in the Young (DAISY; http://www.daisycolorado.org/) and The Environmental Determinants of Diabetes in the Young (TEDDY; https://teddy.epi.usf.edu/), have been established to collect information longitudinally to gain insights into the biological mechanisms driving changes in the progression of the disease from a pre-symptomatic IA to symptomatic T1D state. These prospective cohort studies have reported potential demographic, immune, genetic, metabolomic, and proteomic markers statistically associated with IA or the progression from IA to T1D. However, these markers alone are not highly predictive on T1D outcomes at an individual level.  This presentation will describe an approach for integration and feature selection of these various risk factors and multi-omics measurements via machine learning, enabling a better understanding of the biological mechanisms driving IA and/or T1D and identifying clinically relevant biomarkers to predict patient-level progression to these disease endpoints.

Click here to watch

Hosted By:

- top -


Metabolic modelling of microbial interactions in microbiomes
By Aarthi Ravikrishnan, Karthik Raman and Dinesh Kumar

October 22, 2021 at 9:00AM UTC

About the Tutorial

The recent years have seen the emergence of the microbiomes as important axes of human health and disease. Microbial communities abound in various regions of the human body, notably the gut, skin and the oral cavity. Microbial communities are increasingly being used for industrial fermentations and wastewater treatment. Many algorithms and tools have been developed to study microbial communities, particularly the metabolic interactions that drive and sustain these microbial communities. In this tutorial, we seek to provide a brief overview of the key modelling paradigms that have been used to study microbiomes, particularly focussing on two broad classes of methods: (a) constraint-based modelling, that attempts to model microbial metabolic networks in terms of the fluxes of various constituent reactions, and (b) graph-based modelling, which models microbial interactions as part of a complex graph capturing the exchange of several metabolites between the constituent organisms, and consequently, shed light on the nature of the interactions between the organisms. In this tutorial, we will introduce the participants to the fundamental concepts of metabolic modelling with a special emphasis on microbial communities. Following this, we will delve deeper into different types of techniques to understand interactions in a microbial community. Lastly, we will showcase some representative tools and methods, which will enable the participants to apply the theories to real-life examples and understand the nature of interactions between different kinds of organisms in community settings. At the end of the tutorial, the participants will have an understanding of: - Broad applications of metabolic modelling to model microbiomes - Databases and resources for microbiome modelling - Key constraint-based methods that can be used to understand microbiomes - Key graph-based methods that can aid in understanding metabolic exchanges - Tools for microbiome modelling such as COBRA toolbox (specific algorithms) or MetQuest.

Training Materials

https://github.com/RamanLab/ISCB-Academy-Tutorial-Community-Modelling

Target Audience

Familiarity with Python is necessary. Python and Matlab must be installed prior to the tutorial along with select packages and toolboxes. Instructions will be available in the GitHub repository by 1st October, 2021.

About the Hosts

Karthik Raman is an Associate Professor at the Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, IIT Madras. Karthik’s research group works on the development of algorithms and computational tools to understand, predict and manipulate complex biological networks. Broadly spanning computational aspects of synthetic and systems biology, key areas of research in his group encompass microbiome analysis, in silico metabolic engineering, biological network design and biological data analysis. Karthik also co-ordinates the Centre for Integrative Biology and Systems Medicine and is a core member of the Robert Bosch Centre for Data Science and Artificial Intelligence (RBC-DSAI). Karthik teaches courses on computational biology and systems biology at IIT Madras, and has also authored a textbook on Computational Systems Biology. Aarthi Ravikrishnan is a postdoctoral fellow and a team lead at Genome Institute of Singapore. Her research interests are predominantly in microbiome analytics and developing computational methods to understand the role of microbiome in health. Her team focusses on understanding skin and gut microbiome through data generation and analyses. She, along with Karthik Raman, has co-authored a book on Systems-level modelling of microbial communities.

Click here to watch

Hosted by:

- top -


Indigenous Pharmacogenomics and Implications for Personalized Medicine
by Katrina G. Claw

October 22, 2021 at 1:00PM EDT

The integration of genomic technology into health care settings has the potential to transform healthcare through increased personalization of medical decisions. In particular, pharmacogenomics research on drug disposition and response can tailor and improve medication regimens for all patients by informing tests of function altering variation in drug metabolism and transport genes. Unfortunately, Indigenous peoples remain underrepresented in pharmacogenomics research. Effective strategies to create research partnerships between tribal communities and genomic researchers are often lacking, yet such partnerships are needed for trustworthy research. We review what is currently known about pharmacogenetic variation in Indigenous communities and highlight work related to nicotine metabolism and tobacco cessation as an example of successful collaborations in pharmacogenetic research relating genotype-phenotype associations. We discuss the challenges and opportunites related to the implementation of personalized drug therapy in the community using ethical engagement and collaborative approaches.

Click here to watch

Hosted by:

- top -

Practicals in next-generation sequencing - Programming course in a generalist school can truly be fun, even in lockdown
By Marie Sémon

October 26, 2021 at 11:00AM EDT

Next-Gen Sequencing has become a staple tool in biology during the past decades. This makes it necessary to teach students how to perform analysis of such data. However, this is challenging, particularly for students unfamiliar with code and command-line tools. The Master of biology of the ENS de Lyon has set up a practical course where we teach students to set up a reproducible NGS data analysis pipeline to generate near-publication results from raw sequencing data. The students have to deposit their work on a git repository. We engage students by allowing them to choose from a broad range of projects, and strengthen group and student interaction through flipped-classrooms. Despite the need to host the course remotely due to the pandemic, we achieved great success (as evidenced by student feedback), through the use of virtual machines for computing, chat applications for communicating and screen-sharing, and much involvement on both students and teachers’ part.

Click here to watch

Hosted by:

- top -

Scalable Inference of Phylogentetic Networks
By Claudia Solis-Lemus

November 2, 202

Phylogenetic network inference plays an important role in the reconstruction of the tree of life, given the widespread gene flow among different organisms. However, there are many challenges in the inference of reticulate evolution such as network reconstruction and interpretation, and difficulties to summarize network uncertainty. In this talk, I will explain the current difficulties in network statistical inference and present a new scalable method based on pseudolikelihood theory. I will also present extensions of standard trait evolution tools to networks, such as phylogenetic regression or ANOVA, ancestral trait reconstruction, and Pagel's lambda test of phylogenetic signal. All the new tools are implemented in the open-source Julia package PhyloNetworks.

Click here to watch

Hosted by:

- top -


Inferring functions of the essential genes for life
By Mark Wass

November 9, 2021 at 11:00AM EST

Identification of the smallest possible genome that is possible to support life has been a long term quest in Synthetic Biology. This has seen ongoing progress and a few years ago a bacterial genome, based on Mycoplasma mycoides, containing only 438 protein coding genes was engineered. Strikingly, the function of more than a third (149) of these proteins was unknown, demonstrating our limited knowledge and understanding of the essential function for life. In this talk I will present our recent work using an array of bioinformatics approaches to characterise these proteins and infer their functions. I will discuss the insights we gained into the essential functions for life and also reflect on what our findings show for the area of protein function prediction.

Click here to watch

Hosted by:

- top -


Robust single-cell discovery of RNA targets of RNA-binding proteins and ribosomes
By Kristopher Brannan

November 23, 2021 at 11:00 AM EST

RNA-binding proteins (RBPs) are critical regulators of gene expression and RNA processing that are required for gene function. Yet the dynamics of RBP regulation in single cells is unknown. To address this gap in understanding, we developed STAMP (Surveying Targets by APOBEC-Mediated Profiling), which efficiently detects RBP–RNA interactions. STAMP does not rely on ultraviolet cross-linking or immunoprecipitation and, when coupled with single-cell capture, can identify RBP-specific and cell-type–specific RNA–protein interactions for multiple RBPs and cell types in single, pooled experiments. Pairing STAMP with long-read sequencing yields RBP target sites in an isoform-specific manner. Finally, Ribo-STAMP leverages small ribosomal subunits to measure transcriptome-wide ribosome association in single cells. STAMP enables the study of RBP–RNA interactomes and translational landscapes with unprecedented cellular resolution.

Click here to watch

Hosted by:

- top -


Elements of Style in Reproducible Workflow Creation and Maintenance: A Hands-on Tutorial
By Anne Deslattes Mays and Christina Chatzipantsiou

November 26, 2021 at 11:00 AM UTC

In this short 3 hour course, we will introduce the learner to certain elements of style in the construction and containerization of small single-function processes that facilitate reproducible workflow creation and execution. We will show how these processes may be kept up-to-date and alert the creator to the functional state of these processes (working or failing) by using a feature found within GitHub called GitHub Actions.

This hands-on-course will use a small example to provide the structure, philosophy and approach to achieving this desirable outcome. This course seeks to demystify and make accessible powerful methods one can use to achieve platform independence and platform interoperability. Using a simple RNASeq pre-baked analysis example to demonstrate these techniques, we will break down and walk the learner through each of the construction steps. The learners will be introduced to Conda, Docker, GitHub and the standard workflow language, Nextflow. If time permits, we will also show how these containerized processes can also be represented in a second standard workflow language implementation (e.g. Common Workflow Language or WDL).

By the end of the course, the learner will understand these Elements of Style and will know how Conda, Docker, GitHub, Zenodo, and Nextflow enable reproducible research. Moreover, these steps will be on GitHub for the Learner to return to and reproduce themselves after the end of the course. In taking this course, the Learner will also be shown the power of JupyterLab notebooks to facilitate literate programming. Through their participation in the class, learners will learn and understand FAIR (findability, accessibility, interoperability and reusability) best practices. We ask all participants to get a GitHub, Zenodo and ORCID accounts prior to the course. We ask for minimal background knowledge of the command line, simple commands in the shell environment, we enable a bit of self-learning from the repository to facilitate the acquisition of this knowledge.

GitHub: https://github.com/ISCB-Academy/Elements-of-Style-Reproducible-Workflow-Creation-Maintenance-Tutorial

Capacity: 20

Click here to watch

Hosted By:

- top -


Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads
By Kishwar Shafin

November 29, 2021 at 2:00 PM EST

Abstract: Long-read sequencing has the potential to transform variant detection by reaching difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing.  Third-generation nanopore sequence data has demonstrated a long read length, but current interpretation methods for its novel pore-based signal have unique error profiles, making accurate analysis challenging.  In this talk, I will introduce a haplotype-aware variant calling pipeline PEPPER-Margin-DeepVariant that produces state-of-the-art variant calling results with nanopore data. The nanopore-based method outperforms the short-read-based single nucleotide variant identification method at the whole genome-scale and produces high-quality single nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails.

Click here to watch

Hosted by:

- top -


Longitudinal genome resolved metagenomics
by Christopher Quince

November 30, 2021 at 11:00 AM EST

The extraction of prokaryotic genomes direct from metagenome assemblies has uncovered a wealth of novel microbial diversity, both in the environment and host associated microbiomes. It is a particularly powerful technique when coupled with longitudinal sampling of the same community since it can then also be used to understand changes in community structure. I will give an overview of bioinformatics methods for resolving genomes direct from multiple metagenomic samples. I will briefly explain short read assembly methods and binning followed by evaluation of bins to metagenome assembled genomes (MAGs). These principles will be illustrated using a large-scale binning of MAGs from anaerobic digestion reactors. I will then introduce our pipeline, STRONG,  STrain Resolution ON assembly Graphs: https://github.com/chrisquince/STRONG, for resolving sub-populations within MAGs. I will compare to alternative methods that obtain strains from metagenomes de novo and apply it to a study of human fecal microbiome transplants.

Click here to watch

Hosted by:

- top -


DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of enhancers
by Bernardo Almeida

December 9, 2021 at 11:00 AM EST

Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood and enhancer de novo design is considered impossible. Here we built a deep learning model, DeepSTARR, to quantitatively predict the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in Drosophila melanogaster S2 cells. The model learned relevant TF motifs and higher-order syntax rules, including functionally non-equivalent instances of the same TF motif that are determined by motif-flanking sequence and inter-motif distances. We validated these rules experimentally and demonstrated their conservation in human by testing more than 40,000 wildtype and mutant Drosophila and human enhancers. Finally, we designed and functionally validated synthetic enhancers with desired activities de novo.

Click here to watch

Hosted by:

- top -


A Multi-Objective Genetic Algorithm to Find Active Modules in Multiplex Biological Networks
by Elva Maria Novoa del Toro

December 14, 2021 at 11:00 AM EST

One of the most challenging tasks in computational biology is the integration of complementary biological data produced from different experimental sources. Our goal here is to combine expression data and biological networks to identify “active modules”, i.e. subnetworks of interacting genes/proteins associated with expression changes in different biological contexts. We developed MOGAMUN, a multi-objective genetic algorithm that finds dense and overall deregulated subnetworks in a multiplex network. We compared the performance of MOGAMUN with different state-of-the-art methods for active module identification. We also applied MOGAMUN to identify active modules for a rare monogenic disease, Facioscapulohumeral muscular dystrophy (FSHD). MOGAMUN is available as a Bioconductor package.

Click here to watch

Hosted by:

- top -

COSI ISCBacademy Scientific Program Coordinator



ISCB seeks applications from student and postdoctoral fellows wishing to serve in the volunteer role of COSI ISCBacademy Scientific Program Coordinator.   This role will support the ISCB Communities of Special Interest (COSI) by managing the scientific program and schedule for the individual COSI’s ISCBacademy webinar series.

ISCB started ISCBacademy (https://www.iscb.org/iscbacademy-webinars) in 2020 with the goal to connect researchers outside of the traditional face-to-face conferences and events. The program showed early success in providing an alternative platform to present research to the broader scientific community while eliminating the need to travel. ISCB hopes to expand the program moving forward offering more webinars monthly.

This COSI ISCBacademy Scientific Program Coordinator will work closely with COSI senior leadership.  The duties include researching, organizing and recommending scientific papers for presentation, as well as inviting, coordinating, and moderating the presentation of the selected talks.

The goal of the program is to have at least five talks presented each year.  One seeking the role could expect to spend approximately six hours monthly supporting the COSI and managing the program.   ISCB staff assists by providing marketing, webpage posting, platform set up and technical management of the webinar.

Others who have served in this role have found the benefits to be very rewarding.  As the person works closely with senior leadership, they are able to expand their networks as well as learn important management and communication skills.

Interested individuals should submit using the online application form.  The form will ask for:

  • Motivation Statement (up to 250 words)
  • Past experiences managing projects
  • Short CV

Learn more about the ISCB COSIs
Learn more about the ISCBacademy webinar program

ISCBacademy 2020 Archived Webinars



To view previous webinars use the links below

2021 Webinars | 2022 Webinars


Please use the links below to view 2020 webinars:


Revealing Principles of Subcellular RNA Localization by APEX-Seq
by Furqan Fazal

March 24, 2020

The human body is composed of trillion of cells, which are the building blocks of life. Each cell is highly organized and contains RNAs that code for proteins and serve regulatory roles. The location of an RNA species within a cell can dictate its folding1, editing, splicing, translation, degradation, binding partners, catalytic activity, and even the fate of the protein that it encodes. However, characterizing the RNA contents of cellular compartments that cannot be biochemically isolated is challenging. Here we introduce APEX-seq2, a method for RNA sequencing based on the direct proximity labeling of RNA using the peroxidase enzyme APEX2. APEX-seq in nine distinct subcellular locales produced a nanometer-resolution spatial map of the human transcriptome, revealing extensive patterns of localization for diverse RNA classes and transcript isoforms. We uncovered a radial organization of the nuclear transcriptome, which is gated at the inner surface of the nuclear pore for cytoplasmic export of processed transcripts. We identified two distinct pathways of messenger RNA localization to mitochondria, each associated with specific sets of transcripts for building complementary macromolecular machines within the organelle. APEX-seq should be widely applicable to many systems and model organisms, enabling comprehensive investigations of the dynamic spatial transcriptome.

  1. Sun L*, Fazal FM*, Li P*, Broughton JP, Lee B, Tang L, Huang W, Kool ET, Chang HY, Zhang QC. RNA structure maps across mammalian cellular compartments. Nature Structural and Molecular Biology (NSMB), 26, 322-330 (2019)
  2. Fazal FM*, Han S*, Parker KR, Kaewsapsak P, Xu J, Boettiger AN, Chang HY, Ting AY. Atlas of subcellular RNA localization revealed by APEX-seq. Cell, 178, 473–490 (2019)


Click here to watch


Hosted by:

- top -


 

Dynamic determinants of co-transcriptional gene regulation
by Ana Fiszbein

April 21, 2020

The architecture of mammalian genes enables the production of multiple transcripts that greatly expand the coding capacity of our genomes. Understanding how these transcripts are regulated is of particular importance in cancer genomics, as their aberrant regulation contributes to the ~10 million cancer-related deaths each year. We recently described a phenomenon called exon-mediated activation of transcription starts (EMATS) in which the splicing of internal exons impacts the spectrum of promoters used and expression level of the host gene. We showed that targeted-inhibition of splicing reduces the usage of promoters and suppresses gene expression, while evolutionary creation of a new splice site can activate cryptic promoters. My findings support a model in which splicing factors recruit transcription machinery to influence promoter choice and regulate the expression of thousands of mammalian genes.

Click here to watch

Hosted by:

- top -


DNCON2: improved protein contact prediction using two-level deep convolutional neural networks

by Jianlin Cheng

April 22, 2020

Significant improvements in the prediction of protein residue-residue contacts are observed in the recent years. These contacts, predicted using a variety of coevolution-based and machine learning methods, are the key contributors to the recent progress in ab initio protein structure prediction, as demonstrated in the recent CASP experiments. Continuing the development of new methods to reliably predict contact maps is essential to further improve ab initio structure prediction.

In this paper we discuss DNCON2, an improved protein contact map predictor based on two-level deep convolutional neural networks. It consists of six convolutional neural networks-the first five predict contacts at 6, 7.5, 8, 8.5 and 10 Å distance thresholds, and the last one uses these five predictions as additional features to predict final contact maps. On the free-modeling datasets in CASP10, 11 and 12 experiments, DNCON2 achieves mean precisions of 35, 50 and 53.4%, respectively, higher than 30.6% by MetaPSICOV on CASP10 dataset, 34% by MetaPSICOV on CASP11 dataset and 46.3% by Raptor-X on CASP12 dataset, when top L/5 long-range contacts are evaluated. We attribute the improved performance of DNCON2 to the inclusion of short- and medium-range contacts into training, two-level approach to prediction, use of the state-of-the-art optimization and activation functions, and a novel deep learning architecture that allows each filter in a convolutional layer to access all the input features of a protein of arbitrary length.

The source code of DNCON2 is available at https://github.com/multicom-toolbox/DNCON2/

Click here to watch

Hosted By:

- top -


A SARS-CoV-2 protein interaction map reveals targets for drug repurposing
By Nevan Krogan

May 19, 2020

Efforts to develop antiviral drugs versus COVID-19 or vaccines for its prevention have been hampered by limited knowledge of the molecular details of SARS-CoV-2 infection. This webinar will describe our efforts to address this challenge by expressing 26 of the 29 SARS-CoV-2 proteins in human cells and identifying the human proteins physically associated with each using affinity-purification mass spectrometry. Among 332 high-confidence SARS-CoV-2-human protein-protein interactions, we identified 66 druggable human proteins or host factors targeted by 69 compounds (29 FDA-approved drugs, 12 drugs in clinical trials, and 28 preclinical compounds). Within a subset of these, multiple viral assays identified two sets of pharmacological agents that displayed antiviral activity.

Click here to watch

Hosted By:

- top -


Deep Neural Networks for Interpreting RNA-binding Protein Target Preferences
by Mahsa Ghanbari

May 20, 2020 at 11:00AM EDT!

Deep learning has become a powerful paradigm to analyze the binding sites of regulatory factors including RNA-binding proteins (RBPs), owing to its strength to learn complex features from possibly multiple sources of raw data. However, the interpretability of these models, which is crucial to improve our understanding of RBP binding preferences and functions, has not yet been investigated in significant detail. We have designed a multitask and multimodal deep neural network for characterizing in vivo RBP targets. The model incorporates not only the sequence but also the region type of the binding sites as input, which helps the model to boost the prediction performance. To interpret the model, we quantified the contribution of the input features to the predictive score of each RBP. Learning across multiple RBPs at once, we are able to avoid experimental biases and to identify the RNA sequence motifs and transcript context patterns that are the most important for the predictions of each individual RBP. Our findings are consistent with known motifs and binding behaviors and can provide new insights about the regulatory functions of RBPs

Click here to watch

Hosted by:

- top -


Divergence in DNA Specificity among Paralogous Transcription Factors Contributes to Their Differential In Vivo Binding
by Raluca Gordan and Ning Shen

May 26, 2020 at 11:00AM EDT!

Paralogous transcription factors (TFs) are oftentimes reported to have identical DNA-binding motifs, despite the fact that they perform distinct regulatory functions. Differential genomic targeting by paralogous TFs is generally assumed to be due to interactions with protein co-factors or the chromatin environment. Using a computational-experimental framework called iMADS (integrative modeling and analysis of differential specificity), we show that, contrary to previous assumptions, paralogous TFs bind differently to genomic target sites even in vitro. We used iMADS to quantify, model, and analyze specificity differences between 11 TFs from 4 protein families. We found that paralogous TFs have diverged mainly at medium- and low-affinity sites, which are poorly captured by current motif models. We identify sequence and shape features differentially preferred by paralogous TFs, and we show that the intrinsic differences in specificity among paralogous TFs contribute to their differential in vivo binding. Thus, our study represents a step forward in deciphering the molecular mechanisms of differential specificity in TF families.

Click here to watch

Hosted By:

- top -


Fast Coalescent-Based Computation of Local Branch Support from Quartet Frequencies
by Erfan Sayyari

June 12, 2020

Species tree reconstruction is complicated by effects of incomplete lineage sorting, commonly modeled by the multi-species coalescent model (MSC). While there has been substantial progress in developing methods that estimate a species tree given a collection of gene trees, less attention has been paid to fast and accurate methods of quantifying support. In this article, we propose a fast algorithm to compute quartet-based support for each branch of a given species tree with regard to a given set of gene trees. We then show how the quartet support can be used in the context of the MSC to compute (1) the local posterior probability (PP) that the branch is in the species tree and (2) the length of the branch in coalescent units. We evaluate the precision and recall of the local PP on a wide set of simulated and biological datasets, and show that it has very high precision and improved recall compared with multi-locus bootstrapping. The estimated branch lengths are highly accurate when gene tree estimation error is low, but are underestimated when gene tree estimation error increases. Computation of both the branch length and local PP is implemented as new features in ASTRAL.

Click here to watch

Hosted By:

- top -


Engineering Alternative Polyadenylation with Deep Generative Neural Networks
by Johannes Linder

June 23, 2020

Engineering gene and protein sequences with defined functional properties is a major goal of synthetic biology. Rational design of gene enhancers, splice sites, 3’-end regulatory sequences and more has the potential of greatly accelerating the fields of nanotechnology and medical therapeutics. Deep neural network models, together with gradient ascent optimization, show promise for sequence design. The optimized sequences can however get stuck in local minima, have low diversity and may be computationally very costly to generate at scale. In the first part of this talk, I will present our work on using gradient-based methods to design regulatory sequences of Alternative Polyadenylation (APA), a post-transcriptional mechanism where multiple polyadenylation signals (PAS) in the mRNA compete for cleavage. Given a deep neural network trained on a massively parallel reporter assay of APA variants, we forward-engineer new functional polyadenylation signals with precisely defined cleavage and isoform distributions. In the second part of this talk, I discuss how we extend this design framework using a class of generative neural networks called deep exploration networks (DENs). By penalizing any two generated patterns based on similarity, DENs learn to jointly maximize fitness and diversity. DENs can be used to design transcription factor binding sites, splice sequences and functional proteins. In the context of APA, we used DENs to engineer PAS with more than 10-fold higher selection odds than the best gradient ascent-generated patterns.

Click here to watch

Hosted by:

- top -


At Home with Covid-19
By Brian Shoichet

June 26, 2020

The urgency of the coronavirus pandemic has motivated investigators world wide to seek approved drugs or investigation new drugs as a way to rapidly advance therapeutics into clinical trials to treat the disease.  I will describe a large collaboration, hosted by the UCSF Quantitative Biology Institute, to do that in a mechanistically focused way.  Using AP-MS, a host-pathogen network of viral and human proteins was created, and drugs were sought targeting the human partner.  From among 322 high confidence human proteins associated with 26 viral proteins emerged 63 that were druggable.  Against those, 69 drugs were tested for efficacy, and from these 10 drugs in two broad classes emerged: those targeting protein biogenesis, and those acting against the Sigma1 and Sigma2 receptors.  The activities of these drugs, and the chemoinformatics infrastructure that supported their selection, will be discussed.  The mechanism-based repurposing strategy will be compared to a complementary effort that targets viral proteins and seeks novel chemical matter, using structure-based ultra-large library docking.

Click here to watch

Hosted by:

- top -


Global surveillance of COVID-19 by mining news media using a multi-source dynamic embedded topic model
By
Yue Li and David Buckeridge

June 30, 2020

As the COVID-19 pandemic continues to unfold, understanding the global impact of non-pharmacological interventions (NPI) is important for formulating effective intervention strategies, particularly as many countries prepare for future waves. We used a machine learning approach to distill latent topics related to NPI from large-scale international news media. We hypothesize that these topics are informative about the timing and nature of implemented NPI, dependent on the source of the information (e.g., local news versus official government announcements) and the target countries. Given a set of latent topics associated with NPI (e.g., self-quarantine, social distancing, online education, etc), we assume that countries and media sources have different prior distributions over these topics, which are sampled to generate the news articles. To model the source-specific topic priors, we developed a semi-supervised, multi-source, dynamic, embedded topic model. Our model is able to simultaneously infer latent topics and learn a linear classifier to predict NPI labels using the topic mixtures as input for each news article. To learn these models, we developed an efficient end-to-end amortized variational inference algorithm. We applied our models to news data collected and labelled by the World Health Organization (WHO) and the Global Public Health Intelligence Network (GPHIN). Through comprehensive experiments, we observed superior topic quality and intervention prediction accuracy, compared to the baseline embedded topic models, which ignore information on media source and intervention labels. The inferred latent topics reveal distinct policies and media framing in different countries and media sources, and also characterize reaction COVID-19 and NPI in a semantically meaningful manner.

Click here to watch

Hosted by:

- top -


Genetic Basis Of De Novo Appearance Of Carotenoid Ornamentation In Bare-Parts Of Canaries
by Malgorzata Gazda

July 7, 2020

Unlike wild and domestic canaries (Serinus canaria), or any of the three dozen species of finches in genus Serinus, the domestic urucum breed of canaries exhibits bright red bills and legs. This novel trait offers a unique opportunity to understand the mechanisms of bare-part coloration in birds. To identify the mutation producing the colorful phenotype, we resequenced the genome of urucum canaries and performed a range of analyses to search for genotype-to-phenotype associations across the genome. We identified a nonsynonymous mutation in the gene BCO2 (beta-carotene oxygenase 2, also known as BCDO2), an enzyme involved in the cleavage and breakdown of full-length carotenoids into short apocarotenoids. Protein structural models and in vitro functional assays indicate that the urucum mutation abrogates the carotenoid-cleavage activity of BCO2. Consistent with the predicted loss of carotenoid-cleavage activity, urucum canaries tended to have increased levels of full-length carotenoid pigments in bill tissue and reduced levels of carotenoid-cleavage products (apocarotenoids) in retinal tissue compared with other breeds of canaries. We hypothesize that carotenoid-based bare-part coloration might be readily gained, modified, or lost through simple switches in the enzymatic activity or regulation of BCO2 and this gene may be an important mediator in the evolution of bare-part coloration among bird species.

Click here to watch

Hosted by:

- top -


Pooled CRISPR screens with imaging on microRaft arrays reveals stress granule-regulatory factors
by Emily Wheeler

July 21, 2020

Genetic screens using pooled CRISPR-based approaches are scalable and inexpensive, but restricted to standard readouts including survival, proliferation and sortable markers. However, many biologically relevant cell states involve cellular and subcellular changes that are only accessible by microscopic visualization, and are currently impossible to screen with pooled methods. Here we combine pooled CRISPR/Cas9 screening with microRaft array technology and high-content imaging to screen image-based phenotypes (CRaft-ID; CRISPR-based microRaft, followed by gRNA Identification). By isolating microRafts that contain genetic clones harboring individual guide RNAs, we identify RNA binding proteins (RBPs) that influence the formation of stress granules, punctate protein-RNA assemblies, that form during stress. To automate hit identification, we developed a machine-learning model trained on nuclear morphology to remove unhealthy cells or imaging artifacts. In doing so, we identified and validated previously uncharacterized RBPs that modulate stress granule abundance, highlighting the applicability of our approach to facilitate image-based pooled CRISPR screens. 

Click here to watch

Hosted by:

- top -


Southern African Human Population Structure - an Opportunity to Expand Genomics Research Worldwide
by Caitlin Uren

July 30, 2020

Human genetic diversity in southern Africa is vast, complex and unique. Identifying and characterizing population structure in this region is not a trivial task but when performed correctly, allows for this information to be included in numerous genomic analyses such as studies investigating a populations’ demographic and genetic history and the association between this history and both Mendelian and complex diseases. I will discuss results from our population genetic and demographic studies and how this is related to various phenotypes (with a focus on tuberculosis susceptibility), and discuss various aspects of genomics that in my opinion are greatly lacking in southern Africa. I will conclude by discussing how populations worldwide will benefit from genomics research in this region.

Click here to watch

Hosted by:

- top -


Protein Function Prediction using Graph Convolutional Networks with Language Model Features
by Vladimir Gligorijevic

August 11, 2020 at 11:00AM EDT!

With the maturing of de novo structure prediction methods and the rise of deep learning techniques, it now becomes possible to generate high-throughput structure and function predictions for many unannotated proteins.

We will first introduce deepFRI (deep functional residue identification), our recently proposed deep learning Graph Convolutional Network (GCN) for predicting  protein  functions  by  leveraging  protein  contact  maps  representing  protein  structures  and  residue-level  features  from  a  pre-trained  language  model.  Our model learns general structure-function relationships by robustly predicting Gene Ontology (GO) terms of proteins with < 30% sequence identity to the training set. We show that our GCN architecture predicts functions more accurately than Convolutional Neural Networks trained on sequence data alone and previous competing methods. deepFRI not only improves predictions of GO terms from protein sequences and predicted 3D structures, but also brings residue-level saliency mapping. The mapping provides insight into putative functional sites allowing for biological interpretation, hypothesis generation or the design of targeted validation experiments.

Click here to watch

Hosted by:

- top -


Unravelling the mystery of orphan genes to understand the origins of genetic novelty
by Nikos Vakirlis

August 24, 2020

What explains the presence of a gene only in the genome of one species and not in any other?

Species-specific protein-coding genes, also known as orphans, can arise "from scratch" from previously non-genic loci, through a process known as de novo gene emergence. How exactly the evolutionary transition from non-gene to functional gene unfolds is unclear. Can such de novo emerging genes increase an organism's fitness, and if so how? Orphan genes can also result from extensive sequence divergence of ancestral genes, which can eventually erase all similarity of a gene to its homologues in other species, a process even less well understood than de novo emergence.  I will present novel findings which advance our understanding of both these evolutionary mechanisms and bring us a small step closer to a complete picture of the origins of genetic novelty.

Click here to watch

Hosted by:

- top -


Encyclopedia of DNA Elements (ENCODE) Phase III
by Zhiping Weng

September 16, 2020

The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE and Roadmap Epigenomics data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.

Click here to watch

Hosted By:

- top -


RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference
by Alexey Kozlov and Alexandros Stamatakis

September 30, 2020

Phylogenies are important for fundamental biological research, but also have numerous applications in biotechnology, agriculture and medicine. Finding the optimal tree under the popular maximum likelihood (ML) criterion is known to be NP-hard. Thus, highly optimized and scalable codes are needed to analyze constantly growing empirical datasets.

We present RAxML-NG, a from-scratch re-implementation of the established greedy tree search algorithm of RAxML/ExaML. RAxML-NG offers improved accuracy, flexibility, speed, scalability, and usability compared with RAxML/ExaML. On taxon-rich datasets, RAxML-NG typically finds higher-scoring trees than IQTree, an increasingly popular recent tool for ML-based phylogenetic inference (although IQ-Tree shows better stability). Finally, RAxML-NG introduces several new features, such as the detection of terraces in tree space and the recently introduced transfer bootstrap support metric.

The code is available under GNU GPL at https://github.com/amkozlov/raxml-ng. RAxML-NG web service (maintained by Vital-IT) is available at https://raxml-ng.vital-it.ch/.

Click here to watch

Hosted By:

- top -


Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic
by Philippe Lemey

October 2, 2020

There are outstanding evolutionary questions on the recent emergence of human coronavirus SARS-CoV-2 including the role of reservoir species, the role of recombination and its time of divergence from animal viruses. We find that the sarbecoviruses—the viral subgenus containing SARS-CoV and SARS-CoV-2—undergo frequent recombination and exhibit spatially structured genetic diversity on a regional scale in China. SARS-CoV-2 itself is not a recombinant of any sarbecoviruses detected to date, and its receptor-binding motif, important for specificity to human ACE2 receptors, appears to be an ancestral trait shared with bat viruses and not one acquired recently via recombination. To employ phylogenetic dating methods, recombinant regions of a 68-genome sarbecovirus alignment were removed with three independent methods. Bayesian evolutionary rate and divergence date estimates were shown to be consistent for these three approaches and for two different prior specifications of evolutionary rates based on HCoV-OC43 and MERS-CoV. Divergence dates between SARS-CoV-2 and the bat sarbecovirus reservoir were estimated as 1948 (95% highest posterior density (HPD): 1879–1999), 1969 (95% HPD: 1930–2000) and 1982 (95% HPD: 1948–2009), indicating that the lineage giving rise to SARS-CoV-2 has been circulating unnoticed in bats for decades.

 

Hosted by:

- top -


Indigenous Voices in Computational Biology: An Introduction to Ethical Genomic Research with Indigenous People
by Rene Begay

October 8, 2020!

Indigenous communities through the world have distinct languages, culture, political structures, and ways of knowing. For too long, these communities have been exploited for material goods, land, and more recent for biospecimens. It is important to note that Indigenous people are not anti-science but rather support science that includes their intrinsic perspectives and expertise. Indigenous scientists are emerging across the world bridging science, policy, technology, and Indigenous ways of knowing to determine how their communities can benefit from genomic and clinical health research. The Indigenous Voices in Computational Biology series from the ISCB Academy will highlight the work conducted by Indigenous researchers in the United States, New Zealand, and other countries. Topics will include genomic data sharing, ethical engagement with Indigenous peoples in paleogenomics, and how to responsibly conduct research on Indigenous ancestors (ancient DNA). As a result, Indigenous scientists have developed their own Native biobank and hosted an international Indigenous genomics conference to discuss ethical concerns within their communities and present community based genomic research that integrates Indigenous knowledge. This presentation will introduce the series overarching themes and provide the framework that encourages ethical engagement with Indigenous communities in genomic research.

Click here to watch

Hosted by:

- top -


Altered RNA Splicing by Mutant p53 Activates Oncogenic RAS Signaling in Pancreatic Cancer
by Luisa Escobar-Hoyos

October 15, 2020!

Pancreatic ductal adenocarcinoma (PDAC) is driven by co-existing mutations in KRAS and TP53. However, how these mutations collaborate to promote this cancer is unknown. Here, we uncover sequence-specific changes in RNA splicing enforced by mutant p53 which enhance KRAS activity. Mutant p53 increases expression of splicing regulator hnRNPK to promote inclusion of cytosine-rich exons within GTPase-activating proteins (GAPs), negative regulators of RAS family members. Mutant p53-enforced GAP isoforms lose cell membrane association, leading to heightened KRAS activity. Preventing cytosine-rich exon inclusion in mutant KRAS/p53 PDACs decreases tumor growth. Moreover, mutant p53 PDACs are sensitized to inhibition of splicing via spliceosome inhibitors. These data provide insight into co-enrichment of KRAS and p53 mutations and therapeutics targeting this mechanism in PDAC.

Click here to watch

Hosted by:

- top -


The Illusion of Inclusion — The “All of Us” Research Program and Indigenous Peoples’ DNA 
by Keolu Fox

November 12, 2020 at 11:00AM EST!

Raw data, including digital sequence information derived from human genomes, have in recent years emerged as a top global commodity. This shift is so new that experts are still evaluating what such information is worth in a global market. In 2018, the direct-to-consumer genetic-testing company 23andMe sold access to its database containing digital sequence information from approximately 5 million people to GlaxoSmithKline for $300 million. Earlier this year, 23andMe partnered with Almirall, a Spanish drug company that is using the information to develop a new antiinflammatory drug for autoimmune disorders. This move marks the first time that 23andMe has signed a deal to license a drug for development.

Eighty-eight percent of people included in large-scale studies of human genetic variation are of European ancestry, as are the majority of participants in clinical trials. Corporations such as Geisinger Health System, Regeneron Pharmaceuticals, AncestryDNA, and 23andMe have already mined genomic databases for the strongest genotype–phenotype associations. For the field to advance, a new approach is needed. There are many potential ways to improve existing databases, including “deep phenotyping,” which involves collecting precise measurements from blood panels, questionnaires, cognitive surveys, and other tests administered to research participants. But this approach is costly and physiologically and mentally burdensome for participants. Another approach is to expand existing biobanks by adding genetic information from populations whose genomes have not yet been sequenced — information that may offer opportunities for discovering globally rare but locally common population-specific variants, which could be useful for identifying new potential drug targets.

Click here to watch

Hosted by:

- top -


“Open Access” Data and the Continued Bio-Exploitation of Indigenous Genomes
by Krystal Tsosie

December 10, 2020 at 11:00AM EST

While the field of genomics has certainly advanced technologically in the past 20 years, what (if anything) has actually changed in how scientists engage Indigenous people?

Global Indigenous groups expressed concerns about the biocommercial exploitation of Indigenous-derived genomic data at the start of large-scale diversity projects such as the Human Genome Diversity Project, Genographic Project, and 1000 Genomes. Open accessibility of these data were meant to “democratize” the field of genomics to advance technology and bridge health inequities—but for whom? Health benefits have yet to arrive to those Indigenous communities from whom DNA was questionably procured, yet companies continue to build intellectual property from openly sourced Indigenous genomes.

Presentation will highlight individual versus group consent issues and the myth of de-identification of DNA for small, underrepresented groups in genomics. In addition, Indigenous genomic data sovereignty and the importance of Indigenous-led biological and data repositories (or ‘biobanks’) will be discussed as means of centering Indigenous forms of data governance.

Click here to watch

Hosted By:

- top -

Upcoming ISCBacademy Webinars



To view previous webinars use the links below

2020 Webinars | 2021 Webinars | 2022 Webinars


ISCB in collaboration with our Communities of Special Interest is pleased to announce the ISCBacademy COSI Webinar Series.  Mark your calendars for Tuesdays at 11 AM Eastern Time Zone to participate in a COSI themed webinar.

Upcoming Webinars (check back regularly for speaker and registration details):
February 8, 2022 - VarI
February 15, 2022 - 3DSIG
February 22, 2022 - BOSC/OBF
March 1, 2022 - Bio-Ontologies
March 8, 2022 - BIOINFO-CORE
March 15, 2022 - BioVis
March 22, 2022 - CAMDA
March 29, 2022 - CompMS
April 5, 2022 - Education
April 12, 2022 - EvolCompGen
April 19, 2022 - Function
April 26, 2022 - HiTSeq
May 3, 2022 - iRNA
May 10, 2022 - MICROBIOME
May 17, 2022 - MLCSB
May 24, 2022 - NetBio
May 31, 2022 - RegSys
June 7, 2022 - SysMod
June 14, 2022 - Text Mining
June 21, 2022 - TransMed
June 28, 2022 - VarI


Join us for our upcoming ISCBacademy Webinars.  Check back regularly for updates.

To propose a talk for an ISCBacademy Webinar click here.


COVID-19 Disease Map: building a computational repository of SARS-CoV-2 virus-host interaction mechanisms
by Marek Ostaszewski

February 1, 2022 at 11:00 AM EST

Disease Maps are computational and visual knowledge repositories constructed to catalogue, standardise, and model disease-related mechanisms. They allow to bridge the knowledge gap between biomedical experts and the computational biologists towards contextualised data analysis and modelling of a given pathophysiology. Disease Maps are built using graphical and computational Systems Biology standards and can be used as interactive knowledge repositories, platforms for visual analytics of omics datasets, or integrated into large-scale computational workflows. With the global impact of COVID-19, we organised a community effort to develop a COVID-19 Disease Map to help researchers worldwide to study the mechanisms of the SARS-CoV-2 – host interactions. Our effort engaged over 250 members, contributing as domain experts, diagram curators, analysts, and modellers. This talk will discuss the challenges of community biocuration and integration of a plethora of resources, from Systems Biology diagrams, through interaction databases and text mining results to modelling pipelines of varying granularity.

Click here to register

Hosted by:

- top -

Critical Assessment of Computational Hit-finding Experiments (CACHE): An Initiative to Guide Future Computational Drug Design
by Matthieu Schapira

February 15, 2022 at 11:00 AM EST

Computational methods used to facilitate small molecule drug discovery are currently witnessing a revived optimism, fueled by continuous leaps in computational power, increased accessibility to commercial compounds, improved physics-based methods, and the emerging potential of generative models and newer machine learning approaches. It is fair to say that the question is not whether in silico design will transform the early phase of drug discovery, but how profoundly and how fast. But there is currently no metric to systematically evaluate and compare these approaches, no mechanism to highlight the most promising methods and identify the fastest route to success. CACHE is a prospective hit finding competition where compounds selected by virtual screening or invented by generative models are procured and tested experimentally. Hit rate and diversity, potency and drug-likeness are used to evaluate and compare methods. All data and method description are publicly released. We expect that CACHE will define the state-of-the-art as computational hit-finding evolves over the years, and will act as an accelerator in the field.

Click here to register

Hosted by:

- top -


Tidy Transcriptomics for Single-cell RNA Sequencing Analyses
by Stefano Mangiola and Maria Doyle

February 18, 2022 at 4:00 PM CET

Description:

This tutorial will present how to perform analysis of single-cell RNA sequencing data following the tidy data paradigm. The tidy data paradigm provides a standard way to organise data values within a dataset, where each variable is a column, each observation is a row, and data is manipulated using an easy-to-understand vocabulary. Most importantly, the data structure remains consistent across manipulation and analysis functions. This can be achieved with the integration of packages present in the R CRAN and Bioconductor ecosystem, including tidyseurat, tidySingleCellExperiment, and tidyverse. These packages are part of the tidytranscriptomics suite that introduces a tidy approach to RNA sequencing data representation and analysis.

Instructors:

Dr. Stefano Mangiola is a Postdoctoral researcher in the laboratory of Prof. Tony Papenfuss. His background spans from biotechnology to bioinformatics and biostatistics. His research focuses on prostate and breast tumour microenvironment, the development of statistical model for the analysis of RNA sequencing data, and data analysis and visualisation interfaces.

Dr. Maria Doyle is the Application and Training Specialist for Research Computing at the Peter MacCallum Cancer Centre in Melbourne, Australia. She has a PhD in Molecular Biology and currently works in bioinformatics and data science education and training. She is passionate about supporting researchers, reproducible research, open source and tidy data.

Recommended Prerequisites:

  • Basic knowledge of single cell transcriptomic analyses
  • Basic knowledge of tidyverse

Click here to register

Hosted by:

- top -


Growing open source communities with internships
by Yo Yehudi

February 22, 2022 at 11:00 AM EST

Building communities for your open source computational tooling requires more than just technical expertise, and often isn't as straightforward as building the tool itself. Having a community of contributors and users can make a big difference in many ways - additional community members will spot opportunities and bugs in your code that previously you didn't notice, and may be able to offer unique skillsets to your team.

One effective way to grow your community can be via internships. Programs such as Google Summer of Code and Outreachy offer the chance to work with interns for 6-12 weeks, working on individual supervised projects whilst getting paid for their work.

This webinar will cover the ins-and-outs of participating in internship programs like this, from the perspective of a mentoring organisation. Topics will include:

  1. Getting started with internship programs - finding mentors and defining a set of projects
  2. Time commitments for mentors, before the application period and after interns are selected.
  3. Funding for internship programs! (it's not as tricky as you may fear - others handle this bit!)
  4. Keeping interns engaged during the program and bringing them in as long-term contributors afterwards.

Click here to register

Hosted by:

- top -


Spinning a semantic web of protein information
by Monique Zahn

March 18, 2022 at 11:00 AM UTC

Description:

Life science is the most demanding research field in terms of data quantity and complexity, with many relevant reference databases. To generate knowledge, heterogeneous data from various sources must often be combined. Semantic Web technologies, and in particular RDF and its companion query language SPARQL, provide a common framework allowing data to be shared and reused between resources. Many life science databases have recently turned to RDF to model their data, developed SPARQL endpoints and joined the Linked (Open) Data cloud. This tutorial will introduce neXtProt (www.nextprot.org/), one of the major public knowledge bases on human proteins, its comprehensive RDF data model, and its large collection of reusable example queries, including federated queries to other resources.

At the end of the course, the participants are expected to:
•    Describe the neXtProt data model
•    Run example queries that answer biological questions
•    Search for data by modifying existing SPARQL queries
•    Understand how federated queries are constructed

Instructor:

Monique Zahn is the Quality Manager of the CALIPHO group which develops neXtProt. She is responsible for testing user interfaces and the contents of each release. She has established quality control procedures involving SPARQL queries carried out at each data release. She has taught biology in undergraduate degree programs in Switzerland and is also Training Manager at the SIB.

Click here to register

Hosted by:

- top -

Exclusively for members

  • Member Discount

    ISCB Members enjoy discounts on conference registration (up to $150), journal subscriptions, book (25% off), and job center postings (free).

  • Why Belong

    Connecting, Collaborating, Training, the Lifeblood of Science. ISCB, the professional society for computational biology!

     

Supporting ISCB

Donate and Make a Difference

Giving never felt so good! Considering donating today.