Join us for our upcoming ISCBacademy Webinars. Check back regularly for updates
RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference
by Alexey Kozlov and Alexandros Stamatakis
September 30, 2020 at 11:00AM EDT!
Phylogenies are important for fundamental biological research, but also have numerous applications in biotechnology, agriculture and medicine. Finding the optimal tree under the popular maximum likelihood (ML) criterion is known to be NP-hard. Thus, highly optimized and scalable codes are needed to analyze constantly growing empirical datasets.
We present RAxML-NG, a from-scratch re-implementation of the established greedy tree search algorithm of RAxML/ExaML. RAxML-NG offers improved accuracy, flexibility, speed, scalability, and usability compared with RAxML/ExaML. On taxon-rich datasets, RAxML-NG typically finds higher-scoring trees than IQTree, an increasingly popular recent tool for ML-based phylogenetic inference (although IQ-Tree shows better stability). Finally, RAxML-NG introduces several new features, such as the detection of terraces in tree space and the recently introduced transfer bootstrap support metric.
October 2, 2020 at 11:00AM EDT!
There are outstanding evolutionary questions on the recent emergence of human coronavirus SARS-CoV-2 including the role of reservoir species, the role of recombination and its time of divergence from animal viruses. We find that the sarbecoviruses—the viral subgenus containing SARS-CoV and SARS-CoV-2—undergo frequent recombination and exhibit spatially structured genetic diversity on a regional scale in China. SARS-CoV-2 itself is not a recombinant of any sarbecoviruses detected to date, and its receptor-binding motif, important for specificity to human ACE2 receptors, appears to be an ancestral trait shared with bat viruses and not one acquired recently via recombination. To employ phylogenetic dating methods, recombinant regions of a 68-genome sarbecovirus alignment were removed with three independent methods. Bayesian evolutionary rate and divergence date estimates were shown to be consistent for these three approaches and for two different prior specifications of evolutionary rates based on HCoV-OC43 and MERS-CoV. Divergence dates between SARS-CoV-2 and the bat sarbecovirus reservoir were estimated as 1948 (95% highest posterior density (HPD): 1879–1999), 1969 (95% HPD: 1930–2000) and 1982 (95% HPD: 1948–2009), indicating that the lineage giving rise to SARS-CoV-2 has been circulating unnoticed in bats for decades.
Indigenous Voices in Computational Biology: An Introduction to Ethical Genomic Research with Indigenous People
by Rene Begay
October 8, 2020 at 1:00PM EDT!
Indigenous communities through the world have distinct languages, culture, political structures, and ways of knowing. For too long, these communities have been exploited for material goods, land, and more recent for biospecimens. It is important to note that Indigenous people are not anti-science but rather support science that includes their intrinsic perspectives and expertise. Indigenous scientists are emerging across the world bridging science, policy, technology, and Indigenous ways of knowing to determine how their communities can benefit from genomic and clinical health research. The Indigenous Voices in Computational Biology series from the ISCB Academy will highlight the work conducted by Indigenous researchers in the United States, New Zealand, and other countries. Topics will include genomic data sharing, ethical engagement with Indigenous peoples in paleogenomics, and how to responsibly conduct research on Indigenous ancestors (ancient DNA). As a result, Indigenous scientists have developed their own Native biobank and hosted an international Indigenous genomics conference to discuss ethical concerns within their communities and present community based genomic research that integrates Indigenous knowledge. This presentation will introduce the series overarching themes and provide the framework that encourages ethical engagement with Indigenous communities in genomic research.
Altered RNA Splicing by Mutant p53 Activates Oncogenic RAS Signaling in Pancreatic Cancer
by Luisa Escobar-Hoyos
October 15, 2020 at 11:00AM EDT!
Pancreatic ductal adenocarcinoma (PDAC) is driven by co-existing mutations in KRAS and TP53. However, how these mutations collaborate to promote this cancer is unknown. Here, we uncover sequence-specific changes in RNA splicing enforced by mutant p53 which enhance KRAS activity. Mutant p53 increases expression of splicing regulator hnRNPK to promote inclusion of cytosine-rich exons within GTPase-activating proteins (GAPs), negative regulators of RAS family members. Mutant p53-enforced GAP isoforms lose cell membrane association, leading to heightened KRAS activity. Preventing cytosine-rich exon inclusion in mutant KRAS/p53 PDACs decreases tumor growth. Moreover, mutant p53 PDACs are sensitized to inhibition of splicing via spliceosome inhibitors. These data provide insight into co-enrichment of KRAS and p53 mutations and therapeutics targeting this mechanism in PDAC.
The Illusion of Inclusion — The “All of Us” Research Program and Indigenous Peoples’ DNA
by Keolu Fox
November 12, 2020 at 11:00AM EST!
Raw data, including digital sequence information derived from human genomes, have in recent years emerged as a top global commodity. This shift is so new that experts are still evaluating what such information is worth in a global market. In 2018, the direct-to-consumer genetic-testing company 23andMe sold access to its database containing digital sequence information from approximately 5 million people to GlaxoSmithKline for $300 million. Earlier this year, 23andMe partnered with Almirall, a Spanish drug company that is using the information to develop a new antiinflammatory drug for autoimmune disorders. This move marks the first time that 23andMe has signed a deal to license a drug for development.
Eighty-eight percent of people included in large-scale studies of human genetic variation are of European ancestry, as are the majority of participants in clinical trials. Corporations such as Geisinger Health System, Regeneron Pharmaceuticals, AncestryDNA, and 23andMe have already mined genomic databases for the strongest genotype–phenotype associations. For the field to advance, a new approach is needed. There are many potential ways to improve existing databases, including “deep phenotyping,” which involves collecting precise measurements from blood panels, questionnaires, cognitive surveys, and other tests administered to research participants. But this approach is costly and physiologically and mentally burdensome for participants. Another approach is to expand existing biobanks by adding genetic information from populations whose genomes have not yet been sequenced — information that may offer opportunities for discovering globally rare but locally common population-specific variants, which could be useful for identifying new potential drug targets.