ISCB-LA SoIBio BioNetMX 2022

Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
All times listed are in CST
Thursday, November 3rd
8:45-9:00
Welcome to ISCB-LA SoIBio BioNetMX
Room: UNAM
Format: Live from venue

  • Alejandra Medina-Rivera
9:00-10:00
Keynote Presentation: Heads or Tails? Systems Biology of Wnt Signaling in Stem Cell Differentiation
Room: UNAM
Format: Live from venue

Moderator(s): Alejandra Medina-Rivera

  • Terry Gaasterland, University of California San Diego, USA


Presentation Overview: Show

Stem cells balance the maintenance of pleuripotency and readiness to differentiate. Upon initiation of differentiation, they must be ready to proceed along a multiplicity of fates. Conflicting signals between states must be regulated during transitions, and this regulation may be critical to completing the transition. Wnt signaling governs both maintenance of pleuripotency (at low doses) and differentiation (at higher doses) in stem cells. Our discovery of an essential regulator of the Wnt signaling pathway provided a missing piece in how stem cells initiate differentiation and then organize into cells that will become “head” vs “tail”. Time series RNA-seq data of WNT3a-treated human embryonic stem cells revealed a new transcription factor that regulates Wnt-response by displacing a ubiquitous transcription factor to repress gene expression. ChIP-seq data for each transcription factor revealed specific genes targeted by this negative feedback system. Genome-editing disabled the new transcription factor and confirmed the negative feedback loop. This talk will (1) trace how the regulatory loop emerged from the multiple 'omics datas and also cover two related studies that used the interpretation of gene expression patterns through RNA-seq to explore (2) Wnt-pathway regulation of "head-tail" organization during differentiation, and (3) dose-dependent Wnt-regulation of lineage specificity. The latter study engineered and characterized a new line of intermediate mesodermal progenitor cells positioned to incorporate and contribute to kidney.

10:30-10:45
A deep learning algorithm for high-resolution spatial transcriptomics cell type identification by integrating gene expression and spatial information.
Room: UNAM
Format: Live from venue

Moderator(s): Alejandra Medina-Rivera, Yalbi I. Balderas-Martínez

  • Karla Paniagua, University of Texas at San Antonio, United States
  • Mario Flores, University of Texas at San Antonio, United States
  • Yufang Jin, University of Texas at San Antonio, United States


Presentation Overview: Show

Spatial transcriptomics is positioning as a tool for the discovery of novel insights into disease. Presently it provides a picture of the transcription of hundreds of genes that maintains the spatial information of the transcription elements within the cells and tissue of origin. Accurately spatially mapping transcripts to cells and assigning cell types is critical for understanding transcription spatial organization and its dysregulation during disease. Currently, the most common methods for cell type identification rely on unsupervised clustering, where a cell type is assigned to each cluster. However, this approach assigns cell types to clusters without considering that a cluster may contain a mix of cell sub-types. Some other methods are based on statistical models to estimate the expected cell types in a tissue using an annotated scRNA-seq reference. However, these methods do not consider the spatial information of cells, where specific cell types can be located together or in the context of functionally associated cell types. In this work, we are developing a Deep Learning algorithm that combines graph convolutional networks and transformers to predict cell types from the integration of gene signatures, gene expression, and spatial information. The proposed method is evaluated in a non-small cell lung cancer high-resolution spatial transcriptomics dataset. The model achieves promising results by accurately predicting sub-types for individual cells. Spatial cell type identification with the proposed method enables cell type assignment to each cell considering its spatial context, uncovering the spatial organization of the biological tissue which underlies cell-cell communication, organ function, and pathology.

10:45-11:00
Unveiling functional heterogeneity in breast cancer multicellular tumor spheroids through single-cell RNA-seq
Room: UNAM
Format: Live from venue

Moderator(s): Alejandra Medina-Rivera, Yalbi I. Balderas-Martínez

  • Aarón Vázquez-Jiménez, INMEGEN, Mexico
  • Osbaldo Resendis-Antonio, INMEGEN, Mexico


Presentation Overview: Show

Heterogeneity is an intrinsic characteristic of cancer. Cell populations exhibit differential cellular programs that supply malignancy and decrease treatment efficiency even in isogenic tumors. This study investigated the functional relationship among cell subtypes and how this interdependency can promote tumor development in a cancer cell line. To do so, we performed single-cell RNA-seq of MCF7 Multicellular Tumor Spheroids as a tumor model. Analysis of single-cell transcriptomes at two-time points of the spheroid growth, allowed us to dissect their functional relationship. As a result, three major robust cellular clusters, with a non-redundant complementary composition, were found. Meanwhile, one cluster promotes proliferation, while others mainly activate mechanisms to invade other tissues and serve as a reservoir population conserved over time. Our results provide evidence to see cancer as a systemic unit that has cell populations with task stratification with the ultimate goal of preserving the hallmarks in tumors.

11:00-11:15
Are fibrils that different – the reaction of neuroblastoma cells to fibrillar amyloid beta
Room: UNAM
Format: Live from venue

Moderator(s): Alejandra Medina-Rivera, Yalbi I. Balderas-Martínez

  • Vladimir Jovanovic, Freie Universität Berlin, Germany
  • Branislava Rankovic, Laboratory of Molecular Neuroscience, German Center for Neurodegenerative Diseases (DZNE), Berlin, Germany
  • Roberto Sanseverino, Laboratory of Molecular Neuroscience, German Center for Neurodegenerative Diseases (DZNE), Berlin, Germany
  • Antonela Condric, Laboratory of Molecular Neuroscience, German Center for Neurodegenerative Diseases (DZNE), Berlin, Germany
  • Dragomir Milovanovic, Laboratory of Molecular Neuroscience, German Center for Neurodegenerative Diseases (DZNE), Berlin, Germany
  • Katja Nowick, Freie Universität Berlin, Germany


Presentation Overview: Show

Alzheimer's disease (AD) is a progressive neurodegenerative disorder, that is seen as the most common case of dementia in humans. During the course of its progression, neurons in the brain die, brains shrink, and problems with memory, speech and orientation occur. The causes of AD offset are still insufficiently understood, but both genetic and environmental factors play role in it. One of the molecular pathways leading to damage of neurons and activation of surrounding microglia is the aberrant cleavage and polymerization of amyloid beta (Aβ) peptides outside neurons from soluble oligomers to insoluble fibrils. We set to determine if a cell reacts differently to two forms of this peptide - oligomeric and fibrillar – especially as these are en route to the formation of amyloid plaques, a hallmark phenotype in AD patients. To this aim, we investigated differential gene expression in human neuroblastoma cells exposed to different Aβ forms, up to one day after exposure. The differences between two study groups were first seen after six hours, whilst the difference between control and the cells exposed to oligomeric Aβ was not established. Sixteen genes were repeatedly differentially expressed upon treatment with fibrillar Aβ. The genes that were overexpressed in this experimental group have role in the regulation of transcription, while the underexpressed genes take part in the lipid metabolism. Our results suggest that immediate cell answer to build-up of fibrillar Aβ outside its membrane includes both regulatory part (regulating the future expression of genes, potential part of a long-term reaction to Aβ stressor) and a metabolic part (change in mitochondrial and lipid metabolism).

11:15-11:30
Integrative gene regulatory analysis of Ikaros tumor suppression in acute lymphoblastic leukemia
Room: UNAM
Format: Live from venue

Moderator(s): Alejandra Medina-Rivera, Yalbi I. Balderas-Martínez

  • Alyssa Richman, University of Vermont, United States
  • Hana Paculova, University of Vermont, United States
  • Joseph Boyd, University of Vermont, United States
  • Princess Rodriguez, University of Vermont, United States
  • Hilde Schjerven, University of California San Francisco, United States
  • Seth Frietze, University of Vermont, United States


Presentation Overview: Show

Deregulation of chromatin structure is a characteristic of hematologic malignancies and promotes tumorigenesis by activation of oncogenes and silencing of tumor suppressor genes. Ikaros (encoded by the IKZF1 gene) is a hematopoietic transcription factor (TF) that regulates chromatin structure and is frequently mutated in B cell acute lymphoblastic leukemia (B-ALL). In this study, we used inducible expression of wild-type Ikaros (IK1) in patient-derived B-ALL cell lines that harbor a heterozygous IKZF1 deletion. IK1 induction in cells resulted in a robust growth arrest with cell cycle exit supporting the function of Ikaros as a tumor suppressor. Integrated omic analysis (ChIP-seq, RNA-seq and ATAC-seq) provided evidence of chromatin mechanisms associated with IK1-mediated growth suppression in B-ALL, including the regulation of genes that function downstream of the pre-B cell receptor, as well as genes that limit the metabolic capacity of leukemia cells. Interestingly, we found that induction of IK1 results in corresponding changes in chromatin accessibility and DNA methylation at Ikaros binding sites located at proximal and distal gene regulatory elements. Finally, we performed Paired Expression and Chromatin Accessibility Analysis to study the gene regulatory networks of IKZF1-mutated B-ALL. Overall, we present a high-resolution map of the deregulated gene regulatory landscape in IKZF1-deleted B-ALL and insight into Ikaros tumor suppression mechanisms. This work helps to identify characteristic epigenetic features including putative oncogenic TFs that contribute to leukemogenic growth programs that may represent therapeutic targets for B-ALL.

11:30-11:45
An RNA-Seq Pan Cancer Panel Approach for Molecular Subtyping of Pediatric B-cell Acute Lymphoblastic Leukemia (B-ALL)
Room: UNAM
Format: Live from venue

Moderator(s): Alejandra Medina-Rivera, Yalbi I. Balderas-Martínez

  • Alejandra Cervera Taboada, INMEGEN, Mexico
  • Carmen Alaez Verson, INMEGEN, Mexico
  • Karol Carrillo Sánchez, INMEGEN, Mexico
  • Carolina Molina Garay, INMEGEN, Mexico
  • Beatriz Villegas Torres, INMEGEN, Mexico
  • Anallely Muñoz Rivas, INMEGEN, Mexico
  • Marco Jinmenes Olivares, INMEGEN, Mexico


Presentation Overview: Show

B-ALL is the most common childhood cancer affecting 20-35 children per year worldwide while in Mexico the estimated incidence is 49.5 cases per year. Although 5-year survival for pediatric B-ALL has been improving over the past few years, getting close to 90% in some countries, in Mexico it remains around 60%. The difference in survival can be attributed to many reasons, some of those being a higher proportion of high-risk patients at diagnosis, incomplete diagnosis, and toxicity from treatment. Proper diagnosis of pediatric B-ALL requires either DNA or RNA-sequencing to identify all genetic alterations known to be associated with disease risk and treatment. In Mexico whole genome or transcriptome sequencing is seldom performed for B-ALL subtyping and in best case scenarios karyotyping, FISH and RT-PCR are used to identify the most common alterations. In this project we explore the classification of samples into the different molecular subtypes using a panel of 1385 genes for RNA-seq. We use standard RNA-seq processing and analysis methods combined with machine learning approaches to classify the samples. We have analyzed 113 samples which allowed us to corroborate that indeed the proportion of Mexican children with B-ALL carrying a high-risk alteration is higher than in other populations. We also show that we have been able to identify fusion genes with only one of the genes in the panel, in particular IGH-CRLF2 in which CRLF2 fuses to the promoter of IGH resulting in CRLF2 over-expression. Alterations of CRLF2 that cause over-expression of this gene is one of the hallmarks for the Ph-like subtype which is associated with poor prognosis. In addition to the classification of Ph-like samples through the identification of several fusion genes, we have also identified a set of samples with an amplification in chromosome X that has not been previously described and that does not overlap with other known molecular subtypes. To determine the clinical implications of this alteration we are in the process of gathering more samples for analysis as well as samples from public repositories, but preliminary data shows that patients with this alteration may have a better survival rate than other patients from the unclassified group. The use of a cancer panel of RNA-seq in combination with a low cost bioinformatic solution aimed for clinicians can be a useful tool for properly classifying B-ALL afflicted children into the molecular subtypes that can determine the appropriate treatment.

11:45-12:00
Elucidating the transcriptional landscape of metastatic Medulloblastoma in Patient Derived Xenografts
Room: UNAM
Format: Live-stream

Moderator(s): Alejandra Medina-Rivera, Yalbi I. Balderas-Martínez

  • Ana Isabel Castillo Orozco, McGill University - Research Institute of the McGill University Health Center, Canada
  • Masoomeh Aghababazadeh, McGill University - Research Institute of the McGill University Health Center, Canada
  • Niusha Khazaei, McGill University - Research Institute of the McGill University Health Center, Canada
  • Livia Garzia, McGill University - Research Institute of the McGill University Health Center, Canada
  • Geoffroy Danieau, McGill University - Research Institute of the McGill University Health Center, Canada


Presentation Overview: Show

Medulloblastoma (MB) is a highly aggressive and the most common pediatric brain tumor that arises mainly in the cerebellum. MB can metastasize to the leptomeningeal space, which is known as Leptomeningeal Disease (LMD). Although LMD represents a main clinical challenge, it is a vastly understudied field, and its molecular mechanisms are poorly characterized. Accordingly, there is an urgent need to develop strategies to study metastatic Medulloblastoma. We hypothesize that an in-depth knowledge of the molecular events driving subclones of the primary tumor to metastasize will offer therapeutic targets for effective therapies to treat or prevent LMD. To test this hypothesis, we have established metastatic Patient-Derived Xenografts (PDXs) that faithfully recapitulate LMD features. We have addressed our efforts in performing bulk RNA seq of PDXes models to profile LMD intertumoral heterogeneity and to identify genetic drivers/pathways that sustain this compartment. Using ssGSEA, we have identified PDXes models retain neoplastic subpopulations previously identified in MB single-cell sequencing studies with slight changes between primary and leptomeningeal compartments. Furthermore, we observe profound differences in gene expression between primary and LMD. Our results show various signaling pathways enriched across LMD models. We also have identified differentially expressed genes (DEG), where FcFragment of IgG Binding Protein (FCGBP) was found to be significantly differentially expressed in all Group 3 LMD models. This finding was concordant with DEA results and single-cell atlas from GEO expression omnibus datasets for breast and lung cancer metastatic to the leptomeninges. Our results support the notion that primary and LMD seem to retain subpopulation clusters present in MB Group 3 tumors with slight changes. We show that primary and LMD are transcriptionally different and have identified FCGBP as a DEG across LMD Group 3 Models. Further work will be required to determine the role of this gene in the development of LMD.

12:00-12:15
Differential analysis of gene expression and identification of pathways related to hippocampal function after caloric restriction diet.
Room: UNAM
Format: Live-stream

Moderator(s): Alejandra Medina-Rivera, Yalbi I. Balderas-Martínez

  • Francisco Javier Martínez-Rodríguez, Universidad Autónoma Metropolitana, Mexico
  • Pamela Salcedo-Tello, Universidad Autónoma Metropolitana, Mexico
  • Rodrigo González‑barrios, Instituto Nacional de Cancerología, Mexico
  • Kioko Gúzman-Ramos, Universidad Autónoma Metropolitana, Mexico


Presentation Overview: Show

Caloric restriction (CR), defined as a decrease in daily total calorie intake without causing malnutrition, is a dietary regimen that has been proposed as a non-pharmacological intervention model to prevent aging-related impairment of certain cognitive domains such as memory. However, its impact on brain function in healthy adults is poorly understood at the genetic and molecular levels. Therefore, we systematically searched massive databases (RNASeq) and performed differential gene expression analysis as well as structural and functional annotation of the data genes obtained to find possible mechanisms by which CR might impact hippocampal function. We identified 51 differentially expressed genes, of which Trpv4, Arhgap6, GNG4, Aqp1, and Sostdc1 are the most prominent to be implicated in the alteration of hippocampal cytoarchitectonic dynamics and cell signaling given their pleiotropic functions. The data obtained suggest that CR is an intervention that may have detrimental effects on hippocampal function in middle-aged adults, thus affecting the morpho-functional dynamics that support and promote hippocampal functions, such as synaptic plasticity.

12:15-12:30
Genetic variants and allelic frequencies related to hyperlipidemias from Costa Rican genomes
Room: UNAM
Format: Live-stream

Moderator(s): Alejandra Medina-Rivera, Yalbi I. Balderas-Martínez

  • Juan Carlos Valverde-Hernández, Universidad de Costa rica, Costa Rica
  • Gabriela Chavarría-Soley, Universidad de Costa Rica, Costa Rica
  • Sandra Silva de la Fuente, Universidad de Costa Rica, Costa Rica
  • Rebeca Campos-Sánchez, Universidad de Costa Rica, Costa Rica


Presentation Overview: Show

Hyperlipidemias are risk factors in diseases of significant importance to public health such as acute pancreatitis and atherosclerosis, conditions that contribute to the development of pancreatic cancer and cardiovascular diseases, respectively. Unhealthy lifestyles, the pre-existence of diseases, and the accumulation of genetic variants in some loci participate in the development of hyperlipidemias. The genetic causality behind this disease has been mostly studied in populations with a large European ancestry. Few studies have explored this topic in Costa Rica and none of them have focused on identifying variants with the potential to alter blood lipid levels and quantifying their frequency. To fill this gap, this study focused on identifying variants located in genes and promoters involved in lipid metabolism from genomes belonging to Costa Ricans of the Central Valley, contrasting their allelic frequency with those of groups reported in the 1000 Genomes Project and identifying potential variants that could affect the development of dyslipidemias. We identified ~17500 variants in the evaluated regions, of which the vast majority were not reported in Costa Ricans. A set of 29 variants with the potential to alter the performance of these genes were detected on their coding and promoter regions using predictive bioinformatics tools. Some of them have been linked to changes in blood lipid levels in other studies. Several of the variants previously reported in locally conducted studies were not identified. This study has an impact in the future applications of genomics for the country.

13:45-14:00
Regulation of Calcium Homeostasis in the Trans-Golgi Network of HeLa Cells
Room: UNAM
Format: Live from venue

Moderator(s): Maribel Hernández-Rosales, Fabien Plisson

  • Norma Pérez-Rosas, Purdue University, United States
  • Ursula Kummer, Heidelberg University, Germany


Presentation Overview: Show

Calcium ions (Ca2+) play an important role as second messengers in the cell. In the trans-Golgi Network (TGN), they are decisive for the sorting of proteins. Measurements of Ca2+ during the development of the TGN show that Ca2+ levels change and that specific levels are maintained depending on the maturation state of the TGN. However, an integrative understanding of how Ca2+ homeostasis is kept in the TGN is still lacking. Here, we propose a quantitative computational model that integrates experimental knowledge to understand the homeostasis of Ca2+ in the TGN of HeLa cells. The model supports the idea that Cab45 (a calcium-binding protein localized in the lumen of the Golgi complex) induces a regulation of a Ca2+ release channel in the membrane of the TGN which existence has been debated in the literature.

14:00-14:15
Virtual Screening, Molecular Docking, and In Vitro Evaluation of Potential Enolase Inhibitors from Entamoeba histolytica
Room: UNAM
Format: Live from venue

Moderator(s): Maribel Hernández-Rosales, Fabien Plisson

  • Karla Araceli León García, Instituto Politécnico Nacional, Mexico
  • Christian Adalid Martínez Rebollar, Instituto Politécnico Nacional, Mexico
  • Marisol López Hidalgo, Instituto Politécnico Nacional, Mexico
  • Darinka Pamela Durán Gutiérrez, Instituto Politécnico Nacional, Mexico
  • César Augusto Sandino Reyes López, Instituto Politécnico Nacional, Mexico


Presentation Overview: Show

Amebiasis is a major health problem in developing countries. It is caused by the
protozoan parasite Entamoeba histolytica. Resistance to the treatment of choice by this parasite has been reported, there is also evidence of important side effects. Therefore, it is necessary to search for new therapeutic targets that allow novel treatments. We performed a virtual screening for available compounds that could bind to a positively charged region present in loop 3 of E. histolytica enolase that differs from the homologous human protein.
The binding of compounds in this zone could hinder the interaction between loop 3 and loop 2 of the protein, an important event for catalytic activity. Compounds were evaluated using a combination of computational tools, such as molecular dynamics and molecular docking tools, and catalytic activity assays. The results obtained from Autodock, Autodock VINA, and MOE showed high-affinity interactions between E. histolytica loop 3 residues and compounds identified as 1598 and 1602. We also carried out molecular dockings with non- neuronal human and Saccharomyces cerevisiae enolase, and the results showed that the compounds do not have high-affinity interactions with these proteins, however, enzyme assays show that the compounds could completely inhibit the enzymatic activity of amoeba enolase and yeast enolase, suggesting that the compounds could bind to other regions relevant for catalysis.

14:15-14:30
Identification of potential new inhibitors of Mycobacterium tuberculosis dihydrofolate reductase using an in silico approach
Room: UNAM
Format: Live-stream

Moderator(s): Maribel Hernández-Rosales, Fabien Plisson

  • Jose Antonio Jimenez, Grupo de Investigacion en Bioinformatica y Biologia Estructural. Universidad Nacional Mayor de San Marcos, Peru
  • Gustavo Sandoval, Grupo de Investigacion en Bioinformatica y Biologia Estructural. Universidad Nacional Mayor de San Marcos, Peru


Presentation Overview: Show

Mycobacterium tuberculosis dihydrofolate reductase (MtDHFR) catalyzes the conversion of dihydrofolate to tetrahydrofolate, which is used in DNA and protein synthesis. Therefore, MtDHFR could be used as an attractive target for new antitubercular drugs. This study aimed to identify inhibitory compounds that bind with high affinity to this enzyme and form stable complexes. Compounds were screened against a crystal structure of MtDHFR bound to diaveridine (PDB ID: 6NNE) with a combination of pharmacophore features from a 2,4-diaminopyrimidine group present in the inhibitor. Thus, an initial library of 314 compounds from ChEMBL/ZINC/Coconut databases was obtained and then filtered based on their physicochemical properties and toxicity. After that, virtual screening and molecular docking were performed with the minimized structure of the enzyme. Furthermore, a docking was performed between MtDHFR enzyme and diaveridine from the crystal structure. Molecular dynamics simulations (MDS) of 10 ns were carried out to assess conformational stability of the selected complex systems, followed by absorption, distribution, metabolism, excretion and toxicity (ADMET) predictions. As a result, compounds 2, 3 and 4 showed higher affinity towards MtDHFR and lower inhibitory constant (Ki) values than diaveridine and were selected for further analysis. Molecular dynamic simulations revealed that the conformation of complexes containing compounds 2 and 4 remained stable with mean RMSD values lower than 0.35 nm. Regarding ADMET predictions, compounds 2 and 4 were lipophilic, water-soluble and had high gastrointestinal (GI) absorption rate. However, both compounds would be hepatotoxic and positive for AMES mutagenicity assay. In conclusion, the lead compounds selected in this study showed promising inhibitory activity towards MtDHFR and can be optimized by using in silico tools to reduce their toxicity. Also, their inhibitory potential can be evaluated in further studies involving both in vitro and in vivo models. Financial Support: VRIP-UNMSM and PROCIENCIA-CONCYTEC (Contrato Nº 390-2019-FONDECYT).

14:30-14:45
In silico screening of potential antiviral compounds from some selected Nigerian medicinal plants against 3CLpro, ACE2, and PLpro targets of SARS-CoV-2 virus
Room: UNAM
Format: Live from venue

Moderator(s): Maribel Hernández-Rosales, Fabien Plisson

  • Raymond Ibeh, Federal University of Technology Owerri, Nigeria
  • Monsurat Lawal, University of KwaZulu-Natal, South Africa
  • Gavin Ikechukwu, Michael Okpara University of Agriculture, Nigeria


Presentation Overview: Show

Aims: This study aims to identify potential inhibitors of the SARS-CoV-2 virus.
Background: The contemporary economic situation and challenging public health burden both on the national and global scale has led to a search for COVID-19 therapies. Presently, there is no cure for COVID-19 infection, but there is significant progress in vaccines formulation for prophylaxis. Some of the acknowledged conventional vaccines approved include AstraZeneca, Covax, Johnson & Johnson, and Pfizer. Due to potential adverse effects usually accrue through synthetic drugs, there are ongoing investigations on natural antiviral plants with active components as therapeutic agents against COVID-19.
Objective: Apply computational and theoretical methods to identify potent phytocompounds as anti-COVID-19 through inhibiting SARS-CoV-2 identified targets.
Method: We use virtual screening, molecular docking, and molecular dynamics (MD) simulations to investigate our research aim.
Result: The predicted docking score (ΔG) values range from -5.5 to -9.4 kcal/mol, denoting favored binding of these compounds to the SARS-CoV-2 proteins and presenting a multitarget inhibition for COVID-19. Some phytocompounds interact favorably at non-active sites of the enzymes, and refinement with MD simulation shows that these regions are stable sites and possibly, allosteric regions of 3CLpro, ACE2, and PLpro for inhibitor binding and modulation.
Conclusion: These phytocompounds could be developed into effective therapy against COVID19 and probed as potential multitarget-directed ligands and drug candidates against the SARSCoV-2 virus.
Implications: This study features drug repurposing, allosteric site targeting, and multitargetdirected ligand in one piece. These concepts are three distinct approaches in the drug design and discovery pipeline.

14:45-15:00
IN SILICO DESIGN OF QUERCETIN DERIVATIVES WITH POTENTIAL DUAL INHIBITORY ACTIVITY AGAINST GSK3B AND CDK5/p25 FOR THE TREATMENT OF ALZHEIMER'S DISEASE.
Room: UNAM
Format: Live-stream

Moderator(s): Maribel Hernández-Rosales, Fabien Plisson

  • Jaime Tamayo, Universidad Nacional Mayor de San Marcos - CSSR, Peru
  • Alessandra Latorre, Chemistry Student Society for Reseach (CSSR), Peru
  • Giulliano Nájera, Universidad Nacional Mayor de San Marcos - CSSR, Peru
  • Juan Zavaleta, Universidad Privada Antenor Orrego, Peru
  • Kevin Ore, Universidad Nacional Mayor de San Marcos - CSSR, Peru


Presentation Overview: Show

To assess existing strategies for compounds targeting AD, the current approach is the in silico design of multi-target drugs covering diverse pathways related to AD pathophysiology. Tau protein hyperphosphorylation mechanism associated with B-amyloid accumulation is a critical step in the pathogenesis and progression of AD, related to kinases GSK3B and CDK5/p25 activation. In this sense, the search for dual inhibitors is a novel therapeutic strategy for the treatment of AD. The flavonoid scaffold has been the basis for the design of new derivatives as drug candidates against AD. Despite scarce studies with substitutions at the 8 Carbon position, have shown to be great inhibitors of AD targets. Herein, we described the in silico design of quercetin derivatives substituted at the C8 position as dual inhibitors of GSK3B and CDK5/p25 through an integrative approach of physicochemical and toxicologic properties along with structure bioinformatics. Based on this integrative workflow, 613 bioisosteres were designed using the Swisbioisostere platform. The first screening was performed based on criteria of kinase inhibitory activity, toxicity, Lipinski-Veber rules, and pharmacokinetics, using the Way2Drug, DataWarrior, and SwissADME platforms, respectively. The 22 best bioisosteres were optimized with Avogadro, protein preparation was carried on Pymol and Chimera and finally site-specific docking was performed three times in the active site of the enzymes using MTiAutoDock. 22 bioisosteres resulted with better affinity energies (Eaff) to GSK3B and CDK5/p25 compared with quercetin (-9,77; -10.61 kcal/mol), roscovitine (-8.50; -9.04 kcal/mol), flavopiridol (-8.89; -10.96kcal/mol) controls, particularly compounds QT94 (-12.41; -13.94 kcal/mol) and QT115 (-12.76; -13.79 kcal/mol). Moreover, all of them shown many hydrophobic interactions with important protein residues from the active site of GSK3B (Ile62, Val70, Ala83, Leu188, Leu132, Asp133, Tyr134, Val135 and Pro136) and from CDK5/p25 (Ile10, Phe80 and Lys133). In the same way, the bioisosteres had many polar interactions with both GSK3B (Val135 and Asp133) and CDK5/p25 (Cys83 and Asp86). Additionally, the interaction of the evaluated compounds with residues Leu 132, Tyr134, Arg141 and Cys199 could indicate a higher binding selectivity to GSK3B compared to its isoform GSK3α, since they are specific residues. In conclusion, C8-substituted quercetin derivatives were shown to be potential dual inhibitors of GSK3B and CDK5/p25 for the treatment of AD.

15:00-15:15
Determination of the microRNAs-target genes network in idiopathic pulmonary fibrosis
Room: UNAM
Format: Live-stream

Moderator(s): Maribel Hernández-Rosales, Fabien Plisson

  • Jose A Ovando-Ricardez, División Académica Multidisciplinaria de Jalpa de Méndez, Universidad Juárez Autónoma de Tabasco, Mexico
  • Yazmin Hernandez-Diaz, División Académica Multidisciplinaria de Jalpa de Méndez, Universidad Juárez Autónoma de Tabasco, Mexico
  • Yalbi I Balderas-Martinez, Laboratorio de Biologica Computacional, Instituto Nacional de Enfermedades Respiratorias Ismael Cosio Villegas, Mexico


Presentation Overview: Show

Introduction. Idiopathic Pulmonary Fibrosis (IPF) is a very aggressive and irreversible respiratory disease characterized by progressive scarring of lung tissue, resulting in respiratory failure and death. The pathogenesis of the disease remains unclear. In recent decades, studies have reported that microRNAs (miRNAs) play an important role in developing lung diseases, including IPF. Methods. We analyzed 304 IPF lung biopsies and 200 controls obtained from the Gene Expression Omnibus database, considering eight datasets (GSE35145, GSE72073, GSE48149, GSE53845, GSE110147, GSE24206, GSE32538, and GSE27430) from microarrays and two datasets (GSE150910 and GSE52463) coming from tissue RNA-seq. Each experiment was analyzed according to the type of platform (Affymetrix, Agilent, or Ilumina) with R v4.1.1 and Bioconductor v 3.14. The differential expression analysis was performed with limma v3.15 for microarrays and DESeq2 v3.14 for RNA-seq experiments, using a threshold of |log2FoldChange| 0.8 and a Benjamini-Hochberg adjusted p-value < 0.05. The genes considered for subsequent analyses were those that were present in a minimum of four experiments. miRNA-target gene interactions and enrichment were established using miRNet, and then, we performed a network and module analysis with the Walktrap algorithm in Cytoscape. Results. We found 149 aberrantly expressed genes (89 up-regulated genes and 60 down-regulated genes) in more than four experiments analyzed. Likewise, 56 miRNAs previously reported experimentally in lung tissue interact with 137 of our 149 aberrantly expressed genes. In addition, we found five main modules within our network related to epithelial development, cellular growth, extracellular matrix organization, negative regulation of cell signaling, and finally, a module associated with key genes in the process of lung fibrosis. Conclusions. In this work, we integrated all the transcriptomics experiments on lung biopsies from IPF patients to find the gene signature and related miRNAs. It allowed us to generate hypotheses about the inhibitory effect of two miRNAs (miR-126-3p and miR-335-5p) associated with a module related to lung fibrosis, these miRNAs seem to regulate essential genes for the development of fibrosis and key processes such as epithelial-mesenchymal transition and the secretion of molecules in charge of extracellular matrix remodeling.

15:15-15:30
Bioinformatic analyses to identify identical inverted and direct repeats that mediate the potential formation of complex genomic rearrangements
Room: UNAM
Format: Live from venue

Moderator(s): Maribel Hernández-Rosales, Fabien Plisson

  • Luis Fernandez-Luna, International Laboratory for Human Genome Research, Mexico
  • Carlos Aguilar-Perez, International Laboratory for Human Genome Research, Mexico
  • Michele Mehaffey, Pacific Northwest Research Institute, United States
  • Claudia Carvalho, Pacific Northwest Research Institute, United States
  • Claudia Gonzaga-Jauregui, International Laboratory for Human Genome Research, Mexico


Presentation Overview: Show

Repeated sequences spread throughout the genome play a role in genomic stability, structural variant complexity, and the generation of complex genomic rearrangements contributing to disease burden. Bioinformatic analyses of the key features of these repeated sequences such as orientation, size, similarity, and distribution can help predict the possible rearrangements, type of structural vari­ants, and major mechanisms occurring at a specific genomic region. Furthermore, the study of these genomic features can help identify regions that are prone to result in genomic disorders.

We performed bioinformatic analyses through self-alignment of three available human genome assemblies: GRCh37, GRCh38 and the most recent telomere-to-telomere alternate assembly (T2T-CHM3) to identify pairs of direct and inverted repeats with, a minimum of 80% identity between pairs and at least 200 base pairs length. Next, we developed an algorithm to combine and collapse overlapping or nearby clusters of repeats in the same orientation where sequence identity would not fall below 80% Overall, we found 640,167, 646,689, and 669,755 number of repeated sequence pairs in direct orientation in the GRCh37, GRCh38, and T2T-CHM3 assemblies respectively; while 620,756, 621,651, and 638,405 inverted repeat pairs were similarly found. Of these 15,508, 22,416 , and 21,795 overlap genes in each one of the assemblies, while 2,305, 5,508, and 3,476 flank genes potentially leading to genomic disorders when rearranged.

As expected,we found more repeats in the T2T-CHM3 assembly than in the two human genome reference assemblies analyzed, consistent with an enrichment of repeated elements in the regions not previously covered by the reference assembly, such as the short arms of acrocentric chromosomes . Similarly, we observed a larger fraction of repeat pairs overlapping segmental duplications in T2T compared to the reference. Other types of repeats such as LINEs, SINEs and Alus were similarly distributed across the genome in all three assemblies.

Identifying inverted and direct repeats that act as substrates for recombination can reveal the genomic regions that are most susceptible to genomic instability as well as gaining insights into how the genomic architecture of these repeats such as their homology, size and distance, leads to structural variation and complex genomic rearrangement formation that are associated with diseases, as well as the understanding of the mechanisms that underlie the rearrangement formation and even the determination of the breakpoints involved .

16:00-17:00
Powering single-cell transcriptomics with long-read Nanopore sequencing
Room: UNAM
Format: Live from venue

  • Morgane Thomas-Chollier


Presentation Overview: Show

Single-cell RNA-seq (scRNA-seq) has stimulated the understanding of complex processes (e.g. cell differentiation, tumorigenesis) and their underlying cell heterogeneity at a unprecedented high resolution. By providing gene expression at single cell resolution, this technique allows for the identification of new cell subtypes and their corresponding markers. The most common approach targets the 3’ region of the transcripts, further sequenced with short reads technology. On the one hand, this limitation causes a loss of signal in poorly-annotated genomes. On the other hand, this limitation prevents signal to be associated to specific isoforms. We show how long-read Nanopore sequencing, either in bulk RNA-seq or directly in scRNA-seq, can be used to improve short-reads scRNA-seq analyses, for applications in poorly-annotated genomes or to refine the resolution of the analyses to the level of isoforms.

Friday, November 4th
8:45-9:00
Morning Welcome
Room: UNAM
Format: Live from venue

9:00-10:00
Keynote Presentation: The population history of Mexico: tales from ancient and modern genomes
Room: UNAM
Format: Live from venue

  • María Ávila-Arcos


Presentation Overview: Show

Our genomes harbor information about our population history. By studying numerous ancient and modern human genomes, we have unveiled several aspects of our journey expanding throughout the world. In this talk I will share some findings pertaining to the genetic history of the population who have inhabited the territory we today call Mexico. We will explore how the study of genomes have helped us understand who were the first peoples to enter the region, how these diverged and adapted as they encountered new environments, and what was the genetic impact of European colonization.

10:30-10:45
Therapeutic capability of selected medicinal plants' bioactive constituents against the mutant ovarian TP53 gene; A computational approach.
Room: UNAM
Format: Live-stream

Moderator(s): Nelly Sélem Mojica, Alejandro Pereira

  • Kayode Raheem, Adekunle Ajasin university Akungba, Nigeria
  • Praise Fawehinmi, Institution University of Ilorin Molecular Diagnostic and Research Laboratory, Nigeria
  • Solomon Olorundare, University of Lagos, Akoka, Lagos, Nigeria., Nigeria


Presentation Overview: Show

Background
The pivotal role of mutant P53 protein in Ovarian cancer and the efficacy of natural compounds in cancer treatment necessitated the current study to identify novel mutant P53 modulators from medicinal plants. Homology modelling was deployed to assemble the 3-D structure of the mutant P53 protein from its amino acid sequences, while Findsitecom2.0 was used to predict the active binding site of the mutant P53 protein model. The bioactive constituents obtained from seven plants were used as ligands and docked against the binding pocket of mutant P53 protein. Autodock tools, PyRx and Discovery Studio, were used to prepare the protein, dock the ligands and visualize the complexes, respectively. Thiotepa and Germcitabine were used as reference drugs. The hit compounds were selected based on their highest binding affinity and further analyzed to identify their pharmacokinetic properties and acute Rat Toxicity using SWISSADME and Gusar, with their electronic properties calculated using the density functional theory(DFT) method.
Results
Screening results of 50 bioactive phytochemicals confirmed that 15 leads showed superior binding energies to mutant P53 as compared to the standard FDA-approved drugs (Thiotepa and Germcitabine with binding scores of -3.5 and -5.4, respectively). After considering their drug-like, pharmacokinetic properties and acute toxicity prediction, four major hits (Morusin, Irinotecan, Rubitecan, and 10-hydroxycamptothecin) were identified to have minimal toxicities and are safe to be used. The DFT calculations showed regions of the molecules prone to electrophilic and nucleophilic attacks.
Conclusions
The current study revealed drug-like compounds that can serve as potential modulators of mutant P53 in Ovarian cancer treatment.

10:45-11:00
Evolutionary conservation of secondary structures in the lncRNAs of plants
Room: UNAM
Format: Live from venue

Moderator(s): Nelly Sélem Mojica, Alejandro Pereira

  • Jose Antonio Corona-Gomez, Unidad de Genómica Avanzada, Langebio, Cinvestav, Mexico
  • Peter F. Stadler, Bioinformatics Group, Department of Computer Science, University Leipzig, Germany
  • Selene L. Fernandez-Valverde, Unidad de Genómica Avanzada, Langebio, Cinvestav, Mexico


Presentation Overview: Show

LncRNAs are essential regulators of eukaryotic gene expression. lncRNAs exert their gene regulatory functions primarily through their interaction with, DNA, RNA, and protein. These functions are thought to be partially associated with their capacity to fold into complex three-dimensional structures, similar to ribosomal or transfer RNAs or the human lncRNAs HOTAIR and Cyrano. Despite fewer lncRNAs functionally characterized in plants than in animals, they display similar functions, and their structure is also considered paramount to their function. Here, we computationally identified conserved lncRNAs structures in plants, specifically in the Brassicaceae family. To find the lncRNAs with structural conservation, we focused our analysis on lncRNAs with position and splicing conservation (previously identified by our group), using the whole genomic alignment of 16 Brassicaceae species. We found that 44.2% (1925 of 4354) of the intergenic lncRNAs (lincRNAs) and 75.1% (1549 of 2060) of the natural antisense transcripts (NATs) of Arabidopsis thaliana have conserved structural motifs in at least 2 of the 16 species. We found 3612 lncRNAs that have conserved structural motifs in multiple species, from these lncRNAs 2264 are tissue-specific and 841 can be associated with a function by a co-expression network in A. thaliana. Indeed, we found several previously functionally characterized lncRNAs, such as COOLAIR previously shown to have a conserved structure, and lncRNAs lncCOBRA1, FLORE, IPS1, and ELENA1. In summary, we have identified lncRNAs with conserved structures in Brassicaceae. These warrant future experiments to test whether these identified structures for in vivo and whether they are essential to the function of these lncRNAs.

11:00-11:15
Comprehensive structural variants detection in plant genomes using whole genome alignment
Room: UNAM
Format: Live-stream

Moderator(s): Nelly Sélem Mojica, Alejandro Pereira

  • Clarence Todd, University of Saskatchewan, Canada
  • Lingling Jin, University of Saskatchewan, Canada


Presentation Overview: Show

Structural variants (SVs) are rearrangements within a species' genetic sequence that play a significant role in the diversity and evolution of a species and have been linked to many important biological traits. While the importance of SVs has been demonstrated in multiple studies, there remains no single best method with which SVs can be accurately identified on real data. Further, SV caller results can often vary significantly, especially with plant genomes that are recently or historically polyploidized. To reduce this variability and improve the accuracy of results, this study explores the use of multiple types of sequence data and SV callers to obtain a more comprehensive and accurate set of SVs. We created a four-stage pipeline that accepts raw read data to complete all necessary steps to generate a set of SVs. The pipeline is validated using Brassica nigra. A comprehensive SV set is generated from whole genome comparisons of high-quality assemblies to assess the amount of variability and consistency between the selected SV callers and to complement the results. A validation set was also generated using Arabidopsis thaliana genotypes to compare our algorithm to results from other whole genome comparison-based SV tools.

11:15-11:30
WITCH-NG: Efficient and Accurate Alignment of Datasets with Sequence Length Heterogeneity
Room: UNAM
Format: Live-stream

Moderator(s): Nelly Sélem Mojica, Alejandro Pereira

  • Baqiao Liu, University of Illinois at Urbana-Champaign, United States
  • Tandy Warnow, University of Illinois at Urbana-Champaign, United States


Presentation Overview: Show

Multiple sequence alignment (MSA) is a basic part of many bioinformatics pipelines, including in phylogeny estimation, prediction of structure for both RNAs and proteins, and metagenomic sequence analysis. Yet many sequence datasets exhibit substantial sequence length heterogeneity, both because of large insertions and deletions (indels) in the evolutionary history of the sequences and the inclusion of unassembled reads or incompletely assembled sequences in the input. A few methods have been developed that can be highly accurate in aligning datasets with sequence length heterogeneity, with UPP (Nguyen et al., 2015) one of the first methods to achieve good accuracy, and WITCH (Shen et al., Bioinformatics 2021) an improvement on UPP for accuracy, In this paper, we show how we can speed up WITCH. Our improvement includes replacing a critical step in WITCH (currently performed using a heuristic search) by a polynomial time exact algorithm using Smith-Waterman. Our new method, WITCH-NG (i.e., "next generation WITCH", pronounced "witching") achieves the same accuracy but is substantially faster. WITCH-NG is available in open source form at https://github.com/RuneBlaze/WITCH-NG. Supplementary materials are available at https://doi.org/10.1101/2022.08.08.503232.

11:30-11:45
Resolving sub-lineages of haplogroup Q in admixed South Americans using targeted capture massively parallel sequencing
Room: UNAM
Format: Live from venue

Moderator(s): Nelly Sélem Mojica, Alejandro Pereira

  • Zehra Köksal, Section of Forensic Genetics, Department of Forensic Medicine, University of Copenhagen, Denmark
  • Graciela Bailliet, Instituto Multidisciplinario de Biología Celular, Universidad Nacional de La Plata, Argentina
  • Germán Burgos, Escuela de Medicina, Facultad de Ciencias de la Salud, Universidad de Las Américas (UDLA), Ecuador
  • Elizeu Carvalho, DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Brazil
  • Andrea Casas-Vargas, Instituto de Genética, Universidad Nacional de Colombia, Argentina
  • Adriana Castillo, Department of Basic Sciences, Universidad Industrial de Santander (UIS), Colombia
  • Beatriz Martínez, Institution of Investigative Immunology, Universidad de Cartagena, Colombia
  • Humberto Ossa, Laboratório de Genética y Biología Molecular, Colombia
  • María Laura Parolin, Instituto de Diversidad y Evolución Austral (IDEAus), Centro Nacional Patagónico, CONICET, Argentina
  • Alfredo Quiroz, Instituto de Previsión Social, Paraguay
  • Ulises Toscanini, Primer Centro Argentino de Inmunogenética (PRICAI), Fundación Favaloro, Argentina
  • William Usaquén, Instituto de Genética, Universidad Nacional de Colombia, Colombia
  • Carlos Vullo, DNA Forensic Laboratory, Argentine Forensic Anthropology Team (EAAF), Argentina
  • Claus Børsting, Section of Forensic Genetics, Department of Forensic Medicine, University of Copenhagen, Denmark
  • Leonor Gusmão, DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Brazil
  • Vania Pereira, Section of Forensic Genetics, Department of Forensic Medicine, University of Copenhagen, Denmark


Presentation Overview: Show

Y-chromosomal SNPs, that define Y-haplogroups, enable the study of Y-chromosome geographical distribution throughout human history and reconstruction of male migration patterns. However, the distribution of Native American Y chromosomes within South America is not entirely resolved. Haplogroup Q is assumed to have been carried by the first settlers into South America. Within this haplogroup, the sub-lineage Q-M3 is the most common but limited haplogroup resolution prohibits the precise analysis of genetic patterns and migration routes within the continent.

The aim of the study is to identify and characterize novel population-specific variants within haplogroup Q to increase the resolution within the haplogroup. A total of 64 genetically diverse and admixed modern-day South American male individuals with haplogroup Q were investigated.

Comprehensive sequencing of sections of the non-recombining portion of the Y chromosome (NRY) was carried out. Our approach relied on targeted capture and massively parallel sequencing. A total of 359,954 probes were custom-designed using Agilent’s SureDesign software corresponding to 7.5 Mb of the NRY. The probes were hybridized to libraries generated by random breakdown of the input DNA. These complementary libraries were retained, washed, amplified, and finally sequenced on the Illumina NovaSeq 6000 System.

Thorough bioinformatic analysis included adapter trimming, and alignment to a reference genome. PCR product deduplication using additional barcodes was conducted in order to minimize sequencing errors. Variants were called and supplemented with additional variants from more than 100 publicly available genome sequences of South and Central Americans with haplogroup Q.

We found a combination of novel and already reported variants of haplogroup Q, which contributed to resolving sub-branches of the haplogroup, in particular within the lineage Q-M3.

11:45-12:00
Integration of a panamerican whole-genome dataset from publicly available data
Room: UNAM
Format: Live-stream

Moderator(s): Nelly Sélem Mojica, Alejandro Pereira

  • Josué Guzmán-Linares, BUAP, Mexico
  • María Fernanda Mirón-Toruño, BUAP, Mexico
  • Carlos Alberto Contreras-Paredes, BUAP, Mexico
  • Enrique Morett, UNAM, Mexico
  • Israel Aguilar-Ordóñez, INMEGEN, Mexico


Presentation Overview: Show

Native American (NatAm) groups are underrepresented in collective genomic knowledge. In terms of whole-genome data (a technology that explores the genome in greater depth for population studies) the number of genomes is scarce but together they can provide more information. We recently published the review of 56 scientific papers where genomic data was generated in NatAm populations, which allowed us to locate publicly available data (https://doi.org/10.3390/d14080647).

In the present work we describe the data from 93 publicly available contemporary NatAm genomes from the following home countries: Mexico, Bolivia, Brazil, Peru, Colombia and Argentina; and representing the following ethnic groups: Aymara (25 genomes), Maya (21), Pima (14), Karitiana (12), Surui (8), Quechua (3), Mixe (3), Mixteco (2), Zapoteco (2), Piapoco (2), Chane (1).

The integration of these genomes in a single bioinformatic resource will be an important source of information for genomic projects seeking to study the evolutionary history of the continent.

12:00-12:15
DISCO+QR: Rooting Species Trees in the Presence of GDL and ILS
Room: UNAM
Format: Live-stream

Moderator(s): Nelly Sélem Mojica, Alejandro Pereira

  • James Willson, University of Illinois at Urbana-Champaign, United States
  • Yasamin Tabatabaee, University of Illinois at Urbana-Champaign, United States
  • Baqiao Liu, University of Illinois at Urbana-Champaign, United States
  • Tandy Warnow, the university of illinois at urbana-champaign, United States


Presentation Overview: Show

Genes evolve under processes such as gene duplication and loss (GDL), so that gene family trees are multi-copy, as well as incomplete lineage sorting (ILS); both processes produce gene trees that differ from the species tree. The estimation of species trees from sets of gene family trees is challenging, and the estimation of rooted species trees presents additional analytical challenges. Two of the methods developed for this problem are STRIDE (Emms and Kelly, MBE 2017), which roots species trees by considering GDL events, and Quintet Rooting (Tabatabaee et al., ISMB 2022 and Bioinformatics 2022), which roots species trees by considering ILS. We present DISCO+QR, a new method for rooting species trees in the presence of both GDL and ILS. DISCO+QR, operates by taking the input gene family trees and decomposing them into single-copy trees using DISCO (Willson et al., Systematic Biology 2022) and then roots the given species tree using the information in the single-copy gene trees using Quintet Rooting (QR). We show that the relative accuracy of STRIDE and DISCO+QR depend on properties of the dataset (number of species, genes, rate of gene duplication, degree of ILS, and gene tree estimation error), and that each provides advantages over the other under some conditions. Availability: DISCO and QR are available in GitHub. The supplementary materials are available at http://tandy.cs.illinois.edu/discoqr-suppl.pdf.

12:15-12:30
Predicting Phenotypes From Novel Genomic Markers Using Deep Learning
Room: UNAM
Format: Live-stream

Moderator(s): Nelly Sélem Mojica, Alejandro Pereira

  • Shivani Sehrawat, Department of Computer Science, University of Saskatchewan, Canada
  • Keyhan Najafian, Department of Computer Science, University of Saskatchewan, Canada
  • Lingling Jin, Department of Computer Science, University of Saskatchewan, Canada


Presentation Overview: Show

Previous studies have used Single Nucleotide Polymorphism (SNP) markers to predict phenotypes using conventional statistical or deep learning models. However, these predictive models face challenges due to the high dimensionality of genome-wide SNP marker data. The study of novel genomic variants such as Structural Variations (SVs) and Transposable Elements (TEs) in plants is becoming increasingly prevalent, thanks to recent breakthroughs in DNA long- read sequencing and decreased cost. We develop a deep convolutional neural network model, NovGMDeep, for predicting the phenotypes of Arabidopsis thaliana and Oryza sativa using SVs and TEs respectively in this paper. The proposed model is trained and tested on different samples of phenotypes of A. thaliana and O. sativa using k-fold cross-validation. The prediction accuracy is evaluated using Pearson’s Correlation Coefficient (PCC), Mean Absolute Error (MAE), and Standard Deviation (SD) of MAE. The predicted results showed higher correlation values for the model when trained upon SV and TE data than the SNP data. Also, NovGMDeep outperforms conventional statistical models. This work aims to shed light on the unrecognized function of SVs and TEs in genotype-to-phenotype associations, as well as their extensive significance and value in crop development.

13:30-13:45
An in-silico approach to analyze and characterize the interaction of Cdc6 with Clb2
Room: UNAM
Format: Live from venue

Moderator(s): Abraham Avelar-Rivas, Yesid Cuesta Astroz

  • Andriele Silva, CUNY, United States
  • Shaneen Singh, CUNY, United States
  • Amy Ikui, CUNY Brooklyn College, United States
  • Jasmin Philip, CUNY, United States


Presentation Overview: Show

Cell division cycle 6, Cdc6, is a key part of the pre-replicative complex. During mitosis, the phosphorylated N-terminus of Cdc6 is targeted by mitotic cyclin Clb2, which binds to the 45PEKLQF49 motif at the Cdc6 N-terminus and is enhanced by a 126FQSLP130 motif located in the mid-region of Cdc6. The accurate timing of the Cdc6-Clb2 protein interaction ensures that DNA replication takes place only once per cell cycle to maintain genomic integrity. It has been well studied how each Cdc6 phosphorylation site regulates protein binding, however little is known about the mechanism of Cdc6-Clb2 binding. The interaction modes of Clb2 with its various protein partners also remain unexplored, due to a lack of full- length structural information for many of its protein partners. It is known that Clb2 specifically interact with other proteins, such as Bud3, a protein necessary for bud site selection. Bud3 also contains a LxF motif (1630PEKLKF1635), similarly to Cdc6. Such similarity can provide us with clues to Cdc6-Clb2 interaction. Here, we present a robust computational approach to understand the molecular mechanism of Cdc6-Clb2 interaction, which includes homology and ab -initio modeling of the full-length structures, docking predictions of interaction scenarios, and molecular dynamics simulations to estimate the free energies of the predicted complexes through umbrella sampling. We also provide a comparison between Cdc6-Clb2 binding and Bud3-Clb2 binding to assess the conservation of the binding mechanism for different protein partners of Clb2. Our results show that Clb2’s K270, within its hydrophobic patch, binds to E45 of Cdc6 in a stable manner, and the mechanism of Cdc6-Clb2 interaction seems to be conserved for Clb2-Bud3 interaction as well, with similar residues and motifs being involved in the interaction. These details provide initial clues towards uncovering the structural dynamic of Cdc6 for protein recruitment and protein binding modes of Clb2.

13:45-14:00
Kinase-Substrate Annotation using Network of Phosphosites and Kinases
Room: UNAM
Format: Live from venue

Moderator(s): Abraham Avelar-Rivas, Yesid Cuesta Astroz

  • Marzieh Ayati, University of Texas Rio Grande Valley, United States
  • Serhan Yilmaz, Case Western Reserve University, United States
  • Filipa Blasco Tavares Pereira Lopes, Case Western Reserve University, United States
  • Mark Chance, Case Western Reserve University, United States
  • Mehmet Koyuturk, Case Western Reserve University, United States


Presentation Overview: Show

Protein phosphorylation is a key post-translational modification that plays a central role in many cellular processes. Recent advances in mass spectrometry (MS) based technologies drastically enhanced the accuracy and coverage of phosphosite identification and quantification. However, most identified phosphosites do not have kinase annotations, and large scale and reliable prediction of which kinase can phosphorylate which phosphosites remains challenging. In the last decade, several computational methods are developed to predict kinase-substrate associations (KSA). The earlier KSA prediction methods focus mainly on sequence motifs recognized by the active sites of kinases. Later methods integrate other contextual information such as protein structure and physical interactions to improve the accuracy of prediction methods. In parallel, machine learning algorithms that utilize network models gain significant attraction in computational biology. Inspired by these developments, we here develop a comprehensive framework for integrating broad functional information on kinases and phosphoproteins to build machine learning models for predicting kinase-substrate associations. Our framework uses heterogeneous network models to represent the functional relationships between phosphorylation sites, as well as kinases. To construct a phosphosite-phosphosite association network, we use sequence similarity, shared biological pathways, co-evolution, co-occurrence, and co-phosphorylation of phosphosites across different biological states. To construct a kinase-kinase association network, we integrate protein-protein interactions, shared biological pathways, and membership in common kinase families. We use node embeddings computed from these heterogeneous networks to train machine learning models for predicting kinase-substrate associations. Our systematic computational experiments using the PhosphositePLUS database shows that the resulting algorithm outperforms state-of-the-art algorithms and resources, in reliably predicting KSAs. We also developed a strategy of kinase stratification in order to identify the substrates of the kinases that are poorly studied, and improving the performance of all the methods that predict kinase-substrate associations.

14:00-14:15
Comparative investigation of ZEB2 target genes and its regulatory networks in primates reveals a contribution of ZEB2 to human brain evolution
Room: UNAM
Format: Live from venue

Moderator(s): Abraham Avelar-Rivas, Yesid Cuesta Astroz

  • Jeong-Eun Lee, Freie Universitaet Berlin, Germany
  • Vladimir Jovanovic, Freie Universitaet Berlin, Germany
  • Katja Nowick, Freie Universitaet Berlin, Germany


Presentation Overview: Show

Humans differ from other primates in various aspects, among them a larger brain and distinct cognitive abilities. Evolutionary changes in gene regulation can drive such differences. We and others discovered that the transcription factor ZEB2 is an excellent candidate for contributing to the evolution of human cognition. It is crucial for proper brain development by regulating neuronal differentiation and other processes. Mutations in ZEB2 cause Mowat-Wilson-Syndrome, which includes microcephaly and intellectual disability. Moreover, ZEB2 is more highly expressed in the human compared to the chimpanzee brain and acts as a hub in a co-expression network of the human but not chimpanzee brain. A delayed expression of ZEB2 in humans compared to other apes during development might play a role in the evolution of the larger human brain. To determine binding sites and target genes of ZEB2 in a comparative way, we used cell lines of three different primates. First, using B-lymphoblastoid cells from three human, three chimpanzee and three orangutan individuals, we performed ChIP-Seq and discovered differences between ZEB2 binding sites between the three species. Second, treatment of the same cell lines with siRNA against ZEB2 and RNA-Seq analysis allowed us determining expression changes related to ZEB2 knockdown. Combining our results, we pointed out human specific ZEB2 target genes that are involved in neuronal development, such as neuronal precursor cell proliferation. We thus investigated ZEB2 target genes further using published transcriptome data of brain organoids developed from induced pluripotent stem cells of different great apes. While in gorilla ZEB2 is correlated with genes involved in neuronal development from early time points on, in humans it is correlated with cell cycle genes during these early time points and only later with neuronal developmental genes. This relative delay in humans might allow the developing human brain to spend more time with increasing its pool of neuronal progenitor cells, ultimately leading to a higher number of neurons in the human brain.

14:15-14:30
Dimensional reduction to represent multi-layer epigenomic data
Room: UNAM
Format: Live from venue

Moderator(s): Abraham Avelar-Rivas, Yesid Cuesta Astroz

  • Seth Frietze, University of Vermont, United States
  • Joseph Boyd, University of Vermont, United States


Presentation Overview: Show

The epigenome is comprised of multiple layers of epigenetic modifications that occur in cell-type specific patterns. Epigenomic profiling methods have been developed to provide one-dimensional genomic maps of distinct chromatin marks in different cell types, but due to data interpretation and visualization challenges, the central question of how combinatorial patterns of different chromatin marks superimpose across the genome remains difficult to interpret. Here we present ChIP-tsne as a visualization tool for the discovery and understanding of varied epigenomic patterns from chromatin state maps. ChIP-tsne is an R-based chromatin pattern comparison method that applies dimensionality reduction, namely t-Distributed Stochastic Neighbor Embedding (t-SNE), to multi-layer epigenomic data. We used ChIP-tsne to explore the patterns of interaction among multiple epigenetic modifications between different cell types. We find that distinct sets of epigenomic features superimpose to provide cell type specific chromatin maps, revealing distinct enriched pathways and gene expression differences. ChIP-tsne should be broadly applicable for epigenomic comparisons and provides a powerful new tool for studying multidimensional chromatin differences at the genome scale.

14:30-14:45
DiffHiChIP: Identifying differential chromatin contacts from HiChIP data
Room: UNAM
Format: Live from venue

Moderator(s): Abraham Avelar-Rivas, Yesid Cuesta Astroz

  • Daniela Salgado Figueroa, LJI, United States
  • Sourya Bhattacharyya, LJI, United States
  • Ferhat Ay, LJI, United States


Presentation Overview: Show

Chromatin loops from high-resolution HiChIP/PLAC-seq data help us identify regulatory interactions between enhancers and promoters as well as structural interactions involving well-characterized regulators of chromatin organization such as CTCF and cohesin. Differential chromatin loops between two conditions (e.g., different cell types, before/after perturbations) have been used to identify condition-specific activities of genes, regulatory elements and genetic variants. Existing differential HiChIP loop callers such as FitHiChIP, diffloop and HiC-DC+ employ count-based tests from edgeR (default exact test model) or DESeq2 that are mainly used for RNA-seq analysis. These methods cannot model the exponential distance decay of HiChIP contacts and fail to detect differences in longer range loops (>500Kb). To counter the distance decay, stratification approaches perform equal-distance binning of contacts by their genomic distance and process individual bins separately, but still exhibit lower statistical power in detecting longer range loops. We implemented DiffHiChIP, the first comprehensive framework to assess these differential HiChIP loop calling models. DiffHiChIP incorporates both DESeq2 and edgeR with false discovery rate (FDR) and independent hypothesis weighting (IHW)-corrected p-values, implements stratification by equal-occupancy binning (similar contacts per bin) to assign higher statistical power to longer range loops. DiffHiChIP further incorporates edgeR with generalized linear model (GLM), and defines four additional models employing either of quasi-likelihood F-test, likelihood ratio test, or fold-change specific thresholds (TREAT). The GLM-based models show higher precision in recovering short and long-range differential contacts, which are highly supported by the respective Hi-C backgrounds and their association with differentially expressed genes. Specifically, IHW-corrected FDR showed a better performance in recovering validated long-range differential loops compared to the conventional BH-corrected FDR. DiffHiChIP further categorizes differential loops by the underlying ChIP-seq differences of the interacting bins, and the significance of loops in individual samples. Overall, DiffHiChIP provides a comprehensive benchmark for differential HiChIP loop analysis.

14:45-15:00
A Pipeline to Extract the Metatranscriptome from Human Blood RNAseq Data
Room: UNAM
Format: Live from venue

Moderator(s): Abraham Avelar-Rivas, Yesid Cuesta Astroz

  • An Dinh Duy Nguyen, Case Western Reserve University, United States
  • Thomas LaFramboise, Case Western Reserve University, United States


Presentation Overview: Show

In recent years, various studies have investigated the microbiome at tumor sites and found that these sites can have microbial signatures that distinguish tumors from normal tissues at the same type. Acute myeloid leukemia (AML) starts in the bone marrow, the tissue that makes blood cells, causing abnormal developments of these cells. Our previous work characterized the microbial landscape of leukemia patients and compared it with that of healthy controls. Microbial content also differed among leukemia subtypes by analyzing the DNA sequencing data from the blood and bone marrow of 1870 individuals. Here we follow up this result by analyzing the RNA sequencing data from a separate AML cohort. We present our efforts to design and optimize a pipeline to analyze the metatranscriptome from patient blood and bone marrow using standard RNA sequencing data. Prior pipelines have been optimized for shotgun DNA sequence. We sought to adapt a workflow for RNA sequencing performed on human samples. The project whose data we analyzed involved 11 academic medical centers which collectively accrued a cohort of 950 AML patient specimens, along with individual healthy bone marrow samples and 12 technical replicates.
To extract the microbial signal from our data, we selected sequences not mapped to the human genome and performed taxonomic classification using Kraken2. The 12 technical replicates allow distinction between batch artifacts and true signal. We performed a variety of ordination techniques for multivariate analysis to visualize the data. We will discuss how the results heavily depend upon how the input data is acquired (in terms of both laboratory and sequencing protocols), and how different techniques lead to different possible interpretations of the results. For instance, normalizing each sample’s raw read counts by total microbial reads (thereby rendering the data compositional) leads to batch effects that are not present when normalizing by total (human + microbial) reads. In addition, we will also discuss approaches to deal with sequencing artifact and fine-tune Kraken2 parameters, which play a pivotal role in microbiome analysis. Importantly, we will also present key finding describing the metatrascriptomic landscape of AML and the microbial taxa that distinguish disease from normal blood and bone marrow.

15:00-15:15
Discovery of novel metabolic pathways in uncultured bacteria through metagenomics
Room: UNAM
Format: Live from venue

Moderator(s): Abraham Avelar-Rivas, Yesid Cuesta Astroz

  • Mirna Vazquez Rosas Landa, The University of Texas at Austin, United States
  • Valerie De Anda, The University of Texas at Austin, United States
  • Georgia Waldram, Heriot-Watt University, United Kingdom
  • Robin Rohwer, The University of Texas at Austin, United States
  • Angelina Angelova, Heriot-Watt University, United Kingdom
  • Tony Gutierrez, Heriot-Watt University, United Kingdom
  • Brett Baker, The University of Texas at Austin, United States


Presentation Overview: Show

Microbes can use petroleum as a source of carbon and energy. As a result, they play an active role in oil spill remediation, but little is known about the baseline hydrocarbon-degrading communities before a spill occurs or the diversity of metabolic mechanisms responsible for degradation. The Faroe Shetland Channel (FSC) is a region of the North Atlantic Ocean with prominent oil production and a diverse microbial community associated with the degradation of petroleum compounds. We characterized the baseline hydrocarbon-degrading communities of the FSC and identified potential novel molecular mechanisms for petrochemical degradation. We obtained 42 metagenome-assembled genomes (MAGs) from bacteria actively utilizing a major compound in oil, n-hexadecane, via stable isotope probing (SIP). Phylogenomics revealed that they belong to 19 genera, including two not previously shown to degrade hydrocarbons: Lentibacter (Alphaproteobacteria) and Dokdonia (Bacteroidetes). Diversity surveys indicated Lentibacter were dominant members, constituting up to 17% of these communities. 42% of the SIP-enriched MAGs encoded a complete alkane oxidation pathway containing alkane monooxygenase (AlkB), rubredoxin reductase (AlkT), and rubredoxin-2 (AlkG). However, 40% of the Alphaproteobacteria lacked AlkG for electron transfer in alkane hydroxylation. Instead, they encoded novel disulfide isomerases with iron-binding cysteine motifs conserved across rubredoxins. Dokdonia lacked AlkT and AlkG, however, their central alkane-degradation catabolic pathways were complete. This study describes new bacteria capable of hydrocarbon degradation, including the dominant genera Lentibacter and novel putative hydrocarbon degradation enzymes. These bacteria may continuously purge hydrocarbons released from industrial activities. This study advances our understanding of the diversity and physiologies of alkane degradation in industrial impact oceans and provides evidence of new mechanisms used to metabolize alkanes.

15:15-15:30
The Bioinformatics education scenario in Latin America and the Caribbean: current perspectives
Room: UNAM
Format: Live-stream

Moderator(s): Abraham Avelar-Rivas, Yesid Cuesta Astroz

  • Patricia Carvajal-López, EMBL-EBI, United Kingdom
  • Sebastián Ayala-Ruano, Maastricht University, Netherlands
  • Yesid Cuesta-Atroz, Instituto Colombiano de Medicina Tropical, Colombia
  • Gonzalo Parra, Barcelona Supercomputing Center (BSC), Spanish National Bioinformatics Institute (INB/ELIXR-ES), Spain
  • Margereta Boege, Escuela Nacional de Estudios Superiores Unidad Juriquilla - UNAM, Mexico
  • Irma Martínez-Flores, Centro de Ciencias Genómicas - UNAM, Mexico
  • Nicolás Palopoli, MetaDocencia & Universidad Nacional de Quilmes, Argentina
  • Vinicius Maracaja-Coutinho, Laboratory of Integrative Bioinformatics, Universidad de Chile, Mexico


Presentation Overview: Show

Different communities, institutions and initiatives have contributed to the development of academic programs and networks within Latin America and the Caribbean (LAC) to take bioinformatics education to a wider audience and cover different needs of the scientific community. The impact of these initiatives can be seen in the increased number of publications in the field coming from different countries in the region. Despite the nearly two decades of development in bioinformatics, including the creation of several academic programs throughout LAC, engagement is still poor, and wide geographical distances and centralised systems have been an obstacle for reaching more universities and institutions. This results in hotspot cities, regions and countries with academic programs devoted to bioinformatics, curbing the development of academic and industrial areas that are key for the region and would benefit from the transversality of bioinformatics and computational biology. The regional community needs to solve several challenges in order to consolidate the bioinformatics environment in LAC. Multiple aspects related to decentralisation, equity, diversity and inclusion, language barriers, and entrepreneurship, among others, should be addressed jointly by institutions and professional groups. Several multi-country initiatives have started addressing these challenges; nonetheless, we are focusing on consolidating a consortium of bioinformatics groups and institutions located mainly in LAC to provide coordination, planning, technical support and dissemination of activities to take bioinformatics education to the next level.

16:00-17:00
Keynote Presentation: Plant life at the extremes in the Atacama Desert
Room: UNAM
Format: Live-stream

  • Rodrigo Gutierrez


Presentation Overview: Show

Throughout evolution, plants adapted to flourish in a variety of ecosystems, including extreme deserts. In the current changing climate scenario, it is essential to identify the underlying molecular mechanisms that enable plant resilience extreme conditions. The Atacama Desert, the driest non-polar desert in the world, offers a unique opportunity to explore plant adaptations to extreme environmental conditions. We characterized the three pristine and extreme ecosystems along a natural altitudinal gradient of environmental parameters on the western Andes slopes in the Atacama Desert. We recorded low and unpredictable precipitation patterns, large daily temperature oscillations, low humidity, extremely high radiation levels, as well as soils with consistently low nitrogen levels. Despite these harsh conditions, a diversity of plant species coexist. We sequenced the transcriptome of the 32 most important plant species, representing 14 plant families with diverse phylogenetic origins. Using phylogenomics, we compared the protein-coding sequences of these 32 Atacama species to their 32 closest available sequenced species, and found 265 genes under positive selection in Atacama plants versus their non adapted “sister” species. These genes are involved in various developmental, regulatory and metabolic processes associated with environmental adaptation. We chose a set of positive selected genes and based the available functional characterization of their Arabidopsis orthologs we exemplify their potential role in the adaptation of plants to the extreme Atacama Desert. Our study provides new insights into plant abiotic stress tolerance, and improves our understanding of the highly unique, undisturbed Atacama Desert ecosystem.

Saturday, November 5th
8:45-9:00
RIABIO Session Welcome
Room: UNAM
Format: Live from venue

  • Javier Tapia


Presentation Overview: Show

One of the pillars of precision medicine is found in the association of genomic structure and variation with the environment and phenotype. Ongoing studies are mainly exploring deleterious variation in gene encoding, and the possible association of epigenetic factors with certain traits. However, large projects about human genetic variation have shown a significant need to define logical and quantifiable patterns on the organization of genetic information in the context of its constituent elements, which go beyond the polymorphisms; that is, elements such as structural variations (SV), DNA repetitions, binding sites to ncRNA/TF, among others, must be considered and integrated on the phenotypic traits of interest. Most of the genomic associations described to date correspond to genes or are dependent on highly heterogeneous events between human populations, that is, gene-dependent genotype-phenotype associations have been essential to recognize direct markers of diagnosis, prognosis, and treatment of different diseases, mainly cancer; but they have also shown great limitations in the study of complex or even rare diseases where the exploration of non-coding regions is necessary. Therefore, the articulation of the largest number of genomic elements in clear structural patterns and with significant function or impact is a major challenge in human genomics and precision medicine. Consequently, our group has been working in the last decade on the exploration of key genomic elements for the understanding of non-coding sequences, genome organization, the association of gene variation, and for the recognition of these elements in the susceptibility to different diseases. Our work has focused on the description of the human genome as a highly entropic, configurable system with measurable emergent properties; Specifically, the organization and configuration of DNA repeats has been one of our great contributions to the understanding and application of the human genome, because we have presented evidence of how configurations between Transposable Repeats are significant for cell function and genomic susceptibility to some neurodegenerative diseases or even cancer. In consequence, the present work will present our project: implementation and evaluation of a predictive model of genomic association based on configurations of DNA repeats and structural variants for the study of rare diseases, which is currently part of the research initiatives on precision medicine in the Colombian Government.

9:00-9:15
Implementation and evaluation of a predictive genome association model for rare diseases based on configuration of DNA repeats and structural variants.
Room: UNAM
Format: Live from venue

Moderator(s): Javier De Las Rivas, Elizabeth Tapia

  • Fabian Tobar-Tosse, Pontificia Universidad Javeriana Cali, Colombia
  • Elizabeth Londoño, Pontificia Universidad Javeriana Cali, Colombia
  • Andres Zuñiga, Pontificia Universidad Javeriana Cali, Colombia
  • Jose Guillermo Ortega, Pontificia Universidad Javeriana Cali, Colombia
  • Valentina Corchuelo, Pontificia Universidad Javeriana Cali, Colombia
  • Patricia E. Velez, Universidad del Cauca, Colombia
  • Pedro A. Moreno, Universidad del Valle, Colombia


Presentation Overview: Show

A significant fact in precision medicine is the association of the genomic structure and variation with the environment and phenotype. Ongoing studies are mainly exploring deleterious variation in gene encoding, and the possible association of epigenetic factors with certain traits. However, large projects about human genetic variation have shown a significant need to define logical and quantifiable patterns on the organization of genetic information in the context of its constituent elements, which go beyond the polymorphisms; that is, elements such as structural variations (SV), Repeats, Binding-sites to ncRNA or Regulatory Proteins, among others, must be considered and integrated on the phenotypic traits of interest. Most of the genomic associations described to date correspond to genes or are dependent on highly heterogeneous events between human populations, that is, genotype-phenotype associations based on genes have been essential to recognize direct markers of diagnosis, prognosis, and treatment of different diseases, mainly cancer; but they have also shown great limitations in the study of complex or even rare diseases, where the exploration of non-coding regions is necessary. Therefore, the integration of the largest number of genomic elements in clear structural patterns and with significant functional meaning is a major challenge in human genomics and precision medicine. Consequently, our group has been working in the last decade on the exploration of key genomic elements for the understanding of non-coding sequences, genome organization, the association of gene variation, and for the recognition structural patterns in the susceptibility to different diseases. Our work has focused on the description of the human genome as a highly entropic and configurable system with measurable emergent properties. Specifically, the organization and configuration of DNA repeats has been one of our great contributions to the understanding and application of the human genome, because we have presented evidence of how configurations between Transposable elements and other Repeats are significant for cell function and genomic susceptibility to some neurodegenerative diseases or even cancer. In consequence, the present work will present our project: implementation and evaluation of a predictive model of genomic association based on configurations of DNA repeats and structural variants for the study of rare diseases, which is currently part of the research initiatives on precision medicine in the Colombian Government.

9:15-9:30
Identifying potential driver genes by multi-omics approaches: a deep insight into the complex heterogeneity of cancer diseases
Room: UNAM
Format: Live from venue

Moderator(s): Javier De Las Rivas, Elizabeth Tapia

  • Katia Avina-Padilla, University of Illinois, United States
  • Carla Angulo-Rojo, CIASAP UAS, Mexico
  • Octavio Zambada Moreno, Cinvestav IPN Unidad Irapuato - Irapuato Leon, Mexico
  • Jose Antonio Ramirez-Rafael, Cinvestav IPN Unidad Irapuato - Irapuato Leon, Mexico
  • Maribel Hernandez-Rosales, Cinvestav IPN Unidad Irapuato - Irapuato Leon, Mexico


Presentation Overview: Show

Cancer is a complex disease that relies on progressive uncontrolled cell division linked with multiple dysfunctional biological processes. Oncology practice has incorporated genes in key molecular events that drive tumorigenesis as biomarkers to guide diagnosis and design patient therapy. However, tumor heterogeneity remains the most challenging feature in diagnosing and treating cancer diseases. In this context, we focus on studying the significant heterogeneity of aggressive tumors at the genomics, transcriptional, and interactome levels, emphasizing the potential clinical of driver genes. For this purpose, we have used integrated data from multiple databases and developed bioinformatics strategies and network approaches to contribute to understanding the biological and molecular processes underlying cancer initiation and progression. Our studies have analyzed prevalent cancers, such as Breast Invasive Carcinoma (BRCA), Colon Adenocarcinoma (COAD), Lung Adenocarcinoma (LUAD), Prostate Adenocarcinoma (PRAD), along with the four most aggressive with high intrinsic heterogeneity, namely Bladder Urothelial Carcinoma (BLCA), Esophageal Carcinoma (ESCA), Glioblastoma Multiforme (GBM), and Kidney Renal Clear Cell Carcinoma (KIRC). As a result, we have identified transcriptional profile patterns in GBM fitting with the stem cell model of ontogenesis. A unique distribution of somatic mutations was found for young and adult populations, particularly for DNA repair and chromatin remodeling genes. Our results also revealed that highly lesioned genes undergo differential regulation with biological pathways for young patients. Moreover, we also detected a combination of 4 biomarkers with potential relevance to determine the GBM molecular subtype. Also, we have highlighted the potential regulatory role of differentially expressed (DE) human intronless genes across cancer types. As well as their implication in specific PPI networks for GBM, ESCA, and LUAD tumors. The aim is to identify their unique expression profiles and interactome that may act as functional signatures across eight different cancers. We identified 940 protein-coding IGs in the human genome, of which about 35% were differentially expressed across the analyzed cancer datasets. Specifically, 78% of DE-IGs underwent transcriptional reprogramming with elevated expression in tumor cells. Remarkably, in all the studied tumors, a highly conserved induction of a group of deacetylase-histones located in a region of chromosome 6 enriched in chromatin condensation processes. IGs are essential in the tumor phenotype at transcriptional and post-transcriptional levels. Notably in important mechanisms such as interactomics rewiring in BRCA. Our multi-omic approaches could help delineate future strategies for using the predictive molecular markers for clinical decision-making in the medical routine.

9:30-9:45
Identification of cancer-drug-gene resistance network modules by analysis of genome-wide expression and drug activity correlations in cancer cells
Room: UNAM
Format: Live-stream

Moderator(s): Javier De Las Rivas, Elizabeth Tapia

  • Monica M Arroyo, Department of Chemitry, Pontifical Catholic University of Puerto Rico (PUCPR), Puerto Rico, Puerto Rico
  • Alberto Berral-Gonzalez, Cancer Research Center (IBMCC, CSIC/USAL & IBSAL), CSIC and University of Salamanca, Salamanca, Spain
  • Diego Alonso-Lopez, Cancer Research Center (IBMCC, CSIC/USAL), CSIC and University of Salamanca, Salamanca, Spain
  • Jose M Sanchez-Santos, Department of Statistics, University of Salamanca (USAL) and Cancer Research Center (IBMCC, CSIC/USAL), Spain
  • Javier De Las Rivas, Cancer Research Center (IBMCC, CSIC/USAL & IBSAL), CSIC and University of Salamanca, Salamanca, Spain


Presentation Overview: Show

Drug resistance is a major hurdle in the treatment of cancer patients. Several molecular mechanisms have been identified that contribute to drug resistance and disease relapse, threatening cancer theraphies, patient healing and survival. Multiple drug resistance limits treatment options and patient outcome. This leads to chemotherapeutic options that may have more serious side effects, less effective therapies, or no treatment alternatives at all. We examined publicly available databases (which provide omics information on anticancer drugs activity) and developed an updated version of a web-based cancer drug resistance resource, which provides bipartite drug-protein networks and allows the identification of clusters or modules of resistance (with the identification of putative protein-coding genes that may be involved in the resistance process). In this framework, we calculated Pearson and Spearman correlations between 733 cancer genes and 24,360 drugs. Upon filtering for significant negative correlations, which indicate resistance, and FDA-approved drugs, we identified 1552 resistant pairs between 137 drugs and 374 genes. Heatmaps were generated to establish resistance clusters by cell lines of different tissues, as well as gene-drug bipartite resistance networks. The results showed the identification of known resistance gene-drug pairs. We also found new plausible resistant gene-cancer FDA-approved drug pairs and genes that may be involves in multi-drug resistance (MDR).

9:45-10:00
Application of bioinformatic methods for the deconvolution of cell mixtures to blood and immune cell-types and comparison of generated gene signatures
Room: UNAM
Format: Live-stream

Moderator(s): Javier De Las Rivas, Elizabeth Tapia

  • Natalia Alonso-Moreda, Cancer Research Center (IBMCC, CSIC/USAL & IBSAL), CSIC and University of Salamanca, Salamanca, Spain
  • Alberto Berral-Gonzalez, Cancer Research Center (IBMCC, CSIC/USAL & IBSAL), CSIC and University of Salamanca, Salamanca, Spain
  • Enrique De La Rosa, Cancer Research Center (IBMCC, CSIC/USAL & IBSAL), CSIC and University of Salamanca, Salamanca, Spain
  • Jose M Sanchez-Santos, Department of Statistics, University of Salamanca (USAL) and Cancer Research Center (IBMCC, CSIC/USAL), Spain
  • Javier De Las Rivas, Cancer Research Center (IBMCC, CSIC/USAL & IBSAL), CSIC and University of Salamanca, Salamanca, Spain


Presentation Overview: Show

Recent omic studies of the variability between different cell populations provide a deep identification of the activity of cell specific genes and determine how the changes produced in complex cell mixtures are driven by different genes in each specific cell-type. Despite this accurate identification of genome-wide changes provided by omic technologies, the studies on cell mixtures or on bulk samples with different cell-types only provide average global signatures. In the last decade, computational techniques have been developed to solve this problem by applying Cell Deconvolution methods, which are designed to decompose a cell mixture (consisting of various cell-types working together in a tissue, organ or biological system), into its component cells and calculating the proportion corresponding to each cell-type. Some of these methods only calculate the proportions of cell-types in the mixture (supervised methods), while other deconvolution algorithms can also identify gene expression signatures specific for each cell-type (unsupervised methods). In this work, five deconvolution methods (DECONICA, LINSEED, CIBERSORT, FARDEEP and ABIS) were implemented and compared with the aim of evaluating their accuracy and determining the best performance in the identification of different blood and immune cell-types. To asses these methods, we used several bulk expression datasets from peripheral blood samples obtained using both high density microarray data and RNA sequencing (RNA-seq) data. The analyses showed that FARDEEP, CIBERSORT and LINSEED provided a more accurate estimate of the abundance of different cell populations in the mixture. Our comparative analysis also showed that the most efficient algorithm to identify gene signatures is LINSEED.

10:00-10:15
Assignment of gene cell markers to hematological and immune cell-types based on single-cell proteo-transcriptomic data using machine learning approaches
Room: UNAM
Format: Live-stream

Moderator(s): Javier De Las Rivas, Elizabeth Tapia

  • Enrique De La Rosa, Cancer Research Center (IBMCC, CSIC/USAL & IBSAL), CSIC and University of Salamanca, Salamanca, Spain
  • Elena Sanchez-Luis, Cancer Research Center (IBMCC, CSIC/USAL & IBSAL), CSIC and University of Salamanca, Salamanca, Spain
  • Natalia Alonso-Moreda, Cancer Research Center (IBMCC, CSIC/USAL & IBSAL), CSIC and University of Salamanca, Salamanca, Spain
  • Jose M Sanchez-Santos, Department of Statistics, University of Salamanca (USAL) and Cancer Research Center (IBMCC, CSIC/USAL), Spain
  • Javier De Las Rivas, Cancer Research Center (IBMCC, CSIC/USAL & IBSAL), CSIC and University of Salamanca, Salamanca, Spain


Presentation Overview: Show

The human hematopoietic system is composed of highly specialized cells with unique and essential functions. Due to the relevance of its activity, we have studied the cell types and subtypes that form this complex system using single-cell RNA-seq (scRNA-seq) technique, to unravel the different and heterogeneous cell populations. Specifically, we looked for gene expression signatures that would allow us to identify specific cell types robustly and at different levels (i.e., markers for all cells, markers for specific lineages, for cell types or cell subtypes). Different immune cells are often used for different studies, so their specific isolation is crucial. In our search, we attempted to look for CD marker-based gene signatures first, then genes encoding membrane proteins, and lastly any other protein-encoding gene. In the methodological part of this work, we first tried to determine the best computational workflow to analyse large scRNA-seq datasets. Once an analytical workflow was set up, we used it for determining different gene signatures of bone marrow (BM) and peripheral blood (PB) cells based on single-cell resolution power. We analysed three different scRNA-seq datasets, and we detected a reference list of 369 human CDs in the genes provided (19,813, 33,660 and 36,601 transcripts, respectively). These 369 CD genes were used as a guide to perform all the comparative bioinformatic analyses. We also screened for genes encoding membrane proteins (located on the surface of cells) as potential cell-specific markers. With this, in addition to identifying the best CDs that mark the main cell types (i.e., monocytes, B lymphocytes, natural killer cells, etc), we could check the ability of new gene signatures to distinguish each cell type. To do so, we applied Random Forest (i.e., a machine learning method) to test the accuracy of different gene signatures in identifying different haematological and immune cell types.

10:30-10:45
Automatic identification of GO-Terms related to ripening fruit of tomato based on machine learning and its application in a breeding program
Room: UNAM
Format: Live from venue

  • Paolo Caccharelli, IICAR, Argentina
  • Flavio Spetale, Cifasis-Conicet, Argentina
  • Guillermo Pratta, IICAR-CONICET, Argentina
  • Elizabeth Tapia, CIFASIS-CONICET, Argentina


Presentation Overview: Show

Background: Tomato quality is one of the most important factors that helps ensure a consistent marketing of tomato fruit. The genome sequencing of tomatoes has provided powerful insights into the molecular changes in fruit ripening, a complex developmental process that is highly coordinated and includes changes in color, texture, taste and flavor. The automatic identification of biological functionalities of genes during the ripening process is important because it detects a list of GO-Terms and genes related to maturation. Given that gene expression revealed by massive RNA-seq is a metric measurement, estimation of gene action by Quantitative Genetics analysis provides a direct application to identify promising parents and hybrid taking the additive and non-additive genetic effects on RNA levels as a selection criteria in breeding programs.
Results: An analysis of the global expression of genes was developed through the technique of massive sequencing of RNA-seq in parent genotypes and their hybrid. The goal was to identify GO-Terms of biological processes related to fruit ripening in which genes with differential expression are involved and analyze their gene action. The study of genes was obtained with transcriptional profiles in three mainly ripening stages (Breaker, Mature Green and Red Ripe) and then a new analysis was carried out with the AgriGO tool in order to characterize and identify genes involved in these development stages. In addition, this enrichment analysis allows exploring the biological processes related to fruit ripening in which differential expression transcripts participate. Comparison into the biological processes domain between these three genotypes generated a list of consensus GO-Terms, of which 4 were detected (GO:0044699, GO:0008152, GO:0044710 and GO:0005975). On the other hand, 20 genes were randomly selected from the total of 2,744 genes to estimate the additive and non-additive genetic effects. This result showed 17 genes with negative overdominance from the parent with the lowest value, and the others showed partial dominance towards lower values.
Conclusions: This first approach was proposed to identify GO-Terms related to a development stage in the fruit of tomato and detect potential genes of interest to continue with the breeding program and to obtain new varieties of tomatoes.

10:45-11:00
Genome-wide bisulfite sequencing analysis of cultivated and wild rice species reveals epigenome variation in response to aluminum stress
Room: UNAM
Format: Live from venue

  • Jenny Johana Gallo-Franco, Pontificia Universidad Javeriana-Cali, Colombia
  • Thaura Ghneim-Herrera, Universidad Icesi, Colombia
  • Fabian Tobar-Tosse, Pontificia Universidad Javeriana-Cali, Colombia
  • Mauricio Quimbaya, Pontificia Universidad Javeriana-Cali, Colombia


Presentation Overview: Show

DNA methylation has been defined as the most studied epigenetic modification involved in several biological processes, such as plant genome stability, developmental regulation, and environmental responses. However, little is known about the potential role of DNA methylation in response to aluminum (Al) stress in rice. To determine the dynamics of DNA methylation variation associated with Al exposure in rice plants, we analyzed single-base resolution methylome maps for two genotypes of Oryza sativa, a cultivated species, with contrasting response to Al-stress conditions (Azucena-Tolerant and BGI9311-Susceptible). We also analyzed the methylome of two genotypes of O. glumaepatula, a wild species, with contrasting response to Al-exposure (Og131-Tolerant and Og131-Susceptible). Our results showed that, under control conditions, genome-wide methylation profiles are mainly conserved between both species. Nevertheless, there are several differentially methylated regions (DMRs) with species-specific methylation patterns. In addition, we identified a large number of DMRs for tolerant and susceptible genotypes for both species between control and stress conditions. Several of these DMRs are related to genes previously reported as Al-responsive genes, suggesting a possible role of rice DNA methylation in regulating the Al stress response. Likewise, we analyzed the association of identified DNA methylation marks with Al-tolerance levels of the genotypes studied within each rice species, as well as variation in the methylome of different rice species in response to Al-exposure. Our findings provide novel insights into genome-wide DNA methylation profiles of wild and cultivated rice genotypes and their possible role in regulating plant responses to stress.

11:00-11:15
Logical model of the tolerization of Dendritic cells integrates novel players
Room: UNAM
Format: Live from venue

  • Karen Nuñez-Reza, International Laboratory for Human Genome Research, Mexico
  • Isaac Lozano-Jiménez, International Laboratory for Human Genome Research, Mexico
  • Leslie Martínez-Hernández, International Laboratory for Human Genome Research, Mexico
  • Alejandra Medina-Rivera, Universidad Nacional Autónoma de México, Mexico


Presentation Overview: Show

Tolerogenic dendritic cells (tolDC) play an essential role in regulating immune response by inducing an effective Treg response, especially those obtained with IL10-based protocol. Recently tolDC have become a relevant subject due to their ability to regulate immune response under different conditions like autoimmune diseases and food allergies.
In order to characterize tolDC that exhibit a robust Treg induction, we built a logical model of their tolerization using IL10, by integrating published knowledge, transcriptome data, and identification of transcription factor binding sites (TFBS).
We first performed a literature search in PubMed and found several papers that were included in our model. We then performed a gene expression analysis of the available transcriptome data in GEO (GSE117946). Based on our gene expression analysis, we identified five differentially expressed transcription factors (TFs), IRF8, TCF7L2, GAL4, CEBPB, and TFCP2L1, in tolDC had not been related before to tolDC obtention. Using JASPAR (Castro-Mondragon et al. 2022) matrices for these TFs we searched for TFBS in the upstream region of genes related to the tolerogenic phenotype in tolDC obtained using IL10 protocol. Once we predicted TFBS in the interested genes, those new predicted regulations were used to complete our model.
Our logical model integrates differential gene expresión, predicted transcription factor binding sites, and current knowledge about IL10 signaling in monocytes-derived dendritic cells (moDC). With our completed model we performed in silico mutants of the TFs involved (STAT3, STAT6, IRF8, TCF7L2, GAL4, CEBPB, and TFCP2L1), consistent with current knowledge of the STAT6, CEBPB, and IRF8 mutants, in the presence of IL10, allow for the expression of tolerogenic specific gene markers. On the contrary STAT3, TCF7L2, and TFCP2L1 mutants, in the presence of IL10, abolished the tolerogenic specific gene markers.
The novelty of our study is the identification of the role of TFs that had not been described as involved in the tolerization of dendritic cells. Our model helps understand the differences in gene expression when IL10 is used to obtain tolDC and could be used as a base to integrate other tolerogenic protocols and be able to contrast the basal behavior with immune diseases.

11:15-11:30
Using Graph Convolutional Networks (GCNs) for Molecular Property Prediction from Natural Products
Room: UNAM
Format: Live from venue

  • Naicolette Agudo, Universidad Tecnologica de Panama, Panama
  • José Luis López, University of Salamanca, Spain
  • Grimaldo Ureña, Universidad Tecnologica de Panama, Panama
  • Javier E. Sanchez-Galan, Universidad Tecnologica de Panama, Panama


Presentation Overview: Show

Machine learning has been applied at length for molecular characterization in specific for predicting properties of molecules for the pharmaceutical industry. It has been also used for the prediction of biochemical and physiological effects of natural products. For this task, neural networks have been extensively used. Lately, Graph Convolutional Networks (GCN) have been used for this task. Exploiting the fact that graphs can readily represent the structure of molecules and keeps its integrity in the analysis. Also, they will be used for the prediction of properties behaves in a non-linear fashion, needing to take into consideration a substantial number of parameters (feature space) to predict a single characteristic of the output. For natural products, these predictions can be even more fruitful since they could help elucidate the use of complex compounds.


This project will take as basis the NAPROC-13 NMR-based database of natural products in SMILE format. Its access will be provided from our collaborators from the University of Salamanca. The main objective of this project will be to apply GCNs to molecules in NAPROC-13 to predict chemical properties. The properties to be predicted will be made within this database as validation. A second objective is to focus and study molecules that have been found via bioprospection in the Panamanian territory.

GCNs will have as inputs molecular graph. Different network architectures will be tested. Specially, architectures which are known to be suitable for this task and taking those which we consider most applicable in many approaches: Graph Attention Networks (GAT), Mixture Model Network (Monet) and Chebyshev Networks (ChebNet). The molecular information supplied will be used as targets. The optimization of the GCN architecture to be used is contemplated, using Grid search strategies.

This analysis will provide a basis on which further work on structural analysis can be done. The algorithms used and optimized will have the potential to be applied to other molecules in the database with relatively unknown characteristics, in addition to other databases. With this, the opportunity to improve the interfaces for the structure of new compounds opens. It will be possible to characterize the components and properties of any natural products such as coffee, a known export from the Republic of Panama.

12:00-12:15
Prediction of bacterial interactions using metabolic network features
Room: UNAM
Format: Live-stream

  • Claudia Silva-Andrade, Universidad Mayor, Chile
  • Daniel Garrido, Laboratorio de Microbiología de Sistemas, Escuela de Ingeniería, Pontificia Universidad Católica de Chile, Chile
  • Maria Rodriguez-Fernandez, Institute for Biological and Medical Engineering, Pontificia Universidad Católica de Chile, Santiago, Chile., Chile
  • Alberto J. Martin, Laboratorio de Redes Biológicas, Centro Científico y Tecnológico de Excelencia Ciencia & Vida,Universidad San Sebastián, Chile


Presentation Overview: Show

Understanding the interactions between microorganisms and how these relationships affect bacterial behavior at the community level is a key research topic in microbiology. Microbial consortia engineering has been established as a scientific discipline and its main objective is the creation of consortia with a particular behavior, either by increasing the productivity of specific metabolites or by modifying the metabolic functionality to obtain stable communities over time.
There are different methods to study interactions between bacteria based on experimental or mathematical approaches. Mathematical approaches use mathematical models to represent various properties of bacteria and can be classified as static or dynamic depending on whether or not they contain information on how community members interact over time.
Metabolic networks describe the interactions between metabolic pathways, mapping the enzymes (represented as edges in the network) and metabolites (represented as nodes in the network), which can be used to understand different metabolic processes in a microorganism in order to optimize the production of a particular metabolite. Metabolic networks are also being used to create stoichiometric models of genome-scale metabolism in consortia to understand interactions between pairs of bacteria, highlighting the relevance of this approach to characterizing bacteria.
In this work, we describe a new method that aims to reduce the number of experimental trials needed to design bacterial consortia with a particular behavior. For that, the representation of bacteria in terms of their metabolic networks was used to build a mathematical model able to predict cross-feeding interactions or competition between pairs of bacteria. We first used the simplest supervised classifier, K-Nearest Neighbors, to choose among several ways of encoding the metabolisms of two bacteria, test different parameter values, and implement various data curation approaches to reduce the biological bias associated with our dataset. Next, we tested different classification algorithms and performed rigorous cross-validation experiments to select the best one for our dataset. The top performing supervised machine learning algorithm obtained an overall rate of correctly classified pairs of bacteria between 92% and 96%. Our method will surely prove useful to improve our understanding of community behavior and, at the same time, aid in rational consortia design approaches by reducing the number of experiments required to identify beneficial interactions among bacteria.

12:15-12:30
PPIntegrator: Semantic integrative system for protein-protein interaction and application for Host-Pathogen datasets
Room: UNAM
Format: Live-stream

  • Yasmmin Côrtes Martins, National Laboratory for Scientific Computing, Brazil
  • Artur Ziviani, National Laboratory for Scientific Computing, Brazil
  • Maiana de Oliveira Cerqueira E Costa, National Laboratory for Scientific Computing, Brazil
  • Maria Claudia Cavalcanti, Military Institute of Engineering, Brazil
  • Marisa Fabiana Nicolás, National Laboratory fo Scientific Computation, Brazil
  • Ana Tereza Vasconcelos, National Laboratory fo Scientific Computation, Brazil


Presentation Overview: Show

Semantic web standards have shown their importance in the last 20 years in promoting data formalization and interlinking between the existing knowledge graphs. In this context, several ontologies and data integration initiatives have emerged in recent years for the biological area, such as the broadly used Gene Ontology initiative that contains metadata to annotate gene function and subcellular location. Another important subject in the biological area is the proteinprotein interactions (PPIs) which have many applications like protein function inference. Current PPI databases have heterogeneous exportation methods that challenge their integration and analysis. Presently, several initiatives of ontologies covering some concepts of the PPI domain are available to promote interoperability across datasets. However, the efforts to stimulate guidelines for automatic semantic data integration and analysis for PPIs in these datasets are limited. Here, we present PPIntegrator, a system that semantically describes data related to protein interactions. We also introduce an enrichment pipeline to generate, predict and validate new potential Host-Pathogen datasets by transitivity analysis. PPIntegrator contains two modules: i) a data preparation module to organize the data of three reference databases and ii) a triplification and data fusion module to describe the provenance information, protein annotations when they exist and results of scores separated by detection method. This work provides an overview of the PPIntegrator system applied to integrate and compare Host-Pathogen PPI datasets from four bacterial species using our proposed transitivity analysis
pipeline. We also demonstrated some critical queries to analyze this kind of data and highlight the importance and usage of the semantic data generated by our system.

12:30-12:45
AN INTEGRATIVE APPROACH IDENTIFIES HOST-MICROBE INTERACTIONS IN THE BLOOD OF SEPTIC PATIENTS
Room: UNAM
Format: Live-stream

  • Ícaro Maia Santos de Castro, University of Sao Paulo, Brazil
  • Marielton Passos Cunha, Scientific Platform Pasteur USP, Brazil
  • Youvika Singh, University of São Paulo, Brazil
  • Paulo Amaral, Insper Institute of Education and Research, Brazil
  • Helder Nakaya, Hospital Israelita Albert Einstein, Brazil


Presentation Overview: Show

In recent years, the role of the human microbiome in health and disease has been widely elucidated. These microbial communities not only contribute to the host's local defense against infections but also modulate immune system responses in different locations and tissues. Nevertheless, under conditions that compromise the host's defense barrier, some species can invade tissues and cause infections. Nowadays, RNA-Seq has become the leading high-throughput sequencing technology for understanding the immune response in infectious diseases. Most human transcriptomic studies evaluate gene expression exclusively of the human host. However, during the read alignment process, most pipelines end up discarding reads not mapped to the human genome. These unmapped reads can provide valuable information of non-host RNA transcripts derived from microorganisms that can be present in these samples. Here, we propose a bioinformatic approach to identify potential microbial-derived transcripts from these unmapped reads and their impact on host gene expression during infectious diseases. Applying our computational approach to 3 different transcriptomic studies of sepsis we were able to identify several microbes in the blood of septic patients. Furthermore, we identified common nosocomial pathogens already described in sepsis studies, suggesting possible bacteremia events. Also, we identified transcriptional pathways and immune modules altered by the presence of those pathogens. Considering the increasing amount of publicly available transcriptomic data, our approach may enlighten researchers on the potential of analyzing otherwise discarded biological information to generate new insights to microbe-host interactions that can impact host immune response in infectious diseases.

12:45-13:00
Modeling and structural characterization of FAZ10 protein regions: understanding their function in Trypanosoma brucei
Room: UNAM
Format: Live from venue

  • Cleidy Mirela Osorio Mogollón, University of Sao Paulo, Brazil
  • Diego Leonardo Cabrejos, University of Sao Paulo, Brazil
  • Munira Muhammad Abdel Baqui, University of Sao Paulo, Brazil


Presentation Overview: Show

Trypanosoma brucei is the etiological agent of Sleeping Sickness, a tropical and neglected disease endemic to sub-Saharan Africa. Trypanosomatids have a complex zone called Flagellum Attachment Zone (FAZ), connecting its single flagellum to the cell body. High molecular mass proteins localized at the FAZ region have helped understand the maintenance of cellular morphology, cytokinesis, and survival. One of these proteins is FAZ10, recently described by our group in T. brucei. We showed that FAZ10 is required for determining cleavage furrow positioning, FAZ organization, and correct cytokinesis. However, little is known about the molecular structure of this protein. Here, we report the in silico structural characterization of FAZ10 regions. To predict 3D models, we used AlphaFold2, a intelligence artificial program for structural biology, predicting 3D models of protein structures; MARCOIL and LOGICOIL to identify and analyze coiled-coil motifs in FAZ10; IUPRED2A, a server to predict disordered regions, and Pymol to visualize models. The FAZ10 protein has several coiled-coil motifs along its amino acid sequence, including the region of the N-terminal and C-terminal domains followed by the two disordered domains. The central domain is folded, which may be involved in protein interaction.
Furthermore, FAZ10 has the potential to be a protein dimer. The knowledge of the three-dimensional arrangement of this protein allows us to understand the FAZ10 function and its interaction with other proteins present in the FAZ region. These studies will provide a better understanding of the complexity of FAZ in T. brucei and more excellent knowledge of this parasite of public health importance to a continent.

13:00-13:15
Genomic profiling of bacteria for the prediction of synthetic community assembly under antagonistic interactions
Room: UNAM
Format: Live from venue

  • Marisol Navarro-Miranda, Cinvestav, Irapuato, Mexico
  • Maribel Hernandez-Rosales, Cinvestav, Irapuato, Mexico
  • Gabriela Olmedo-Alvarez, Cinvestav, Irapuato, Mexico


Presentation Overview: Show

Microbial communities play critical roles in a wide range of natural processes, from biogeochemical cycles to microbiomes. A fundamental problem in ecology is community assembly that seeks to understand how deterministic and stochastic processes give rise to observed patterns in species abundances over space and time. Therefore, the use of well-studied environmental strains in the construction of synthetic communities under controlled conditions provides robust models to monitor their assembly, pattern formation, and testing of particular hypotheses. We study 78 strains of the phylum Bacillota for which we have obtained phenotypic data on their pairwise interactions. The strains are part of a collection of natural sediment communities isolated from a lagoon in Cuatro Cienegas, Coahuila, Mexico. We sequenced the genome of the isolates and performed a pangenomic analysis, identified gene clusters encoding secondary metabolites, and inferred a core genome-based phylogeny. A subset of three strains from this community has been previously studied and established as the so-called BARS synthetic community model. In paired interactions, each strain exhibits a different ecological role, such as antagonism, resistance, and sensitivity. As a community, they exhibit high-order properties of complex natural communities, with an emergent property where antagonism is not observed in the presence of the resistant strain. With the genomic characterization of the 78 strains, we aim to explain features that allow the three BARS strains to cohabit and to predict other candidate strains that can be substituted in the BARS interaction and even increase the number of strains that can assemble in a stable community. These will be tested in further assembly dynamics experiments. If we assume that synthetic communities can capture some properties of natural ones, this could help us to understand environmental problems where microbes are the main actors.

13:15-14:15
Hunting for species and genera in metagenome datasets
Room: UNAM
Format: Live from venue

  • João Setubal
14:15-14:30
ISCB-LA SoIBio BioNetMX Closing Ceremonies
Room: UNAM
Format: Live from venue

  • Diane Rivas