The Netherlands Neurogenetics Database: Reveiling clinical, neuropathological and genetic heterogeneity of brain-disorders
Confirmed Presenter: Inge Holtman, University Medical Center Groningen, Netherlands
Room: 524ab
Format: In Person
Authors List: Show
- Inge Holtman, University Medical Center Groningen, Netherlands
- Bart Eggen, University Medical Center Groningen, Netherlands
- Inge Huitinga, Netherlands institute for Neurosciences, Netherlands
Presentation Overview: Show
The brain is susceptible to a wide-range of neurodegenerative disorders, that share pathophysiological mechanisms, genetic risk factors, and are frequently clinically misdiagnosed[1]. Hence, there is a clear need for a data-driven delineation of the pathophysiological mechanisms of brain disorders, for improved diagnosis and prognosis. To this end, we established the Netherlands Neurogenetics Database (http://nnd.app.rug.nl/) which aims to integrate the extensive clinical, neuropathological, genetic data of large collection of brain donors (+/- 3000) from the Netherlands Brain Bank. We recently implemented Large Language Models (LLMs) to convert medical record summaries into clinical disease trajectories[2]. These trajectories included many known and novel disease specific symptoms, and were used for disease prediction, and disease subtyping and resulted in the identification of clinical subtypes of disease. Currently, we’re implementing LLMs to process neuropathological examinations, giving us unprecedented insight into neuropathological state, which we’ll relate back to the clinical heterogeneity. In addition, we’re analysing common genetic variants (Illumina GSA-array), to refine current GWAS studies for neurodegenerative disorders, that typically include a considerable fraction of misdiagnosed individuals. We’re also calculating polygenic risk scores (PRS) to identify genetic features for clinical/neuropathological subtypes and features. Together, these studies aim to give new data-driven insights into shared and unique features of neurodegenerative disorders.
1. Revealing clinical heterogeneity in a large brain bank cohort. N.J. Mekkes, I.R. Holtman, Nat Med, 2024.
2. Identification of clinical disease trajectories in neurodegenerative disorders with natural language processing. N. J. Mekkes, …, B.J.L. Eggen, I. Huitinga. I.R. Holtman, Nat Med, 2024.
GeneAgent: Self-verification Language Agent for Gene Set Knowledge Discovery
Confirmed Presenter: Zhiyong Lu, National Institutes of Health (NIH), United States
Room: 524ab
Format: In Person
Authors List: Show
- Zhizheng Wang, National Institutes of Health (NIH), United States
- Qiao Jin, National Institutes of Health (NIH), United States
- Chih-Hsuan Wei, National Institutes of Health (NIH), United States
- Shubo Tian, National Institutes of Health (NIH), United States
- Po-Ting Lai, National Institutes of Health (NIH), United States
- Qingqing Zhu, National Institutes of Health (NIH), United States
- Xiuying Chen, King Abdullah University of Science and Technology, Saudi Arabia
- Chi-Ping Day, National Institutes of Health (NIH), United States
- Christina Ross, National Institutes of Health (NIH), United States
- M.G. Hirsch, National Institutes of Health (NIH), United States
- Teresa Przytycka, National Institutes of Health (NIH), United States
- Zhiyong Lu, National Institutes of Health (NIH), United States
Presentation Overview: Show
Genomics has been a research interest of molecular biologists for a long time. Recent studies have shown promising results by harnessing the instruction learning in Large Language Models (LLMs). Nonetheless, these methods did not explore LLMs in-depth to accurately identify biological functions of gene sets and are hindered by the issue of hallucinations. In response, we present GeneAgent, a first-of-its-kind language agent equipped with the self-verification capability to autonomously interact with domain-specific databases. GeneAgent contains four stages (i.e., generation, self-verification, modification, and summarization), which creates the process name and analytical narratives for the input gene set and activates the self-verification agent for verifying them respectively. Different stages of self-verification are cascaded through the modification module. After self-verification, GeneAgent produces the final response for the given gene set based on the verification report. Benchmarking on multiple gene sets in Gene Ontology, NeST, and MsigDB, GeneAgent achieves higher accuracies than the standard GPT-4 by a significant margin. Notably, for 15 gene sets (1.4%), GeneAgent accurately predicted the reference terms with 100% precision, compared with only 3 cases (0.3%) by GPT-4. Additionally, our enriched term tests demonstrate that GeneAgent can provide targeted gene synopsis for summarizing multiple biological terms in alignment with conventional enrichment analyses. Detailed case studies demonstrate that GeneAgent can effectively reduce hallucination issues in GPT-4 and generate reliable analytical narratives for gene functions. As such, GeneAgent stands as a robust solution for gene set knowledge discovery and can provide reliable insights for future research endeavors.