The brain is susceptible to a wide-range of neurodegenerative disorders, that share pathophysiological mechanisms, genetic risk factors, and are frequently clinically misdiagnosed[1]. Hence, there is a clear need for a data-driven delineation of the pathophysiological mechanisms of brain disorders, for improved diagnosis and prognosis. To this end, we established the Netherlands Neurogenetics Database (http://nnd.app.rug.nl/) which aims to integrate the extensive clinical, neuropathological, genetic data of large collection of brain donors (+/- 3000) from the Netherlands Brain Bank. We recently implemented Large Language Models (LLMs) to convert medical record summaries into clinical disease trajectories[2]. These trajectories included many known and novel disease specific symptoms, and were used for disease prediction, and disease subtyping and resulted in the identification of clinical subtypes of disease. Currently, we’re implementing LLMs to process neuropathological examinations, giving us unprecedented insight into neuropathological state, which we’ll relate back to the clinical heterogeneity. In addition, we’re analysing common genetic variants (Illumina GSA-array), to refine current GWAS studies for neurodegenerative disorders, that typically include a considerable fraction of misdiagnosed individuals. We’re also calculating polygenic risk scores (PRS) to identify genetic features for clinical/neuropathological subtypes and features. Together, these studies aim to give new data-driven insights into shared and unique features of neurodegenerative disorders.
1. Revealing clinical heterogeneity in a large brain bank cohort. N.J. Mekkes, I.R. Holtman, Nat Med, 2024.
2. Identification of clinical disease trajectories in neurodegenerative disorders with natural language processing. N. J. Mekkes, …, B.J.L. Eggen, I. Huitinga. I.R. Holtman, Nat Med, 2024.