More than 350 million individuals globally suffer from approximately 10,000 known rare diseases. Despite genomic advances, patients today experience prolonged diagnostic journeys, averaging six years, and roughly 50% remain undiagnosed. This delay often results in inappropriate care, irreversible disease progression, and increased medical costs. To address these challenges, we developed GENIUS, a comprehensive framework targeting patient identification, variant prioritization, and continuous genomic data reanalysis to accelerate diagnoses and improve patient outcomes.
GENIUS integrates three innovative machine learning algorithms: NeoGX identifies undiagnosed patients through phenotypic features extracted via NLP from electronic health records, facilitating timely genetic testing referrals, particularly in NICU settings. CAVaLRi employs an advanced likelihood-ratio framework incorporating variant pathogenicity, phenotype overlap, parental genotypes, and segregation data, effectively prioritizing diagnostic genetic variants amidst noisy phenotype data. PARDIGM automates genomic data reanalysis by continuously updating clinicians as new gene-disease associations emerge.
GENIUS demonstrated remarkable performance, with NeoGX accurately predicting the need for genetic testing (ROC AUC = 0.855), halving testing initiation time from 62 to 31 days. CAVaLRi significantly outperformed existing methods (PR AUC = 0.701), ranking diagnostic variants first in over 70% of cases. PARADIGM, the automated genomic reanalysis component, achieved a 40% diagnostic yield, substantially surpassing conventional methods.
GENIUS exemplifies a scalable computational framework integrating predictive analytics, precise variant prioritization, and dynamic genomic reanalysis. By automating complex diagnostic workflows, GENIUS substantially accelerates diagnosis, optimizes clinical decision-making, and demonstrates the transformative potential of machine learning to advance personalized genomic medicine in rare genetic disorders.