Numerous diseases co-occur more than expected by chance, likely due to a combination of genetic and environmental factors. However, the extent to which these influences shape disease relationships remains unclear. Here, we integrate large-scale RNA-seq data and heritability measures from human diseases with genomic data from the UK Biobank to disentangle the genetic and non-genetic origins of disease co-occurrences (DCs).
Our findings show that gene expression not only recovers but also expands upon genomically explained DCs, capturing disease relationships beyond genetic variation. Approximately 60% of transcriptomically inferred DCs have a detectable genomic component, whereas the remaining 40% are not explained by known genomic layers, suggesting contributions from regulatory or environmental mechanisms. Consistent with this interpretation, the relative contributions of transcriptomics and genomics reconstruct disease etiology and correlate with comorbidity burden, revealing key aspects of disease mechanisms. Complex diseases with strong genetic predispositions tend to be captured by both omics, whereas those primarily influenced by non-genetic factors are better explained by transcriptomics. Additionally, we find that diseases do not generally co-occur based on their heritability, except when sharing SNPs. However, highly heritable diseases tend to have genetically driven co-occurrences, even with lowly heritable diseases. In contrast, transcriptomics explains DCs regardless of heritability, at least partly due to non-heritable mechanisms, such as regulatory or environmental. Integrating transcriptomic and genomic data provides near-complete coverage of DCs among the analyzed diseases, with a considerable portion likely rooted in factors beyond DNA sequence and, therefore, potentially modifiable.