With increasing numbers of human omics data, there is an urgent need for adequate resources for data sharing while also standardizing and harmonizing data processing. Within the federated European Genome-Phenome Archive (EGA), the German Human Genome-Phenome Archive (GHGA) strives to provide (i) the necessary secure IT-infrastructure for Germany, (ii) an ethico-legal framework to handle omics data in a data-protection-compliant but open and FAIR manner, (iii) harmonized metadata schema, and (iv) standardized workflows to process the incoming omics data uniformly.
GHGA is aiming to be more than an archive. GHGA will build on cloud computing infrastructures managed in a network of data generators. Researchers will have controlled access to raw and processed sequence data using recognized GA4GH-compliant NGS workflows. For this, GHGA is working with the nf-core community to co-develop and standardize bioinformatics workflows for data analysis, benchmarking, statistical analysis, and visualizations. Besides, continuous integration and deployment to test and benchmark workflows, synthetic and experimental datasets will be applied to guarantee the high quality of workflows. Finally, by delivering on IT infrastructure and the aforementioned goals, an ethico-legal framework, metadata schemas, and standardized and reproducible workflows, GHGA will enable cross-project analysis and promote new collaborations and research projects.