The generation of metagenome-assembled genomes (MAGs) has become a routine method for studying microbiomes. With the growing availability of MAGs in public repositories, MGnify—a free platform for metagenomic data assembly, analysis, and archiving—has introduced MGnify Genomes. This resource serves as a hub for systematically organising and annotating publicly available MAGs and isolate genomes into non-redundant, biome-specific catalogues.
The resource includes over half a million genomes and has recently expanded to incorporate eukaryotic genomes in addition to prokaryotic ones. These genomes are sourced from a wide range of biomes, including both host-associated and environmental contexts. Within each biome, genomes are organised into species-level clusters, with the highest-quality genome selected as the representative—prioritising isolate genomes over MAGs. Each representative genome is richly annotated with comprehensive functional information, including antimicrobial resistance. Additional annotations cover biosynthetic gene clusters, carbohydrate metabolism—including polysaccharide utilisation loci, non-coding RNAs, CRISPR, phage sequences, plasmids, and integrative mobile elements.
An open-source Nextflow pipeline is maintained for generating new catalogues and updating existing ones. The platform offers multiple ways to utilise these references: each biome-specific catalogue is accompanied by Kraken2, protein, and gene databases. A fast, k-mer-based search tool is available on the MGnify Genomes website, allowing users to quickly compare their genomes against the reference catalogues. The resource supports a wide range of applications, including the identification of novel genomes, analysis of species-level adaptation across environments, and research in agricultural, environmental, and health and disease fields.