Human and Mouse Genome Comparison Using Genome-Wide Unique Sequences

Ben-Yang Liao1, Yu-Jung Chang2, Jan-Ming Ho and Ming-Jing Hwang
1liaoby@gate.sinica.edu.tw, Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan; 2yjchang@iis.sinica.edu.tw, Institute of Information Science, Academia Sinica, Taipei, Taiwan

The genomes of multichromosomal organisms are often shuffled by chromosomal rearrangements during evolution, but the local order of genes in the conserved genomic segments between two related species often remains unchanged. The identification of these homologous regions could provide the basis for studies of genome organization and evolutionary genomics. As the genomic data grows, sensitive and faster detection of conserved segments between species is required. We have developed an alignment-free method for whole genome comparison. In this method, we first identified 16-mer genome-wide unique sequences that are coexistent in human and mouse genomes and then used their patterns of distribution to recognize homologous regions between the two mammalian species. The resulting human-mouse syntenic map revealed more than 400 conserved segments each of 100kb or larger and covering more than 90% of both genomes. Comparisons showed very good agreement with several alignment-based maps such as that produced by the Mouse Genome Sequencing Consortium. Notably, our approach is much faster, capable of comparing two entire mammalian genomes within hours of computing on just one Pentium IV personal computer.