In the search of genomic clusters of human co-expressed genes using microarray gene expression data.

Johannes Olson1, Per Broberg2, Krzysztof Pawlowski
1Johannes.EXT.Olson@astrazeneca.com, AstraZeneca; 2Per.Broberg@astrazeneca.com, AstraZeneca

There is a number of recent reports that genomic clusters of co-expressed genes can be found in higher eukaryotes, including fruit fly and humans. Here an attempt was made to identify such clusters of human co-expressed genes using the GeneLogic library of gene expression data as measured by Affymetrix HG_U133 microarrays. The results were compared to previous studies that used EST and SAGE libraries.
Co-expression was analyzed using average and maximum Pearson correlation of expression profiles for all pairs of genes within sliding genomic windows for both strands seperately, and also for strands considered toegther. Several window sizes of fixed length were tested, and the size of 200 kbp was used. In order to account for both possible tissue-specific co-expression and biological process - specific co-expression, two GeneLogic expression datasets were used: first, containing samples from normal human tissues; and second, containing samples from both normal and diseased tissues. Statistical significance of average and maximum expression correlation within sliding windows was estimated by sampling randomly selected gene sets of sizes corresponding to gene sets within genomic windows used.
The identified genomic clusters of neighbouring genes showing significant similarity in expression profiles were analyzed for several features. First, evidence of gene duplication was checked. Second, common subcellular localization and common functional themes were verified using Gene Ontology (G.O.). Also, co-expressed genomic clusters were analysed for evidence of conservation between human and mouse. The results were discussed in the context of previous work that characterized genomic clusters as consisting of highly expressed genes and housekeeping genes.