Non-metric analysis of temporal patterns captured in microarray data

Y-h. Taguchi1, Y. Oono2
1tag@granular.com, Department of Physics, Chuo Universit; 2y-oono@uiuc.edu, Department of Physics, UIUC

The gene activities in the transcriptional response of cell cycle-synchronized human fibroblasts to serum [Lyer et al. Science 283, 83-87 (1999)] is analyzed by our novel nonmetric multidimensional scaling algorithm. Although the analysis of the microarray data with the aid of principal component analysis cannot clearly unravel the temproal order in the data, our intrinsically nonlinear analysis method unambiguously gives a ring-like arrangement of the genes, along which the gene activity peak rotates in time in an orderly fashion. Although our method is fully unsupervised, the obtained results
are comparable to the results that would be obtained by detailed Fourier analysis(cf. http://www.granular.com/MDS/fig_ISMB2003.pdf ). Thus, our result emphasizes that data mining is intrinsically a nonlinear analysis, and nonmetric MDS could be a powerful means. Especially, our novel nMDS algorithm is maximally non-metric and is designed for large scale data sets, so it could be a useful data mining method for bioinformatics data as well as other biological data sets such as ecological, phylogenetic, and biochemical ones.