Comparing Patterns in Gene Expression in Longitudinal Array Experiments Using a Novel Algorithm, TAPiR: Time-course Algorithm for Pattern Recognition

Catherine Campbell1, Raj Lingam2, Yang Fann
1campbelc@ninds.nih.gov, NIH-NINDS; 2lingamr@ninds.nih.gov, NIH-NINDS

Time-course experiments provide a unique challenge to bioinformatics. Over the course of an experiment complex patterns in gene expression can arise that as a whole are more striking than any of the individual pairwise time point comparisons. We seek to provide a fast, easy, but powerful approach to identifying, clustering and visualizing complex patterns in gene expression unique to longitudinal experiments. We have developed and implemented a novel algorithm TAPiR (Time-course Algorithm for Pattern Recognition) to detect patterns in gene expression in longitudinal microarray. At each time point genes are shown to be up-regulated (U), down-regulated (D), or show no change (N) with relation to either a set time point or as a cross comparison across all time points. The letters generated at each time point are then oriented to reflect the time course so that a letter sequence or “word” is generated that represents the expression pattern of the gene over time. These words can either be “monosyllabic” when comparing to a single reference time point, or “multi-syllabic” when cross-comparing all time points in an experiment. The words can be sorted alphabetically to generate lists of similar patterns of gene expression, as well as lists of genes with opposing expression patterns over time. Changes in gene expression in TAPiR can be measured either by setting a threshold fold change (for experiments with one sample per time point) or by calculating and setting a t-test p-value threshold (for experiments with at least three samples per time point). TAPiR handles time course data in a more robust manner than traditional hierarchical or k-means type clustering because it allows a better view of subtle patterns in gene expression over time and it allows the user to see small as well as large clusters. Algorithms and code are available from the corresponding author upon request. Algorithms will be available through the NINDS website at http://itp.ninds.nih.gov/TAPiR/.