Reliable feature extraction from mechanically spotted two-color microarrays.

Yuching Lai¹, Greg Tyrelle², Daniel Di Giusto, Garry C. King
¹yuching@kinglab.unsw.edu.au, UNSW; ²greg@kinglab.unsw.edu.au, UNSW

Spotted DNA microarrays offer small labs a cost-effective entry into expression profiling, but reliable data extraction remains a significant problem. Spot inhomogeneity (varying probe density at a microscopic level), a characteristic feature of mechanically spotted arrays, produces distinct pixel-by-pixel correlations that can be employed for more reliable extraction of red/green expression ratios. Our Python-scripted feature processing pipeline consists of five stages: 1. Culling saturated pixels from spots; 2. Culling unreliable spots by Pearson product-moment correlation coefficient; 3. Linear and non-linear pixel fits to extract red/green ratios and identify dye-selective quenching; 4. Slide-based normalization; 5. Merging of equivalent data from alternative scanner settings. The experimental and theoretical justification of this approach is first demonstrated, then its robustness is illustrated using data from 197 replicated G3PDH housekeeping control spots (approx. 10,000 pixels) and numerous other control spots per slide over 14 slides. Performance in terms of ratio reliability and false positive/negative spot assignments is compared to other methods including conventional spot-based background subtraction and spot ratio variability (SRV) approaches.