PrepMS: Mass Spectrometry Graphical Preprocessing Tool
Yuliya V. Karpievitch (Biostatistics and Applied Math Dept., M.D. Anderson Cancer Center, UTH)
Elizabeth G. Hill (Biostatistics, Bioinformatics and Epidemiology Dept., Medical University of South Carolina)
Jeffrey S. Morris (Biostatistics and Applied Math Dept., M.D. Anderson Cancer Center, UTH)
Kevin R. Coombes (Biostatistics and Applied Math Dept., M.D. Anderson Cancer Center, UTH)
Keith Baggerly (Biostatistics and Applied Math Dept., M.D. Anderson Cancer Center, UTH)
Jonas S. Almeida (Biostatistics and Applied Math Dept., M.D. Anderson Cancer Center, UTH)
Short Abstract: PrepMS is a graphical stand-alone application that allows for easy feature extraction (peak detection) from the mass spectrometry data. The algorithm uses translation-invariant wavelet transforms to denoise the spectra and detects peaks using the mean spectrum. Mass spectrogram views allow users to view individual spectra, mean spectrum and detected peaks.
Mass spectrometry data is widely used to discover disease-related biomarkers from easily obtainable bodily fluids like urine, saliva or serum. The reliability and reproducibility of the biomarker identification depends extensively on the preprocessing of the data obtained from the MS instruments. The following basic steps are required to preprocess the mass spectra: calibration, denoising, baseline correction, normalization and peak detection. Calibration aligns multiple spectra by interpolating intensity values on a specified m/z interval. The need to calibrate arises from the mass spectrometer producing variable length spectra on different runs even if range is the same. Denoising removes the electrical and chemical white noise. Baseline correction removes the baseline artifact characteristic to the matrix molecules. Normalization across spectra enables the comparison of peaks across different spectra by eliminating the difference in the total amount of protein desorbed from the plate.
A simple-to-use graphical tool is necessary to enable researches to easily prepare mass spectrometry data for analysis. We present a graphical program PrepMS that allows for easy feature extraction (peak detection) from the mass spectrometry data. The algorithm uses translation-invariant wavelet transforms to denoise the spectra and detects peaks using the mean spectrum. For ease-of-use by researchers not familiar with the Wavelet transforms the graphical executable provides the default parameter settings which were experimentally determined to work well in most situations, but could be changed by the user if desired. PrepMS is a stand-alone application made freely available (open source), and is under a public license for unrestricted use and modification. Its graphical user interface and default parameter settings allow PrepMS to be effectively used as a preprocessing tool or a simple visual inspection tool to inspect the quality of the data and eliminate spectra bellow a certain user-defined threshold. A heat map of the spectra provides detailed insight on the calibration. A mass spectrogram view displays two plots which can be chosen from the list: mean, denoised, and denoised and baseline corrected spectra. Two resulting plots provide the baseline overlaid other the mean, denoised or denoised and baseline corrected spectra in one, and detected peaks overlaid in the second. The identified peaks are stored in the text file peaks.txt, and the corresponding m/z values are stored in the mz.txt file (other names can be specified by the user).
PrepMS is a graphical user-friendly preprocessing tool that implements a robust peak identification algorithm based on the mean spectrum. By providing sensible default parameters, the need to know about peak detection algorithms or the intricate details of Wavelet denoising used underneath is no longer a requirement. In Summary, PrepMS provides a straight-forward fully automated graphical user interface for mass spectrometry data preprocessing.