I will present novel methods to assist in the determination of peptide occurrence from tandem mass spectrometry (MS/MS) data. MS/MS is a critical tool for determining the proteins within biological samples. This provides biologists information about the proteins appearing in different organisms which they use for diverse types of biological experiments. Unfortunately MS/MS data is highly noisy and imprecise which makes identification extremely challenging.
I will first discuss several methods for identifying peptides in MS/MS samples, namely de novo analysis (attempting to match the data to a theoretical model) and database identification (matching data to only those peptides within a database). Both techniques are slow due to the amount of processing that must be done per sample. As a result we have developed a method for accurately filtering bad samples out of the identification process. In particular, our experiments suggest that we can achieve more than 30% reduction in identification time of samples while preserving over 99.9% accuracy. These results have been deemed accurate enough by expert practitioners to be used for the filtering of general MS/MS data. We have also experimented with the tradeoffs of using a general filter for all the data versus building a custom filter for each data set.
The work described is joint work with Kobus Barnard, Linda Breci, Paul Haynes, Josh Kittleson, and the Proteomics Laboratory at the University of Arizona.