Skip to main content
SHARE
Publication

Implications of Strain- and Species-Level Sequence Divergence for Community and Isolate Shotgun Proteomic Analysis...

by Vincent Denef, Manesh B Shah, Nathan C Verberkmoes, Robert L Hettich, Jillian Banfield
Publication Type
Journal
Journal Name
Journal of Proteome Research
Publication Date
Page Numbers
3152 to 3161
Volume
6
Issue
8

The recent surge in microbial genomic sequencing, combined with the development of high-throughput
liquid chromatography-mass-spectrometry-based (LC/LC-MS/MS) proteomics, has raised the question
of the extent to which genomic information of one strain or environmental sample can be used to
profile proteomes of related strains or samples. Even with decreasing sequencing costs, it remains
impractical to obtain genomic sequence for every strain or sample analyzed. Here, we evaluate how
shotgun proteomics is affected by amino acid divergence between the sample and the genomic database
using a probability-based model and a random mutation simulation model constrained by experimental
data. To assess the effects of nonrandom distribution of mutations, we also evaluated identification
levels using in silico peptide data from sequenced isolates with average amino acid identities (AAI)
varying between 76 and 98%. We compared the predictions to experimental protein identification levels
for a sample that was evaluated using a database that included genomic information for the dominant
organism and for a closely related variant (95% AAI). The range of models set the boundaries at
which half of the proteins in a proteomic experiment can be identified to be 77-92% AAI between
orthologs in the sample and database. Consistent with this prediction, experimental data indicated
loss of half the identifiable proteins at 90% AAI. Additional analysis indicated a 6.4% reduction of the
initial protein coverage per 1% amino acid divergence and total identification loss at 86% AAI.
Consequently, shotgun proteomics is capable of cross-strain identifications but avoids most crossspecies
false positives.