Abstract
Current state-of-the-art experimental and computational proteomic approaches were integrated to obtain a comprehensive protein profile of Populus vascular tissue. This featured: 1) a large sample set consisting of two genotypes grown under normal and tension stress conditions, 2) bioinformatics clustering to effectively handle gene duplication, and 3) an informatics approach to track and identify single amino acid polymorphisms (SAAPs). By applying a clustering algorithm to the Populus database, the number of protein entries decreased from 64,689 proteins to a total of 43,069 protein groups, thereby reducing 7,505 identified proteins to a total of 4,226 protein groups, in which 2,016 were singletons. This reduction implies that ~50% of the measured proteins were clustered into groups that shared extensive sequence homology. Using conservative search criteria, we were able to identify 1,354 peptides containing a SAAP and 201 peptides that become tryptic due to a K or R substitution. These newly identified peptides correspond to 502 proteins, including 97 proteins that were not previously identified. In total, the integration of deep proteome measurements on an extensive sample set with protein clustering and peptide sequence variants provided an unprecedented level of proteome characterization for Populus, allowing us to spatially resolve the vascular tissue proteome.