The completion of the sequencing of the human genome and the concurrent, rapid development of high-throughput proteomic methods have resulted in an increasing need for automated approaches to archive proteomic data in a repository that enables the exchange of data among researchers and also accurate integration with genomic data. PeptideAtlas () addresses these needs by identifying peptides by tandem mass spectrometry (MS/MS), statistically validating those identifications and then mapping identified sequences to the genomes of eukaryotic organisms. A meaningful comparison of data across different experiments generated by different groups using different types of instruments is enabled by the implementation of a uniform analytic process. This uniform statistical validation ensures a consistent and high-quality set of peptide and protein identifications. The raw data from many diverse proteomic experiments are made available in the associated PeptideAtlas repository in several formats. Here we present a summary of our process and details about the Human, Drosophila and Yeast PeptideAtlas builds.
A notable inefficiency of shotgun proteomics experiments is the repeated rediscovery of the same identifiable peptides by sequence database searching methods, which often are time-consuming and error-prone. A more precise and efficient method, in which previously observed and identified peptide MS/MS spectra are catalogued and condensed into searchable spectral libraries to allow new identifications by spectral matching, is seen as a promising alternative. To that end, an open-source, functionally complete, high-throughput and readily extensible MS/MS spectral searching tool, SpectraST, was developed. A high-quality spectral library was constructed by combining the high-confidence identifications of millions of spectra taken from various data repositories and searched using four sequence search engines. The resulting library consists of over 30 000 spectra for Saccharomyces cerevisiae. Using this library, SpectraST vastly outperforms the sequence search engine SEQUEST in terms of speed and the ability to discriminate good and bad hits. A unique advantage of SpectraST is its full integration into the popular Trans Proteomic Pipeline suite of software, which facilitates user adoption and provides important functionalities such as peptide and protein probability assignment, quantification, and data visualization. This method of spectral library searching is especially suited for targeted proteomics applications, offering superior performance to traditional sequence searching.
In many studies, particularly in the field of systems biology, it is essential that identical protein sets are precisely quantified in multiple samples such as those representing differentially perturbed cell states. The high degree of reproducibility required for such experiments has not been achieved by classical mass spectrometry-based proteomics methods. In this study we describe the implementation of a targeted quantitative approach by which predetermined protein sets are first identified and subsequently quantified at high sensitivity reliably in multiple samples. This approach consists of three steps. First, the proteome is extensively mapped out by multidimensional fractionation and tandem mass spectrometry, and the data generated are assembled in the PeptideAtlas database. Second, based on this proteome map, peptides uniquely identifying the proteins of interest, proteotypic peptides, are selected, and multiple reaction monitoring (MRM) transitions are established and validated by MS2 spectrum acquisition. This process of peptide selection, transition selection, and validation is supported by a suite of software tools, TIQAM (Targeted Identification for Quantitative Analysis by MRM), described in this study. Third, the selected target protein set is quantified in multiple samples by MRM. Applying this approach we were able to reliably quantify low abundance virulence factors from cultures of the human pathogen Streptococcus pyogenes exposed to increasing amounts of plasma. The resulting quantitative protein patterns enabled us to clearly define the subset of virulence proteins that is regulated upon plasma exposure. Molecular & Cellular Proteomics 7:1489 -1500, 2008.A key element of the experimental framework for systems biology is the comprehensive, quantitative measurement of whole biological systems in differentially perturbed states (1). Among the different types of measurements possible, protein quantification is particularly informative because proteins catalyze or control the majority of cellular functions. Currently the most widely applied quantitative proteome analysis technologies consist of the labeling of the samples by stable isotopes, the reproducible separation of complex peptide mixtures, usually by capillary LC, and the identification and quantification of selected peptides by tandem mass spectrometry and sequence database searching (2, 3). Relative quantitative values are generated by these methods if two or more samples are being compared, and absolute quantification can be achieved if suitable, calibrated reference samples are available (4). Using such shotgun methods, in each measurement only a fraction of the analytes present in a complex sample is identified and quantified. Peptide ions are selected by the mass spectrometer automatically based on precursor ion signal intensities. Due to a multitude of factors, including interference between analytes and variations in precursor ion spectra, the selection of peptides is not reproducible in consecutive runs in particular for peptides...
Peptides derived from protein tandem mass spectrometry data have been mapped to the human genome sequence forming an expandable resource for the proteomic data.
Adjustment of physiology in response to changes in oxygen availability is critical for the survival of all organisms. However, the chronology of events and the regulatory processes that determine how and when changes in environmental oxygen tension result in an appropriate cellular response is not well understood at a systems level. Therefore, transcriptome, proteome, ATP, and growth changes were analyzed in a halophilic archaeon to generate a temporal model that describes the cellular events that drive the transition between the organism's two opposing cell states of anoxic quiescence and aerobic growth. According to this model, upon oxygen influx, an initial burst of protein synthesis precedes ATP and transcription induction, rapidly driving the cell out of anoxic quiescence, culminating in the resumption of growth. This model also suggests that quiescent cells appear to remain actively poised for energy production from a variety of different sources. Dynamic temporal analysis of relationships between transcription and translation of key genes suggests several important mechanisms for cellular sustenance under anoxia as well as specific instances of post-transcriptional regulation.
Peptide identifications of high probability from 28 LC-MS/MS human serum and plasma experiments from eight different laboratories, carried out in the context of the HUPO Plasma Proteome Project, were combined and mapped to the EnsEMBL human genome. The 6929 distinct observed peptides were mapped to approximately 960 different proteins. The resulting compendium of peptides and their associated samples, proteins, and genes is made publicly available as a reference for future research on human plasma.
Cellular response to stress entails complex mRNA and protein abundance changes, which translate into physiological adjustments to maintain homeostasis as well as to repair and minimize damage to cellular components. We have characterized the response of the halophilic archaeon Halobacterium salinarum NRC-1 to 60 Co ionizing gamma radiation in an effort to understand the correlation between genetic information processing and physiological change. The physiological response model we have constructed is based on integrated analysis of temporal changes in global mRNA and protein abundance along with protein-DNA interactions and evolutionarily conserved functional associations. This systems view reveals cooperation among several cellular processes including DNA repair, increased protein turnover, apparent shifts in metabolism to favor nucleotide biosynthesis and an overall effort to repair oxidative damage. Further, we demonstrate the importance of time dimension while correlating mRNA and protein levels and suggest that steadystate comparisons may be misleading while assessing dynamics of genetic information processing across transcription and translation.
The relatively small numbers of proteins and fewer possible posttranslational modifications in microbes provides a unique opportunity to comprehensively characterize their dynamic proteomes. We have constructed a Peptide Atlas (PA) for 62.7% of the predicted proteome of the extremely halophilic archaeon Halobacterium salinarum NRC-1 by compiling approximately 636,000 tandem mass spectra from 497 mass spectrometry runs in 88 experiments. Analysis of the PA with respect to biophysical properties of constituent peptides, functional properties of parent proteins of detected peptides, and performance of different mass spectrometry approaches has helped highlight plausible strategies for improving proteome coverage and selecting signature peptides for targeted proteomics. Notably, discovery of a significant correlation between absolute abundances of mRNAs and proteins has helped identify low abundance of proteins as the major limitation in peptide detection. Furthermore we have discovered that iTRAQ labeling for quantitative proteomic analysis introduces a significant bias in peptide detection by mass spectrometry. Therefore, despite identifying at least one proteotypic peptide for almost all proteins in the PA, a context-dependent selection of proteotypic peptides appears to be the most effective approach for targeted proteomics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.