Comprehensive Database of Mechanisms of Peptide Fragmentation

A Key Step towards Confident Identification of Proteins

Robert Mistrik

HighChem, Ltd., Cajakova 18, 81105 Bratislava, Slovakia

Introduction

The protein identification and peptide sequencing ultimately requires an understanding of mechanisms of gas-phase fragmentation. The major difficulty regarding the structural characterization arises from the complexity of the chemistry of gaseous ions and the phenomena of unimolecular decompositions, which are being continuously comprehended thought the intensive research in proteomic field. The accumulated knowledge of fragmentation mechanisms has been systematized in a reaction database (Fragmentation Library™) which provides a framework for the protein identification and peptide sequencing. Advanced algorithms make use of the database to predict protonation/deprotonation sites and generate the nontrivial fragmentation pathways from user provided structures. This system can be used in computer programs that are designed to generate sequence information from spectra by comparing experimental spectra with theoretical spectra created from sequences.

Unimolecular Fragmentation Reactions of Peptides in Mechanistic Context

Widely used fragmentation model which considers that peptides fragment in a uniform manner is sufficiently applicable for broad range of peptides. However, considerable number of routinely observed spectra do not show contiguous series of backbone cleavage sequence ions (a/b, y) because of the great variability of dissociations patterns of peptides. If the peptides lack complete pattern of backbone cleavages the computer programs based on theoretical spectra do not yield sequence information. It is the goal of our work to enhance the predictability of peptide fragmentation and show new strategies in automated protein identification.

To gain better theoretical spectra, prediction of fragmentation pathways based on comprehensive database of fragmentation mechanisms (Fragmentation Library™) has been used. The database contains around 600 detailed fragmentation mechanisms of peptides and 18 000 supporting mechanisms of non-peptidic (small) molecules collected from mass spectrometric literature. The large and diverse collection of database mechanisms is fully structurally oriented allowing computer expert system selectively apply database mechanisms that reflect the nature of composition of the studied peptide.

Peptide Fragmentation Knowledge Base

Database mechanisms serve as knowledge base for computer algorithms that generate fragmentation pathways from peptide structures. The database mechanisms of peptide fragmentation incorporate enhanced bond cleavages, random and skeletal ion-rearrangements, five-membered ring formations, ring closures and contractions, non-mobile proton model, mechanisms of peptides containing an internal basic residue, and fragmentation pathways of N-terminal, C-terminal, backbone, and side chain derivatives. The algorithms selectively apply database mechanisms producing fragmentation pathways for any user supplied peptide structure. The system does not determine the plausibility of predicted ions and human intervention is required to filter generated ion structures according to their relevance. Considering the vast amount of knowledge accumulated in the Fragmentation Library, and the simplicity and variability of its application, this approach can be an important step towards automated identification of proteins. In addition this fully searchable mechanistic database allows systematic classification of published mechanisms which can reveal the difference in fragmentation behavior that is important for confident protein identification via database searching.

Image
Figure 1. Software realization (Mass Frontier™ 4.0) of the system for the prediction of fragmentation and rearrangement pathways using a Fragmentation Library™

Conclusion

Fragmentation Library is a unique collection of several hundreds mechanisms of peptide fragmentation. The mechanisms are stored in a knowledge base and can be automatically applied to any user-provided structure to generate fragmentation pathways at an advanced level. Fragmentation Library together with predictive algorithms and data processing modules are available in Mass Frontier 4.0 commercial software. Work on enlarging the data collection is planned to continue until 2007, by which time we expect to have completed the screening of all the major sources of fragmentation mechanisms.

Contact: This e-mail address is being protected from spam bots, you need JavaScript enabled to view it