PASS: De novo assembler for short peptide sequences
Abstract
The ability to characterize proteins at sequence-level resolution is vital to biological research. Currently, the leading method for protein sequencing is by liquid chromatography mass spectrometry (LC-MS) whereas proteins are reduced to their constituent peptides by enzymatic digest and subsequently analyzed on an LC-MS instrument. The short peptide sequences that result from this analysis are used to characterize the original protein content of the sample. Here we present PASS, a de novo assembler for short peptide sequences that can be used to reconstruct large portions of protein targets, a step that can facilitate downstream sample characterization efforts. We show how, with adequate peptide sequence coverage and little-to-no additional sequence processing, PASS reconstructs protein sequences into relatively large (100 amino acid or longer) contigs having high (93.1 - 99.1%) sequence identity to reference antibody light and heavy chain proteins. Availability: PASS is released under the GNU General Public License Version 3 (GPLv3) and is publicly available from https://github.com/warrenlr/PASS
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2022
- DOI:
- 10.48550/arXiv.2208.05598
- arXiv:
- arXiv:2208.05598
- Bibcode:
- 2022arXiv220805598W
- Keywords:
-
- Quantitative Biology - Genomics;
- Quantitative Biology - Biomolecules
- E-Print:
- 4 pages, 1 table