Improving Tandem Mass Spectra Analysis with Hierarchical Learning

Improving Tandem Mass Spectra Analysis with Hierarchical Learning

Zhengcong Fei

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Special track on AI for CompSust and Human well-being. Pages 4345-4351. https://doi.org/10.24963/ijcai.2020/599

Tandem mass spectrometry is the most widely used technology to identify proteins in a complex biological sample, which produces a large number of spectra representative of protein subsequences named peptide. In this paper, we propose a hierarchical multi-stage framework, referred as DeepTag, to identify the peptide sequence for each given spectrum. Compared with the traditional one-stage generation, our sequencing model starts the inference with a selected high-confidence guiding tag and provides the complete sequence based on this guiding tag. Besides, we introduce a cross-modality refining module to asist the decoder focus on effective peaks and fine-tune with a reinforcement learning technique. Experiments on different public datasets demonstrate that our method achieves a new state-of-the-art performance in peptide identification task, leading to a marked improvement in terms of both precision and recall.
Keywords:
Computer Vision: Biomedical Image Understanding
Natural Language Processing: NLP Applications and Tools