GLPocket: A Multi-Scale Representation Learning Approach for Protein Binding Site Prediction

GLPocket: A Multi-Scale Representation Learning Approach for Protein Binding Site Prediction

Peiying Li, Yongchang Liu, Shikui Tu, Lei Xu

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 4821-4828. https://doi.org/10.24963/ijcai.2023/536

Protein binding site prediction is an important prerequisite for the discovery of new drugs. Usually, natural 3D U-Net is adopted as the standard site prediction framework to do per-voxel binary mask classification. However, this scheme only performs feature extraction for single-scale samples, which may bring the loss of global or local information, resulting in incomplete, artifacted or even missed predictions. To tackle this issue, we propose a network called GLPocket, which is based on the Lmser (Least mean square error reconstruction) network and utilizes multi-scale representation to predict binding sites. Firstly, GLPocket uses Target Cropping Block (TCB) for targeted prediction. TCB selects the local interested feature from the global representations to perform concentrated prediction, and reduces the volume of feature maps to be calculated by 82% without adding additional parameters. It integrates global distribution information into local regions, making prediction more concentrated on decoding stage. Secondly, GLPocket establishes long-range relationship of patches within the local region with Transformer Block (TB), to enrich local context semantic information. Experiments show that GLPocket improves by 0.5%-4% on DCA Top-n prediction compared with previous state-of-the-art methods on four datasets. Our code has been released in https://github.com/CMACH508/GLPocket.
Keywords:
Multidisciplinary Topics and Applications: MDA: Bioinformatics
Computer Vision: CV: Biomedical image analysis