POMP: Pathology-omics Multimodal Pre-training Framework for Cancer Survival Prediction
POMP: Pathology-omics Multimodal Pre-training Framework for Cancer Survival Prediction
Suixue Wang, Shilin Zhang, Huiyuan Lai, Weiliang Huo, Qingchen Zhang
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 7813-7821.
https://doi.org/10.24963/ijcai.2025/869
Cancer survival prediction is an important direction in precision medicine, aiming to help clinicians tailor treatment regimens for patients. With the rapid development of high-throughput sequencing and computational pathology technologies, survival prediction has shifted from clinical features to joint modeling of multi-omics data and pathology images. However, existing multimodal learning methods struggle to effectively learn pathology-omics interactions due to the lack of proper alignment of multimodal data before fusion. In this paper, we propose POMP, a pathology-omics multimodal pre-training framework jointly learned with three training tasks for integrating pathological images and omics data for cancer survival prediction. To better perform cross-modal learning, we introduce a pathology-omics contrastive learning method to align the pathology and omics information. POMP leverages the principle of pre-trained models and explores the benefit of aligning multimodal information from the same patient, achieving state-of-the-art results on six cancer datasets from the Cancer Genome Atlas (TCGA). We also show that our contrastive learning method allows us to exploit the cosine similarity of pathological images and omics data as the survival risk score, which can further boost prediction performance compared with other commonly used methods. The code is available at https://github.com/SuixueWang/POMP.
Keywords:
Multidisciplinary Topics and Applications: MTA: Bioinformatics
Computer Vision: CV: Biomedical image analysis
Computer Vision: CV: Multimodal learning
Machine Learning: ML: Self-supervised Learning
