End-to-end Multi-source Visual Prompt Tuning for Survival Analysis in Whole Slide Images

doi:10.48550/arXiv.2409.03804

End-to-end Multi-source Visual Prompt Tuning for Survival Analysis in Whole Slide Images

Survival analysis using pathology images poses a considerable challenge, as it requires the localization of relevant information from the multitude of tiles within whole slide images (WSIs). Current methods typically resort to a two-stage approach, where a pre-trained network extracts features from tiles, which are then used by survival models. This process, however, does not optimize the survival models in an end-to-end manner, and the pre-extracted features may not be ideally suited for survival prediction. To address this limitation, we present a novel end-to-end Visual Prompt Tuning framework for survival analysis, named VPTSurv. VPTSurv refines feature embeddings through an efficient encoder-decoder framework. The encoder remains fixed while the framework introduces tunable visual prompts and adaptors, thus permitting end-to-end training specifically for survival prediction by optimizing only the lightweight adaptors and the decoder. Moreover, the versatile VPTSurv framework accommodates multi-source information as prompts, thereby enriching the survival model. VPTSurv achieves substantial increases of 8.7% and 12.5% in the C-index on two immunohistochemical pathology image datasets. These significant improvements highlight the transformative potential of the end-to-end VPT framework over traditional two-stage methods.

Publication:

arXiv e-prints

Pub Date:

September 2024

DOI:

10.48550/arXiv.2409.03804

arXiv:

arXiv:2409.03804

Bibcode:

2024arXiv240903804Q

Keywords:

Electrical Engineering and Systems Science - Image and Video Processing

NASA/ADS

End-to-end Multi-source Visual Prompt Tuning for Survival Analysis in Whole Slide Images

Abstract