FedMM: Federated Multi-Modal Learning with Modality Heterogeneity in Computational Pathology
Abstract
The fusion of complementary multimodal information is crucial in computational pathology for accurate diagnostics. However, existing multimodal learning approaches necessitate access to users' raw data, posing substantial privacy risks. While Federated Learning (FL) serves as a privacy-preserving alternative, it falls short in addressing the challenges posed by heterogeneous (yet possibly overlapped) modalities data across various hospitals. To bridge this gap, we propose a Federated Multi-Modal (FedMM) learning framework that federatedly trains multiple single-modal feature extractors to enhance subsequent classification performance instead of existing FL that aims to train a unified multimodal fusion model. Any participating hospital, even with small-scale datasets or limited devices, can leverage these federated trained extractors to perform local downstream tasks (e.g., classification) while ensuring data privacy. Through comprehensive evaluations of two publicly available datasets, we demonstrate that FedMM notably outperforms two baselines in accuracy and AUC metrics.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2024
- DOI:
- 10.48550/arXiv.2402.15858
- arXiv:
- arXiv:2402.15858
- Bibcode:
- 2024arXiv240215858P
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing
- E-Print:
- 2024 International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024)