Deep Mamba Multi-modal Learning
Abstract
Inspired by the excellent performance of Mamba networks, we propose a novel Deep Mamba Multi-modal Learning (DMML). It can be used to achieve the fusion of multi-modal features. We apply DMML to the field of multimedia retrieval and propose an innovative Deep Mamba Multi-modal Hashing (DMMH) method. It combines the advantages of algorithm accuracy and inference speed. We validated the effectiveness of DMMH on three public datasets and achieved state-of-the-art results.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2024
- DOI:
- 10.48550/arXiv.2406.18007
- arXiv:
- arXiv:2406.18007
- Bibcode:
- 2024arXiv240618007Z
- Keywords:
-
- Computer Science - Multimedia
- E-Print:
- Deep Mamba Multi-modal Learning