English

Deep Mamba Multi-modal Learning

Multimedia 2024-06-27 v1

Abstract

Inspired by the excellent performance of Mamba networks, we propose a novel Deep Mamba Multi-modal Learning (DMML). It can be used to achieve the fusion of multi-modal features. We apply DMML to the field of multimedia retrieval and propose an innovative Deep Mamba Multi-modal Hashing (DMMH) method. It combines the advantages of algorithm accuracy and inference speed. We validated the effectiveness of DMMH on three public datasets and achieved state-of-the-art results.

Keywords

Cite

@article{arxiv.2406.18007,
  title  = {Deep Mamba Multi-modal Learning},
  author = {Jian Zhu and Xin Zou and Yu Cui and Zhangmin Huang and Chenshu Hu and Bo Lyu},
  journal= {arXiv preprint arXiv:2406.18007},
  year   = {2024}
}

Comments

Deep Mamba Multi-modal Learning; Deep Mamba Multi-modal Hashing

R2 v1 2026-06-28T17:19:22.081Z