Inspired by the excellent performance of Mamba networks, we propose a novel Deep Mamba Multi-modal Learning (DMML). It can be used to achieve the fusion of multi-modal features. We apply DMML to the field of multimedia retrieval and propose an innovative Deep Mamba Multi-modal Hashing (DMMH) method. It combines the advantages of algorithm accuracy and inference speed. We validated the effectiveness of DMMH on three public datasets and achieved state-of-the-art results.
@article{arxiv.2406.18007,
title = {Deep Mamba Multi-modal Learning},
author = {Jian Zhu and Xin Zou and Yu Cui and Zhangmin Huang and Chenshu Hu and Bo Lyu},
journal= {arXiv preprint arXiv:2406.18007},
year = {2024}
}
Comments
Deep Mamba Multi-modal Learning; Deep Mamba Multi-modal Hashing