English
Related papers

Related papers: Gradient-Guided Modality Decoupling for Missing-Mo…

200 papers

Multimodal learning has developed very fast in recent years. However, during the multimodal training process, the model tends to rely on only one modality based on which it could learn faster, thus leading to inadequate use of other…

Machine Learning · Computer Science 2024-11-05 Zirun Guo , Tao Jin , Jingyuan Chen , Zhou Zhao

Multimodal learning often encounters the under-optimized problem and may have worse performance than unimodal learning. Existing methods attribute this problem to the imbalanced learning between modalities and rebalance them through…

Computer Vision and Pattern Recognition · Computer Science 2025-07-15 Shicai Wei , Chunbo Luo , Yang Luo

Multimodal learning seeks to utilize data from multiple sources to improve the overall performance of downstream tasks. It is desirable for redundancies in the data to make multimodal systems robust to missing or corrupted observations in…

Computer Vision and Pattern Recognition · Computer Science 2024-10-14 Md Kaykobad Reza , Ashley Prater-Bennette , M. Salman Asif

Multimodal learning aims to leverage information from diverse data modalities to achieve more comprehensive performance. However, conventional multimodal models often suffer from modality imbalance, where one or a few modalities dominate…

Computer Vision and Pattern Recognition · Computer Science 2025-10-21 Mohammed Rakib , Arunkumar Bagavathi

Multimodal networks have demonstrated remarkable performance improvements over their unimodal counterparts. Existing multimodal networks are designed in a multi-branch fashion that, due to the reliance on fusion strategies, exhibit…

While the field of multi-modal learning keeps growing fast, the deficiency of the standard joint training paradigm has become clear through recent studies. They attribute the sub-optimal performance of the jointly trained model to the…

Computer Vision and Pattern Recognition · Computer Science 2023-08-16 Hong Li , Xingyu Li , Pengbo Hu , Yinuo Lei , Chunxiao Li , Yi Zhou

The missing modality problem poses a fundamental challenge in multimodal sentiment analysis, significantly degrading model accuracy and generalization in real world scenarios. Existing approaches primarily improve robustness through prompt…

Computer Vision and Pattern Recognition · Computer Science 2026-04-08 Rongfei Chen , Tingting Zhang , Xiaoyu Shen , Wei Zhang

Large-scale multimodal models have shown excellent performance over a series of tasks powered by the large corpus of paired multimodal training data. Generally, they are always assumed to receive modality-complete inputs. However, this…

Computer Vision and Pattern Recognition · Computer Science 2024-10-22 Lianyu Hu , Tongkai Shi , Wei Feng , Fanhua Shang , Liang Wan

In multi-modal learning, some modalities are more influential than others, and their absence can have a significant impact on classification/segmentation accuracy. Addressing this challenge, we propose a novel approach called Meta-learned…

Computer Vision and Pattern Recognition · Computer Science 2025-08-27 Hu Wang , Salma Hassan , Yuyuan Liu , Congbo Ma , Yuanhong Chen , Qing Li , Jiahui Geng , Bingjie Wang , Yu Tian , Yutong Xie , Jodie Avery , Louise Hull , Ian Reid , Mohammad Yaqub , Gustavo Carneiro

During multimodal model training and testing, certain data modalities may be absent due to sensor limitations, cost constraints, privacy concerns, or data loss, negatively affecting performance. Multimodal learning techniques designed to…

Computer Vision and Pattern Recognition · Computer Science 2026-02-05 Renjie Wu , Hu Wang , Hsiang-Ting Chen , Gustavo Carneiro

Multimodal deep learning, especially vision-language models, have gained significant traction in recent years, greatly improving performance on many downstream tasks, including content moderation and violence detection. However, standard…

Computer Vision and Pattern Recognition · Computer Science 2024-08-05 Zhuokai Zhao , Harish Palani , Tianyi Liu , Lena Evans , Ruth Toner

Multimodal object detection has attracted significant attention in both academia and industry for its enhanced robustness. Although numerous studies have focused on improving modality fusion strategies, most neglect fusion degradation, and…

Computer Vision and Pattern Recognition · Computer Science 2025-11-20 YiKang Shao , Tao Shi

Accurate medical image segmentation commonly requires effective learning of the complementary information from multimodal data. However, in clinical practice, we often encounter the problem of missing imaging modalities. We tackle this…

Computer Vision and Pattern Recognition · Computer Science 2020-02-25 Cheng Chen , Qi Dou , Yueming Jin , Hao Chen , Jing Qin , Pheng-Ann Heng

The problem of missing modalities is both critical and non-trivial to be handled in multi-modal models. It is common for multi-modal tasks that certain modalities contribute more compared to other modalities, and if those important…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 Hu Wang , Congbo Ma , Jianpeng Zhang , Yuan Zhang , Jodie Avery , Louise Hull , Gustavo Carneiro

In multimodal learning, dominant modalities often overshadow others, limiting generalization. We propose Modality-Aware Sharpness-Aware Minimization (M-SAM), a model-agnostic framework that applies to many modalities and supports early and…

Computer Vision and Pattern Recognition · Computer Science 2025-10-30 Hossein R. Nowdeh , Jie Ji , Xiaolong Ma , Fatemeh Afghah

Multimodal learning integrates diverse modalities but suffers from modality imbalance, where dominant modalities suppress weaker ones due to inconsistent convergence rates. Existing methods predominantly rely on static modulation or…

Machine Learning · Computer Science 2026-02-11 Zhaocheng Liu , Zhiwen Yu , Xiaoqing Liu

Multimodal learning leverages complementary information derived from different modalities, thereby enhancing performance in medical image segmentation. However, prevailing multimodal learning methods heavily rely on extensive well-annotated…

Computer Vision and Pattern Recognition · Computer Science 2024-09-05 Xiaogen Zhou , Yiyou Sun , Min Deng , Winnie Chiu Wing Chu , Qi Dou

Multimodal magnetic resonance imaging (MRI) is crucial for brain tumor segmentation, with many methods leveraging its four key modalities to capture complementary information for effective sub-region analysis. However, the absence of…

Artificial Intelligence · Computer Science 2026-05-19 Sha Tao , Jiao Pan , Yu Guo , Chao Yao

Multimodal learning seeks to combine data from multiple input sources to enhance the performance of different downstream tasks. In real-world scenarios, performance can degrade substantially if some input modalities are missing. Existing…

Machine Learning · Computer Science 2024-10-10 Niki Nezakati , Md Kaykobad Reza , Ameya Patil , Mashhour Solh , M. Salman Asif

Multimodal learning often outperforms its unimodal counterparts by exploiting unimodal contributions and cross-modal interactions. However, focusing only on integrating multimodal features into a unified comprehensive representation…

Machine Learning · Computer Science 2025-05-15 Sehwan Moon , Hyunju Lee
‹ Prev 1 2 3 10 Next ›