English
Related papers

Related papers: Multimodal Negative Learning

200 papers

Multimodal learning (MML) is significantly constrained by modality imbalance, leading to suboptimal performance in practice. While existing approaches primarily focus on balancing the learning of different modalities to address this issue,…

Computer Vision and Pattern Recognition · Computer Science 2026-01-30 QingYuan Jiang , Longfei Huang , Yang Yang

To overcome the imbalanced multimodal learning problem, where models prefer the training of specific modalities, existing methods propose to control the training of uni-modal encoders from different perspectives, taking the inter-modal…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Yake Wei , Siwei Li , Ruoxuan Feng , Di Hu

Multimodal learning seeks to utilize data from multiple sources to improve the overall performance of downstream tasks. It is desirable for redundancies in the data to make multimodal systems robust to missing or corrupted observations in…

Computer Vision and Pattern Recognition · Computer Science 2024-10-14 Md Kaykobad Reza , Ashley Prater-Bennette , M. Salman Asif

Multimodal learning (MML) aims to jointly exploit the common priors of different modalities to compensate for their inherent limitations. However, existing MML methods often optimize a uniform objective for different modalities, leading to…

Machine Learning · Computer Science 2022-11-15 Yunfeng Fan , Wenchao Xu , Haozhao Wang , Junxiao Wang , Song Guo

Multimodal learning has significantly enhanced machine learning performance but still faces numerous challenges and limitations. Imbalanced multimodal learning is one of the problems extensively studied in recent works and is typically…

Computer Vision and Pattern Recognition · Computer Science 2025-11-04 Shu Shen , C. L. Philip Chen , Tong Zhang

Training multimodal networks requires a vast amount of data due to their larger parameter space compared to unimodal networks. Active learning is a widely used technique for reducing data annotation costs by selecting only those samples…

Multimedia · Computer Science 2023-08-22 Meng Shen , Yizheng Huang , Jianxiong Yin , Heqing Zou , Deepu Rajan , Simon See

Multimodal networks have demonstrated remarkable performance improvements over their unimodal counterparts. Existing multimodal networks are designed in a multi-branch fashion that, due to the reliance on fusion strategies, exhibit…

Traditional multimodal methods often assume static modality quality, which limits their adaptability in dynamic real-world scenarios. Thus, dynamical multimodal methods are proposed to assess modality quality and adjust their contribution…

Computer Vision and Pattern Recognition · Computer Science 2026-03-23 Shicai Wei , Kaijie Zhang , Luyi Chen , Tao He , Guiduo Duan

Multimodal representation learning aims to construct a shared embedding space in which heterogeneous modalities are semantically aligned. Despite strong empirical results, InfoNCE-based objectives introduce inherent conflicts that yield…

Machine Learning · Computer Science 2026-02-11 Wenzhe Yin , Pan Zhou , Zehao Xiao , Jie Liu , Shujian Yu , Jan-Jakob Sonke , Efstratios Gavves

Multimodal models often converge to a dominant-modality solution, in which a stronger, faster-converging modality overshadows weaker ones. This modality imbalance causes suboptimal performance. Existing methods attempt to balance different…

Multimedia · Computer Science 2026-03-19 Zechang Xiong , Da Li , Kexin Tang , Pengyuan Li , Wenkang Kong , Yulan Hu

The strength of multimodal learning lies in its ability to integrate information from various sources, providing rich and comprehensive insights. However, in real-world scenarios, multi-modal systems often face the challenge of dynamic…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Xiyuan Gao , Bing Cao , Pengfei Zhu , Nannan Wang , Qinghua Hu

Multimodal learning seeks to combine data from multiple input sources to enhance the performance of different downstream tasks. In real-world scenarios, performance can degrade substantially if some input modalities are missing. Existing…

Machine Learning · Computer Science 2024-10-10 Niki Nezakati , Md Kaykobad Reza , Ameya Patil , Mashhour Solh , M. Salman Asif

Recently, Multimodal Learning (MML) has gained significant interest as it compensates for single-modality limitations through comprehensive complementary information within multimodal data. However, traditional MML methods generally use the…

Computer Vision and Pattern Recognition · Computer Science 2024-07-30 Yunfeng Fan , Wenchao Xu , Haozhao Wang , Junhong Liu , Song Guo

Multimodal learning integrates information from different modalities to enhance model performance, yet it often suffers from modality imbalance, where dominant modalities overshadow weaker ones during joint optimization. This paper reveals…

Machine Learning · Computer Science 2025-10-17 Xiaoyu Ma , Hao Chen

To address the modality learning degeneration caused by modality imbalance, existing multimodal learning~(MML) approaches primarily attempt to balance the optimization process of each modality from the perspective of model learning.…

Machine Learning · Computer Science 2025-03-07 Qingyuan Jiang , Zhouyang Chi , Xiao Ma , Qirong Mao , Yang Yang , Jinhui Tang

Continual learning aims to learn knowledge of tasks observed in sequential time steps while mitigating the forgetting of previously learned knowledge. Existing methods were designed to learn a single modality (e.g., image) over time, which…

Computer Vision and Pattern Recognition · Computer Science 2025-08-15 Hyundong Jin , Eunwoo Kim

Multimodal learning robust to missing modality has attracted increasing attention due to its practicality. Existing methods tend to address it by learning a common subspace representation for different modality combinations. However, we…

Computer Vision and Pattern Recognition · Computer Science 2024-07-08 Shicai Wei , Yang Luo , Yuji Wang , Chunbo Luo

Multimodal learning often outperforms its unimodal counterparts by exploiting unimodal contributions and cross-modal interactions. However, focusing only on integrating multimodal features into a unified comprehensive representation…

Machine Learning · Computer Science 2025-05-15 Sehwan Moon , Hyunju Lee

Multimodal deep learning systems which employ multiple modalities like text, image, audio, video, etc., are showing better performance in comparison with individual modalities (i.e., unimodal) systems. Multimodal machine learning involves…

Machine Learning · Computer Science 2022-01-19 Anil Rahate , Rahee Walambe , Sheela Ramanna , Ketan Kotecha

Multimodal learning has attracted increasing attention due to its practicality. However, it often suffers from insufficient optimization, where the multimodal model underperforms even compared to its unimodal counterparts. Existing methods…

Computer Vision and Pattern Recognition · Computer Science 2026-04-08 Shicai Wei , Chunbo Luo , Qiang Zhu , Yang Luo
‹ Prev 1 2 3 10 Next ›