Related papers: Multimodal Negative Learning

Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion

Multimodal learning (MML) is significantly constrained by modality imbalance, leading to suboptimal performance in practice. While existing approaches primarily focus on balancing the learning of different modalities to address this issue,…

Computer Vision and Pattern Recognition · Computer Science 2026-01-30 QingYuan Jiang , Longfei Huang , Yang Yang

Diagnosing and Re-learning for Balanced Multimodal Learning

To overcome the imbalanced multimodal learning problem, where models prefer the training of specific modalities, existing methods propose to control the training of uni-modal encoders from different perspectives, taking the inter-modal…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Yake Wei , Siwei Li , Ruoxuan Feng , Di Hu

Robust Multimodal Learning with Missing Modalities via Parameter-Efficient Adaptation

Multimodal learning seeks to utilize data from multiple sources to improve the overall performance of downstream tasks. It is desirable for redundancies in the data to make multimodal systems robust to missing or corrupted observations in…

Computer Vision and Pattern Recognition · Computer Science 2024-10-14 Md Kaykobad Reza , Ashley Prater-Bennette , M. Salman Asif

PMR: Prototypical Modal Rebalance for Multimodal Learning

Multimodal learning (MML) aims to jointly exploit the common priors of different modalities to compensate for their inherent limitations. However, existing MML methods often optimize a uniform objective for different modalities, leading to…

Machine Learning · Computer Science 2022-11-15 Yunfeng Fan , Wenchao Xu , Haozhao Wang , Junxiao Wang , Song Guo

AIM: Adaptive Intra-Network Modulation for Balanced Multimodal Learning

Multimodal learning has significantly enhanced machine learning performance but still faces numerous challenges and limitations. Imbalanced multimodal learning is one of the problems extensively studied in recent works and is typically…

Computer Vision and Pattern Recognition · Computer Science 2025-11-04 Shu Shen , C. L. Philip Chen , Tong Zhang

Towards Balanced Active Learning for Multimodal Classification

Training multimodal networks requires a vast amount of data due to their larger parameter space compared to unimodal networks. Active learning is a widely used technique for reducing data annotation costs by selecting only those samples…

Multimedia · Computer Science 2023-08-22 Meng Shen , Yizheng Huang , Jianxiong Yin , Heqing Zou , Deepu Rajan , Simon See

Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach

Multimodal networks have demonstrated remarkable performance improvements over their unimodal counterparts. Existing multimodal networks are designed in a multi-branch fashion that, due to the reliance on fusion strategies, exhibit…

Computer Vision and Pattern Recognition · Computer Science 2024-08-15 Muhammad Saad Saeed , Shah Nawaz , Muhammad Zaigham Zaheer , Muhammad Haris Khan , Karthik Nandakumar , Muhammad Haroon Yousaf , Hassan Sajjad , Tom De Schepper , Markus Schedl

Unbiased Dynamic Multimodal Fusion

Traditional multimodal methods often assume static modality quality, which limits their adaptability in dynamic real-world scenarios. Thus, dynamical multimodal methods are proposed to assess modality quality and adjust their contribution…

Computer Vision and Pattern Recognition · Computer Science 2026-03-23 Shicai Wei , Kaijie Zhang , Luyi Chen , Tao He , Guiduo Duan

Towards Uniformity and Alignment for Multimodal Representation Learning

Multimodal representation learning aims to construct a shared embedding space in which heterogeneous modalities are semantically aligned. Despite strong empirical results, InfoNCE-based objectives introduce inherent conflicts that yield…

Machine Learning · Computer Science 2026-02-11 Wenzhe Yin , Pan Zhou , Zehao Xiao , Jie Liu , Shujian Yu , Jan-Jakob Sonke , Efstratios Gavves

Beyond Forced Modality Balance: Intrinsic Information Budgets for Multimodal Learning

Multimodal models often converge to a dominant-modality solution, in which a stronger, faster-converging modality overshadows weaker ones. This modality imbalance causes suboptimal performance. Existing methods attempt to balance different…

Multimedia · Computer Science 2026-03-19 Zechang Xiong , Da Li , Kexin Tang , Pengyuan Li , Wenkang Kong , Yulan Hu

Asymmetric Reinforcing against Multi-modal Representation Bias

The strength of multimodal learning lies in its ability to integrate information from various sources, providing rich and comprehensive insights. However, in real-world scenarios, multi-modal systems often face the challenge of dynamic…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Xiyuan Gao , Bing Cao , Pengfei Zhu , Nannan Wang , Qinghua Hu

MMP: Towards Robust Multi-Modal Learning with Masked Modality Projection

Multimodal learning seeks to combine data from multiple input sources to enhance the performance of different downstream tasks. In real-world scenarios, performance can degrade substantially if some input modalities are missing. Existing…

Machine Learning · Computer Science 2024-10-10 Niki Nezakati , Md Kaykobad Reza , Ameya Patil , Mashhour Solh , M. Salman Asif

Detached and Interactive Multimodal Learning

Recently, Multimodal Learning (MML) has gained significant interest as it compensates for single-modality limitations through comprehensive complementary information within multimodal data. However, traditional MML methods generally use the…

Computer Vision and Pattern Recognition · Computer Science 2024-07-30 Yunfeng Fan , Wenchao Xu , Haozhao Wang , Junhong Liu , Song Guo

Revisit Modality Imbalance at the Decision Layer

Multimodal learning integrates information from different modalities to enhance model performance, yet it often suffers from modality imbalance, where dominant modalities overshadow weaker ones during joint optimization. This paper reveals…

Machine Learning · Computer Science 2025-10-17 Xiaoyu Ma , Hao Chen

Rebalanced Multimodal Learning with Data-aware Unimodal Sampling

To address the modality learning degeneration caused by modality imbalance, existing multimodal learning~(MML) approaches primarily attempt to balance the optimization process of each modality from the perspective of model learning.…

Machine Learning · Computer Science 2025-03-07 Qingyuan Jiang , Zhouyang Chi , Xiao Ma , Qirong Mao , Yang Yang , Jinhui Tang

Continual Learning for Multiple Modalities

Continual learning aims to learn knowledge of tasks observed in sequential time steps while mitigating the forgetting of previously learned knowledge. Existing methods were designed to learn a single modality (e.g., image) over time, which…

Computer Vision and Pattern Recognition · Computer Science 2025-08-15 Hyundong Jin , Eunwoo Kim

Robust Multimodal Learning via Representation Decoupling

Multimodal learning robust to missing modality has attracted increasing attention due to its practicality. Existing methods tend to address it by learning a common subspace representation for different modality combinations. However, we…

Computer Vision and Pattern Recognition · Computer Science 2024-07-08 Shicai Wei , Yang Luo , Yuji Wang , Chunbo Luo

Deep Metric Loss for Multimodal Learning

Multimodal learning often outperforms its unimodal counterparts by exploiting unimodal contributions and cross-modal interactions. However, focusing only on integrating multimodal features into a unified comprehensive representation…

Machine Learning · Computer Science 2025-05-15 Sehwan Moon , Hyunju Lee

Multimodal Co-learning: Challenges, Applications with Datasets, Recent Advances and Future Directions

Multimodal deep learning systems which employ multiple modalities like text, image, audio, video, etc., are showing better performance in comparison with individual modalities (i.e., unimodal) systems. Multimodal machine learning involves…

Machine Learning · Computer Science 2022-01-19 Anil Rahate , Rahee Walambe , Sheela Ramanna , Ketan Kotecha

PDMP: Rethinking Balanced Multimodal Learning via Performance-Dominant Modality Prioritization

Multimodal learning has attracted increasing attention due to its practicality. However, it often suffers from insufficient optimization, where the multimodal model underperforms even compared to its unimodal counterparts. Existing methods…

Computer Vision and Pattern Recognition · Computer Science 2026-04-08 Shicai Wei , Chunbo Luo , Qiang Zhu , Yang Luo