Related papers: Robust Multimodal Learning via Representation Deco…

Dual-Stream Cross-Modal Representation Learning via Residual Semantic Decorrelation

Cross-modal learning has become a fundamental paradigm for integrating heterogeneous information sources such as images, text, and structured attributes. However, multimodal representations often suffer from modality dominance, redundant…

Computer Vision and Pattern Recognition · Computer Science 2025-12-09 Xuecheng Li , Weikuan Jia , Alisher Kurbonaliev , Qurbonaliev Alisher , Khudzhamkulov Rustam , Ismoilov Shuhratjon , Eshmatov Javhariddin , Yuanjie Zheng

MMANet: Margin-aware Distillation and Modality-aware Regularization for Incomplete Multimodal Learning

Multimodal learning has shown great potentials in numerous scenes and attracts increasing interest recently. However, it often encounters the problem of missing modality data and thus suffers severe performance degradation in practice. To…

Computer Vision and Pattern Recognition · Computer Science 2023-04-18 Shicai Wei , Yang Luo , Chunbo Luo

Towards Robust and Realible Multimodal Misinformation Recognition with Incomplete Modality

Multimodal Misinformation Recognition has become an urgent task with the emergence of huge multimodal fake content on social media platforms. Previous studies mainly focus on complex feature extraction and fusion to learn discriminative…

Multimedia · Computer Science 2025-10-15 Hengyang Zhou , Yiwei Wei , Jian Yang , Zhenyu Zhang

Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Models

Multimodal learning typically relies on the assumption that all modalities are fully available during both the training and inference phases. However, in real-world scenarios, consistently acquiring complete multimodal data presents…

Computer Vision and Pattern Recognition · Computer Science 2024-07-18 Donggeun Kim , Taesup Kim

Dynamic Proximal Unrolling Network for Compressive Imaging

Compressive imaging aims to recover a latent image from under-sampled measurements, suffering from a serious ill-posed inverse problem. Recently, deep neural networks have been applied to this problem with superior results, owing to the…

Image and Video Processing · Electrical Eng. & Systems 2021-10-26 Yixiao Yang , Ran Tao , Kaixuan Wei , Ying Fu

Robust Multimodal Learning with Missing Modalities via Parameter-Efficient Adaptation

Multimodal learning seeks to utilize data from multiple sources to improve the overall performance of downstream tasks. It is desirable for redundancies in the data to make multimodal systems robust to missing or corrupted observations in…

Computer Vision and Pattern Recognition · Computer Science 2024-10-14 Md Kaykobad Reza , Ashley Prater-Bennette , M. Salman Asif

Multimodal Guidance Network for Missing-Modality Inference in Content Moderation

Multimodal deep learning, especially vision-language models, have gained significant traction in recent years, greatly improving performance on many downstream tasks, including content moderation and violence detection. However, standard…

Computer Vision and Pattern Recognition · Computer Science 2024-08-05 Zhuokai Zhao , Harish Palani , Tianyi Liu , Lena Evans , Ruth Toner

Manifold Regularized Deep Neural Networks using Adversarial Examples

Learning meaningful representations using deep neural networks involves designing efficient training schemes and well-structured networks. Currently, the method of stochastic gradient descent that has a momentum with dropout is one of the…

Machine Learning · Computer Science 2016-01-15 Taehoon Lee , Minsuk Choi , Sungroh Yoon

A unified representation network for segmentation with missing modalities

Over the last few years machine learning has demonstrated groundbreaking results in many areas of medical image analysis, including segmentation. A key assumption, however, is that the train- and test distributions match. We study a…

Computer Vision and Pattern Recognition · Computer Science 2019-08-20 Kenneth Lau , Jonas Adler , Jens Sjölund

Deep Representation Learning For Multimodal Brain Networks

Applying network science approaches to investigate the functions and anatomy of the human brain is prevalent in modern medical imaging analysis. Due to the complex network topology, for an individual brain, mining a discriminative network…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Wen Zhang , Liang Zhan , Paul Thompson , Yalin Wang

MurreNet: Modeling Holistic Multimodal Interactions Between Histopathology and Genomic Profiles for Survival Prediction

Cancer survival prediction requires integrating pathological Whole Slide Images (WSIs) and genomic profiles, a challenging task due to the inherent heterogeneity and the complexity of modeling both inter- and intra-modality interactions.…

Image and Video Processing · Electrical Eng. & Systems 2025-07-08 Mingxin Liu , Chengfei Cai , Jun Li , Pengbo Xu , Jinze Li , Jiquan Ma , Jun Xu

Towards Uniformity and Alignment for Multimodal Representation Learning

Multimodal representation learning aims to construct a shared embedding space in which heterogeneous modalities are semantically aligned. Despite strong empirical results, InfoNCE-based objectives introduce inherent conflicts that yield…

Machine Learning · Computer Science 2026-02-11 Wenzhe Yin , Pan Zhou , Zehao Xiao , Jie Liu , Shujian Yu , Jan-Jakob Sonke , Efstratios Gavves

Uncertainty-aware Multi-modal Learning via Cross-modal Random Network Prediction

Multi-modal learning focuses on training models by equally combining multiple input data modalities during the prediction process. However, this equal combination can be detrimental to the prediction accuracy because different modalities…

Computer Vision and Pattern Recognition · Computer Science 2022-07-25 Hu Wang , Jianpeng Zhang , Yuanhong Chen , Congbo Ma , Jodie Avery , Louise Hull , Gustavo Carneiro

Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach

Multimodal networks have demonstrated remarkable performance improvements over their unimodal counterparts. Existing multimodal networks are designed in a multi-branch fashion that, due to the reliance on fusion strategies, exhibit…

Computer Vision and Pattern Recognition · Computer Science 2024-08-15 Muhammad Saad Saeed , Shah Nawaz , Muhammad Zaigham Zaheer , Muhammad Haris Khan , Karthik Nandakumar , Muhammad Haroon Yousaf , Hassan Sajjad , Tom De Schepper , Markus Schedl

DMCL: Distillation Multiple Choice Learning for Multimodal Action Recognition

In this work, we address the problem of learning an ensemble of specialist networks using multimodal data, while considering the realistic and challenging scenario of possible missing modalities at test time. Our goal is to leverage the…

Computer Vision and Pattern Recognition · Computer Science 2019-12-24 Nuno C. Garcia , Sarah Adel Bargal , Vitaly Ablavsky , Pietro Morerio , Vittorio Murino , Stan Sclaroff

DMMRL: Disentangled Multi-Modal Representation Learning via Variational Autoencoders for Molecular Property Prediction

Molecular property prediction constitutes a cornerstone of drug discovery and materials science, necessitating models capable of disentangling complex structure-property relationships across diverse molecular modalities. Existing approaches…

Machine Learning · Computer Science 2026-03-24 Long Xu , Junping Guo , Jianbo Zhao , Jianbo Lu , Yuzhong Peng

DeepSuM: Deep Sufficient Modality Learning Framework

Multimodal learning has become a pivotal approach in developing robust learning models with applications spanning multimedia, robotics, large language models, and healthcare. The efficiency of multimodal systems is a critical concern, given…

Machine Learning · Computer Science 2025-03-04 Zhe Gao , Jian Huang , Ting Li , Xueqin Wang

EmbraceNet: A robust deep learning architecture for multimodal classification

Classification using multimodal data arises in many machine learning applications. It is crucial not only to model cross-modal relationship effectively but also to ensure robustness against loss of part of data or modalities. In this paper,…

Machine Learning · Computer Science 2019-04-22 Jun-Ho Choi , Jong-Seok Lee

Understanding Robust Learning through the Lens of Representation Similarities

Representation learning, i.e. the generation of representations useful for downstream applications, is a task of fundamental importance that underlies much of the success of deep neural networks (DNNs). Recently, robustness to adversarial…

Machine Learning · Computer Science 2022-09-16 Christian Cianfarani , Arjun Nitin Bhagoji , Vikash Sehwag , Ben Y. Zhao , Prateek Mittal , Haitao Zheng

MMP: Towards Robust Multi-Modal Learning with Masked Modality Projection

Multimodal learning seeks to combine data from multiple input sources to enhance the performance of different downstream tasks. In real-world scenarios, performance can degrade substantially if some input modalities are missing. Existing…

Machine Learning · Computer Science 2024-10-10 Niki Nezakati , Md Kaykobad Reza , Ameya Patil , Mashhour Solh , M. Salman Asif