Related papers: Learning Modality-Specific Representations with Se…

Self-MI: Efficient Multimodal Fusion via Self-Supervised Multi-Task Learning with Auxiliary Mutual Information Maximization

Multimodal representation learning poses significant challenges in capturing informative and distinct features from multiple modalities. Existing methods often struggle to exploit the unique characteristics of each modality due to unified…

Computer Vision and Pattern Recognition · Computer Science 2023-11-08 Cam-Van Thi Nguyen , Ngoc-Hoa Thi Nguyen , Duc-Trong Le , Quang-Thuy Ha

Multimodal Interaction Modeling via Self-Supervised Multi-Task Learning for Review Helpfulness Prediction

In line with the latest research, the task of identifying helpful reviews from a vast pool of user-generated textual and visual data has become a prominent area of study. Effective modal representations are expected to possess two key…

Multimedia · Computer Science 2024-03-26 HongLin Gong , Mengzhao Jia , Liqiang Jing

A survey on Self Supervised learning approaches for improving Multimodal representation learning

Recently self supervised learning has seen explosive growth and use in variety of machine learning tasks because of its ability to avoid the cost of annotating large-scale datasets. This paper gives an overview for best self supervised…

Machine Learning · Computer Science 2022-10-21 Naman Goyal

Meta-Learn Unimodal Signals with Weak Supervision for Multimodal Sentiment Analysis

Multimodal sentiment analysis aims to effectively integrate information from various sources to infer sentiment, where in many cases there are no annotations for unimodal labels. Therefore, most works rely on multimodal labels for training.…

Machine Learning · Computer Science 2024-09-16 Sijie Mai , Yu Zhao , Ying Zeng , Jianhua Yao , Haifeng Hu

Learning Factorized Multimodal Representations

Learning multimodal representations is a fundamentally complex research problem due to the presence of multiple heterogeneous sources of information. Although the presence of multiple modalities provides additional valuable information,…

Machine Learning · Computer Science 2019-05-15 Yao-Hung Hubert Tsai , Paul Pu Liang , Amir Zadeh , Louis-Philippe Morency , Ruslan Salakhutdinov

Self-Supervised Multimodal Learning: A Survey

Multimodal learning, which aims to understand and analyze information from multiple modalities, has achieved substantial progress in the supervised regime in recent years. However, the heavy dependence on data paired with expensive human…

Machine Learning · Computer Science 2024-08-19 Yongshuo Zong , Oisin Mac Aodha , Timothy Hospedales

Shared and Private Information Learning in Multimodal Sentiment Analysis with Deep Modal Alignment and Self-supervised Multi-Task Learning

Designing an effective representation learning method for multimodal sentiment analysis tasks is a crucial research direction. The challenge lies in learning both shared and private information in a complete modal representation, which is…

Computation and Language · Computer Science 2024-03-20 Songning Lai , Jiakang Li , Guinan Guo , Xifeng Hu , Yulong Li , Yuan Tan , Zichen Song , Yutong Liu , Zhaoxia Ren , Chun Wan , Danmin Miao , Zhi Liu

Improving the Modality Representation with Multi-View Contrastive Learning for Multimodal Sentiment Analysis

Modality representation learning is an important problem for multimodal sentiment analysis (MSA), since the highly distinguishable representations can contribute to improving the analysis effect. Previous works of MSA have usually focused…

Multimedia · Computer Science 2023-01-31 Peipei Liu , Xin Zheng , Hong Li , Jie Liu , Yimo Ren , Hongsong Zhu , Limin Sun

MENTOR: Multi-level Self-supervised Learning for Multimodal Recommendation

With the increasing multimedia information, multimodal recommendation has received extensive attention. It utilizes multimodal information to alleviate the data sparsity problem in recommendation systems, thus improving recommendation…

Information Retrieval · Computer Science 2024-03-01 Jinfeng Xu , Zheyu Chen , Shuo Yang , Jinze Li , Hewei Wang , Edith C. -H. Ngai

Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning

One of the key factors of enabling machine learning models to comprehend and solve real-world tasks is to leverage multimodal data. Unfortunately, annotation of multimodal data is challenging and expensive. Recently, self-supervised…

Computer Vision and Pattern Recognition · Computer Science 2020-12-11 Elad Amrani , Rami Ben-Ari , Daniel Rotman , Alex Bronstein

UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning

Multimodal learning aims to imitate human beings to acquire complementary information from multiple modalities for various downstream tasks. However, traditional aggregation-based multimodal fusion methods ignore the inter-modality…

Computer Vision and Pattern Recognition · Computer Science 2023-05-17 Heqing Zou , Meng Shen , Chen Chen , Yuchen Hu , Deepu Rajan , Eng Siong Chng

Multi-Modal Self-Supervised Learning for Recommendation

The online emergence of multi-modal sharing platforms (eg, TikTok, Youtube) is powering personalized recommender systems to incorporate various modalities (eg, visual, textual and acoustic) into the latent user representations. While…

Information Retrieval · Computer Science 2023-07-19 Wei Wei , Chao Huang , Lianghao Xia , Chuxu Zhang

Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment Analysis

Multimodal machine learning is a core research area spanning the language, visual and acoustic modalities. The central challenge in multimodal learning involves learning representations that can process and relate information from multiple…

Computation and Language · Computer Science 2018-08-07 Hai Pham , Thomas Manzini , Paul Pu Liang , Barnabas Poczos

Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning

Self-supervised learning is an efficient pre-training method for medical image analysis. However, current research is mostly confined to specific-modality data pre-training, consuming considerable time and resources without achieving…

Computer Vision and Pattern Recognition · Computer Science 2023-12-01 Yiwen Ye , Yutong Xie , Jianpeng Zhang , Ziyang Chen , Qi Wu , Yong Xia

Multi-Modal Mutual Information (MuMMI) Training for Robust Self-Supervised Deep Reinforcement Learning

This work focuses on learning useful and robust deep world models using multiple, possibly unreliable, sensors. We find that current methods do not sufficiently encourage a shared representation between modalities; this can cause poor…

Machine Learning · Computer Science 2021-07-07 Kaiqi Chen , Yong Lee , Harold Soh

Tailor Versatile Multi-modal Learning for Multi-label Emotion Recognition

Multi-modal Multi-label Emotion Recognition (MMER) aims to identify various human emotions from heterogeneous visual, audio and text modalities. Previous methods mainly focus on projecting multiple modalities into a common latent space and…

Computer Vision and Pattern Recognition · Computer Science 2022-01-19 Yi Zhang , Mingyuan Chen , Jundong Shen , Chongjun Wang

Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications

In many machine learning systems that jointly learn from multiple modalities, a core research question is to understand the nature of multimodal interactions: how modalities combine to provide new task-relevant information that was not…

Machine Learning · Computer Science 2024-06-14 Paul Pu Liang , Chun Kai Ling , Yun Cheng , Alex Obolenskiy , Yudong Liu , Rohan Pandey , Alex Wilf , Louis-Philippe Morency , Ruslan Salakhutdinov

Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching

Automatic emotion recognition is an active research topic with wide range of applications. Due to the high manual annotation cost and inevitable label ambiguity, the development of emotion recognition dataset is limited in both scale and…

Audio and Speech Processing · Electrical Eng. & Systems 2020-09-08 Jingjun Liang , Ruichen Li , Qin Jin

Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach

Multimodal networks have demonstrated remarkable performance improvements over their unimodal counterparts. Existing multimodal networks are designed in a multi-branch fashion that, due to the reliance on fusion strategies, exhibit…

Computer Vision and Pattern Recognition · Computer Science 2024-08-15 Muhammad Saad Saeed , Shah Nawaz , Muhammad Zaigham Zaheer , Muhammad Haris Khan , Karthik Nandakumar , Muhammad Haroon Yousaf , Hassan Sajjad , Tom De Schepper , Markus Schedl

Uncertainty-Resilient Multimodal Learning via Consistency-Guided Cross-Modal Transfer

Multimodal learning systems often face substantial uncertainty due to noisy data, low-quality labels, and heterogeneous modality characteristics. These issues become especially critical in human-computer interaction settings, where data…

Artificial Intelligence · Computer Science 2025-11-21 Hyo-Jeong Jang