Related papers: Learning Factorized Multimodal Representations

Discriminative Multimodal Learning via Conditional Priors in Generative Models

Deep generative models with latent variables have been used lately to learn joint representations and generative processes from multi-modal data. These two learning mechanisms can, however, conflict with each other and representations can…

Machine Learning · Computer Science 2023-01-24 Rogelio A. Mancisidor , Michael Kampffmeyer , Kjersti Aas , Robert Jenssen

Multimodal Adversarially Learned Inference with Factorized Discriminators

Learning from multimodal data is an important research topic in machine learning, which has the potential to obtain better representations. In this work, we propose a novel approach to generative modeling of multimodal data based on…

Machine Learning · Computer Science 2021-12-21 Wenxue Chen , Jianke Zhu

FINE: Factorized multimodal sentiment analysis via mutual INformation Estimation

Multimodal sentiment analysis remains a challenging task due to the inherent heterogeneity across modalities. Such heterogeneity often manifests as asynchronous signals, imbalanced information between modalities, and interference from…

Multimedia · Computer Science 2025-11-26 Yadong Liu , Shangfei Wang

Integrative Factor Regression and Its Inference for Multimodal Data Analysis

Multimodal data, where different types of data are collected from the same subjects, are fast emerging in a large variety of scientific applications. Factor analysis is commonly used in integrative analysis of multimodal data, and is…

Statistics Theory · Mathematics 2021-03-31 Quefeng Li , Lexin Li

A survey of multimodal deep generative models

Multimodal learning is a framework for building models that make predictions based on different types of modalities. Important challenges in multimodal learning are the inference of shared representations from arbitrary modalities and…

Machine Learning · Computer Science 2022-07-06 Masahiro Suzuki , Yutaka Matsuo

Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis

Representation Learning is a significant and challenging task in multimodal learning. Effective modality representations should contain two parts of characteristics: the consistency and the difference. Due to the unified multimodal…

Computation and Language · Computer Science 2021-02-10 Wenmeng Yu , Hua Xu , Ziqi Yuan , Jiele Wu

A Survey of Inductive Biases for Factorial Representation-Learning

With the resurgence of interest in neural networks, representation learning has re-emerged as a central focus in artificial intelligence. Representation learning refers to the discovery of useful encodings of data that make domain-relevant…

Machine Learning · Computer Science 2016-12-19 Karl Ridgeway

Identifiability Results for Multimodal Contrastive Learning

Contrastive learning is a cornerstone underlying recent progress in multi-view and multimodal learning, e.g., in representation learning with image/caption pairs. While its effectiveness is not yet fully understood, a line of recent work…

Machine Learning · Computer Science 2023-03-17 Imant Daunhawer , Alice Bizeul , Emanuele Palumbo , Alexander Marx , Julia E. Vogt

Factorized Contrastive Learning: Going Beyond Multi-view Redundancy

In a wide range of multimodal tasks, contrastive learning has become a particularly appealing approach since it can successfully learn representations from abundant unlabeled data with only pairing information (e.g., image-caption or…

Machine Learning · Computer Science 2023-10-31 Paul Pu Liang , Zihao Deng , Martin Ma , James Zou , Louis-Philippe Morency , Ruslan Salakhutdinov

Causal Debiasing Medical Multimodal Representation Learning with Missing Modalities

Medical multimodal representation learning aims to integrate heterogeneous clinical data into unified patient representations to support predictive modeling, which remains an essential yet challenging task in the medical data mining…

Machine Learning · Computer Science 2025-09-09 Xiaoguang Zhu , Lianlong Sun , Yang Liu , Pengyi Jiang , Uma Srivatsa , Nipavan Chiamvimonvat , Vladimir Filkov

Factorized Multimodal Transformer for Multimodal Sequential Learning

The complex world around us is inherently multimodal and sequential (continuous). Information is scattered across different modalities and requires multiple continuous sensors to be captured. As machine learning leaps towards better…

Machine Learning · Computer Science 2019-11-25 Amir Zadeh , Chengfeng Mao , Kelly Shi , Yiwei Zhang , Paul Pu Liang , Soujanya Poria , Louis-Philippe Morency

Multimodal Representation Learning and Fusion

Multi-modal learning is a fast growing area in artificial intelligence. It tries to help machines understand complex things by combining information from different sources, like images, text, and audio. By using the strengths of each…

Machine Learning · Computer Science 2025-12-22 Qihang Jin , Enze Ge , Yuhang Xie , Hongying Luo , Junhao Song , Ziqian Bi , Chia Xin Liang , Jibin Guan , Joe Yeong , Xinyuan Song , Junfeng Hao

Found in Translation: Learning Robust Joint Representations by Cyclic Translations Between Modalities

Multimodal sentiment analysis is a core research area that studies speaker sentiment expressed from the language, visual, and acoustic modalities. The central challenge in multimodal learning involves inferring joint representations that…

Machine Learning · Computer Science 2020-03-02 Hai Pham , Paul Pu Liang , Thomas Manzini , Louis-Philippe Morency , Barnabas Poczos

HyperLearn: A Distributed Approach for Representation Learning in Datasets With Many Modalities

Multimodal datasets contain an enormous amount of relational information, which grows exponentially with the introduction of new modalities. Learning representations in such a scenario is inherently complex due to the presence of multiple…

Machine Learning · Computer Science 2019-09-24 Devanshu Arya , Stevan Rudinac , Marcel Worring

Multimodal learning with graphs

Artificial intelligence for graphs has achieved remarkable success in modeling complex systems, ranging from dynamic networks in biology to interacting particle systems in physics. However, the increasingly heterogeneous graph datasets call…

Machine Learning · Computer Science 2023-01-25 Yasha Ektefaie , George Dasoulas , Ayush Noori , Maha Farhat , Marinka Zitnik

Semi-supervised Bayesian Deep Multi-modal Emotion Recognition

In emotion recognition, it is difficult to recognize human's emotional states using just a single modality. Besides, the annotation of physiological emotional data is particularly expensive. These two aspects make the building of effective…

Artificial Intelligence · Computer Science 2017-04-26 Changde Du , Changying Du , Jinpeng Li , Wei-long Zheng , Bao-liang Lu , Huiguang He

Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Models

Multimodal learning for generative models often refers to the learning of abstract concepts from the commonality of information in multiple modalities, such as vision and language. While it has proven effective for learning generalisable…

Machine Learning · Computer Science 2021-04-22 Yuge Shi , Brooks Paige , Philip H. S. Torr , N. Siddharth

Learning Actionable Representations with Goal-Conditioned Policies

Representation learning is a central challenge across a range of machine learning areas. In reinforcement learning, effective and functional representations have the potential to tremendously accelerate learning progress and solve more…

Machine Learning · Computer Science 2019-01-30 Dibya Ghosh , Abhishek Gupta , Sergey Levine

What Makes Multi-modal Learning Better than Single (Provably)

The world provides us with data of multiple modalities. Intuitively, models fusing data from different modalities outperform their uni-modal counterparts, since more information is aggregated. Recently, joining the success of deep learning,…

Machine Learning · Computer Science 2021-10-27 Yu Huang , Chenzhuang Du , Zihui Xue , Xuanyao Chen , Hang Zhao , Longbo Huang

Geometric Multimodal Contrastive Representation Learning

Learning representations of multimodal data that are both informative and robust to missing modalities at test time remains a challenging problem due to the inherent heterogeneity of data obtained from different channels. To address it, we…

Machine Learning · Computer Science 2022-11-21 Petra Poklukar , Miguel Vasco , Hang Yin , Francisco S. Melo , Ana Paiva , Danica Kragic