Related papers: Variational methods for Conditional Multimodal Dee…

Joint Multimodal Learning with Deep Generative Models

We investigate deep generative models that can exchange multiple modalities bi-directionally, e.g., generating images from corresponding texts and vice versa. Recently, some studies handle multiple modalities on deep generative models, such…

Machine Learning · Statistics 2016-11-08 Masahiro Suzuki , Kotaro Nakayama , Yutaka Matsuo

Discriminative Multimodal Learning via Conditional Priors in Generative Models

Deep generative models with latent variables have been used lately to learn joint representations and generative processes from multi-modal data. These two learning mechanisms can, however, conflict with each other and representations can…

Machine Learning · Computer Science 2023-01-24 Rogelio A. Mancisidor , Michael Kampffmeyer , Kjersti Aas , Robert Jenssen

A survey of multimodal deep generative models

Multimodal learning is a framework for building models that make predictions based on different types of modalities. Important challenges in multimodal learning are the inference of shared representations from arbitrary modalities and…

Machine Learning · Computer Science 2022-07-06 Masahiro Suzuki , Yutaka Matsuo

Multi-modal Latent Diffusion

Multi-modal data-sets are ubiquitous in modern applications, and multi-modal Variational Autoencoders are a popular family of models that aim to learn a joint representation of the different modalities. However, existing approaches suffer…

Machine Learning · Computer Science 2023-12-19 Mustapha Bounoua , Giulio Franzese , Pietro Michiardi

Conditional Generative Modeling via Learning the Latent Space

Although deep learning has achieved appealing results on several machine learning tasks, most of the models are deterministic at inference, limiting their application to single-modal settings. We propose a novel general-purpose framework…

Machine Learning · Computer Science 2020-10-12 Sameera Ramasinghe , Kanchana Ranasinghe , Salman Khan , Nick Barnes , Stephen Gould

Learning Factorized Multimodal Representations

Learning multimodal representations is a fundamentally complex research problem due to the presence of multiple heterogeneous sources of information. Although the presence of multiple modalities provides additional valuable information,…

Machine Learning · Computer Science 2019-05-15 Yao-Hung Hubert Tsai , Paul Pu Liang , Amir Zadeh , Louis-Philippe Morency , Ruslan Salakhutdinov

Multimodal Representation Learning by Alternating Unimodal Adaptation

Multimodal learning, which integrates data from diverse sensory modes, plays a pivotal role in artificial intelligence. However, existing multimodal learning methods often struggle with challenges where some modalities appear more dominant…

Machine Learning · Computer Science 2024-04-02 Xiaohui Zhang , Jaehong Yoon , Mohit Bansal , Huaxiu Yao

Multimodal Generative Models for Compositional Representation Learning

As deep neural networks become more adept at traditional tasks, many of the most exciting new challenges concern multimodality---observations that combine diverse types, such as image and text. In this paper, we introduce a family of…

Machine Learning · Computer Science 2019-12-12 Mike Wu , Noah Goodman

Increasing the Generalisation Capacity of Conditional VAEs

We address the problem of one-to-many mappings in supervised learning, where a single instance has many different solutions of possibly equal cost. The framework of conditional variational autoencoders describes a class of methods to tackle…

Machine Learning · Statistics 2019-09-11 Alexej Klushyn , Nutan Chen , Botond Cseke , Justin Bayer , Patrick van der Smagt

Unity by Diversity: Improved Representation Learning in Multimodal VAEs

Variational Autoencoders for multimodal data hold promise for many tasks in data analysis, such as representation learning, conditional generation, and imputation. Current architectures either share the encoder output, decoder input, or…

Machine Learning · Computer Science 2025-01-08 Thomas M. Sutter , Yang Meng , Andrea Agostini , Daphné Chopard , Norbert Fortin , Julia E. Vogt , Babak Shahbaba , Stephan Mandt

Multimodal Generative Models for Scalable Weakly-Supervised Learning

Multiple modalities often co-occur when describing natural phenomena. Learning a joint representation of these modalities should yield deeper and more useful representations. Previous generative approaches to multi-modal input either do not…

Machine Learning · Computer Science 2018-11-13 Mike Wu , Noah Goodman

Learning more expressive joint distributions in multimodal variational methods

Data often are formed of multiple modalities, which jointly describe the observed phenomena. Modeling the joint distribution of multimodal data requires larger expressive power to capture high-level concepts and provide better data…

Machine Learning · Computer Science 2020-09-09 Sasho Nedelkoski , Mihail Bogojeski , Odej Kao

Controlling Structured Output Representations from Attributes using Conditional Generative Models

Structured output representation is a generative task explored in computer vision that often times requires the mapping of low dimensional features to high dimensional structured outputs. Losses in complex spatial information in…

Computer Vision and Pattern Recognition · Computer Science 2025-02-25 Mohamed Debbagh

Bridging the inference gap in Mutimodal Variational Autoencoders

From medical diagnosis to autonomous vehicles, critical applications rely on the integration of multiple heterogeneous data modalities. Multimodal Variational Autoencoders offer versatile and scalable methods for generating unobserved…

Machine Learning · Computer Science 2025-02-07 Agathe Senellart , Stéphanie Allassonnière

Multimodal Shape Completion via Conditional Generative Adversarial Networks

Several deep learning methods have been proposed for completing partial data from shape acquisition setups, i.e., filling the regions that were missing in the shape. These methods, however, only complete the partial shape with a single…

Computer Vision and Pattern Recognition · Computer Science 2020-07-09 Rundi Wu , Xuelin Chen , Yixin Zhuang , Baoquan Chen

Improving Bi-directional Generation between Different Modalities with Variational Autoencoders

We investigate deep generative models that can exchange multiple modalities bi-directionally, e.g., generating images from corresponding texts and vice versa. A major approach to achieve this objective is to train a model that integrates…

Machine Learning · Statistics 2018-01-29 Masahiro Suzuki , Kotaro Nakayama , Yutaka Matsuo

Improving the Modality Representation with Multi-View Contrastive Learning for Multimodal Sentiment Analysis

Modality representation learning is an important problem for multimodal sentiment analysis (MSA), since the highly distinguishable representations can contribute to improving the analysis effect. Previous works of MSA have usually focused…

Multimedia · Computer Science 2023-01-31 Peipei Liu , Xin Zheng , Hong Li , Jie Liu , Yimo Ren , Hongsong Zhu , Limin Sun

Conditional Meta-Learning of Linear Representations

Standard meta-learning for representation learning aims to find a common representation to be shared across multiple tasks. The effectiveness of these methods is often limited when the nuances of the tasks' distribution cannot be captured…

Machine Learning · Computer Science 2021-03-31 Giulia Denevi , Massimiliano Pontil , Carlo Ciliberto

Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models

Learning generative models that span multiple data modalities, such as vision and language, is often motivated by the desire to learn more useful, generalisable representations that faithfully capture common underlying factors between the…

Machine Learning · Statistics 2019-11-11 Yuge Shi , N. Siddharth , Brooks Paige , Philip H. S. Torr

Modal Uncertainty Estimation via Discrete Latent Representation

Many important problems in the real world don't have unique solutions. It is thus important for machine learning models to be capable of proposing different plausible solutions with meaningful probability measures. In this work we introduce…

Machine Learning · Computer Science 2020-07-28 Di Qiu , Lok Ming Lui