Related papers: Score-Based Multimodal Autoencoder

Quantum Variational Autoencoder

Variational autoencoders (VAEs) are powerful generative models with the salient ability to perform inference. Here, we introduce a quantum variational autoencoder (QVAE): a VAE whose latent generative process is implemented as a quantum…

Quantum Physics · Physics 2019-01-15 Amir Khoshaman , Walter Vinci , Brandon Denis , Evgeny Andriyash , Hossein Sadeghi , Mohammad H. Amin

On the Limitations of Multimodal VAEs

Multimodal variational autoencoders (VAEs) have shown promise as efficient generative models for weakly-supervised data. Yet, despite their advantage of weak supervision, they exhibit a gap in generative quality compared to unimodal VAEs,…

Machine Learning · Computer Science 2022-04-08 Imant Daunhawer , Thomas M. Sutter , Kieran Chin-Cheong , Emanuele Palumbo , Julia E. Vogt

Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectives

Devising deep latent variable models for multi-modal data has been a long-standing theme in machine learning research. Multi-modal Variational Autoencoders (VAEs) have been a popular generative model class that learns latent representations…

Machine Learning · Statistics 2024-09-25 Marcel Hirt , Domenico Campolo , Victoria Leong , Juan-Pablo Ortega

Multimodal ELBO with Diffusion Decoders

Multimodal variational autoencoders have demonstrated their ability to learn the relationships between different modalities by mapping them into a latent representation. Their design and capacity to perform any-to-any conditional and…

Machine Learning · Computer Science 2025-02-04 Daniel Wesego , Pedram Rooshenas

Multi-modal Latent Diffusion

Multi-modal data-sets are ubiquitous in modern applications, and multi-modal Variational Autoencoders are a popular family of models that aim to learn a joint representation of the different modalities. However, existing approaches suffer…

Machine Learning · Computer Science 2023-12-19 Mustapha Bounoua , Giulio Franzese , Pietro Michiardi

Multimodal Generative Models for Scalable Weakly-Supervised Learning

Multiple modalities often co-occur when describing natural phenomena. Learning a joint representation of these modalities should yield deeper and more useful representations. Previous generative approaches to multi-modal input either do not…

Machine Learning · Computer Science 2018-11-13 Mike Wu , Noah Goodman

Enhancing Unimodal Latent Representations in Multimodal VAEs through Iterative Amortized Inference

Multimodal variational autoencoders (VAEs) aim to capture shared latent representations by integrating information from different data modalities. A significant challenge is accurately inferring representations from any subset of modalities…

Machine Learning · Computer Science 2024-10-16 Yuta Oshima , Masahiro Suzuki , Yutaka Matsuo

Learning Multimodal Energy-Based Model with Multimodal Variational Auto-Encoder via MCMC Revision

Energy-based models (EBMs) are a flexible class of deep generative models and are well-suited to capture complex dependencies in multimodal data. However, learning multimodal EBM by maximum likelihood requires Markov Chain Monte Carlo…

Machine Learning · Computer Science 2026-05-04 Jiali Cui , Zhiqiang Lao , Heather Yu

A survey on Variational Autoencoders from a GreenAI perspective

Variational AutoEncoders (VAEs) are powerful generative models that merge elements from statistics and information theory with the flexibility offered by deep neural networks to efficiently solve the generation problem for high dimensional…

Machine Learning · Computer Science 2021-03-02 A. Asperti , D. Evangelista , E. Loli Piccolomini

Unity by Diversity: Improved Representation Learning in Multimodal VAEs

Variational Autoencoders for multimodal data hold promise for many tasks in data analysis, such as representation learning, conditional generation, and imputation. Current architectures either share the encoder output, decoder input, or…

Machine Learning · Computer Science 2025-01-08 Thomas M. Sutter , Yang Meng , Andrea Agostini , Daphné Chopard , Norbert Fortin , Julia E. Vogt , Babak Shahbaba , Stephan Mandt

Joint Multimodal Learning with Deep Generative Models

We investigate deep generative models that can exchange multiple modalities bi-directionally, e.g., generating images from corresponding texts and vice versa. Recently, some studies handle multiple modalities on deep generative models, such…

Machine Learning · Statistics 2016-11-08 Masahiro Suzuki , Kotaro Nakayama , Yutaka Matsuo

Parallelizing Autoregressive Generation with Variational State Space Models

Attention-based models such as Transformers and recurrent models like state space models (SSMs) have emerged as successful methods for autoregressive sequence modeling. Although both enable parallel training, none enable parallel generation…

Machine Learning · Computer Science 2024-07-12 Gaspard Lambrechts , Yann Claes , Pierre Geurts , Damien Ernst

Multimodal Variational Autoencoders have emerged as a popular tool to extract effective representations from rich multimodal data. However, such models rely on fusion strategies in latent space that destroy the joint statistical structure…

Machine Learning · Computer Science 2026-03-03 Federico Caretti , Guido Sanguinetti

Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and Toolkit

Multimodal Variational Autoencoders (VAEs) have been the subject of intense research in the past years as they can integrate multiple modalities into a joint representation and can thus serve as a promising tool for both data classification…

Machine Learning · Computer Science 2024-09-18 Gabriela Sejnova , Michal Vavrecka , Karla Stepanova , Tadahiro Taniguchi

Learning Latent Subspaces in Variational Autoencoders

Variational autoencoders (VAEs) are widely used deep generative models capable of learning unsupervised latent representations of data. Such representations are often difficult to interpret or control. We consider the problem of…

Machine Learning · Computer Science 2018-12-18 Jack Klys , Jake Snell , Richard Zemel

Generalizing Multimodal Variational Methods to Sets

Making sense of multiple modalities can yield a more comprehensive description of real-world phenomena. However, learning the co-representation of diverse modalities is still a long-standing endeavor in emerging machine learning…

Artificial Intelligence · Computer Science 2022-12-21 Jinzhao Zhou , Yiqun Duan , Zhihong Chen , Yu-Cheng Chang , Chin-Teng Lin

Revisiting Structured Variational Autoencoders

Structured variational autoencoders (SVAEs) combine probabilistic graphical model priors on latent variables, deep neural networks to link latent variables to observed data, and structure-exploiting algorithms for approximate posterior…

Machine Learning · Statistics 2023-05-29 Yixiu Zhao , Scott W. Linderman

Predictive variational autoencoder for learning robust representations of time-series data

Variational autoencoders (VAEs) have been used extensively to discover low-dimensional latent factors governing neural activity and animal behavior. However, without careful model selection, the uncovered latent factors may reflect noise in…

Machine Learning · Computer Science 2023-12-13 Julia Huiming Wang , Dexter Tsin , Tatiana Engel

Multivariate Variational Autoencoder

Learning latent representations that are simultaneously expressive, geometrically well-structured, and reliably calibrated remains a central challenge for Variational Autoencoders (VAEs). Standard VAEs typically assume a diagonal Gaussian…

Machine Learning · Computer Science 2025-12-02 Mehmet Can Yavuz

Analyzing Multimodal Integration in the Variational Autoencoder from an Information-Theoretic Perspective

Human perception is inherently multimodal. We integrate, for instance, visual, proprioceptive and tactile information into one experience. Hence, multimodal learning is of importance for building robotic systems that aim at robustly…

Machine Learning · Computer Science 2024-11-04 Carlotta Langer , Yasmin Kim Georgie , Ilja Porohovoj , Verena Vanessa Hafner , Nihat Ay