English
Related papers

Related papers: Score-Based Multimodal Autoencoder

200 papers

Variational autoencoders (VAEs) are powerful generative models with the salient ability to perform inference. Here, we introduce a quantum variational autoencoder (QVAE): a VAE whose latent generative process is implemented as a quantum…

Multimodal variational autoencoders (VAEs) have shown promise as efficient generative models for weakly-supervised data. Yet, despite their advantage of weak supervision, they exhibit a gap in generative quality compared to unimodal VAEs,…

Machine Learning · Computer Science 2022-04-08 Imant Daunhawer , Thomas M. Sutter , Kieran Chin-Cheong , Emanuele Palumbo , Julia E. Vogt

Devising deep latent variable models for multi-modal data has been a long-standing theme in machine learning research. Multi-modal Variational Autoencoders (VAEs) have been a popular generative model class that learns latent representations…

Machine Learning · Statistics 2024-09-25 Marcel Hirt , Domenico Campolo , Victoria Leong , Juan-Pablo Ortega

Multimodal variational autoencoders have demonstrated their ability to learn the relationships between different modalities by mapping them into a latent representation. Their design and capacity to perform any-to-any conditional and…

Machine Learning · Computer Science 2025-02-04 Daniel Wesego , Pedram Rooshenas

Multi-modal data-sets are ubiquitous in modern applications, and multi-modal Variational Autoencoders are a popular family of models that aim to learn a joint representation of the different modalities. However, existing approaches suffer…

Machine Learning · Computer Science 2023-12-19 Mustapha Bounoua , Giulio Franzese , Pietro Michiardi

Multiple modalities often co-occur when describing natural phenomena. Learning a joint representation of these modalities should yield deeper and more useful representations. Previous generative approaches to multi-modal input either do not…

Machine Learning · Computer Science 2018-11-13 Mike Wu , Noah Goodman

Multimodal variational autoencoders (VAEs) aim to capture shared latent representations by integrating information from different data modalities. A significant challenge is accurately inferring representations from any subset of modalities…

Machine Learning · Computer Science 2024-10-16 Yuta Oshima , Masahiro Suzuki , Yutaka Matsuo

Energy-based models (EBMs) are a flexible class of deep generative models and are well-suited to capture complex dependencies in multimodal data. However, learning multimodal EBM by maximum likelihood requires Markov Chain Monte Carlo…

Machine Learning · Computer Science 2026-05-04 Jiali Cui , Zhiqiang Lao , Heather Yu

Variational AutoEncoders (VAEs) are powerful generative models that merge elements from statistics and information theory with the flexibility offered by deep neural networks to efficiently solve the generation problem for high dimensional…

Machine Learning · Computer Science 2021-03-02 A. Asperti , D. Evangelista , E. Loli Piccolomini

Variational Autoencoders for multimodal data hold promise for many tasks in data analysis, such as representation learning, conditional generation, and imputation. Current architectures either share the encoder output, decoder input, or…

We investigate deep generative models that can exchange multiple modalities bi-directionally, e.g., generating images from corresponding texts and vice versa. Recently, some studies handle multiple modalities on deep generative models, such…

Machine Learning · Statistics 2016-11-08 Masahiro Suzuki , Kotaro Nakayama , Yutaka Matsuo

Attention-based models such as Transformers and recurrent models like state space models (SSMs) have emerged as successful methods for autoregressive sequence modeling. Although both enable parallel training, none enable parallel generation…

Machine Learning · Computer Science 2024-07-12 Gaspard Lambrechts , Yann Claes , Pierre Geurts , Damien Ernst

Multimodal Variational Autoencoders have emerged as a popular tool to extract effective representations from rich multimodal data. However, such models rely on fusion strategies in latent space that destroy the joint statistical structure…

Machine Learning · Computer Science 2026-03-03 Federico Caretti , Guido Sanguinetti

Multimodal Variational Autoencoders (VAEs) have been the subject of intense research in the past years as they can integrate multiple modalities into a joint representation and can thus serve as a promising tool for both data classification…

Machine Learning · Computer Science 2024-09-18 Gabriela Sejnova , Michal Vavrecka , Karla Stepanova , Tadahiro Taniguchi

Variational autoencoders (VAEs) are widely used deep generative models capable of learning unsupervised latent representations of data. Such representations are often difficult to interpret or control. We consider the problem of…

Machine Learning · Computer Science 2018-12-18 Jack Klys , Jake Snell , Richard Zemel

Making sense of multiple modalities can yield a more comprehensive description of real-world phenomena. However, learning the co-representation of diverse modalities is still a long-standing endeavor in emerging machine learning…

Artificial Intelligence · Computer Science 2022-12-21 Jinzhao Zhou , Yiqun Duan , Zhihong Chen , Yu-Cheng Chang , Chin-Teng Lin

Structured variational autoencoders (SVAEs) combine probabilistic graphical model priors on latent variables, deep neural networks to link latent variables to observed data, and structure-exploiting algorithms for approximate posterior…

Machine Learning · Statistics 2023-05-29 Yixiu Zhao , Scott W. Linderman

Variational autoencoders (VAEs) have been used extensively to discover low-dimensional latent factors governing neural activity and animal behavior. However, without careful model selection, the uncovered latent factors may reflect noise in…

Machine Learning · Computer Science 2023-12-13 Julia Huiming Wang , Dexter Tsin , Tatiana Engel

Learning latent representations that are simultaneously expressive, geometrically well-structured, and reliably calibrated remains a central challenge for Variational Autoencoders (VAEs). Standard VAEs typically assume a diagonal Gaussian…

Machine Learning · Computer Science 2025-12-02 Mehmet Can Yavuz

Human perception is inherently multimodal. We integrate, for instance, visual, proprioceptive and tactile information into one experience. Hence, multimodal learning is of importance for building robotic systems that aim at robustly…

Machine Learning · Computer Science 2024-11-04 Carlotta Langer , Yasmin Kim Georgie , Ilja Porohovoj , Verena Vanessa Hafner , Nihat Ay
‹ Prev 1 2 3 10 Next ›