English
Related papers

Related papers: Multimodal Generative Models for Scalable Weakly-S…

200 papers

Learning generative models that span multiple data modalities, such as vision and language, is often motivated by the desire to learn more useful, generalisable representations that faithfully capture common underlying factors between the…

Machine Learning · Statistics 2019-11-11 Yuge Shi , N. Siddharth , Brooks Paige , Philip H. S. Torr

We investigate deep generative models that can exchange multiple modalities bi-directionally, e.g., generating images from corresponding texts and vice versa. Recently, some studies handle multiple modalities on deep generative models, such…

Machine Learning · Statistics 2016-11-08 Masahiro Suzuki , Kotaro Nakayama , Yutaka Matsuo

Making sense of multiple modalities can yield a more comprehensive description of real-world phenomena. However, learning the co-representation of diverse modalities is still a long-standing endeavor in emerging machine learning…

Artificial Intelligence · Computer Science 2022-12-21 Jinzhao Zhou , Yiqun Duan , Zhihong Chen , Yu-Cheng Chang , Chin-Teng Lin

Humans are able to create rich representations of their external reality. Their internal representations allow for cross-modality inference, where available perceptions can induce the perceptual experience of missing input modalities. In…

Machine Learning · Computer Science 2020-06-05 Miguel Vasco , Francisco S. Melo , Ana Paiva

As deep neural networks become more adept at traditional tasks, many of the most exciting new challenges concern multimodality---observations that combine diverse types, such as image and text. In this paper, we introduce a family of…

Machine Learning · Computer Science 2019-12-12 Mike Wu , Noah Goodman

To achieve high-levels of autonomy, modern robots require the ability to detect and recover from anomalies and failures with minimal human supervision. Multi-modal sensor signals could provide more information for such anomaly detection…

Robotics · Computer Science 2020-12-17 Tianchen Ji , Sri Theja Vuppala , Girish Chowdhary , Katherine Driggs-Campbell

We investigate deep generative models that can exchange multiple modalities bi-directionally, e.g., generating images from corresponding texts and vice versa. A major approach to achieve this objective is to train a model that integrates…

Machine Learning · Statistics 2018-01-29 Masahiro Suzuki , Kotaro Nakayama , Yutaka Matsuo

Multimodal variational autoencoders (VAEs) have shown promise as efficient generative models for weakly-supervised data. Yet, despite their advantage of weak supervision, they exhibit a gap in generative quality compared to unimodal VAEs,…

Machine Learning · Computer Science 2022-04-08 Imant Daunhawer , Thomas M. Sutter , Kieran Chin-Cheong , Emanuele Palumbo , Julia E. Vogt

Human perception is inherently multimodal. We integrate, for instance, visual, proprioceptive and tactile information into one experience. Hence, multimodal learning is of importance for building robotic systems that aim at robustly…

Machine Learning · Computer Science 2024-11-04 Carlotta Langer , Yasmin Kim Georgie , Ilja Porohovoj , Verena Vanessa Hafner , Nihat Ay

Multimodal generative models should be able to learn a meaningful latent representation that enables a coherent joint generation of all modalities (e.g., images and text). Many applications also require the ability to accurately sample…

Machine Learning · Computer Science 2021-08-02 Svetlana Kutuzova , Oswin Krause , Douglas McCloskey , Mads Nielsen , Christian Igel

Multimodal variational autoencoders have demonstrated their ability to learn the relationships between different modalities by mapping them into a latent representation. Their design and capacity to perform any-to-any conditional and…

Machine Learning · Computer Science 2025-02-04 Daniel Wesego , Pedram Rooshenas

By composing graphical models with deep learning architectures, we learn generative models with the strengths of both frameworks. The structured variational autoencoder (SVAE) inherits structure and interpretability from graphical models,…

Machine Learning · Computer Science 2023-11-15 Harry Bendekgey , Gabriel Hope , Erik B. Sudderth

Variational autoencoders (VAEs) are widely used deep generative models capable of learning unsupervised latent representations of data. Such representations are often difficult to interpret or control. We consider the problem of…

Machine Learning · Computer Science 2018-12-18 Jack Klys , Jake Snell , Richard Zemel

Multimodal variational autoencoders (VAEs) aim to capture shared latent representations by integrating information from different data modalities. A significant challenge is accurately inferring representations from any subset of modalities…

Machine Learning · Computer Science 2024-10-16 Yuta Oshima , Masahiro Suzuki , Yutaka Matsuo

Multimodal sensory data resembles the form of information perceived by humans for learning, and are easy to obtain in large quantities. Compared to unimodal data, synchronization of concepts between modalities in such data provides…

Machine Learning · Statistics 2018-05-30 Wei-Ning Hsu , James Glass

Multimodal Variational Autoencoders have emerged as a popular tool to extract effective representations from rich multimodal data. However, such models rely on fusion strategies in latent space that destroy the joint statistical structure…

Machine Learning · Computer Science 2026-03-03 Federico Caretti , Guido Sanguinetti

Multimodal learning is a framework for building models that make predictions based on different types of modalities. Important challenges in multimodal learning are the inference of shared representations from arbitrary modalities and…

Machine Learning · Computer Science 2022-07-06 Masahiro Suzuki , Yutaka Matsuo

With the ever-increasing amount of data, the central challenge in multimodal learning involves limitations of labelled samples. For the task of classification, techniques such as meta-learning, zero-shot learning, and few-shot learning…

Computer Vision and Pattern Recognition · Computer Science 2021-06-29 Nihar Bendre , Kevin Desai , Peyman Najafirad

Multimodal Variational Autoencoders (VAEs) have been the subject of intense research in the past years as they can integrate multiple modalities into a joint representation and can thus serve as a promising tool for both data classification…

Machine Learning · Computer Science 2024-09-18 Gabriela Sejnova , Michal Vavrecka , Karla Stepanova , Tadahiro Taniguchi

Multimodal VAEs seek to model the joint distribution over heterogeneous data (e.g.\ vision, language), whilst also capturing a shared representation across such modalities. Prior work has typically combined information from the modalities…

Machine Learning · Computer Science 2022-12-19 Tom Joy , Yuge Shi , Philip H. S. Torr , Tom Rainforth , Sebastian M. Schmon , N. Siddharth
‹ Prev 1 2 3 10 Next ›