English
Related papers

Related papers: Cognitively Inspired Cross-Modal Data Generation U…

200 papers

Diffusion-based generative modeling has been achieving state-of-the-art results on various generation tasks. Most diffusion models, however, are limited to a single-generation modeling. Can we generalize diffusion models with the ability of…

Computer Vision and Pattern Recognition · Computer Science 2024-09-26 Changyou Chen , Han Ding , Bunyamin Sisman , Yi Xu , Ouye Xie , Benjamin Z. Yao , Son Dinh Tran , Belinda Zeng

In recent years, diffusion models have gained popularity for their ability to generate higher-quality images in comparison to GAN models. However, like any other large generative models, these models require a huge amount of data,…

Computer Vision and Pattern Recognition · Computer Science 2023-12-21 Rajesh Shrestha , Bowen Xie

Recent progress in image generation has sparked research into controlling these models through condition signals, with various methods addressing specific challenges in conditional generation. Instead of proposing another specialized…

Computer Vision and Pattern Recognition · Computer Science 2025-04-08 Xirui Li , Charles Herrmann , Kelvin C. K. Chan , Yinxiao Li , Deqing Sun , Chao Ma , Ming-Hsuan Yang

Creating large-scale datasets for training high-performance generative models is often prohibitively expensive, especially when associated attributes or annotations must be provided. As a result, merging existing datasets has become a…

Machine Learning · Statistics 2026-03-31 Yanfeng Yang , Kenji Fukumizu

Diffusion models arise as a powerful generative tool recently. Despite the great progress, existing diffusion models mainly focus on uni-modal control, i.e., the diffusion process is driven by only one modality of condition. To further…

Computer Vision and Pattern Recognition · Computer Science 2023-04-21 Ziqi Huang , Kelvin C. K. Chan , Yuming Jiang , Ziwei Liu

Diffusion models have made significant strides in language-driven and layout-driven image generation. However, most diffusion models are limited to visible RGB image generation. In fact, human perception of the world is enriched by diverse…

Computer Vision and Pattern Recognition · Computer Science 2024-10-22 Zeyu Wang , Jingyu Lin , Yifei Qian , Yi Huang , Shicen Tian , Bosong Chai , Juncan Deng , Qu Yang , Lan Du , Cunjian Chen , Kejie Huang

Diffusion models have demonstrated remarkable performance in generating unimodal data across various tasks, including image, video, and text generation. On the contrary, the joint generation of multimodal data through diffusion models is…

Machine Learning · Computer Science 2025-06-16 Kevin Rojas , Yuchen Zhu , Sichen Zhu , Felix X. -F. Ye , Molei Tao

Cross-Modal learning tasks have picked up pace in recent times. With plethora of applications in diverse areas, generation of novel content using multiple modalities of data has remained a challenging problem. To address the same, various…

Computer Vision and Pattern Recognition · Computer Science 2023-07-12 Nikhil Verma

Diffusion models have emerged as powerful tools for high-quality image generation and editing, but guiding these models to produce specific outputs remains a challenge. Conventional approaches rely on conditioning mechanisms, such as text…

Computer Vision and Pattern Recognition · Computer Science 2026-05-27 Nithesh Chandher Karthikeyan , Jonas Unger , Gabriel Eilertsen

Diffusion probabilistic models have achieved enormous success in the field of image generation and manipulation. In this paper, we explore a novel paradigm of using the diffusion model and classifier guidance in the latent semantic space…

Computer Vision and Pattern Recognition · Computer Science 2023-05-25 Changhao Shi , Haomiao Ni , Kai Li , Shaobo Han , Mingfu Liang , Martin Renqiang Min

Deep generative models with latent variables have been used lately to learn joint representations and generative processes from multi-modal data. These two learning mechanisms can, however, conflict with each other and representations can…

Machine Learning · Computer Science 2023-01-24 Rogelio A. Mancisidor , Michael Kampffmeyer , Kjersti Aas , Robert Jenssen

This paper demonstrates how to use generative models trained for image synthesis as tools for visual data mining. Our insight is that since contemporary generative models learn an accurate representation of their training data, we can use…

Computer Vision and Pattern Recognition · Computer Science 2024-08-07 Ioannis Siglidis , Aleksander Holynski , Alexei A. Efros , Mathieu Aubry , Shiry Ginosar

Multi-modal data-sets are ubiquitous in modern applications, and multi-modal Variational Autoencoders are a popular family of models that aim to learn a joint representation of the different modalities. However, existing approaches suffer…

Machine Learning · Computer Science 2023-12-19 Mustapha Bounoua , Giulio Franzese , Pietro Michiardi

We introduce Diffusion Active Learning, a novel approach that combines generative diffusion modeling with data-driven sequential experimental design to adaptively acquire data for inverse problems. Although broadly applicable, we focus on…

Machine Learning · Computer Science 2025-04-07 Luis Barba , Johannes Kirschner , Tomas Aidukas , Manuel Guizar-Sicairos , Benjamín Béjar

Humans continually expand their learned knowledge to new domains and learn new concepts without any interference with past learned experiences. In contrast, machine learning models perform poorly in a continual learning setting, where input…

Machine Learning · Computer Science 2023-04-24 Mohammad Rostami , Aram Galstyan

Diffusion models typically generate data through a fixed denoising trajectory that is shared across all samples. However, generation targets can differ in complexity, suggesting that a single pre-defined diffusion process may not be optimal…

Computer Vision and Pattern Recognition · Computer Science 2026-03-10 Yucheng Xing , Xiaodong Liu , Xin Wang

Recent improvements in conditional generative modeling have made it possible to generate high-quality images from language descriptions alone. We investigate whether these methods can directly address the problem of sequential…

Machine Learning · Computer Science 2023-07-11 Anurag Ajay , Yilun Du , Abhi Gupta , Joshua Tenenbaum , Tommi Jaakkola , Pulkit Agrawal

Cross-modality image segmentation aims to segment the target modalities using a method designed in the source modality. Deep generative models can translate the target modality images into the source modality, thus enabling cross-modality…

Image and Video Processing · Electrical Eng. & Systems 2024-04-11 Zihao Wang , Yingyu Yang , Yuzhou Chen , Tingting Yuan , Maxime Sermesant , Herve Delingette , Ona Wu

Generative models (GMs) have received increasing research interest for their remarkable capacity to achieve comprehensive understanding. However, their potential application in the domain of multi-modal tracking has remained relatively…

Computer Vision and Pattern Recognition · Computer Science 2023-12-01 Zhangyong Tang , Tianyang Xu , Xuefeng Zhu , Xiao-Jun Wu , Josef Kittler

A unified diffusion framework for multi-modal generation and understanding has the transformative potential to achieve seamless and controllable image diffusion and other cross-modal tasks. In this paper, we introduce MMGen, a unified…

Computer Vision and Pattern Recognition · Computer Science 2025-03-27 Jiepeng Wang , Zhaoqing Wang , Hao Pan , Yuan Liu , Dongdong Yu , Changhu Wang , Wenping Wang
‹ Prev 1 2 3 10 Next ›