English
Related papers

Related papers: Diffusion Models For Multi-Modal Generative Modeli…

200 papers

Deep generative models have garnered significant attention in low-level vision tasks due to their generative capabilities. Among them, diffusion model-based solutions, characterized by a forward diffusion process and a reverse denoising…

Computer Vision and Pattern Recognition · Computer Science 2025-02-26 Chunming He , Yuqi Shen , Chengyu Fang , Fengyang Xiao , Longxiang Tang , Yulun Zhang , Wangmeng Zuo , Zhenhua Guo , Xiu Li

Diffusion models arise as a powerful generative tool recently. Despite the great progress, existing diffusion models mainly focus on uni-modal control, i.e., the diffusion process is driven by only one modality of condition. To further…

Computer Vision and Pattern Recognition · Computer Science 2023-04-21 Ziqi Huang , Kelvin C. K. Chan , Yuming Jiang , Ziwei Liu

Diffusion models, a family of generative models based on deep learning, have become increasingly prominent in cutting-edge machine learning research. With a distinguished performance in generating samples that resemble the observed data,…

Machine Learning · Computer Science 2023-05-02 Lequan Lin , Zhengkun Li , Ruikun Li , Xuliang Li , Junbin Gao

Denoising diffusion models represent a recent emerging topic in computer vision, demonstrating remarkable results in the area of generative modeling. A diffusion model is a deep generative model that is based on two stages, a forward…

Computer Vision and Pattern Recognition · Computer Science 2025-01-17 Florinel-Alin Croitoru , Vlad Hondru , Radu Tudor Ionescu , Mubarak Shah

Diffusion models have gained tremendous success in text-to-image generation, yet still lag behind with visual understanding tasks, an area dominated by autoregressive vision-language models. We propose a large-scale and fully end-to-end…

Computer Vision and Pattern Recognition · Computer Science 2025-04-03 Zijie Li , Henry Li , Yichun Shi , Amir Barati Farimani , Yuval Kluger , Linjie Yang , Peng Wang

Beyond high-fidelity image synthesis, diffusion models have recently exhibited promising results in dense visual perception tasks. However, most existing work treats diffusion models as a standalone component for perception tasks, employing…

Computer Vision and Pattern Recognition · Computer Science 2025-12-18 Shuhong Zheng , Zhipeng Bao , Ruoyu Zhao , Martial Hebert , Yu-Xiong Wang

Diffusion models typically generate data through a fixed denoising trajectory that is shared across all samples. However, generation targets can differ in complexity, suggesting that a single pre-defined diffusion process may not be optimal…

Computer Vision and Pattern Recognition · Computer Science 2026-03-10 Yucheng Xing , Xiaodong Liu , Xin Wang

Diffusion Models have become a cornerstone of modern generative AI for their exceptional generation quality and controllability. However, their inherent \textit{multi-step iterations} and \textit{complex backbone networks} lead to…

Diffusion models, a powerful and universal generative AI technology, have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology. In these applications, diffusion models provide flexible…

Machine Learning · Computer Science 2024-04-12 Minshuo Chen , Song Mei , Jianqing Fan , Mengdi Wang

Diffusion models offer stable training and state-of-the-art performance for deep generative modeling tasks. Here, we consider their use in the context of multivariate subsurface modeling and probabilistic inversion. We first demonstrate…

Computer Vision and Pattern Recognition · Computer Science 2026-01-28 Roberto Miele , Niklas Linde

The recently developed discrete diffusion models perform extraordinarily well in the text-to-image task, showing significant promise for handling the multi-modality signals. In this work, we harness these traits and present a unified…

Computer Vision and Pattern Recognition · Computer Science 2022-11-29 Minghui Hu , Chuanxia Zheng , Heliang Zheng , Tat-Jen Cham , Chaoyue Wang , Zuopeng Yang , Dacheng Tao , Ponnuthurai N. Suganthan

Most existing cross-modal generative methods based on diffusion models use guidance to provide control over the latent space to enable conditional generation across different modalities. Such methods focus on providing guidance through…

Machine Learning · Computer Science 2023-05-31 Zizhao Hu , Mohammad Rostami

Diffusion models have emerged as a powerful new family of deep generative models with record-breaking performance in many applications, including image synthesis, video generation, and molecule design. In this survey, we provide an overview…

Machine Learning · Computer Science 2025-09-30 Ling Yang , Zhilong Zhang , Yang Song , Shenda Hong , Runsheng Xu , Yue Zhao , Wentao Zhang , Bin Cui , Ming-Hsuan Yang

Denosing diffusion model, as a generative model, has received a lot of attention in the field of image generation recently, thanks to its powerful generation capability. However, diffusion models have not yet received sufficient research in…

Computer Vision and Pattern Recognition · Computer Science 2023-04-12 ZiHan Cao , ShiQi Cao , Xiao Wu , JunMing Hou , Ran Ran , Liang-Jian Deng

Diffusion models are a class of generative models that serve to establish a stochastic transport map between an empirically observed, yet unknown, target distribution and a known prior. Despite their remarkable success in real-world…

Machine Learning · Computer Science 2025-03-13 Puheng Li , Zhong Li , Huishuai Zhang , Jiang Bian

Cross-Modal learning tasks have picked up pace in recent times. With plethora of applications in diverse areas, generation of novel content using multiple modalities of data has remained a challenging problem. To address the same, various…

Computer Vision and Pattern Recognition · Computer Science 2023-07-12 Nikhil Verma

Denoising diffusion models, a class of generative models, have garnered immense interest lately in various deep-learning problems. A diffusion probabilistic model defines a forward diffusion stage where the input data is gradually perturbed…

Image and Video Processing · Electrical Eng. & Systems 2023-06-06 Amirhossein Kazerouni , Ehsan Khodapanah Aghdam , Moein Heidari , Reza Azad , Mohsen Fayyaz , Ilker Hacihaliloglu , Dorit Merhof

A unified diffusion framework for multi-modal generation and understanding has the transformative potential to achieve seamless and controllable image diffusion and other cross-modal tasks. In this paper, we introduce MMGen, a unified…

Computer Vision and Pattern Recognition · Computer Science 2025-03-27 Jiepeng Wang , Zhaoqing Wang , Hao Pan , Yuan Liu , Dongdong Yu , Changhu Wang , Wenping Wang

Previously, non-autoregressive models were widely perceived as being superior in generation efficiency but inferior in generation quality due to the difficulties of modeling multiple target modalities. To enhance the multi-modality modeling…

Computation and Language · Computer Science 2023-11-30 Lihua Qian , Mingxuan Wang , Yang Liu , Hao Zhou

Diffusion Models are popular generative modeling methods in various vision tasks, attracting significant attention. They can be considered a unique instance of self-supervised learning methods due to their independence from label…

Computer Vision and Pattern Recognition · Computer Science 2025-01-19 Michael Fuest , Pingchuan Ma , Ming Gui , Johannes Schusterbauer , Vincent Tao Hu , Bjorn Ommer
‹ Prev 1 2 3 10 Next ›