Related papers: DiffusionPID: Interpreting Diffusion via Partial I…

Interpretable Diffusion via Information Decomposition

Denoising diffusion models enable conditional generation and density modeling of complex relationships like images and text. However, the nature of the learned relationships is opaque making it difficult to understand precisely what…

Machine Learning · Computer Science 2024-05-21 Xianghao Kong , Ollie Liu , Han Li , Dani Yogatama , Greg Ver Steeg

Unleashing Text-to-Image Diffusion Models for Visual Perception

Diffusion models (DMs) have become the new trend of generative models and have demonstrated a powerful ability of conditional synthesis. Among those, text-to-image diffusion models pre-trained on large-scale image-text pairs are highly…

Computer Vision and Pattern Recognition · Computer Science 2023-03-06 Wenliang Zhao , Yongming Rao , Zuyan Liu , Benlin Liu , Jie Zhou , Jiwen Lu

The Hidden Language of Diffusion Models

Text-to-image diffusion models have demonstrated an unparalleled ability to generate high-quality, diverse images from a textual prompt. However, the internal representations learned by these models remain an enigma. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2023-10-06 Hila Chefer , Oran Lang , Mor Geva , Volodymyr Polosukhin , Assaf Shocher , Michal Irani , Inbar Mosseri , Lior Wolf

Stimulating Diffusion Model for Image Denoising via Adaptive Embedding and Ensembling

Image denoising is a fundamental problem in computational photography, where achieving high perception with low distortion is highly demanding. Current methods either struggle with perceptual quality or suffer from significant distortion.…

Computer Vision and Pattern Recognition · Computer Science 2024-11-19 Tong Li , Hansen Feng , Lizhi Wang , Zhiwei Xiong , Hua Huang

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis. Starting from random noise, such text-to-image diffusion models gradually synthesize images in an iterative fashion…

Computer Vision and Pattern Recognition · Computer Science 2023-03-15 Yogesh Balaji , Seungjun Nah , Xun Huang , Arash Vahdat , Jiaming Song , Qinsheng Zhang , Karsten Kreis , Miika Aittala , Timo Aila , Samuli Laine , Bryan Catanzaro , Tero Karras , Ming-Yu Liu

AID: Attention Interpolation of Text-to-Image Diffusion

Conditional diffusion models can create unseen images in various settings, aiding image interpolation. Interpolation in latent spaces is well-studied, but interpolation with specific conditions like text or poses is less understood. Simple…

Computer Vision and Pattern Recognition · Computer Science 2024-10-07 Qiyuan He , Jinghao Wang , Ziwei Liu , Angela Yao

PID: Physics-Informed Diffusion Model for Infrared Image Generation

Infrared imaging technology has gained significant attention for its reliable sensing ability in low visibility conditions, prompting many studies to convert the abundant RGB images to infrared images. However, most existing image…

Computer Vision and Pattern Recognition · Computer Science 2025-11-20 Fangyuan Mao , Jilin Mei , Shun Lu , Fuyang Liu , Liang Chen , Fangzhou Zhao , Yu Hu

Person Image Synthesis via Denoising Diffusion Model

The pose-guided person image generation task requires synthesizing photorealistic images of humans in arbitrary poses. The existing approaches use generative adversarial networks that do not necessarily maintain realistic textures or need…

Computer Vision and Pattern Recognition · Computer Science 2023-03-02 Ankan Kumar Bhunia , Salman Khan , Hisham Cholakkal , Rao Muhammad Anwer , Jorma Laaksonen , Mubarak Shah , Fahad Shahbaz Khan

Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners

Diffusion models, such as Stable Diffusion, have shown incredible performance on text-to-image generation. Since text-to-image generation often requires models to generate visual concepts with fine-grained details and attributes specified…

Computer Vision and Pattern Recognition · Computer Science 2024-04-26 Xuehai He , Weixi Feng , Tsu-Jui Fu , Varun Jampani , Arjun Akula , Pradyumna Narayana , Sugato Basu , William Yang Wang , Xin Eric Wang

Diffusion Features to Bridge Domain Gap for Semantic Segmentation

Pre-trained diffusion models have demonstrated remarkable proficiency in synthesizing images across a wide range of scenarios with customizable prompts, indicating their effective capacity to capture universal features. Motivated by this,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-22 Yuxiang Ji , Boyong He , Chenyuan Qu , Zhuoyue Tan , Chuan Qin , Liaoni Wu

Predicated Diffusion: Predicate Logic-Based Attention Guidance for Text-to-Image Diffusion Models

Diffusion models have achieved remarkable results in generating high-quality, diverse, and creative images. However, when it comes to text-based image generation, they often fail to capture the intended meaning presented in the text. For…

Computer Vision and Pattern Recognition · Computer Science 2024-03-20 Kota Sueyoshi , Takashi Matsubara

Exposing Text-Image Inconsistency Using Diffusion Models

In the battle against widespread online misinformation, a growing problem is text-image inconsistency, where images are misleadingly paired with texts with different intent or meaning. Existing classification-based methods for text-image…

Computer Vision and Pattern Recognition · Computer Science 2024-04-30 Mingzhen Huang , Shan Jia , Zhou Zhou , Yan Ju , Jialing Cai , Siwei Lyu

De-Diffusion Makes Text a Strong Cross-Modal Interface

We demonstrate text as a strong cross-modal interface. Rather than relying on deep embeddings to connect image and language as the interface representation, our approach represents an image as text, from which we enjoy the interpretability…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Chen Wei , Chenxi Liu , Siyuan Qiao , Zhishuai Zhang , Alan Yuille , Jiahui Yu

PICD: Versatile Perceptual Image Compression with Diffusion Rendering

Recently, perceptual image compression has achieved significant advancements, delivering high visual quality at low bitrates for natural images. However, for screen content, existing methods often produce noticeable artifacts when…

Computer Vision and Pattern Recognition · Computer Science 2025-05-12 Tongda Xu , Jiahao Li , Bin Li , Yan Wang , Ya-Qin Zhang , Yan Lu

Partial information decomposition for mixed discrete and continuous random variables

The framework of Partial Information Decomposition (PID) unveils complex nonlinear interactions in network systems by dissecting the mutual information (MI) between a target variable and several source variables. While PID measures have…

Data Analysis, Statistics and Probability · Physics 2024-09-23 Chiara Barà , Yuri Antonacci , Marta Iovino , Ivan Lazic , Luca Faes

Blended Diffusion for Text-driven Editing of Natural Images

Natural language offers a highly intuitive interface for image editing. In this paper, we introduce the first solution for performing local (region-based) edits in generic natural images, based on a natural language description along with…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Omri Avrahami , Dani Lischinski , Ohad Fried

Dig2DIG: Dig into Diffusion Information Gains for Image Fusion

Image fusion integrates complementary information from multi-source images to generate more informative results. Recently, the diffusion model, which demonstrates unprecedented generative potential, has been explored in image fusion.…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Bing Cao , Baoshuo Cai , Changqing Zhang , Qinghua Hu

Diffusion Model-Based Image Editing: A Survey

Denoising diffusion models have emerged as a powerful tool for various image generation and editing tasks, facilitating the synthesis of visual content in an unconditional or input-conditional manner. The core idea behind them is learning…

Computer Vision and Pattern Recognition · Computer Science 2025-03-12 Yi Huang , Jiancheng Huang , Yifan Liu , Mingfu Yan , Jiaxi Lv , Jianzhuang Liu , Wei Xiong , He Zhang , Liangliang Cao , Shifeng Chen

Depth-guided Texture Diffusion for Image Semantic Segmentation

Depth information provides valuable insights into the 3D structure especially the outline of objects, which can be utilized to improve the semantic segmentation tasks. However, a naive fusion of depth information can disrupt feature and…

Computer Vision and Pattern Recognition · Computer Science 2024-08-20 Wei Sun , Yuan Li , Qixiang Ye , Jianbin Jiao , Yanzhao Zhou

DiffCap: Exploring Continuous Diffusion on Image Captioning

Current image captioning works usually focus on generating descriptions in an autoregressive manner. However, there are limited works that focus on generating descriptions non-autoregressively, which brings more decoding diversity. Inspired…

Computer Vision and Pattern Recognition · Computer Science 2023-05-23 Yufeng He , Zefan Cai , Xu Gan , Baobao Chang