Related papers: Unsupervised Semantic Correspondence Using Stable …

Emergent Correspondence from Image Diffusion

Finding correspondences between images is a fundamental problem in computer vision. In this paper, we show that correspondence emerges in image diffusion models without any explicit supervision. We propose a simple strategy to extract this…

Computer Vision and Pattern Recognition · Computer Science 2023-12-08 Luming Tang , Menglin Jia , Qianqian Wang , Cheng Perng Phoo , Bharath Hariharan

Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence

Diffusion models have been shown to be capable of generating high-quality images, suggesting that they could contain meaningful internal representations. Unfortunately, the feature maps that encode a diffusion model's internal information…

Computer Vision and Pattern Recognition · Computer Science 2024-04-03 Grace Luo , Lisa Dunlap , Dong Huk Park , Aleksander Holynski , Trevor Darrell

Eye-for-an-eye: Appearance Transfer with Semantic Correspondence in Diffusion Models

As pre-trained text-to-image diffusion models have become a useful tool for image synthesis, people want to specify the results in various ways. This paper tackles training-free appearance transfer, which produces an image with the…

Computer Vision and Pattern Recognition · Computer Science 2025-10-21 Sooyeon Go , Kyungmook Choi , Minjung Shin , Youngjung Uh

A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence

Text-to-image diffusion models have made significant advances in generating and editing high-quality images. As a result, numerous approaches have explored the ability of diffusion model features to understand and process single images for…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Junyi Zhang , Charles Herrmann , Junhwa Hur , Luisa Polania Cabrera , Varun Jampani , Deqing Sun , Ming-Hsuan Yang

EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models

Diffusion models have recently received increasing research attention for their remarkable transfer abilities in semantic segmentation tasks. However, generating fine-grained segmentation masks with diffusion models often requires…

Computer Vision and Pattern Recognition · Computer Science 2024-01-23 Koichi Namekata , Amirmojtaba Sabour , Sanja Fidler , Seung Wook Kim

Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization

Text-to-image diffusion models have emerged as powerful tools for high-quality image generation and editing. Many existing approaches rely on text prompts as editing guidance. However, these methods are constrained by the need for manual…

Computer Vision and Pattern Recognition · Computer Science 2025-05-21 Yuanyuan Chang , Yinghua Yao , Tao Qin , Mengmeng Wang , Ivor Tsang , Guang Dai

Investigating Conceptual Blending of a Diffusion Model for Improving Nonword-to-Image Generation

Text-to-image diffusion models sometimes depict blended concepts in the generated images. One promising use case of this effect would be the nonword-to-image generation task which attempts to generate images intuitively imaginable from a…

Multimedia · Computer Science 2024-11-07 Chihaya Matsuhira , Marc A. Kastner , Takahiro Komamizu , Takatsugu Hirayama , Ichiro Ide

Local-Global Context-Aware and Structure-Preserving Image Super-Resolution

Diffusion models have recently achieved significant success in various image manipulation tasks, including image super-resolution and perceptual quality enhancement. Pretrained text-to-image models, such as Stable Diffusion, have exhibited…

Computer Vision and Pattern Recognition · Computer Science 2025-10-16 Sanchar Palit , Subhasis Chaudhuri , Biplab Banerjee

Unsupervised Keypoints from Pretrained Diffusion Models

Unsupervised learning of keypoints and landmarks has seen significant progress with the help of modern neural network architectures, but performance is yet to match the supervised counterpart, making their practicability questionable. We…

Computer Vision and Pattern Recognition · Computer Science 2024-05-24 Eric Hedlin , Gopal Sharma , Shweta Mahajan , Xingzhe He , Hossam Isack , Abhishek Kar Helge Rhodin , Andrea Tagliasacchi , Kwang Moo Yi

Semantically Robust Unpaired Image Translation for Data with Unmatched Semantics Statistics

Many applications of unpaired image-to-image translation require the input contents to be preserved semantically during translations. Unaware of the inherently unmatched semantics distributions between source and target domains, existing…

Computer Vision and Pattern Recognition · Computer Science 2021-10-07 Zhiwei Jia , Bodi Yuan , Kangkang Wang , Hong Wu , David Clifford , Zhiqiang Yuan , Hao Su

CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer

Transferring visual style between images while preserving semantic correspondence between similar objects remains a central challenge in computer vision. While existing methods have made great strides, most of them operate at global level…

Computer Vision and Pattern Recognition · Computer Science 2026-04-02 Wenbo Nie , Zixiang Li , Renshuai Tao , Bin Wu , Yunchao Wei , Yao Zhao

Leveraging Text-to-Image Diffusion Models for Unsupervised Visual Object Tracking

Unsupervised visual object tracking is a challenging task that requires following arbitrary targets in videos without training on ground-truth annotations. Despite considerable progress, existing state-of-the-art unsupervised trackers often…

Computer Vision and Pattern Recognition · Computer Science 2026-05-27 Zhengbo Zhang , Zhigang Tu , Junsong Yuan , De Wen Soh , Bo Du

Distillation of Diffusion Features for Semantic Correspondence

Semantic correspondence, the task of determining relationships between different parts of images, underpins various applications including 3D reconstruction, image-to-image translation, object tracking, and visual place recognition. Recent…

Computer Vision and Pattern Recognition · Computer Science 2024-12-05 Frank Fundel , Johannes Schusterbauer , Vincent Tao Hu , Björn Ommer

Seek for Incantations: Towards Accurate Text-to-Image Diffusion Synthesis through Prompt Engineering

The text-to-image synthesis by diffusion models has recently shown remarkable performance in generating high-quality images. Although performs well for simple texts, the models may get confused when faced with complex texts that contain…

Computer Vision and Pattern Recognition · Computer Science 2024-01-15 Chang Yu , Junran Peng , Xiangyu Zhu , Zhaoxiang Zhang , Qi Tian , Zhen Lei

Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis

Existing text-to-image generation approaches have set high standards for photorealism and text-image correspondence, largely benefiting from web-scale text-image datasets, which can include up to 5~billion pairs. However, text-to-image…

Computer Vision and Pattern Recognition · Computer Science 2023-08-17 Minho Park , Jooyeol Yun , Seunghwan Choi , Jaegul Choo

Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners

Diffusion models, such as Stable Diffusion, have shown incredible performance on text-to-image generation. Since text-to-image generation often requires models to generate visual concepts with fine-grained details and attributes specified…

Computer Vision and Pattern Recognition · Computer Science 2024-04-26 Xuehai He , Weixi Feng , Tsu-Jui Fu , Varun Jampani , Arjun Akula , Pradyumna Narayana , Sugato Basu , William Yang Wang , Xin Eric Wang

Fine-Tuning Text-To-Image Diffusion Models for Class-Wise Spurious Feature Generation

We propose a method for generating spurious features by leveraging large-scale text-to-image diffusion models. Although the previous work detects spurious features in a large-scale dataset like ImageNet and introduces Spurious ImageNet, we…

Computer Vision and Pattern Recognition · Computer Science 2024-02-14 AprilPyone MaungMaung , Huy H. Nguyen , Hitoshi Kiya , Isao Echizen

Evaluating a Synthetic Image Dataset Generated with Stable Diffusion

We generate synthetic images with the "Stable Diffusion" image generation model using the Wordnet taxonomy and the definitions of concepts it contains. This synthetic image database can be used as training data for data augmentation in…

Computer Vision and Pattern Recognition · Computer Science 2022-11-07 Andreas Stöckl

SEGA: Instructing Text-to-Image Models using Semantic Guidance

Text-to-image diffusion models have recently received a lot of interest for their astonishing ability to produce high-fidelity images from text only. However, achieving one-shot generation that aligns with the user's intent is nearly…

Computer Vision and Pattern Recognition · Computer Science 2023-11-06 Manuel Brack , Felix Friedrich , Dominik Hintersdorf , Lukas Struppek , Patrick Schramowski , Kristian Kersting

Diffusion Features to Bridge Domain Gap for Semantic Segmentation

Pre-trained diffusion models have demonstrated remarkable proficiency in synthesizing images across a wide range of scenarios with customizable prompts, indicating their effective capacity to capture universal features. Motivated by this,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-22 Yuxiang Ji , Boyong He , Chenyuan Qu , Zhuoyue Tan , Chuan Qin , Liaoni Wu