Related papers: ReVersion: Diffusion-Based Relation Inversion from…

Reverse Stable Diffusion: What prompt was used to generate this image?

Text-to-image diffusion models have recently attracted the interest of many researchers, and inverting the diffusion process can play an important role in better understanding the generative process and how to engineer prompts in order to…

Computer Vision and Pattern Recognition · Computer Science 2024-10-22 Florinel-Alin Croitoru , Vlad Hondru , Radu Tudor Ionescu , Mubarak Shah

MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World

Diffusion models have become central to various image editing tasks, yet they often fail to fully adhere to physical laws, particularly with effects like shadows, reflections, and occlusions. In this work, we address the challenge of…

Computer Vision and Pattern Recognition · Computer Science 2025-04-23 Ankit Dhiman , Manan Shah , R Venkatesh Babu

In-Context Learning Unlocked for Diffusion Models

We present Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models. Given a pair of task-specific example images, such as depth from/to image and scribble from/to image, and a text guidance, our…

Computer Vision and Pattern Recognition · Computer Science 2023-10-20 Zhendong Wang , Yifan Jiang , Yadong Lu , Yelong Shen , Pengcheng He , Weizhu Chen , Zhangyang Wang , Mingyuan Zhou

EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models

Text-to-image generation models~(e.g., Stable Diffusion) have achieved significant advancements, enabling the creation of high-quality and realistic images based on textual descriptions. Prompt inversion, the task of identifying the textual…

Computer Vision and Pattern Recognition · Computer Science 2026-03-06 Mingzhe Li , Kejing Xia , Gehao Zhang , Zhenting Wang , Guanhong Tao , Siqi Pan , Juan Zhai , Shiqing Ma

RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers

Inspired by the in-context learning mechanism of large language models (LLMs), a new paradigm of generalizable visual prompt-based image editing is emerging. Existing single-reference methods typically focus on style or appearance…

Computer Vision and Pattern Recognition · Computer Science 2025-06-04 Yan Gong , Yiren Song , Yicheng Li , Chenglin Li , Yin Zhang

Context Diffusion: In-Context Aware Image Generation

We propose Context Diffusion, a diffusion-based framework that enables image generation models to learn from visual examples presented in context. Recent work tackles such in-context learning for image generation, where a query image is…

Computer Vision and Pattern Recognition · Computer Science 2025-07-24 Ivona Najdenkoska , Animesh Sinha , Abhimanyu Dubey , Dhruv Mahajan , Vignesh Ramanathan , Filip Radenovic

CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization

We propose CatVersion, an inversion-based method that learns the personalized concept through a handful of examples. Subsequently, users can utilize text prompts to generate images that embody the personalized concept, thereby achieving…

Computer Vision and Pattern Recognition · Computer Science 2023-12-01 Ruoyu Zhao , Mingrui Zhu , Shiyin Dong , Nannan Wang , Xinbo Gao

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

The correct insertion of virtual objects in images of real-world scenes requires a deep understanding of the scene's lighting, geometry and materials, as well as the image formation process. While recent large-scale diffusion models have…

Computer Vision and Pattern Recognition · Computer Science 2024-08-20 Ruofan Liang , Zan Gojcic , Merlin Nimier-David , David Acuna , Nandita Vijaykumar , Sanja Fidler , Zian Wang

ReNoise: Real Image Inversion Through Iterative Noising

Recent advancements in text-guided diffusion models have unlocked powerful image manipulation capabilities. However, applying these methods to real images necessitates the inversion of the images into the domain of the pretrained diffusion…

Computer Vision and Pattern Recognition · Computer Science 2024-03-22 Daniel Garibi , Or Patashnik , Andrey Voynov , Hadar Averbuch-Elor , Daniel Cohen-Or

Image Inversion: A Survey from GANs to Diffusion and Beyond

Image inversion is a fundamental task in generative models, aiming to map images back to their latent representations to enable downstream applications such as editing, restoration, and style transfer. This paper provides a comprehensive…

Computer Vision and Pattern Recognition · Computer Science 2025-02-18 Yinan Chen , Jiangning Zhang , Yali Bi , Xiaobin Hu , Teng Hu , Zhucun Xue , Ran Yi , Yong Liu , Ying Tai

ReorientDiff: Diffusion Model based Reorientation for Object Manipulation

The ability to manipulate objects in a desired configurations is a fundamental requirement for robots to complete various practical applications. While certain goals can be achieved by picking and placing the objects of interest directly,…

Robotics · Computer Science 2023-09-18 Utkarsh A. Mishra , Yongxin Chen

Unlocking Visual Secrets: Inverting Features with Diffusion Priors for Image Reconstruction

Inverting visual representations within deep neural networks (DNNs) presents a challenging and important problem in the field of security and privacy for deep learning. The main goal is to invert the features of an unidentified target image…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Sai Qian Zhang , Ziyun Li , Chuan Guo , Saeed Mahloujifar , Deeksha Dangwal , Edward Suh , Barbara De Salvo , Chiao Liu

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing

A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images.They either finetune the model, or invert the image in the latent space of the pretrained model. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Senmao Li , Joost van de Weijer , Taihang Hu , Fahad Shahbaz Khan , Qibin Hou , Yaxing Wang , Jian Yang , Ming-Ming Cheng

Leveraging Text-to-Image Diffusion Models for Unsupervised Visual Object Tracking

Unsupervised visual object tracking is a challenging task that requires following arbitrary targets in videos without training on ground-truth annotations. Despite considerable progress, existing state-of-the-art unsupervised trackers often…

Computer Vision and Pattern Recognition · Computer Science 2026-05-27 Zhengbo Zhang , Zhigang Tu , Junsong Yuan , De Wen Soh , Bo Du

Visual Style Prompt Learning Using Diffusion Models for Blind Face Restoration

Blind face restoration aims to recover high-quality facial images from various unidentified sources of degradation, posing significant challenges due to the minimal information retrievable from the degraded images. Prior knowledge-based…

Computer Vision and Pattern Recognition · Computer Science 2024-12-31 Wanglong Lu , Jikai Wang , Tao Wang , Kaihao Zhang , Xianta Jiang , Hanli Zhao

Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models

Recently large-scale language-image models (e.g., text-guided diffusion models) have considerably improved the image generation capabilities to generate photorealistic images in various domains. Based on this success, current image editing…

Computer Vision and Pattern Recognition · Computer Science 2023-05-09 Wenkai Dong , Song Xue , Xiaoyue Duan , Shumin Han

MirrorDiffusion: Stabilizing Diffusion Process in Zero-shot Image Translation by Prompts Redescription and Beyond

Recently, text-to-image diffusion models become a new paradigm in image processing fields, including content generation, image restoration and image-to-image translation. Given a target prompt, Denoising Diffusion Probabilistic Models…

Computer Vision and Pattern Recognition · Computer Science 2024-01-09 Yupei Lin , Xiaoyu Xian , Yukai Shi , Liang Lin

ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image Translation

Large-scale text-to-image models have demonstrated amazing ability to synthesize diverse and high-fidelity images. However, these models are often violated by several limitations. Firstly, they require the user to provide precise and…

Computer Vision and Pattern Recognition · Computer Science 2023-05-09 Yupei Lin , Sen Zhang , Xiaojun Yang , Xiao Wang , Yukai Shi

FakeInversion: Learning to Detect Images from Unseen Text-to-Image Models by Inverting Stable Diffusion

Due to the high potential for abuse of GenAI systems, the task of detecting synthetic images has recently become of great interest to the research community. Unfortunately, existing image-space detectors quickly become obsolete as new…

Computer Vision and Pattern Recognition · Computer Science 2024-06-14 George Cazenavette , Avneesh Sud , Thomas Leung , Ben Usman

Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models

The quality of the prompts provided to text-to-image diffusion models determines how faithful the generated content is to the user's intent, often requiring `prompt engineering'. To harness visual concepts from target images without prompt…

Computer Vision and Pattern Recognition · Computer Science 2023-12-20 Shweta Mahajan , Tanzila Rahman , Kwang Moo Yi , Leonid Sigal