Related papers: EditVal: Benchmarking Diffusion Based Text-Guided …

Diffusion Model-Based Image Editing: A Survey

Denoising diffusion models have emerged as a powerful tool for various image generation and editing tasks, facilitating the synthesis of visual content in an unconditional or input-conditional manner. The core idea behind them is learning…

Computer Vision and Pattern Recognition · Computer Science 2025-03-12 Yi Huang , Jiancheng Huang , Yifan Liu , Mingfu Yan , Jiaxi Lv , Jianzhuang Liu , Wei Xiong , He Zhang , Liangliang Cao , Shifeng Chen

EdiVal-Agent: An Object-Centric Framework for Automated, Fine-Grained Evaluation of Multi-Turn Editing

Instruction-based image editing has advanced rapidly, yet reliable and interpretable evaluation remains a bottleneck. Current protocols either (i) depend on paired reference images, resulting in limited coverage and inheriting biases from…

Computer Vision and Pattern Recognition · Computer Science 2026-03-19 Tianyu Chen , Yasi Zhang , Zhi Zhang , Peiyu Yu , Shu Wang , Zhendong Wang , Kevin Lin , Xiaofei Wang , Zhengyuan Yang , Linjie Li , Chung-Ching Lin , Jianwen Xie , Oscar Leong , Lijuan Wang , Ying Nian Wu , Mingyuan Zhou

A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models

Image editing aims to edit the given synthetic or real image to meet the specific requirements from users. It is widely studied in recent years as a promising and challenging field of Artificial Intelligence Generative Content (AIGC).…

Computer Vision and Pattern Recognition · Computer Science 2024-06-21 Xincheng Shuai , Henghui Ding , Xingjun Ma , Rongcheng Tu , Yu-Gang Jiang , Dacheng Tao

EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models

Text-to-image diffusion models have emerged as an evolutionary for producing creative content in image synthesis. Based on the impressive generation abilities of these models, instruction-guided diffusion models can edit images with simple…

Cryptography and Security · Computer Science 2024-08-21 Ruoxi Chen , Haibo Jin , Yixin Liu , Jinyin Chen , Haohan Wang , Lichao Sun

EditBoard: Towards a Comprehensive Evaluation Benchmark for Text-Based Video Editing Models

The rapid development of diffusion models has significantly advanced AI-generated content (AIGC), particularly in Text-to-Image (T2I) and Text-to-Video (T2V) generation. Text-based video editing, leveraging these generative capabilities,…

Computer Vision and Pattern Recognition · Computer Science 2025-01-22 Yupeng Chen , Penglin Chen , Xiaoyu Zhang , Yixian Huang , Qian Xie

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

Text-guided image editing can have a transformative impact in supporting creative applications. A key challenge is to generate edits that are faithful to input text prompts, while consistent with input images. We present Imagen Editor, a…

Computer Vision and Pattern Recognition · Computer Science 2023-04-14 Su Wang , Chitwan Saharia , Ceslee Montgomery , Jordi Pont-Tuset , Shai Noy , Stefano Pellegrini , Yasumasa Onoe , Sarah Laszlo , David J. Fleet , Radu Soricut , Jason Baldridge , Mohammad Norouzi , Peter Anderson , William Chan

SelfEval: Leveraging the discriminative nature of generative models for evaluation

We present an automated way to evaluate the text alignment of text-to-image generative diffusion models using standard image-text recognition datasets. Our method, called SelfEval, uses the generative model to compute the likelihood of real…

Computer Vision and Pattern Recognition · Computer Science 2024-11-28 Sai Saketh Rambhatla , Ishan Misra

Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models

Recent advances in diffusion models enable many powerful instruments for image editing. One of these instruments is text-driven image manipulations: editing semantic attributes of an image according to the provided text description. %…

Computer Vision and Pattern Recognition · Computer Science 2023-04-11 Nikita Starodubcev , Dmitry Baranchuk , Valentin Khrulkov , Artem Babenko

Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code

Text-guided diffusion models have revolutionized image generation and editing, offering exceptional realism and diversity. Specifically, in the context of diffusion-based editing, where a source image is edited according to a target prompt,…

Computer Vision and Pattern Recognition · Computer Science 2023-10-20 Xuan Ju , Ailing Zeng , Yuxuan Bian , Shaoteng Liu , Qiang Xu

EditTransfer++: Toward Faithful and Efficient Visual-Prompt-Guided Image Editing

Visual-prompt-guided edit transfer aims to learn image transformations directly from example pairs, offering more precise and controllable editing than purely text-driven approaches. However, existing diffusion transformer-based methods…

Computer Vision and Pattern Recognition · Computer Science 2026-05-11 Lan Chen , Qi Mao , Yiren Song , Yuchao Gu , Siwei Ma

PixLens: A Novel Framework for Disentangled Evaluation in Diffusion-Based Image Editing with Object Detection + SAM

Evaluating diffusion-based image-editing models is a crucial task in the field of Generative AI. Specifically, it is imperative to assess their capacity to execute diverse editing tasks while preserving the image content and realism. While…

Computer Vision and Pattern Recognition · Computer Science 2024-10-10 Stefan Stefanache , Lluís Pastor Pérez , Julen Costa Watanabe , Ernesto Sanchez Tejedor , Thomas Hofmann , Enis Simsar

DreamWalk: Style Space Exploration using Diffusion Guidance

Text-conditioned diffusion models can generate impressive images, but fall short when it comes to fine-grained control. Unlike direct-editing tools like Photoshop, text conditioned models require the artist to perform "prompt engineering,"…

Computer Vision and Pattern Recognition · Computer Science 2024-04-05 Michelle Shu , Charles Herrmann , Richard Strong Bowen , Forrester Cole , Ramin Zabih

Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models

Evaluating the quality of automatically generated image descriptions is a complex task that requires metrics capturing various dimensions, such as grammaticality, coverage, accuracy, and truthfulness. Although human evaluation provides…

Computer Vision and Pattern Recognition · Computer Science 2024-11-11 Jia-Hong Huang , Hongyi Zhu , Yixian Shen , Stevan Rudinac , Evangelos Kanoulas

SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing

Diffusion models demonstrate impressive image generation performance with text guidance. Inspired by the learning process of diffusion, existing images can be edited according to text by DDIM inversion. However, the vanilla DDIM inversion…

Computer Vision and Pattern Recognition · Computer Science 2024-09-17 Qi Qian , Haiyang Xu , Ming Yan , Juhua Hu

FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models

Numerous text-to-video (T2V) editing methods have emerged recently, but the lack of a standardized benchmark for fair evaluation has led to inconsistent claims and an inability to assess model sensitivity to hyperparameters. Fine-grained…

Computer Vision and Pattern Recognition · Computer Science 2025-07-23 Minghan Li , Chenxi Xie , Yichen Wu , Lei Zhang , Mengyu Wang

EditInfinity: Image Editing with Binary-Quantized Generative Models

Adapting pretrained diffusion-based generative models for text-driven image editing with negligible tuning overhead has demonstrated remarkable potential. A classical adaptation paradigm, as followed by these methods, first infers the…

Computer Vision and Pattern Recognition · Computer Science 2025-11-10 Jiahuan Wang , Yuxin Chen , Jun Yu , Guangming Lu , Wenjie Pei

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models

Text-guided image editing has recently experienced rapid development. However, simultaneously performing multiple editing actions on a single image, such as background replacement and specific subject attribute changes, while maintaining…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Pengzhi Li , QInxuan Huang , Yikang Ding , Zhiheng Li

TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control

Centred on content modification and style preservation, Scene Text Editing (STE) remains a challenging task despite considerable progress in text-to-image synthesis and text-driven image manipulation recently. GAN-based STE methods…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Weichao Zeng , Yan Shu , Zhenhang Li , Dongbao Yang , Yu Zhou

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Diffusion models have recently been shown to generate high-quality synthetic images, especially when paired with a guidance technique to trade off diversity for fidelity. We explore diffusion models for the problem of text-conditional image…

Computer Vision and Pattern Recognition · Computer Science 2022-03-09 Alex Nichol , Prafulla Dhariwal , Aditya Ramesh , Pranav Shyam , Pamela Mishkin , Bob McGrew , Ilya Sutskever , Mark Chen

Fine-grained Image Editing by Pixel-wise Guidance Using Diffusion Models

Our goal is to develop fine-grained real-image editing methods suitable for real-world applications. In this paper, we first summarize four requirements for these methods and propose a novel diffusion-based image editing framework with…

Computer Vision and Pattern Recognition · Computer Science 2023-06-01 Naoki Matsunaga , Masato Ishii , Akio Hayakawa , Kenji Suzuki , Takuya Narihira