Related papers: ParallelEdits: Efficient Multi-object Image Editin…

Unified Editing of Panorama, 3D Scenes, and Videos Through Disentangled Self-Attention Injection

While text-to-image models have achieved impressive capabilities in image generation and editing, their application across various modalities often necessitates training separate models. Inspired by existing method of single image editing…

Computer Vision and Pattern Recognition · Computer Science 2024-05-28 Gihyun Kwon , Jangho Park , Jong Chul Ye

LEDITS++: Limitless Image Editing using Text-to-Image Models

Text-to-image diffusion models have recently received increasing interest for their astonishing ability to produce high-fidelity images from solely text inputs. Subsequent research efforts aim to exploit and apply their capabilities to real…

Computer Vision and Pattern Recognition · Computer Science 2024-06-27 Manuel Brack , Felix Friedrich , Katharina Kornmeier , Linoy Tsaban , Patrick Schramowski , Kristian Kersting , Apolinário Passos

MultiEdit: Advancing Instruction-based Image Editing on Diverse and Challenging Tasks

Current instruction-based image editing (IBIE) methods struggle with challenging editing tasks, as both editing types and sample counts of existing datasets are limited. Moreover, traditional dataset construction often contains noisy…

Computer Vision and Pattern Recognition · Computer Science 2025-09-19 Mingsong Li , Lin Liu , Hongjun Wang , Haoxing Chen , Xijun Gu , Shizhan Liu , Dong Gong , Junbo Zhao , Zhenzhong Lan , Jianguo Li

IE-Bench: Advancing the Measurement of Text-Driven Image Editing for Human Perception Alignment

Recent advances in text-driven image editing have been significant, yet the task of accurately evaluating these edited images continues to pose a considerable challenge. Different from the assessment of text-driven image generation,…

Computer Vision and Pattern Recognition · Computer Science 2025-01-20 Shangkun Sun , Bowen Qu , Xiaoyu Liang , Songlin Fan , Wei Gao

DragText: Rethinking Text Embedding in Point-based Image Editing

Point-based image editing enables accurate and flexible control through content dragging. However, the role of text embedding during the editing process has not been thoroughly investigated. A significant aspect that remains unexplored is…

Computer Vision and Pattern Recognition · Computer Science 2024-12-05 Gayoon Choi , Taejin Jeong , Sujung Hong , Seong Jae Hwang

Enhancing Image Aesthetics with Dual-Conditioned Diffusion Models Guided by Multimodal Perception

Image aesthetic enhancement aims to perceive aesthetic deficiencies in images and perform corresponding editing operations, which is highly challenging and requires the model to possess creativity and aesthetic perception capabilities.…

Computer Vision and Pattern Recognition · Computer Science 2026-03-13 Xinyu Nan , Ning Wang , Yuyao Zhai , Mei Yang

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

Text-guided image editing can have a transformative impact in supporting creative applications. A key challenge is to generate edits that are faithful to input text prompts, while consistent with input images. We present Imagen Editor, a…

Computer Vision and Pattern Recognition · Computer Science 2023-04-14 Su Wang , Chitwan Saharia , Ceslee Montgomery , Jordi Pont-Tuset , Shai Noy , Stefano Pellegrini , Yasumasa Onoe , Sarah Laszlo , David J. Fleet , Radu Soricut , Jason Baldridge , Mohammad Norouzi , Peter Anderson , William Chan

TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models

Diffusion models have opened the path to a wide range of text-based image editing frameworks. However, these typically build on the multi-step nature of the diffusion backwards process, and adapting them to distilled, fast-sampling methods…

Computer Vision and Pattern Recognition · Computer Science 2024-08-02 Gilad Deutch , Rinon Gal , Daniel Garibi , Or Patashnik , Daniel Cohen-Or

DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting

Subject-driven image inpainting has recently gained prominence in image editing with the rapid advancement of diffusion models. Beyond image guidance, recent studies have explored incorporating text guidance to achieve identity-preserved…

Computer Vision and Pattern Recognition · Computer Science 2025-09-25 Yicheng Yang , Pengxiang Li , Lu Zhang , Liqian Ma , Ping Hu , Siyu Du , Yunzhi Zhuge , Xu Jia , Huchuan Lu

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models

Text-guided image editing has recently experienced rapid development. However, simultaneously performing multiple editing actions on a single image, such as background replacement and specific subject attribute changes, while maintaining…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Pengzhi Li , QInxuan Huang , Yikang Ding , Zhiheng Li

Towards Small Object Editing: A Benchmark Dataset and A Training-Free Approach

A plethora of text-guided image editing methods has recently been developed by leveraging the impressive capabilities of large-scale diffusion-based generative models especially Stable Diffusion. Despite the success of diffusion models in…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Qihe Pan , Zhen Zhao , Zicheng Wang , Sifan Long , Yiming Wu , Wei Ji , Haoran Liang , Ronghua Liang

An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control

Building on the success of text-to-image diffusion models (DPMs), image editing is an important application to enable human interaction with AI-generated content. Among various editing methods, editing within the prompt space gains more…

Computer Vision and Pattern Recognition · Computer Science 2025-01-28 Aosong Feng , Weikang Qiu , Jinbin Bai , Xiao Zhang , Zhen Dong , Kaicheng Zhou , Rex Ying , Leandros Tassiulas

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing

A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images.They either finetune the model, or invert the image in the latent space of the pretrained model. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Senmao Li , Joost van de Weijer , Taihang Hu , Fahad Shahbaz Khan , Qibin Hou , Yaxing Wang , Jian Yang , Ming-Ming Cheng

Unified Diffusion-Based Rigid and Non-Rigid Editing with Text and Image Guidance

Existing text-to-image editing methods tend to excel either in rigid or non-rigid editing but encounter challenges when combining both, resulting in misaligned outputs with the provided text prompts. In addition, integrating reference…

Computer Vision and Pattern Recognition · Computer Science 2024-01-05 Jiacheng Wang , Ping Liu , Wei Xu

LatentEdit: Adaptive Latent Control for Consistent Semantic Editing

Diffusion-based Image Editing has achieved significant success in recent years. However, it remains challenging to achieve high-quality image editing while maintaining the background similarity without sacrificing speed or memory…

Graphics · Computer Science 2025-09-03 Siyi Liu , Weiming Chen , Yushun Tang , Zhihai He

DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing

Large-scale Text-to-Image (T2I) diffusion models have revolutionized image generation over the last few years. Although owning diverse and high-quality generation capabilities, translating these abilities to fine-grained image editing…

Computer Vision and Pattern Recognition · Computer Science 2024-02-06 Chong Mou , Xintao Wang , Jiechong Song , Ying Shan , Jian Zhang

PixLens: A Novel Framework for Disentangled Evaluation in Diffusion-Based Image Editing with Object Detection + SAM

Evaluating diffusion-based image-editing models is a crucial task in the field of Generative AI. Specifically, it is imperative to assess their capacity to execute diverse editing tasks while preserving the image content and realism. While…

Computer Vision and Pattern Recognition · Computer Science 2024-10-10 Stefan Stefanache , Lluís Pastor Pérez , Julen Costa Watanabe , Ernesto Sanchez Tejedor , Thomas Hofmann , Enis Simsar

DiffUTE: Universal Text Editing Diffusion Model

Diffusion model based language-guided image editing has achieved great success recently. However, existing state-of-the-art diffusion models struggle with rendering correct text and text style during generation. To tackle this problem, we…

Computer Vision and Pattern Recognition · Computer Science 2023-10-19 Haoxing Chen , Zhuoer Xu , Zhangxuan Gu , Jun Lan , Xing Zheng , Yaohui Li , Changhua Meng , Huijia Zhu , Weiqiang Wang

Custom-Edit: Text-Guided Image Editing with Customized Diffusion Models

Text-to-image diffusion models can generate diverse, high-fidelity images based on user-provided text prompts. Recent research has extended these models to support text-guided image editing. While text guidance is an intuitive editing…

Computer Vision and Pattern Recognition · Computer Science 2023-05-26 Jooyoung Choi , Yunjey Choi , Yunji Kim , Junho Kim , Sungroh Yoon

A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models

Image editing aims to edit the given synthetic or real image to meet the specific requirements from users. It is widely studied in recent years as a promising and challenging field of Artificial Intelligence Generative Content (AIGC).…

Computer Vision and Pattern Recognition · Computer Science 2024-06-21 Xincheng Shuai , Henghui Ding , Xingjun Ma , Rongcheng Tu , Yu-Gang Jiang , Dacheng Tao