Related papers: FunEditor: Achieving Complex Image Edits via Funct…

DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing

Large-scale Text-to-Image (T2I) diffusion models have revolutionized image generation over the last few years. Although owning diverse and high-quality generation capabilities, translating these abilities to fine-grained image editing…

Computer Vision and Pattern Recognition · Computer Science 2024-02-06 Chong Mou , Xintao Wang , Jiechong Song , Ying Shan , Jian Zhang

Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models

Recent advances in diffusion models enable many powerful instruments for image editing. One of these instruments is text-driven image manipulations: editing semantic attributes of an image according to the provided text description. %…

Computer Vision and Pattern Recognition · Computer Science 2023-04-11 Nikita Starodubcev , Dmitry Baranchuk , Valentin Khrulkov , Artem Babenko

Inverse-and-Edit: Effective and Fast Image Editing by Cycle Consistency Models

Recent advances in image editing with diffusion models have achieved impressive results, offering fine-grained control over the generation process. However, these methods are computationally intensive because of their iterative nature.…

Computer Vision and Pattern Recognition · Computer Science 2025-06-25 Ilia Beletskii , Andrey Kuznetsov , Aibek Alanov

FineEdit: Fine-Grained Image Edit with Bounding Box Guidance

Diffusion-based image editing models have achieved significant progress in real world applications. However, conventional models typically rely on natural language prompts, which often lack the precision required to localize target objects.…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Haohang Xu , Lin Liu , Zhibo Zhang , Rong Cong , Xiaopeng Zhang , Qi Tian

Pix2Video: Video Editing using Image Diffusion

Image diffusion models, trained on massive image collections, have emerged as the most versatile image generator model in terms of quality and diversity. They support inverting real images and conditional (e.g., text) generation, making…

Computer Vision and Pattern Recognition · Computer Science 2023-03-23 Duygu Ceylan , Chun-Hao Paul Huang , Niloy J. Mitra

Edit2Perceive: Image Editing Diffusion Models Are Strong Dense Perceivers

Recent advances in diffusion transformers have shown remarkable generalization in visual synthesis, yet most dense perception methods still rely on text-to-image (T2I) generators designed for stochastic generation. We revisit this paradigm…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Yiqing Shi , Yiren Song , Mike Zheng Shou

TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models

Diffusion models have opened the path to a wide range of text-based image editing frameworks. However, these typically build on the multi-step nature of the diffusion backwards process, and adapting them to distilled, fast-sampling methods…

Computer Vision and Pattern Recognition · Computer Science 2024-08-02 Gilad Deutch , Rinon Gal , Daniel Garibi , Or Patashnik , Daniel Cohen-Or

PartEdit: Fine-Grained Image Editing using Pre-Trained Diffusion Models

We present the first text-based image editing approach for object parts based on pre-trained diffusion models. Diffusion-based image editing approaches capitalized on the deep understanding of diffusion models of image semantics to perform…

Computer Vision and Pattern Recognition · Computer Science 2025-06-30 Aleksandar Cvejic , Abdelrahman Eldesokey , Peter Wonka

Dreamix: Video Diffusion Models are General Video Editors

Text-driven image and video diffusion models have recently achieved unprecedented generation realism. While diffusion models have been successfully applied for image editing, very few works have done so for video editing. We present the…

Computer Vision and Pattern Recognition · Computer Science 2023-02-03 Eyal Molad , Eliahu Horwitz , Dani Valevski , Alex Rav Acha , Yossi Matias , Yael Pritch , Yaniv Leviathan , Yedid Hoshen

Image Editing with Diffusion Models: A Survey

With deeper exploration of diffusion model, developments in the field of image generation have triggered a boom in image creation. As the quality of base-model generated images continues to improve, so does the demand for further…

Graphics · Computer Science 2025-04-21 Jia Wang , Jie Hu , Xiaoqi Ma , Hanghang Ma , Xiaoming Wei , Enhua Wu

DiffEdit: Diffusion-based semantic image editing with mask guidance

Image generation has recently seen tremendous advances, with diffusion models allowing to synthesize convincing images for a large variety of text prompts. In this article, we propose DiffEdit, a method to take advantage of text-conditioned…

Computer Vision and Pattern Recognition · Computer Science 2022-10-21 Guillaume Couairon , Jakob Verbeek , Holger Schwenk , Matthieu Cord

EffiVED:Efficient Video Editing via Text-instruction Diffusion Models

Large-scale text-to-video models have shown remarkable abilities, but their direct application in video editing remains challenging due to limited available datasets. Current video editing methods commonly require per-video fine-tuning of…

Computer Vision and Pattern Recognition · Computer Science 2024-06-06 Zhenghao Zhang , Zuozhuo Dai , Long Qin , Weizhi Wang

InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions

Recent works have explored text-guided image editing using diffusion models and generated edited images based on text prompts. However, the models struggle to accurately locate the regions to be edited and faithfully perform precise edits.…

Computer Vision and Pattern Recognition · Computer Science 2023-05-30 Qian Wang , Biao Zhang , Michael Birsak , Peter Wonka

MotionEditor: Editing Video Motion via Content-Aware Diffusion

Existing diffusion-based video editing models have made gorgeous advances for editing attributes of a source video over time but struggle to manipulate the motion information while preserving the original protagonist's appearance and…

Computer Vision and Pattern Recognition · Computer Science 2023-12-01 Shuyuan Tu , Qi Dai , Zhi-Qi Cheng , Han Hu , Xintong Han , Zuxuan Wu , Yu-Gang Jiang

I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models

The remarkable generative capabilities of diffusion models have motivated extensive research in both image and video editing. Compared to video editing which faces additional challenges in the time dimension, image editing has witnessed the…

Computer Vision and Pattern Recognition · Computer Science 2024-05-28 Wenqi Ouyang , Yi Dong , Lei Yang , Jianlou Si , Xingang Pan

Geometric Image Editing via Effects-Sensitive In-Context Inpainting with Diffusion Transformers

Recent advances in diffusion models have significantly improved image editing. However, challenges persist in handling geometric transformations, such as translation, rotation, and scaling, particularly in complex scenes. Existing…

Computer Vision and Pattern Recognition · Computer Science 2026-02-10 Shuo Zhang , Wenzhuo Wu , Huayu Zhang , Jiarong Cheng , Xianghao Zang , Chao Ban , Hao Sun , Zhongjiang He , Tianwei Cao , Kongming Liang , Zhanyu Ma

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Text-guided diffusion models have shown superior performance in image/video generation and editing. While few explorations have been performed in 3D scenarios. In this paper, we discuss three fundamental and interesting problems on this…

Computer Vision and Pattern Recognition · Computer Science 2023-10-13 Gang Li , Heliang Zheng , Chaoyue Wang , Chang Li , Changwen Zheng , Dacheng Tao

SeedEdit: Align Image Re-Generation to Image Editing

We introduce SeedEdit, a diffusion model that is able to revise a given image with any text prompt. In our perspective, the key to such a task is to obtain an optimal balance between maintaining the original image, i.e. image…

Computer Vision and Pattern Recognition · Computer Science 2024-11-12 Yichun Shi , Peng Wang , Weilin Huang

Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference

Due to the recent success of diffusion models, text-to-image generation is becoming increasingly popular and achieves a wide range of applications. Among them, text-to-image editing, or continuous text-to-image generation, attracts lots of…

Computer Vision and Pattern Recognition · Computer Science 2024-01-05 Zihao Yu , Haoyang Li , Fangcheng Fu , Xupeng Miao , Bin Cui

High-Fidelity Diffusion-based Image Editing

Diffusion models have attained remarkable success in the domains of image generation and editing. It is widely recognized that employing larger inversion and denoising steps in diffusion model leads to improved image reconstruction quality.…

Computer Vision and Pattern Recognition · Computer Science 2024-01-05 Chen Hou , Guoqiang Wei , Zhibo Chen