Related papers: DiffEditor: Boosting Accuracy and Flexibility on D…

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models

Despite the ability of existing large-scale text-to-image (T2I) models to generate high-quality images from detailed textual descriptions, they often lack the ability to precisely edit the generated or real images. In this paper, we propose…

Computer Vision and Pattern Recognition · Computer Science 2023-11-21 Chong Mou , Xintao Wang , Jiechong Song , Ying Shan , Jian Zhang

DiffUTE: Universal Text Editing Diffusion Model

Diffusion model based language-guided image editing has achieved great success recently. However, existing state-of-the-art diffusion models struggle with rendering correct text and text style during generation. To tackle this problem, we…

Computer Vision and Pattern Recognition · Computer Science 2023-10-19 Haoxing Chen , Zhuoer Xu , Zhangxuan Gu , Jun Lan , Xing Zheng , Yaohui Li , Changhua Meng , Huijia Zhu , Weiqiang Wang

Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks

Diffusion-based Image Editing (DIE) is an emerging research hot-spot, which often applies a semantic mask to control the target area for diffusion-based editing. However, most existing solutions obtain these masks via manual operations or…

Computer Vision and Pattern Recognition · Computer Science 2024-01-24 Siyu Zou , Jiji Tang , Yiyi Zhou , Jing He , Chaoyi Zhao , Rongsheng Zhang , Zhipeng Hu , Xiaoshuai Sun

FunEditor: Achieving Complex Image Edits via Function Aggregation with Diffusion Models

Diffusion models have demonstrated outstanding performance in generative tasks, making them ideal candidates for image editing. Recent studies highlight their ability to apply desired edits effectively by following textual instructions, yet…

Computer Vision and Pattern Recognition · Computer Science 2024-12-18 Mohammadreza Samadi , Fred X. Han , Mohammad Salameh , Hao Wu , Fengyu Sun , Chunhua Zhou , Di Niu

DiffEdit: Diffusion-based semantic image editing with mask guidance

Image generation has recently seen tremendous advances, with diffusion models allowing to synthesize convincing images for a large variety of text prompts. In this article, we propose DiffEdit, a method to take advantage of text-conditioned…

Computer Vision and Pattern Recognition · Computer Science 2022-10-21 Guillaume Couairon , Jakob Verbeek , Holger Schwenk , Matthieu Cord

DiffRetouch: Using Diffusion to Retouch on the Shoulder of Experts

Image retouching aims to enhance the visual quality of photos. Considering the different aesthetic preferences of users, the target of retouching is subjective. However, current retouching methods mostly adopt deterministic models, which…

Computer Vision and Pattern Recognition · Computer Science 2024-07-08 Zheng-Peng Duan , Jiawei zhang , Zheng Lin , Xin Jin , Dongqing Zou , Chunle Guo , Chongyi Li

ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models

Modern Text-to-Image (T2I) Diffusion models have revolutionized image editing by enabling the generation of high-quality photorealistic images. While the de facto method for performing edits with T2I models is through text instructions,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-07 Ashutosh Srivastava , Tarun Ram Menta , Abhinav Java , Avadhoot Jadhav , Silky Singh , Surgan Jandial , Balaji Krishnamurthy

A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models

Image editing aims to edit the given synthetic or real image to meet the specific requirements from users. It is widely studied in recent years as a promising and challenging field of Artificial Intelligence Generative Content (AIGC).…

Computer Vision and Pattern Recognition · Computer Science 2024-06-21 Xincheng Shuai , Henghui Ding , Xingjun Ma , Rongcheng Tu , Yu-Gang Jiang , Dacheng Tao

Custom-Edit: Text-Guided Image Editing with Customized Diffusion Models

Text-to-image diffusion models can generate diverse, high-fidelity images based on user-provided text prompts. Recent research has extended these models to support text-guided image editing. While text guidance is an intuitive editing…

Computer Vision and Pattern Recognition · Computer Science 2023-05-26 Jooyoung Choi , Yunjey Choi , Yunji Kim , Junho Kim , Sungroh Yoon

UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models

Text-to-Image (T2I) generation methods based on diffusion model have garnered significant attention in the last few years. Although these image synthesis methods produce visually appealing results, they frequently exhibit spelling errors…

Computer Vision and Pattern Recognition · Computer Science 2023-12-27 Yiming Zhao , Zhouhui Lian

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing

A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images.They either finetune the model, or invert the image in the latent space of the pretrained model. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Senmao Li , Joost van de Weijer , Taihang Hu , Fahad Shahbaz Khan , Qibin Hou , Yaxing Wang , Jian Yang , Ming-Ming Cheng

I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models

The remarkable generative capabilities of diffusion models have motivated extensive research in both image and video editing. Compared to video editing which faces additional challenges in the time dimension, image editing has witnessed the…

Computer Vision and Pattern Recognition · Computer Science 2024-05-28 Wenqi Ouyang , Yi Dong , Lei Yang , Jianlou Si , Xingang Pan

Edit2Perceive: Image Editing Diffusion Models Are Strong Dense Perceivers

Recent advances in diffusion transformers have shown remarkable generalization in visual synthesis, yet most dense perception methods still rely on text-to-image (T2I) generators designed for stochastic generation. We revisit this paradigm…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Yiqing Shi , Yiren Song , Mike Zheng Shou

Enhancing Text-to-Image Editing via Hybrid Mask-Informed Fusion

Recently, text-to-image (T2I) editing has been greatly pushed forward by applying diffusion models. Despite the visual promise of the generated images, inconsistencies with the expected textual prompt remain prevalent. This paper aims to…

Computer Vision and Pattern Recognition · Computer Science 2024-09-20 Aoxue Li , Mingyang Yi , Zhenguo Li

Pix2Video: Video Editing using Image Diffusion

Image diffusion models, trained on massive image collections, have emerged as the most versatile image generator model in terms of quality and diversity. They support inverting real images and conditional (e.g., text) generation, making…

Computer Vision and Pattern Recognition · Computer Science 2023-03-23 Duygu Ceylan , Chun-Hao Paul Huang , Niloy J. Mitra

TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models

Diffusion models have opened the path to a wide range of text-based image editing frameworks. However, these typically build on the multi-step nature of the diffusion backwards process, and adapting them to distilled, fast-sampling methods…

Computer Vision and Pattern Recognition · Computer Science 2024-08-02 Gilad Deutch , Rinon Gal , Daniel Garibi , Or Patashnik , Daniel Cohen-Or

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models

Text-guided image editing has recently experienced rapid development. However, simultaneously performing multiple editing actions on a single image, such as background replacement and specific subject attribute changes, while maintaining…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Pengzhi Li , QInxuan Huang , Yikang Ding , Zhiheng Li

ColorEdit: Training-free Image-Guided Color editing with diffusion model

Text-to-image (T2I) diffusion models, with their impressive generative capabilities, have been adopted for image editing tasks, demonstrating remarkable efficacy. However, due to attention leakage and collision between the cross-attention…

Computer Vision and Pattern Recognition · Computer Science 2025-05-01 Xingxi Yin , Zhi Li , Jingfeng Zhang , Chenglin Li , Yin Zhang

EditCrafter: Tuning-free High-Resolution Image Editing via Pretrained Diffusion Model

We propose EditCrafter, a high-resolution image editing method that operates without tuning, leveraging pretrained text-to-image (T2I) diffusion models to process images at resolutions significantly exceeding those used during training.…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Kunho Kim , Sumin Seo , Yongjun Cho , Hyungjin Chung

Differential Diffusion: Giving Each Pixel Its Strength

Diffusion models have revolutionized image generation and editing, producing state-of-the-art results in conditioned and unconditioned image synthesis. While current techniques enable user control over the degree of change in an image edit,…

Computer Vision and Pattern Recognition · Computer Science 2024-03-01 Eran Levin , Ohad Fried