English

Iterative Multi-granular Image Editing using Diffusion Models

Computer Vision and Pattern Recognition 2023-10-31 v2 Artificial Intelligence Machine Learning

Abstract

Recent advances in text-guided image synthesis has dramatically changed how creative professionals generate artistic and aesthetically pleasing visual assets. To fully support such creative endeavors, the process should possess the ability to: 1) iteratively edit the generations and 2) control the spatial reach of desired changes (global, local or anything in between). We formalize this pragmatic problem setting as Iterative Multi-granular Editing. While there has been substantial progress with diffusion-based models for image synthesis and editing, they are all one shot (i.e., no iterative editing capabilities) and do not naturally yield multi-granular control (i.e., covering the full spectrum of local-to-global edits). To overcome these drawbacks, we propose EMILIE: Iterative Multi-granular Image Editor. EMILIE introduces a novel latent iteration strategy, which re-purposes a pre-trained diffusion model to facilitate iterative editing. This is complemented by a gradient control operation for multi-granular control. We introduce a new benchmark dataset to evaluate our newly proposed setting. We conduct exhaustive quantitatively and qualitatively evaluation against recent state-of-the-art approaches adapted to our task, to being out the mettle of EMILIE. We hope our work would attract attention to this newly identified, pragmatic problem setting.

Keywords

Cite

@article{arxiv.2309.00613,
  title  = {Iterative Multi-granular Image Editing using Diffusion Models},
  author = {K J Joseph and Prateksha Udhayanan and Tripti Shukla and Aishwarya Agarwal and Srikrishna Karanam and Koustava Goswami and Balaji Vasan Srinivasan},
  journal= {arXiv preprint arXiv:2309.00613},
  year   = {2023}
}

Comments

Accepted to IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024

R2 v1 2026-06-28T12:10:37.765Z