English

DreamSteerer: Enhancing Source Image Conditioned Editability using Personalized Diffusion Models

Computer Vision and Pattern Recognition 2024-10-31 v2 Machine Learning

Abstract

Recent text-to-image personalization methods have shown great promise in teaching a diffusion model user-specified concepts given a few images for reusing the acquired concepts in a novel context. With massive efforts being dedicated to personalized generation, a promising extension is personalized editing, namely to edit an image using personalized concepts, which can provide a more precise guidance signal than traditional textual guidance. To address this, a straightforward solution is to incorporate a personalized diffusion model with a text-driven editing framework. However, such a solution often shows unsatisfactory editability on the source image. To address this, we propose DreamSteerer, a plug-in method for augmenting existing T2I personalization methods. Specifically, we enhance the source image conditioned editability of a personalized diffusion model via a novel Editability Driven Score Distillation (EDSD) objective. Moreover, we identify a mode trapping issue with EDSD, and propose a mode shifting regularization with spatial feature guided sampling to avoid such an issue. We further employ two key modifications to the Delta Denoising Score framework that enable high-fidelity local editing with personalized concepts. Extensive experiments validate that DreamSteerer can significantly improve the editability of several T2I personalization baselines while being computationally efficient.

Keywords

Cite

@article{arxiv.2410.11208,
  title  = {DreamSteerer: Enhancing Source Image Conditioned Editability using Personalized Diffusion Models},
  author = {Zhengyang Yu and Zhaoyuan Yang and Jing Zhang},
  journal= {arXiv preprint arXiv:2410.11208},
  year   = {2024}
}

Comments

Published as a conference paper at NeurIPS 2024

R2 v1 2026-06-28T19:21:53.705Z