Related papers: ScribbleSense: Generative Scribble-Based Texture E…

ScribbleEdit: Synthetic Data for Image Editing with Scribbles and Text

Recent progress in generative models has significantly advanced image editing capabilities, yet precise and intuitive user control remains difficult. Specifically, users often struggle to communicate both exact spatial layouts and specific…

Computer Vision and Pattern Recognition · Computer Science 2026-05-06 Anya Ji , George Ma , Téa Wright , Yiming Zhang , David M. Chan , Alane Suhr , Somayeh Sojoudi

ScribbleGen: Generative Data Augmentation Improves Scribble-supervised Semantic Segmentation

Recent advances in generative models, such as diffusion models, have made generating high-quality synthetic images widely accessible. Prior works have shown that training on synthetic images improves many perception tasks, such as image…

Computer Vision and Pattern Recognition · Computer Science 2024-04-18 Jacob Schnell , Jieke Wang , Lu Qi , Vincent Tao Hu , Meng Tang

ScribbleSeg: Scribble-based Interactive Image Segmentation

Interactive segmentation enables users to extract masks by providing simple annotations to indicate the target, such as boxes, clicks, or scribbles. Among these interaction formats, scribbles are the most flexible as they can be of…

Computer Vision and Pattern Recognition · Computer Science 2023-03-21 Xi Chen , Yau Shing Jonathan Cheung , Ser-Nam Lim , Hengshuang Zhao

DreamOmni3: Scribble-based Editing and Generation

Recently unified generation and editing models have achieved remarkable success with their impressive performance. These models rely mainly on text prompts for instruction-based editing and generation, but language often fails to capture…

Computer Vision and Pattern Recognition · Computer Science 2025-12-30 Bin Xia , Bohao Peng , Jiyang Liu , Sitong Wu , Jingyao Li , Junjia Huang , Xu Zhao , Yitong Wang , Ruihang Chu , Bei Yu , Jiaya Jia

Rethinking Scribble-Guided Image Editing: Generalization, Instruction Adherence, and Multi-Tasking

Scribble-guided image editing allows users to combine simple scribble annotations with text prompts to specify both where and how an image should be edited, enabling flexible interaction with precise spatial control. However, existing…

Computer Vision and Pattern Recognition · Computer Science 2026-05-26 Mingyi Xu , Jinpeng Lin , Min Zhou , Tiezheng Ge , Ming Zeng

MIND-Edit: MLLM Insight-Driven Editing via Language-Vision Projection

Recent advances in AI-generated content (AIGC) have significantly accelerated image editing techniques, driving increasing demand for diverse and fine-grained edits. Despite these advances, existing image editing methods still face…

Computer Vision and Pattern Recognition · Computer Science 2025-05-27 Shuyu Wang , Weiqi Li , Qian Wang , Shijie Zhao , Jian Zhang

ScribbleLight: Single Image Indoor Relighting with Scribbles

Image-based relighting of indoor rooms creates an immersive virtual understanding of the space, which is useful for interior design, virtual staging, and real estate. Relighting indoor rooms from a single image is especially challenging due…

Computer Vision and Pattern Recognition · Computer Science 2024-11-27 Jun Myeong Choi , Annie Wang , Pieter Peers , Anand Bhattad , Roni Sengupta

Guiding Instruction-based Image Editing via Multimodal Large Language Models

Instruction-based image editing improves the controllability and flexibility of image manipulation via natural commands without elaborate descriptions or regional masks. However, human instructions are sometimes too brief for current…

Computer Vision and Pattern Recognition · Computer Science 2024-02-06 Tsu-Jui Fu , Wenze Hu , Xianzhi Du , William Yang Wang , Yinfei Yang , Zhe Gan

ClipFace: Text-guided Editing of Textured 3D Morphable Models

We propose ClipFace, a novel self-supervised approach for text-guided editing of textured 3D morphable model of faces. Specifically, we employ user-friendly language prompts to enable control of the expressions as well as appearance of 3D…

Computer Vision and Pattern Recognition · Computer Science 2023-04-25 Shivangi Aneja , Justus Thies , Angela Dai , Matthias Nießner

Language-Guided Multimodal Texture Authoring via Generative Models

Authoring realistic haptic textures typically requires low-level parameter tuning and repeated trial-and-error, limiting speed, transparency, and creative reach. We present a language-driven authoring system that turns natural-language…

Human-Computer Interaction · Computer Science 2026-04-09 Wanli Qian , Aiden Chang , Shihan Lu , Michael Gu , Heather Culbertson

BrushEdit: All-In-One Image Inpainting and Editing

Image editing has advanced significantly with the development of diffusion models using both inversion-based and instruction-based methods. However, current inversion-based approaches struggle with big modifications (e.g., adding or…

Computer Vision and Pattern Recognition · Computer Science 2025-05-06 Yaowei Li , Yuxuan Bian , Xuan Ju , Zhaoyang Zhang , Junhao Zhuang , Ying Shan , Yuexian Zou , Qiang Xu

High-Fidelity Guided Image Synthesis with Latent Diffusion Models

Controllable image synthesis with user scribbles has gained huge public interest with the recent advent of text-conditioned latent diffusion models. The user scribbles control the color composition while the text prompt provides control…

Computer Vision and Pattern Recognition · Computer Science 2022-12-01 Jaskirat Singh , Stephen Gould , Liang Zheng

Understanding the Implicit User Intention via Reasoning with Large Language Model for Image Editing

Existing image editing methods can handle simple editing instructions very well. To deal with complex editing instructions, they often need to jointly fine-tune the large language models (LLMs) and diffusion models (DMs), which involves…

Computer Vision and Pattern Recognition · Computer Science 2025-11-03 Yijia Wang , Yiqing Shen , Weiming Chen , Zhihai He

Scribble-Guided Diffusion for Training-free Text-to-Image Generation

Recent advancements in text-to-image diffusion models have demonstrated remarkable success, yet they often struggle to fully capture the user's intent. Existing approaches using textual inputs combined with bounding boxes or region masks…

Computer Vision and Pattern Recognition · Computer Science 2024-09-13 Seonho Lee , Jiho Choi , Seohyun Lim , Jiwook Kim , Hyunjung Shim

Integrating Large Language Models into Text Animation: An Intelligent Editing System with Inline and Chat Interaction

Text animation, a foundational element in video creation, enables efficient and cost-effective communication, thriving in advertisements, journalism, and social media. However, traditional animation workflows present significant usability…

Human-Computer Interaction · Computer Science 2025-06-13 Bao Zhang , Zihan Li , Zhenglei Liu , Huanchen Wang , Yuxin Ma

Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing

Large-scale text-to-image generative models have been a ground-breaking development in generative AI, with diffusion models showing their astounding ability to synthesize convincing images following an input text prompt. The goal of image…

Computer Vision and Pattern Recognition · Computer Science 2023-09-28 Kai Wang , Fei Yang , Shiqi Yang , Muhammad Atif Butt , Joost van de Weijer

SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing

Scene graphs offer a structured, hierarchical representation of images, with nodes and edges symbolizing objects and the relationships among them. It can serve as a natural interface for image editing, dramatically improving precision and…

Computer Vision and Pattern Recognition · Computer Science 2024-10-16 Zhiyuan Zhang , DongDong Chen , Jing Liao

ScribGen: Generating Scribble Art Through Metaheuristics

Art has long been a medium for individuals to engage with the world. Scribble art, a form of abstract visual expression, features spontaneous, gestural strokes made with pens or brushes. These dynamic and expressive compositions, created…

Graphics · Computer Science 2024-11-14 Soumyaratna Debnath , Ashish Tiwari , Shanmuganathan Raman

PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation

Preference-conditioned image generation seeks to adapt generative models to individual users, producing outputs that reflect personal aesthetic choices beyond the given textual prompt. Despite recent progress, existing approaches either…

Computer Vision and Pattern Recognition · Computer Science 2025-12-09 Wenyi Mo , Tianyu Zhang , Yalong Bai , Ligong Han , Ying Ba , Dimitris N. Metaxas

TexSliders: Diffusion-Based Texture Editing in CLIP Space

Generative models have enabled intuitive image creation and manipulation using natural language. In particular, diffusion models have recently shown remarkable results for natural image editing. In this work, we propose to apply diffusion…

Graphics · Computer Science 2024-05-02 Julia Guerrero-Viu , Milos Hasan , Arthur Roullier , Midhun Harikumar , Yiwei Hu , Paul Guerrero , Diego Gutierrez , Belen Masia , Valentin Deschaintre