Related papers: On Discrete Prompt Optimization for Diffusion Mode…

Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models

The quality of the prompts provided to text-to-image diffusion models determines how faithful the generated content is to the user's intent, often requiring `prompt engineering'. To harness visual concepts from target images without prompt…

Computer Vision and Pattern Recognition · Computer Science 2023-12-20 Shweta Mahajan , Tanzila Rahman , Kwang Moo Yi , Leonid Sigal

Batch-Instructed Gradient for Prompt Evolution:Systematic Prompt Optimization for Enhanced Text-to-Image Synthesis

Text-to-image models have shown remarkable progress in generating high-quality images from user-provided prompts. Despite this, the quality of these images varies due to the models' sensitivity to human language nuances. With advancements…

Artificial Intelligence · Computer Science 2024-06-14 Xinrui Yang , Zhuohan Wang , Anthony Hu

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be hand-crafted by humans. There are also "soft"…

Machine Learning · Computer Science 2023-06-02 Yuxin Wen , Neel Jain , John Kirchenbauer , Micah Goldblum , Jonas Geiping , Tom Goldstein

Seek for Incantations: Towards Accurate Text-to-Image Diffusion Synthesis through Prompt Engineering

The text-to-image synthesis by diffusion models has recently shown remarkable performance in generating high-quality images. Although performs well for simple texts, the models may get confused when faced with complex texts that contain…

Computer Vision and Pattern Recognition · Computer Science 2024-01-15 Chang Yu , Junran Peng , Xiangyu Zhu , Zhaoxiang Zhang , Qi Tian , Zhen Lei

Prompt Optimization Via Diffusion Language Models

We propose a diffusion-based framework for prompt optimization that leverages Diffusion Language Models (DLMs) to iteratively refine system prompts through masked denoising. By conditioning on interaction traces, including user queries,…

Computation and Language · Computer Science 2026-02-24 Shiyu Wang , Haolin Chen , Liangwei Yang , Jielin Qiu , Rithesh Murthy , Ming Zhu , Zixiang Chen , Silvio Savarese , Caiming Xiong , Shelby Heinecke , Huan Wang

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing

A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images.They either finetune the model, or invert the image in the latent space of the pretrained model. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Senmao Li , Joost van de Weijer , Taihang Hu , Fahad Shahbaz Khan , Qibin Hou , Yaxing Wang , Jian Yang , Ming-Ming Cheng

NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation

Despite impressive recent advances in text-to-image diffusion models, obtaining high-quality images often requires prompt engineering by humans who have developed expertise in using them. In this work, we present NeuroPrompts, an adaptive…

Artificial Intelligence · Computer Science 2024-04-09 Shachar Rosenman , Vasudev Lal , Phillip Howard

Prompting Diffusion Representations for Cross-Domain Semantic Segmentation

While originally designed for image generation, diffusion models have recently shown to provide excellent pretrained feature representations for semantic segmentation. Intrigued by this result, we set out to explore how well…

Computer Vision and Pattern Recognition · Computer Science 2023-07-06 Rui Gong , Martin Danelljan , Han Sun , Julio Delgado Mangas , Luc Van Gool

Manipulating Embeddings of Stable Diffusion Prompts

Prompt engineering is still the primary way for users of generative text-to-image models to manipulate generated images in a targeted way. Based on treating the model as a continuous function and by passing gradients between the image space…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Niklas Deckers , Julia Peters , Martin Potthast

User-friendly Image Editing with Minimal Text Input: Leveraging Captioning and Injection Techniques

Recent text-driven image editing in diffusion models has shown remarkable success. However, the existing methods assume that the user's description sufficiently grounds the contexts in the source image, such as objects, background, style,…

Computer Vision and Pattern Recognition · Computer Science 2023-06-06 Sunwoo Kim , Wooseok Jang , Hyunsu Kim , Junho Kim , Yunjey Choi , Seungryong Kim , Gayeong Lee

EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models

Text-to-image generation models~(e.g., Stable Diffusion) have achieved significant advancements, enabling the creation of high-quality and realistic images based on textual descriptions. Prompt inversion, the task of identifying the textual…

Computer Vision and Pattern Recognition · Computer Science 2026-03-06 Mingzhe Li , Kejing Xia , Gehao Zhang , Zhenting Wang , Guanhong Tao , Siqi Pan , Juan Zhai , Shiqing Ma

Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization

Text-to-image diffusion models have emerged as powerful tools for high-quality image generation and editing. Many existing approaches rely on text prompts as editing guidance. However, these methods are constrained by the need for manual…

Computer Vision and Pattern Recognition · Computer Science 2025-05-21 Yuanyuan Chang , Yinghua Yao , Tao Qin , Mengmeng Wang , Ivor Tsang , Guang Dai

Gradient Guidance for Diffusion Models: An Optimization Perspective

Diffusion models have demonstrated empirical successes in various applications and can be adapted to task-specific needs via guidance. This paper studies a form of gradient guidance for adapting a pre-trained diffusion model towards…

Machine Learning · Statistics 2024-10-17 Yingqing Guo , Hui Yuan , Yukang Yang , Minshuo Chen , Mengdi Wang

Optimizing Prompts for Text-to-Image Generation

Well-designed prompts can guide text-to-image models to generate amazing images. However, the performant prompts are often model-specific and misaligned with user input. Instead of laborious human engineering, we propose prompt adaptation,…

Computation and Language · Computer Science 2024-01-01 Yaru Hao , Zewen Chi , Li Dong , Furu Wei

Reusing Computation in Text-to-Image Diffusion for Efficient Generation of Image Sets

Text-to-image diffusion models enable high-quality image generation but are computationally expensive. While prior work optimizes per-inference efficiency, we explore an orthogonal approach: reducing redundancy across correlated prompts.…

Computer Vision and Pattern Recognition · Computer Science 2025-08-29 Dale Decatur , Thibault Groueix , Wang Yifan , Rana Hanocka , Vladimir Kim , Matheus Gadelha

InitNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization

Recent strides in the development of diffusion models, exemplified by advancements such as Stable Diffusion, have underscored their remarkable prowess in generating visually compelling images. However, the imperative of achieving a seamless…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Xiefan Guo , Jinlin Liu , Miaomiao Cui , Jiankai Li , Hongyu Yang , Di Huang

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Recent advancements in text-to-image diffusion models have yielded impressive results in generating realistic and diverse images. However, these models still struggle with complex prompts, such as those that involve numeracy and spatial…

Computer Vision and Pattern Recognition · Computer Science 2024-03-05 Long Lian , Boyi Li , Adam Yala , Trevor Darrell

Prompt-tuning latent diffusion models for inverse problems

We propose a new method for solving imaging inverse problems using text-to-image latent diffusion models as general priors. Existing methods using latent diffusion models for inverse problems typically rely on simple null text prompts,…

Machine Learning · Computer Science 2023-10-03 Hyungjin Chung , Jong Chul Ye , Peyman Milanfar , Mauricio Delbracio

ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for…

Graphics · Computer Science 2023-12-08 Yuxin Zhang , Weiming Dong , Fan Tang , Nisha Huang , Haibin Huang , Chongyang Ma , Tong-Yee Lee , Oliver Deussen , Changsheng Xu

IPGO: Indirect Prompt Gradient Optimization for Parameter-Efficient Prompt-level Fine-Tuning on Text-to-Image Models

Text-to-Image Diffusion models excel at generating images from text prompts but often exhibit suboptimal alignment with content semantics, aesthetics, and human preferences. To address these limitations, this study proposes a novel…

Machine Learning · Computer Science 2025-05-19 Jianping Ye , Michel Wedel , Kunpeng Zhang