English
Related papers

Related papers: On Discrete Prompt Optimization for Diffusion Mode…

200 papers

The quality of the prompts provided to text-to-image diffusion models determines how faithful the generated content is to the user's intent, often requiring `prompt engineering'. To harness visual concepts from target images without prompt…

Computer Vision and Pattern Recognition · Computer Science 2023-12-20 Shweta Mahajan , Tanzila Rahman , Kwang Moo Yi , Leonid Sigal

Text-to-image models have shown remarkable progress in generating high-quality images from user-provided prompts. Despite this, the quality of these images varies due to the models' sensitivity to human language nuances. With advancements…

Artificial Intelligence · Computer Science 2024-06-14 Xinrui Yang , Zhuohan Wang , Anthony Hu

The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be hand-crafted by humans. There are also "soft"…

Machine Learning · Computer Science 2023-06-02 Yuxin Wen , Neel Jain , John Kirchenbauer , Micah Goldblum , Jonas Geiping , Tom Goldstein

The text-to-image synthesis by diffusion models has recently shown remarkable performance in generating high-quality images. Although performs well for simple texts, the models may get confused when faced with complex texts that contain…

Computer Vision and Pattern Recognition · Computer Science 2024-01-15 Chang Yu , Junran Peng , Xiangyu Zhu , Zhaoxiang Zhang , Qi Tian , Zhen Lei

We propose a diffusion-based framework for prompt optimization that leverages Diffusion Language Models (DLMs) to iteratively refine system prompts through masked denoising. By conditioning on interaction traces, including user queries,…

Computation and Language · Computer Science 2026-02-24 Shiyu Wang , Haolin Chen , Liangwei Yang , Jielin Qiu , Rithesh Murthy , Ming Zhu , Zixiang Chen , Silvio Savarese , Caiming Xiong , Shelby Heinecke , Huan Wang

A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images.They either finetune the model, or invert the image in the latent space of the pretrained model. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Senmao Li , Joost van de Weijer , Taihang Hu , Fahad Shahbaz Khan , Qibin Hou , Yaxing Wang , Jian Yang , Ming-Ming Cheng

Despite impressive recent advances in text-to-image diffusion models, obtaining high-quality images often requires prompt engineering by humans who have developed expertise in using them. In this work, we present NeuroPrompts, an adaptive…

Artificial Intelligence · Computer Science 2024-04-09 Shachar Rosenman , Vasudev Lal , Phillip Howard

While originally designed for image generation, diffusion models have recently shown to provide excellent pretrained feature representations for semantic segmentation. Intrigued by this result, we set out to explore how well…

Computer Vision and Pattern Recognition · Computer Science 2023-07-06 Rui Gong , Martin Danelljan , Han Sun , Julio Delgado Mangas , Luc Van Gool

Prompt engineering is still the primary way for users of generative text-to-image models to manipulate generated images in a targeted way. Based on treating the model as a continuous function and by passing gradients between the image space…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Niklas Deckers , Julia Peters , Martin Potthast

Recent text-driven image editing in diffusion models has shown remarkable success. However, the existing methods assume that the user's description sufficiently grounds the contexts in the source image, such as objects, background, style,…

Computer Vision and Pattern Recognition · Computer Science 2023-06-06 Sunwoo Kim , Wooseok Jang , Hyunsu Kim , Junho Kim , Yunjey Choi , Seungryong Kim , Gayeong Lee

Text-to-image generation models~(e.g., Stable Diffusion) have achieved significant advancements, enabling the creation of high-quality and realistic images based on textual descriptions. Prompt inversion, the task of identifying the textual…

Computer Vision and Pattern Recognition · Computer Science 2026-03-06 Mingzhe Li , Kejing Xia , Gehao Zhang , Zhenting Wang , Guanhong Tao , Siqi Pan , Juan Zhai , Shiqing Ma

Text-to-image diffusion models have emerged as powerful tools for high-quality image generation and editing. Many existing approaches rely on text prompts as editing guidance. However, these methods are constrained by the need for manual…

Computer Vision and Pattern Recognition · Computer Science 2025-05-21 Yuanyuan Chang , Yinghua Yao , Tao Qin , Mengmeng Wang , Ivor Tsang , Guang Dai

Diffusion models have demonstrated empirical successes in various applications and can be adapted to task-specific needs via guidance. This paper studies a form of gradient guidance for adapting a pre-trained diffusion model towards…

Machine Learning · Statistics 2024-10-17 Yingqing Guo , Hui Yuan , Yukang Yang , Minshuo Chen , Mengdi Wang

Well-designed prompts can guide text-to-image models to generate amazing images. However, the performant prompts are often model-specific and misaligned with user input. Instead of laborious human engineering, we propose prompt adaptation,…

Computation and Language · Computer Science 2024-01-01 Yaru Hao , Zewen Chi , Li Dong , Furu Wei

Text-to-image diffusion models enable high-quality image generation but are computationally expensive. While prior work optimizes per-inference efficiency, we explore an orthogonal approach: reducing redundancy across correlated prompts.…

Computer Vision and Pattern Recognition · Computer Science 2025-08-29 Dale Decatur , Thibault Groueix , Wang Yifan , Rana Hanocka , Vladimir Kim , Matheus Gadelha

Recent strides in the development of diffusion models, exemplified by advancements such as Stable Diffusion, have underscored their remarkable prowess in generating visually compelling images. However, the imperative of achieving a seamless…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Xiefan Guo , Jinlin Liu , Miaomiao Cui , Jiankai Li , Hongyu Yang , Di Huang

Recent advancements in text-to-image diffusion models have yielded impressive results in generating realistic and diverse images. However, these models still struggle with complex prompts, such as those that involve numeracy and spatial…

Computer Vision and Pattern Recognition · Computer Science 2024-03-05 Long Lian , Boyi Li , Adam Yala , Trevor Darrell

We propose a new method for solving imaging inverse problems using text-to-image latent diffusion models as general priors. Existing methods using latent diffusion models for inverse problems typically rely on simple null text prompts,…

Machine Learning · Computer Science 2023-10-03 Hyungjin Chung , Jong Chul Ye , Peyman Milanfar , Mauricio Delbracio

Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for…

Text-to-Image Diffusion models excel at generating images from text prompts but often exhibit suboptimal alignment with content semantics, aesthetics, and human preferences. To address these limitations, this study proposes a novel…

Machine Learning · Computer Science 2025-05-19 Jianping Ye , Michel Wedel , Kunpeng Zhang
‹ Prev 1 2 3 10 Next ›