Related papers: Controlled Training Data Generation with Diffusion…

ControlStyle: Text-Driven Stylized Image Generation Using Diffusion Priors

Recently, the multimedia community has witnessed the rise of diffusion models trained on large-scale multi-modal data for visual content creation, particularly in the field of text-to-image generation. In this paper, we propose a new task…

Computer Vision and Pattern Recognition · Computer Science 2023-11-10 Jingwen Chen , Yingwei Pan , Ting Yao , Tao Mei

Prompt-Based Safety Guidance Is Ineffective for Unlearned Text-to-Image Diffusion Models

Recent advances in text-to-image generative models have raised concerns about their potential to produce harmful content when provided with malicious input text prompts. To address this issue, two main approaches have emerged: (1)…

Machine Learning · Computer Science 2025-11-13 Jiwoo Shin , Byeonghu Na , Mina Kang , Wonhyeok Choi , Il-Chul Moon

Optimizing Prompts for Text-to-Image Generation

Well-designed prompts can guide text-to-image models to generate amazing images. However, the performant prompts are often model-specific and misaligned with user input. Instead of laborious human engineering, we propose prompt adaptation,…

Computation and Language · Computer Science 2024-01-01 Yaru Hao , Zewen Chi , Li Dong , Furu Wei

Censored Sampling of Diffusion Models Using 3 Minutes of Human Feedback

Diffusion models have recently shown remarkable success in high-quality image generation. Sometimes, however, a pre-trained diffusion model exhibits partial misalignment in the sense that the model can generate good images, but it sometimes…

Computer Vision and Pattern Recognition · Computer Science 2023-11-01 TaeHo Yoon , Kibeom Myoung , Keon Lee , Jaewoong Cho , Albert No , Ernest K. Ryu

Layout Control and Semantic Guidance with Attention Loss Backward for T2I Diffusion Model

Controllable image generation has always been one of the core demands in image generation, aiming to create images that are both creative and logical while satisfying additional specified conditions. In the post-AIGC era, controllable…

Computer Vision and Pattern Recognition · Computer Science 2024-11-12 Guandong Li

Batch-Instructed Gradient for Prompt Evolution:Systematic Prompt Optimization for Enhanced Text-to-Image Synthesis

Text-to-image models have shown remarkable progress in generating high-quality images from user-provided prompts. Despite this, the quality of these images varies due to the models' sensitivity to human language nuances. With advancements…

Artificial Intelligence · Computer Science 2024-06-14 Xinrui Yang , Zhuohan Wang , Anthony Hu

Controllable Generation from Pre-trained Language Models via Inverse Prompting

Large-scale pre-trained language models have demonstrated strong capabilities of generating realistic text. However, it remains challenging to control the generation results. Previous approaches such as prompting are far from sufficient,…

Computation and Language · Computer Science 2021-11-10 Xu Zou , Da Yin , Qingyang Zhong , Ming Ding , Hongxia Yang , Zhilin Yang , Jie Tang

Control-A-Video: Controllable Text-to-Video Diffusion Models with Motion Prior and Reward Feedback Learning

Recent advances in text-to-image (T2I) diffusion models have enabled impressive image generation capabilities guided by text prompts. However, extending these techniques to video generation remains challenging, with existing text-to-video…

Computer Vision and Pattern Recognition · Computer Science 2024-08-13 Weifeng Chen , Yatai Ji , Jie Wu , Hefeng Wu , Pan Xie , Jiashi Li , Xin Xia , Xuefeng Xiao , Liang Lin

Improving Controllable Generation: Faster Training and Better Performance via $x_0$-Supervision

Text-to-Image (T2I) diffusion/flow models have recently achieved remarkable progress in visual fidelity and text alignment. However, they remain limited when users need to precisely control image layouts, something that natural language…

Computer Vision and Pattern Recognition · Computer Science 2026-04-08 Amadou S. Sangare , Adrien Maglo , Mohamed Chaouch , Bertrand Luvison

Towards Adaptive Environment Generation for Training Embodied Agents

Embodied agents struggle to generalize to new environments, even when those environments share similar underlying structures to their training settings. Most current approaches to generating these training environments follow an open-loop…

Robotics · Computer Science 2026-02-09 Teresa Yeo , Dulaj Weerakoon , Dulanga Weerakoon , Archan Misra

Human-Guided Image Generation for Expanding Small-Scale Training Image Datasets

The performance of computer vision models in certain real-world applications (e.g., rare wildlife observation) is limited by the small number of available images. Expanding datasets using pre-trained generative models is an effective way to…

Computer Vision and Pattern Recognition · Computer Science 2024-12-25 Changjian Chen , Fei Lv , Yalong Guan , Pengcheng Wang , Shengjie Yu , Yifan Zhang , Zhuo Tang

ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems

The field of image synthesis has made tremendous strides forward in the last years. Besides defining the desired output image with text-prompts, an intuitive approach is to additionally use spatial guidance in form of an image, such as a…

Computer Vision and Pattern Recognition · Computer Science 2024-08-13 Denis Zavadski , Johann-Friedrich Feiden , Carsten Rother

Controllable Text Generation in the Instruction-Tuning Era

While most research on controllable text generation has focused on steering base Language Models, the emerging instruction-tuning and prompting paradigm offers an alternate approach to controllability. We compile and release ConGenBench, a…

Computation and Language · Computer Science 2024-05-03 Dhananjay Ashok , Barnabas Poczos

Constrained Diffusion Models via Dual Training

Diffusion models have attained prominence for their ability to synthesize a probability distribution for a given dataset via a diffusion process, enabling the generation of new data points with high fidelity. However, diffusion processes…

Machine Learning · Computer Science 2024-11-25 Shervin Khalafi , Dongsheng Ding , Alejandro Ribeiro

BeautifulPrompt: Towards Automatic Prompt Engineering for Text-to-Image Synthesis

Recently, diffusion-based deep generative models (e.g., Stable Diffusion) have shown impressive results in text-to-image synthesis. However, current text-to-image models often require multiple passes of prompt engineering by humans in order…

Computation and Language · Computer Science 2023-11-14 Tingfeng Cao , Chengyu Wang , Bingyan Liu , Ziheng Wu , Jinhui Zhu , Jun Huang

Diffusion Guided Language Modeling

Current language models demonstrate remarkable proficiency in text generation. However, for many applications it is desirable to control attributes, such as sentiment, or toxicity, of the generated language -- ideally tailored towards each…

Computation and Language · Computer Science 2024-08-09 Justin Lovelace , Varsha Kishore , Yiwei Chen , Kilian Q. Weinberger

Towards Better Text-to-Image Generation Alignment via Attention Modulation

In text-to-image generation tasks, the advancements of diffusion models have facilitated the fidelity of generated results. However, these models encounter challenges when processing text prompts containing multiple entities and attributes.…

Computation and Language · Computer Science 2024-04-23 Yihang Wu , Xiao Cao , Kaixin Li , Zitan Chen , Haonan Wang , Lei Meng , Zhiyong Huang

Aligning Text-to-Image Models using Human Feedback

Deep generative models have shown impressive results in text-to-image synthesis. However, current text-to-image models often generate images that are inadequately aligned with text prompts. We propose a fine-tuning method for aligning such…

Machine Learning · Computer Science 2023-02-24 Kimin Lee , Hao Liu , Moonkyung Ryu , Olivia Watkins , Yuqing Du , Craig Boutilier , Pieter Abbeel , Mohammad Ghavamzadeh , Shixiang Shane Gu

An Invariant Learning Characterization of Controlled Text Generation

Controlled generation refers to the problem of creating text that contains stylistic or semantic attributes of interest. Many approaches reduce this problem to training a predictor of the desired attribute. For example, researchers hoping…

Computation and Language · Computer Science 2023-06-02 Carolina Zheng , Claudia Shi , Keyon Vafa , Amir Feder , David M. Blei

Can Diffusion Model Achieve Better Performance in Text Generation? Bridging the Gap between Training and Inference!

Diffusion models have been successfully adapted to text generation tasks by mapping the discrete text into the continuous space. However, there exist nonnegligible gaps between training and inference, owing to the absence of the forward…

Computation and Language · Computer Science 2023-05-09 Zecheng Tang , Pinzheng Wang , Keyan Zhou , Juntao Li , Ziqiang Cao , Min Zhang