English
Related papers

Related papers: Controlled Training Data Generation with Diffusion…

200 papers

Recently, the multimedia community has witnessed the rise of diffusion models trained on large-scale multi-modal data for visual content creation, particularly in the field of text-to-image generation. In this paper, we propose a new task…

Computer Vision and Pattern Recognition · Computer Science 2023-11-10 Jingwen Chen , Yingwei Pan , Ting Yao , Tao Mei

Recent advances in text-to-image generative models have raised concerns about their potential to produce harmful content when provided with malicious input text prompts. To address this issue, two main approaches have emerged: (1)…

Machine Learning · Computer Science 2025-11-13 Jiwoo Shin , Byeonghu Na , Mina Kang , Wonhyeok Choi , Il-Chul Moon

Well-designed prompts can guide text-to-image models to generate amazing images. However, the performant prompts are often model-specific and misaligned with user input. Instead of laborious human engineering, we propose prompt adaptation,…

Computation and Language · Computer Science 2024-01-01 Yaru Hao , Zewen Chi , Li Dong , Furu Wei

Diffusion models have recently shown remarkable success in high-quality image generation. Sometimes, however, a pre-trained diffusion model exhibits partial misalignment in the sense that the model can generate good images, but it sometimes…

Computer Vision and Pattern Recognition · Computer Science 2023-11-01 TaeHo Yoon , Kibeom Myoung , Keon Lee , Jaewoong Cho , Albert No , Ernest K. Ryu

Controllable image generation has always been one of the core demands in image generation, aiming to create images that are both creative and logical while satisfying additional specified conditions. In the post-AIGC era, controllable…

Computer Vision and Pattern Recognition · Computer Science 2024-11-12 Guandong Li

Text-to-image models have shown remarkable progress in generating high-quality images from user-provided prompts. Despite this, the quality of these images varies due to the models' sensitivity to human language nuances. With advancements…

Artificial Intelligence · Computer Science 2024-06-14 Xinrui Yang , Zhuohan Wang , Anthony Hu

Large-scale pre-trained language models have demonstrated strong capabilities of generating realistic text. However, it remains challenging to control the generation results. Previous approaches such as prompting are far from sufficient,…

Computation and Language · Computer Science 2021-11-10 Xu Zou , Da Yin , Qingyang Zhong , Ming Ding , Hongxia Yang , Zhilin Yang , Jie Tang

Recent advances in text-to-image (T2I) diffusion models have enabled impressive image generation capabilities guided by text prompts. However, extending these techniques to video generation remains challenging, with existing text-to-video…

Computer Vision and Pattern Recognition · Computer Science 2024-08-13 Weifeng Chen , Yatai Ji , Jie Wu , Hefeng Wu , Pan Xie , Jiashi Li , Xin Xia , Xuefeng Xiao , Liang Lin

Text-to-Image (T2I) diffusion/flow models have recently achieved remarkable progress in visual fidelity and text alignment. However, they remain limited when users need to precisely control image layouts, something that natural language…

Computer Vision and Pattern Recognition · Computer Science 2026-04-08 Amadou S. Sangare , Adrien Maglo , Mohamed Chaouch , Bertrand Luvison

Embodied agents struggle to generalize to new environments, even when those environments share similar underlying structures to their training settings. Most current approaches to generating these training environments follow an open-loop…

Robotics · Computer Science 2026-02-09 Teresa Yeo , Dulaj Weerakoon , Dulanga Weerakoon , Archan Misra

The performance of computer vision models in certain real-world applications (e.g., rare wildlife observation) is limited by the small number of available images. Expanding datasets using pre-trained generative models is an effective way to…

Computer Vision and Pattern Recognition · Computer Science 2024-12-25 Changjian Chen , Fei Lv , Yalong Guan , Pengcheng Wang , Shengjie Yu , Yifan Zhang , Zhuo Tang

The field of image synthesis has made tremendous strides forward in the last years. Besides defining the desired output image with text-prompts, an intuitive approach is to additionally use spatial guidance in form of an image, such as a…

Computer Vision and Pattern Recognition · Computer Science 2024-08-13 Denis Zavadski , Johann-Friedrich Feiden , Carsten Rother

While most research on controllable text generation has focused on steering base Language Models, the emerging instruction-tuning and prompting paradigm offers an alternate approach to controllability. We compile and release ConGenBench, a…

Computation and Language · Computer Science 2024-05-03 Dhananjay Ashok , Barnabas Poczos

Diffusion models have attained prominence for their ability to synthesize a probability distribution for a given dataset via a diffusion process, enabling the generation of new data points with high fidelity. However, diffusion processes…

Machine Learning · Computer Science 2024-11-25 Shervin Khalafi , Dongsheng Ding , Alejandro Ribeiro

Recently, diffusion-based deep generative models (e.g., Stable Diffusion) have shown impressive results in text-to-image synthesis. However, current text-to-image models often require multiple passes of prompt engineering by humans in order…

Computation and Language · Computer Science 2023-11-14 Tingfeng Cao , Chengyu Wang , Bingyan Liu , Ziheng Wu , Jinhui Zhu , Jun Huang

Current language models demonstrate remarkable proficiency in text generation. However, for many applications it is desirable to control attributes, such as sentiment, or toxicity, of the generated language -- ideally tailored towards each…

Computation and Language · Computer Science 2024-08-09 Justin Lovelace , Varsha Kishore , Yiwei Chen , Kilian Q. Weinberger

In text-to-image generation tasks, the advancements of diffusion models have facilitated the fidelity of generated results. However, these models encounter challenges when processing text prompts containing multiple entities and attributes.…

Computation and Language · Computer Science 2024-04-23 Yihang Wu , Xiao Cao , Kaixin Li , Zitan Chen , Haonan Wang , Lei Meng , Zhiyong Huang

Deep generative models have shown impressive results in text-to-image synthesis. However, current text-to-image models often generate images that are inadequately aligned with text prompts. We propose a fine-tuning method for aligning such…

Controlled generation refers to the problem of creating text that contains stylistic or semantic attributes of interest. Many approaches reduce this problem to training a predictor of the desired attribute. For example, researchers hoping…

Computation and Language · Computer Science 2023-06-02 Carolina Zheng , Claudia Shi , Keyon Vafa , Amir Feder , David M. Blei

Diffusion models have been successfully adapted to text generation tasks by mapping the discrete text into the continuous space. However, there exist nonnegligible gaps between training and inference, owing to the absence of the forward…

Computation and Language · Computer Science 2023-05-09 Zecheng Tang , Pinzheng Wang , Keyan Zhou , Juntao Li , Ziqiang Cao , Min Zhang
‹ Prev 1 2 3 10 Next ›