English
Related papers

Related papers: Self-Guided Diffusion Models

200 papers

Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or classifier pretraining. That is why guidance was harnessed from self-supervised learning backbones, like…

Computer Vision and Pattern Recognition · Computer Science 2023-12-15 Vincent Tao Hu , Yunlu Chen , Mathilde Caron , Yuki M. Asano , Cees G. M. Snoek , Bjorn Ommer

Large-scale generative models are capable of producing high-quality images from detailed text descriptions. However, many aspects of an image are difficult or impossible to convey through text. We introduce self-guidance, a method that…

Computer Vision and Pattern Recognition · Computer Science 2023-06-13 Dave Epstein , Allan Jabri , Ben Poole , Alexei A. Efros , Aleksander Holynski

Generative models have recently undergone significant advancement due to the diffusion models. The success of these models can be often attributed to their use of guidance techniques, such as classifier or classifier-free guidance, which…

Computer Vision and Pattern Recognition · Computer Science 2023-01-31 Gyeongnyeon Kim , Wooseok Jang , Gyuseong Lee , Susung Hong , Junyoung Seo , Seungryong Kim

We introduce Spectral Guidance, a framework for controlling diffusion models by leveraging the intrinsic geometry of the generative process. As data is progressively corrupted by noise, only a small number of features remain informative for…

Machine Learning · Computer Science 2026-05-29 Gabriel Moreira , Manuel Marques , João Paulo Costeira , Chenyan Xiong

Guidance in conditional diffusion generation is of great importance for sample quality and controllability. However, existing guidance schemes are to be desired. On one hand, mainstream methods such as classifier guidance and…

Machine Learning · Computer Science 2023-10-18 Jiajun Ma , Tianyang Hu , Wenjia Wang , Jiacheng Sun

The primary axes of interest in image-generating diffusion models are image quality, the amount of variation in the results, and how well the results align with a given condition, e.g., a class label or a text prompt. The popular…

Computer Vision and Pattern Recognition · Computer Science 2024-12-20 Tero Karras , Miika Aittala , Tuomas Kynkäänniemi , Jaakko Lehtinen , Timo Aila , Samuli Laine

Proper guidance strategies are essential to achieve high-quality generation results without retraining diffusion and flow-based text-to-image models. Existing guidance either requires specific training or strong inductive biases of…

Computer Vision and Pattern Recognition · Computer Science 2025-09-29 Tiancheng Li , Weijian Luo , Zhiyang Chen , Liyuan Ma , Guo-Jun Qi

Typical diffusion models are trained to accept a particular form of conditioning, most commonly text, and cannot be conditioned on other modalities without retraining. In this work, we propose a universal guidance algorithm that enables…

Computer Vision and Pattern Recognition · Computer Science 2023-02-15 Arpit Bansal , Hong-Min Chu , Avi Schwarzschild , Soumyadip Sengupta , Micah Goldblum , Jonas Geiping , Tom Goldstein

Diffusion models for continuous data gained widespread adoption owing to their high quality generation and control mechanisms. However, controllable diffusion on discrete data faces challenges given that continuous guidance methods do not…

Diffusion models have demonstrated superior performance across various generative tasks including images, videos, and audio. However, they encounter difficulties in directly generating high-resolution samples. Previously proposed solutions…

Computer Vision and Pattern Recognition · Computer Science 2024-04-03 Juno Hwang , Yong-Hyun Park , Junghyo Jo

Masked generative models (MGMs) have shown impressive generative ability while providing an order of magnitude efficient sampling steps compared to continuous diffusion models. However, MGMs still underperform in image synthesis compared to…

Computer Vision and Pattern Recognition · Computer Science 2024-10-18 Jiwan Hur , Dong-Jae Lee , Gyojin Han , Jaehyun Choi , Yunho Jeon , Junmo Kim

Diffusion models excel in generating high-quality images. However, current diffusion models struggle to produce reliable images without guidance methods, such as classifier-free guidance (CFG). Are guidance methods truly necessary?…

Computer Vision and Pattern Recognition · Computer Science 2024-12-06 Donghoon Ahn , Jiwon Kang , Sanghyun Lee , Jaewon Min , Minjae Kim , Wooseok Jang , Hyoungwon Cho , Sayak Paul , SeonHwa Kim , Eunju Cha , Kyong Hwan Jin , Seungryong Kim

Personalizing text-to-image diffusion models is crucial for adapting the pre-trained models to specific target concepts, enabling diverse image generation. However, fine-tuning with few images introduces an inherent trade-off between…

Computer Vision and Pattern Recognition · Computer Science 2025-08-04 Sunghyun Park , Seokeon Choi , Hyoungwoo Park , Sungrack Yun

Classifier guidance is a recently introduced method to trade off mode coverage and sample fidelity in conditional diffusion models post training, in the same spirit as low temperature sampling or truncation in other types of generative…

Machine Learning · Computer Science 2022-07-27 Jonathan Ho , Tim Salimans

Controllable image synthesis models allow creation of diverse images based on text instructions or guidance from a reference image. Recently, denoising diffusion probabilistic models have been shown to generate more realistic imagery than…

Computer Vision and Pattern Recognition · Computer Science 2022-12-06 Xihui Liu , Dong Huk Park , Samaneh Azadi , Gong Zhang , Arman Chopikyan , Yuxiao Hu , Humphrey Shi , Anna Rohrbach , Trevor Darrell

Diffusion models have emerged as a powerful framework for generative modeling, with guidance techniques playing a crucial role in enhancing sample quality. Despite their empirical success, a comprehensive theoretical understanding of the…

Machine Learning · Statistics 2025-05-05 Gen Li , Yuchen Jiao

Diffusion-based text-to-image generation models like GLIDE and DALLE-2 have gained wide success recently for their superior performance in turning complex text inputs into images of high quality and wide diversity. In particular, they are…

Computer Vision and Pattern Recognition · Computer Science 2022-11-16 Zhihong Pan , Xin Zhou , Hao Tian

Diffusion models have emerged as powerful tools for high-quality image generation and editing, but guiding these models to produce specific outputs remains a challenge. Conventional approaches rely on conditioning mechanisms, such as text…

Computer Vision and Pattern Recognition · Computer Science 2026-05-27 Nithesh Chandher Karthikeyan , Jonas Unger , Gabriel Eilertsen

Image generation using diffusion models have demonstrated outstanding learning capabilities, effectively capturing the full distribution of the training dataset. They are known to generate wide variations in sampled images, albeit with a…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Rahul Shenoy , Zhihong Pan , Kaushik Balakrishnan , Qisen Cheng , Yongmoon Jeon , Heejune Yang , Jaewon Kim

We introduce a novel, training-free approach for enhancing alignment in Transformer-based Text-Guided Diffusion Models (TGDMs). Existing TGDMs often struggle to generate semantically aligned images, particularly when dealing with complex…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Shulei Wang , Wang Lin , Hai Huang , Hanting Wang , Sihang Cai , WenKang Han , Tao Jin , Jingyuan Chen , Jiacheng Sun , Jieming Zhu , Zhou Zhao
‹ Prev 1 2 3 10 Next ›