Related papers: Guiding Token-Sparse Diffusion Models

Guiding a Diffusion Model by Swapping Its Tokens

Classifier-Free Guidance (CFG) is a widely used inference-time technique to boost the image quality of diffusion models. Yet, its reliance on text conditions prevents its use in unconditional generation. We propose a simple method to enable…

Computer Vision and Pattern Recognition · Computer Science 2026-04-10 Weijia Zhang , Yuehao Liu , Shanyan Guan , Wu Ran , Yanhao Ge , Wei Li , Chao Ma

Improving Diffusion Generalization with Weak-to-Strong Segmented Guidance

Diffusion models generate synthetic images through an iterative refinement process. However, the misalignment between the simulation-free objective and the iterative process often causes accumulated gradient error along the sampling…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Liangyu Yuan , Yufei Huang , Mingkun Lei , Tong Zhao , Ruoyu Wang , Changxi Chi , Yiwei Wang , Chi Zhang

Gradient-Free Classifier Guidance for Diffusion Model Sampling

Image generation using diffusion models have demonstrated outstanding learning capabilities, effectively capturing the full distribution of the training dataset. They are known to generate wide variations in sampled images, albeit with a…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Rahul Shenoy , Zhihong Pan , Kaushik Balakrishnan , Qisen Cheng , Yongmoon Jeon , Heejune Yang , Jaewon Kim

Sparse-to-Sparse Training of Diffusion Models

Diffusion models (DMs) are a powerful type of generative models that have achieved state-of-the-art results in various image synthesis tasks and have shown potential in other domains, such as natural language processing and temporal data…

Machine Learning · Computer Science 2026-02-05 Inês Cardoso Oliveira , Decebal Constantin Mocanu , Luis A. Leiva

PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Diffusion models have shown impressive results in generating high-quality conditional samples using guidance techniques such as Classifier-Free Guidance (CFG). However, existing methods often require additional training or neural function…

Machine Learning · Computer Science 2025-07-22 Kwanyoung Kim , Byeongsu Sim

Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models

Guidance is a crucial technique for extracting the best performance out of image-generating diffusion models. Traditionally, a constant guidance weight has been applied throughout the sampling chain of an image. We show that guidance is…

Computer Vision and Pattern Recognition · Computer Science 2024-11-07 Tuomas Kynkäänniemi , Miika Aittala , Tero Karras , Samuli Laine , Timo Aila , Jaakko Lehtinen

Guiding a Diffusion Transformer with the Internal Dynamics of Itself

The diffusion model presents a powerful ability to capture the entire (conditional) data distribution. However, due to the lack of sufficient training and data to learn to cover low-probability areas, the model will be penalized for failing…

Computer Vision and Pattern Recognition · Computer Science 2026-04-01 Xingyu Zhou , Qifan Li , Xiaobin Hu , Hai Chen , Shuhang Gu

DiffSparse: Accelerating Diffusion Transformers with Learned Token Sparsity

Diffusion models demonstrate outstanding performance in image generation, but their multi-step inference mechanism requires immense computational cost. Previous works accelerate inference by leveraging layer or token cache techniques to…

Computer Vision and Pattern Recognition · Computer Science 2026-04-07 Haowei Zhu , Ji Liu , Ziqiong Liu , Dong Li , Junhai Yong , Bin Wang , Emad Barsoum

Token Perturbation Guidance for Diffusion Models

Classifier-free guidance (CFG) has become an essential component of modern diffusion models to enhance both generation quality and alignment with input conditions. However, CFG requires specific training procedures and is limited to…

Graphics · Computer Science 2025-11-06 Javad Rajabi , Soroush Mehraban , Seyedmorteza Sadat , Babak Taati

Diffusion Models Beat GANs on Image Synthesis

We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For…

Machine Learning · Computer Science 2021-06-02 Prafulla Dhariwal , Alex Nichol

SparseDM: Toward Sparse Efficient Diffusion Models

Diffusion models represent a powerful family of generative models widely used for image and video generation. However, the time-consuming deployment, long inference time, and requirements on large memory hinder their applications on…

Machine Learning · Computer Science 2025-04-18 Kafeng Wang , Jianfei Chen , He Li , Zhenpeng Mi , Jun Zhu

DiffIER: Optimizing Diffusion Models with Iterative Error Reduction

Diffusion models have demonstrated remarkable capabilities in generating high-quality samples and enhancing performance across diverse domains through Classifier-Free Guidance (CFG). However, the quality of generated samples is highly…

Computer Vision and Pattern Recognition · Computer Science 2025-08-21 Ao Chen , Lihe Ding , Tianfan Xue

Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models

Classifier-free Guidance (CFG) is a widely used technique in modern diffusion models for enhancing sample quality and prompt adherence. However, through an empirical analysis on Gaussian mixture modeling with a closed-form solution, we…

Computer Vision and Pattern Recognition · Computer Science 2026-03-05 Chubin Chen , Jiashu Zhu , Xiaokun Feng , Nisha Huang , Chen Zhu , Meiqi Wu , Fangyuan Mao , Jiahong Wu , Xiangxiang Chu , Xiu Li

Feedback Guidance of Diffusion Models

While Classifier-Free Guidance (CFG) has become standard for improving sample fidelity in conditional diffusion models, it can harm diversity and induce memorization by applying constant guidance regardless of whether a particular sample…

Computer Vision and Pattern Recognition · Computer Science 2025-10-10 Felix Koulischer , Florian Handke , Johannes Deleu , Thomas Demeester , Luca Ambrogioni

Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling

Diffusion models have emerged as a powerful tool for generating high-quality images, videos, and 3D content. While sampling guidance techniques like CFG improve quality, they reduce diversity and motion. Autoguidance mitigates these issues…

Computer Vision and Pattern Recognition · Computer Science 2024-12-02 Junha Hyung , Kinam Kim , Susung Hong , Min-Jung Kim , Jaegul Choo

SSG: Scaled Spatial Guidance for Multi-Scale Visual Autoregressive Generation

Visual autoregressive (VAR) models generate images through next-scale prediction, naturally achieving coarse-to-fine, fast, high-fidelity synthesis mirroring human perception. In practice, this hierarchy can drift at inference time, as…

Computer Vision and Pattern Recognition · Computer Science 2026-02-06 Youngwoo Shin , Jiwan Hur , Junmo Kim

Self-Guidance: Boosting Flow and Diffusion Generation on Their Own

Proper guidance strategies are essential to achieve high-quality generation results without retraining diffusion and flow-based text-to-image models. Existing guidance either requires specific training or strong inductive biases of…

Computer Vision and Pattern Recognition · Computer Science 2025-09-29 Tiancheng Li , Weijian Luo , Zhiyang Chen , Liyuan Ma , Guo-Jun Qi

How Much To Guide: Revisiting Adaptive Guidance in Classifier-Free Guidance Text-to-Vision Diffusion Models

With the rapid development of text-to-vision generation diffusion models, classifier-free guidance has emerged as the most prevalent method for conditioning. However, this approach inherently requires twice as many steps for model…

Computer Vision and Pattern Recognition · Computer Science 2025-06-11 Huixuan Zhang , Junzhe Zhang , Xiaojun Wan

Sparsely Supervised Diffusion

Diffusion models have shown remarkable success across a wide range of generative tasks. However, they often suffer from spatially inconsistent generation, arguably due to the inherent locality of their denoising mechanisms. This can yield…

Machine Learning · Computer Science 2026-02-04 Wenshuai Zhao , Zhiyuan Li , Yi Zhao , Mohammad Hassan Vali , Martin Trapp , Joni Pajarinen , Juho Kannala , Arno Solin

Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation

Diffusion-based text-to-image generation models trained on extensive text-image pairs have demonstrated the ability to produce photorealistic images aligned with textual descriptions. However, a significant limitation of these models is…

Computer Vision and Pattern Recognition · Computer Science 2025-02-11 Mingyuan Zhou , Zhendong Wang , Huangjie Zheng , Hai Huang