English

DiffRatio: Training One-Step Diffusion Models Without Teacher Supervision

Machine Learning 2026-01-29 v5 Computer Vision and Pattern Recognition

Abstract

Score-based distillation methods (e.g., variational score distillation) train one-step diffusion models by first pre-training a teacher score model and then distilling it into a one-step student model. However, the gradient estimator in the distillation stage usually suffers from two sources of bias: (1) biased teacher supervision due to score estimation error incurred during pre-training, and (2) the student model's score estimation error during distillation. These biases can degrade the quality of the resulting one-step diffusion model. To address this, we propose DiffRatio, a new framework for training one-step diffusion models: instead of estimating the teacher and student scores independently and then taking their difference, we directly estimate the score difference as the gradient of a learned log density ratio between the student and data distributions across diffusion time steps. This approach greatly simplifies the training pipeline, significantly reduces gradient estimation bias, and improves one-step generation quality. Additionally, it also reduces auxiliary network size by using a lightweight density-ratio network instead of two full score networks, which improves computational and memory efficiency. DiffRatio achieves competitive one-step generation results on CIFAR-10 and ImageNet (64x64 and 512x512), outperforming most teacher-supervised distillation methods. Moreover, the learned density ratio naturally serves as a verifier, enabling a principled inference-time parallel scaling scheme that further improves the generation quality without external rewards or additional sequential computation.

Keywords

Cite

@article{arxiv.2502.08005,
  title  = {DiffRatio: Training One-Step Diffusion Models Without Teacher Supervision},
  author = {Wenlin Chen and Mingtian Zhang and Jiajun He and Zijing Ou and José Miguel Hernández-Lobato and Bernhard Schölkopf and David Barber},
  journal= {arXiv preprint arXiv:2502.08005},
  year   = {2026}
}

Comments

22 pages, 8 figures, 5 tables, 2 algorithms

R2 v1 2026-06-28T21:40:58.311Z