English

Interpreting and Improving Diffusion Models from an Optimization Perspective

Machine Learning 2024-06-04 v4 Computer Vision and Pattern Recognition Optimization and Control Machine Learning

Abstract

Denoising is intuitively related to projection. Indeed, under the manifold hypothesis, adding random noise is approximately equivalent to orthogonal perturbation. Hence, learning to denoise is approximately learning to project. In this paper, we use this observation to interpret denoising diffusion models as approximate gradient descent applied to the Euclidean distance function. We then provide straight-forward convergence analysis of the DDIM sampler under simple assumptions on the projection error of the denoiser. Finally, we propose a new gradient-estimation sampler, generalizing DDIM using insights from our theoretical results. In as few as 5-10 function evaluations, our sampler achieves state-of-the-art FID scores on pretrained CIFAR-10 and CelebA models and can generate high quality samples on latent diffusion models.

Keywords

Cite

@article{arxiv.2306.04848,
  title  = {Interpreting and Improving Diffusion Models from an Optimization Perspective},
  author = {Frank Permenter and Chenyang Yuan},
  journal= {arXiv preprint arXiv:2306.04848},
  year   = {2024}
}

Comments

24 pages, 9 figures, 4 tables. To appear in ICML 2024

R2 v1 2026-06-28T10:59:29.452Z