English

Fixed Point Diffusion Models

Computer Vision and Pattern Recognition 2024-01-18 v1 Artificial Intelligence Machine Learning

Abstract

We introduce the Fixed Point Diffusion Model (FPDM), a novel approach to image generation that integrates the concept of fixed point solving into the framework of diffusion-based generative modeling. Our approach embeds an implicit fixed point solving layer into the denoising network of a diffusion model, transforming the diffusion process into a sequence of closely-related fixed point problems. Combined with a new stochastic training method, this approach significantly reduces model size, reduces memory usage, and accelerates training. Moreover, it enables the development of two new techniques to improve sampling efficiency: reallocating computation across timesteps and reusing fixed point solutions between timesteps. We conduct extensive experiments with state-of-the-art models on ImageNet, FFHQ, CelebA-HQ, and LSUN-Church, demonstrating substantial improvements in performance and efficiency. Compared to the state-of-the-art DiT model, FPDM contains 87% fewer parameters, consumes 60% less memory during training, and improves image generation quality in situations where sampling computation or time is limited. Our code and pretrained models are available at https://lukemelas.github.io/fixed-point-diffusion-models.

Keywords

Cite

@article{arxiv.2401.08741,
  title  = {Fixed Point Diffusion Models},
  author = {Xingjian Bai and Luke Melas-Kyriazi},
  journal= {arXiv preprint arXiv:2401.08741},
  year   = {2024}
}

Comments

Project page: https://lukemelas.github.io/fixed-point-diffusion-models