English

Effective Quantization for Diffusion Models on CPUs

Computer Vision and Pattern Recognition 2023-11-30 v2 Artificial Intelligence

Abstract

Diffusion models have gained popularity for generating images from textual descriptions. Nonetheless, the substantial need for computational resources continues to present a noteworthy challenge, contributing to time-consuming processes. Quantization, a technique employed to compress deep learning models for enhanced efficiency, presents challenges when applied to diffusion models. These models are notably more sensitive to quantization compared to other model types, potentially resulting in a degradation of image quality. In this paper, we introduce a novel approach to quantize the diffusion models by leveraging both quantization-aware training and distillation. Our results show the quantized models can maintain the high image quality while demonstrating the inference efficiency on CPUs. The code is publicly available at: https://github.com/intel/intel-extension-for-transformers.

Keywords

Cite

@article{arxiv.2311.16133,
  title  = {Effective Quantization for Diffusion Models on CPUs},
  author = {Hanwen Chang and Haihao Shen and Yiyang Cai and Xinyu Ye and Zhenzhong Xu and Wenhua Cheng and Kaokao Lv and Weiwei Zhang and Yintong Lu and Heng Guo},
  journal= {arXiv preprint arXiv:2311.16133},
  year   = {2023}
}