Effective Quantization for Diffusion Models on CPUs

Hanwen Chang; Haihao Shen; Yiyang Cai; Xinyu Ye; Zhenzhong Xu; Wenhua Cheng; Kaokao Lv; Weiwei Zhang; Yintong Lu; Heng Guo

Effective Quantization for Diffusion Models on CPUs

Computer Vision and Pattern Recognition 2023-11-30 v2 Artificial Intelligence

Authors: Hanwen Chang , Haihao Shen , Yiyang Cai , Xinyu Ye , Zhenzhong Xu , Wenhua Cheng , Kaokao Lv , Weiwei Zhang , Yintong Lu , Heng Guo

View on arXiv ↗ PDF ↗

Abstract

Diffusion models have gained popularity for generating images from textual descriptions. Nonetheless, the substantial need for computational resources continues to present a noteworthy challenge, contributing to time-consuming processes. Quantization, a technique employed to compress deep learning models for enhanced efficiency, presents challenges when applied to diffusion models. These models are notably more sensitive to quantization compared to other model types, potentially resulting in a degradation of image quality. In this paper, we introduce a novel approach to quantize the diffusion models by leveraging both quantization-aware training and distillation. Our results show the quantized models can maintain the high image quality while demonstrating the inference efficiency on CPUs. The code is publicly available at: https://github.com/intel/intel-extension-for-transformers.

Keywords

diffusion model quantization knowledge distillation

Cite

@article{arxiv.2311.16133,
  title  = {Effective Quantization for Diffusion Models on CPUs},
  author = {Hanwen Chang and Haihao Shen and Yiyang Cai and Xinyu Ye and Zhenzhong Xu and Wenhua Cheng and Kaokao Lv and Weiwei Zhang and Yintong Lu and Heng Guo},
  journal= {arXiv preprint arXiv:2311.16133},
  year   = {2023}
}

Effective Quantization for Diffusion Models on CPUs

Abstract

Keywords

Cite

Related papers