Related papers: SegQuant: A Semantics-Aware and Generalizable Quan…

PQD: Post-training Quantization for Efficient Diffusion Models

Diffusionmodels(DMs)havedemonstratedremarkableachievements in synthesizing images of high fidelity and diversity. However, the extensive computational requirements and slow generative speed of diffusion models have limited their widespread…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Jiaojiao Ye , Zhen Wang , Linnan Jiang

Diffusion Model Quantization: A Review

Recent success of large text-to-image models has empirically underscored the exceptional performance of diffusion models in generative tasks. To facilitate their efficient deployment on resource-constrained edge devices, model quantization…

Computer Vision and Pattern Recognition · Computer Science 2025-05-09 Qian Zeng , Chenggong Hu , Mingli Song , Jie Song

PermuQuant: Lowering Per-Group Quantization Error by Reordering Channels for Diffusion Models

Large-scale visual generative models have achieved remarkable performance. However, their high computational and memory costs make deployment challenging in resource-constrained scenarios, such as interactive applications and personal…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Yongsen Cheng , Kai Liu , Kaiwen Tao , Junxian Li , Zhixin Wang , Zhikai Chen , Renjing Pei , Yulun Zhang

Gradient-Aligned Calibration for Post-Training Quantization of Diffusion Models

Diffusion models have shown remarkable performance in image synthesis by progressively estimating a smooth transition from a Gaussian distribution of noise to a real image. Unfortunately, their practical deployment is limited by slow…

Machine Learning · Computer Science 2026-03-03 Dung Anh Hoang , Cuong Pham anh Trung Le , Jianfei Cai , Thanh-Toan Do

Q-Diffusion: Quantizing Diffusion Models

Diffusion models have achieved great success in image synthesis through iterative noise estimation using deep neural networks. However, the slow inference, high memory consumption, and computation intensity of the noise estimation model…

Computer Vision and Pattern Recognition · Computer Science 2023-06-09 Xiuyu Li , Yijiang Liu , Long Lian , Huanrui Yang , Zhen Dong , Daniel Kang , Shanghang Zhang , Kurt Keutzer

DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation

Model quantization is a promising method for accelerating and compressing diffusion models. Nevertheless, since post-training quantization (PTQ) fails catastrophically at low-bit cases, quantization-aware training (QAT) is essential.…

Computer Vision and Pattern Recognition · Computer Science 2025-07-10 Xuewen Liu , Zhikai Li , Minhao Jiang , Mengjuan Chen , Jianquan Li , Qingyi Gu

Temporal Feature Matters: A Framework for Diffusion Model Quantization

The Diffusion models, widely used for image generation, face significant challenges related to their broad applicability due to prolonged inference times and high memory demands. Efficient Post-Training Quantization (PTQ) is crucial to…

Computer Vision and Pattern Recognition · Computer Science 2025-07-15 Yushi Huang , Ruihao Gong , Xianglong Liu , Jing Liu , Yuhang Li , Jiwen Lu , Dacheng Tao

DLLMQuant: Quantizing Diffusion-based Large Language Models

Diffusion-based large language models (DLLMs) have shown promise for non-autoregressive text generation, but their deployment is constrained by large model sizes and heavy computational costs. Post-training quantization (PTQ), a widely used…

Computation and Language · Computer Science 2025-08-27 Chen Xu , Dawei Yang

RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization

Large transformer models have demonstrated remarkable success. Post-training quantization (PTQ), which requires only a small dataset for calibration and avoids end-to-end retraining, is a promising solution for compressing these large…

Machine Learning · Computer Science 2024-02-09 Zhikai Li , Xuewen Liu , Jing Zhang , Qingyi Gu

PTQD: Accurate Post-Training Quantization for Diffusion Models

Diffusion models have recently dominated image synthesis tasks. However, the iterative denoising process is expensive in computations at inference time, making diffusion models less practical for low-latency and scalable real-world…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Yefei He , Luping Liu , Jing Liu , Weijia Wu , Hong Zhou , Bohan Zhuang

DMQ: Dissecting Outliers of Diffusion Models for Post-Training Quantization

Diffusion models have achieved remarkable success in image generation but come with significant computational costs, posing challenges for deployment in resource-constrained environments. Recent post-training quantization (PTQ) methods have…

Computer Vision and Pattern Recognition · Computer Science 2025-07-18 Dongyeun Lee , Jiwan Hur , Hyounguk Shon , Jae Young Lee , Junmo Kim

Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion

Text-to-image generation via Stable Diffusion models (SDM) have demonstrated remarkable capabilities. However, their computational intensity, particularly in the iterative denoising process, hinders real-time deployment in latency-sensitive…

Computer Vision and Pattern Recognition · Computer Science 2025-05-08 Shuaiting Li , Juncan Deng , Zeyu Wang , Kedong Xu , Rongtao Deng , Hong Gu , Haibin Shen , Kejie Huang

SEPTQ: A Simple and Effective Post-Training Quantization Paradigm for Large Language Models

Large language models (LLMs) have shown remarkable performance in various domains, but they are constrained by massive computational and storage costs. Quantization, an effective technique for compressing models to fit resource-limited…

Computation and Language · Computer Science 2026-04-14 Han Liu , Haotian Gao , Xiaotong Zhang , Changya Li , Feng Zhang , Wei Wang , Fenglong Ma , Hong Yu

EfficientQuant: An Efficient Post-Training Quantization for CNN-Transformer Hybrid Models on Edge Devices

Hybrid models that combine convolutional and transformer blocks offer strong performance in computer vision (CV) tasks but are resource-intensive for edge deployment. Although post-training quantization (PTQ) can help reduce resource…

Computer Vision and Pattern Recognition · Computer Science 2025-06-16 Shaibal Saha , Lanyu Xu

Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers

Recent advancements in diffusion models, particularly the architectural transformation from UNet-based models to Diffusion Transformers (DiTs), significantly improve the quality and scalability of image and video generation. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-21 Lei Chen , Yuan Meng , Chen Tang , Xinzhu Ma , Jingyan Jiang , Xin Wang , Zhi Wang , Wenwu Zhu

AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models

We present in this paper a novel post-training quantization (PTQ) method, dubbed AccuQuant, for diffusion models. We show analytically and empirically that quantization errors for diffusion models are accumulated over denoising steps in a…

Computer Vision and Pattern Recognition · Computer Science 2025-10-24 Seunghoon Lee , Jeongwoo Choi , Byunggwan Son , Jaehyeon Moon , Jeimin Jeon , Bumsub Ham

Temporal Dynamic Quantization for Diffusion Models

The diffusion model has gained popularity in vision applications due to its remarkable generative performance and versatility. However, high storage and computation demands, resulting from the model size and iterative generation, hinder its…

Computer Vision and Pattern Recognition · Computer Science 2023-12-12 Junhyuk So , Jungwon Lee , Daehyun Ahn , Hyungjun Kim , Eunhyeok Park

DVD-Quant: Data-free Video Diffusion Transformers Quantization

Diffusion Transformers (DiTs) have emerged as the state-of-the-art architecture for video generation, yet their computational and memory demands hinder practical deployment. While post-training quantization (PTQ) presents a promising…

Computer Vision and Pattern Recognition · Computer Science 2026-03-09 Zhiteng Li , Hanxuan Li , Junyi Wu , Kai Liu , Haotong Qin , Linghe Kong , Guihai Chen , Yulun Zhang , Xiaokang Yang

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

The Diffusion model, a prevalent framework for image generation, encounters significant challenges in terms of broad applicability due to its extended inference times and substantial memory requirements. Efficient Post-training Quantization…

Computer Vision and Pattern Recognition · Computer Science 2024-03-12 Yushi Huang , Ruihao Gong , Jing Liu , Tianlong Chen , Xianglong Liu

Efficient Quantization Strategies for Latent Diffusion Models

Latent Diffusion Models (LDMs) capture the dynamic evolution of latent variables over time, blending patterns and multimodality in a generative system. Despite the proficiency of LDM in various applications, such as text-to-image…

Computer Vision and Pattern Recognition · Computer Science 2023-12-12 Yuewei Yang , Xiaoliang Dai , Jialiang Wang , Peizhao Zhang , Hongbo Zhang