Related papers: FP4DiT: Towards Effective Floating Point Quantizat…

Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers

Recent advancements in diffusion models, particularly the architectural transformation from UNet-based models to Diffusion Transformers (DiTs), significantly improve the quality and scalability of image and video generation. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-21 Lei Chen , Yuan Meng , Chen Tang , Xinzhu Ma , Jingyan Jiang , Xin Wang , Zhi Wang , Wenwu Zhu

PTQ4DiT: Post-training Quantization for Diffusion Transformers

The recent introduction of Diffusion Transformers (DiTs) has demonstrated exceptional capabilities in image generation by using a different backbone architecture, departing from traditional U-Nets and embracing the scalable nature of…

Computer Vision and Pattern Recognition · Computer Science 2024-10-18 Junyi Wu , Haoxuan Wang , Yuzhang Shang , Mubarak Shah , Yan Yan

HQ-DiT: Efficient Diffusion Transformer with FP4 Hybrid Quantization

Diffusion Transformers (DiTs) have recently gained substantial attention in both industrial and academic fields for their superior visual generation capabilities, outperforming traditional diffusion models that use U-Net. However,the…

Computer Vision and Pattern Recognition · Computer Science 2024-06-03 Wenxuan Liu , Sai Qian Zhang

DiRotQ: Rotation-Aware Quantization for 4-bit Diffusion Transformers

Diffusion Transformers (DiTs) achieve state-of-the-art image generation quality but incur substantial memory and computational costs at inference. While aggressive Post-Training Quantization (PTQ) to 4-bit precision offers significant…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Sayeh Sharify , Mahsa Salmani , Hesham Mostafa

ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation

Diffusion transformers have demonstrated remarkable performance in visual generation tasks, such as generating realistic images or videos based on textual instructions. However, larger model sizes and multi-frame processing for video…

Computer Vision and Pattern Recognition · Computer Science 2025-02-25 Tianchen Zhao , Tongcheng Fang , Haofeng Huang , Enshu Liu , Rui Wan , Widyadewi Soedarmadji , Shiyao Li , Zinan Lin , Guohao Dai , Shengen Yan , Huazhong Yang , Xuefei Ning , Yu Wang

EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models

Diffusion models have demonstrated remarkable capabilities in image synthesis and related generative tasks. Nevertheless, their practicality for real-world applications is constrained by substantial computational costs and latency issues.…

Computer Vision and Pattern Recognition · Computer Science 2024-04-16 Yefei He , Jing Liu , Weijia Wu , Hong Zhou , Bohan Zhuang

Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping

Diffusion Transformer (DiT) has now become the preferred choice for building image generation models due to its great generation capability. Unlike previous convolution-based UNet models, DiT is purely composed of a stack of transformer…

Computer Vision and Pattern Recognition · Computer Science 2025-04-01 Ning Ding , Jing Han , Yuchuan Tian , Chao Xu , Kai Han , Yehui Tang

DVD-Quant: Data-free Video Diffusion Transformers Quantization

Diffusion Transformers (DiTs) have emerged as the state-of-the-art architecture for video generation, yet their computational and memory demands hinder practical deployment. While post-training quantization (PTQ) presents a promising…

Computer Vision and Pattern Recognition · Computer Science 2026-03-09 Zhiteng Li , Hanxuan Li , Junyi Wu , Kai Liu , Haotong Qin , Linghe Kong , Guihai Chen , Yulun Zhang , Xiaokang Yang

VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers

The Diffusion Transformers Models (DiTs) have transitioned the network architecture from traditional UNets to transformers, demonstrating exceptional capabilities in image generation. Although DiTs have been widely applied to…

Computer Vision and Pattern Recognition · Computer Science 2024-09-02 Juncan Deng , Shuaiting Li , Zeyu Wang , Hong Gu , Kedong Xu , Kejie Huang

Q-Diffusion: Quantizing Diffusion Models

Diffusion models have achieved great success in image synthesis through iterative noise estimation using deep neural networks. However, the slow inference, high memory consumption, and computation intensity of the noise estimation model…

Computer Vision and Pattern Recognition · Computer Science 2023-06-09 Xiuyu Li , Yijiang Liu , Long Lian , Huanrui Yang , Zhen Dong , Daniel Kang , Shanghang Zhang , Kurt Keutzer

LRQ-DiT: Log-Rotation Post-Training Quantization of Diffusion Transformers for Image and Video Generation

Diffusion Transformers (DiTs) have achieved impressive performance in text-to-image and text-to-video generation. However, their high computational cost and large parameter sizes pose significant challenges for usage in resource-constrained…

Computer Vision and Pattern Recognition · Computer Science 2025-09-24 Lianwei Yang , Haokun Lin , Tianchen Zhao , Yichen Wu , Hongyu Zhu , Ruiqi Xie , Zhenan Sun , Yu Wang , Qingyi Gu

Q-DiT4SR: Exploration of Detail-Preserving Diffusion Transformer Quantization for Real-World Image Super-Resolution

Recently, Diffusion Transformers (DiTs) have emerged in Real-World Image Super-Resolution (Real-ISR) to generate high-quality textures, yet their heavy inference burden hinders real-world deployment. While Post-Training Quantization (PTQ)…

Computer Vision and Pattern Recognition · Computer Science 2026-05-21 Xun Zhang , Kaicheng Yang , Hongliang Lu , Haotong Qin , Yong Guo , Yulun Zhang

TCAQ-DM: Timestep-Channel Adaptive Quantization for Diffusion Models

Diffusion models have achieved remarkable success in the image and video generation tasks. Nevertheless, they often require a large amount of memory and time overhead during inference, due to the complex network architecture and…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Haocheng Huang , Jiaxin Chen , Jinyang Guo , Ruiyi Zhan , Yunhong Wang

DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing

Diffusion Transformers (DiTs) have recently attracted significant interest from both industry and academia due to their enhanced capabilities in visual generation, surpassing the performance of traditional diffusion models that employ…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Zhenyuan Dong , Sai Qian Zhang

Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models

Diffusion models are emerging models that generate images by iteratively denoising random Gaussian noise using deep neural networks. These models typically exhibit high computational and memory demands, necessitating effective post-training…

Computer Vision and Pattern Recognition · Computer Science 2024-08-14 Cheng Chen , Christina Giannoula , Andreas Moshovos

TaQ-DiT: Time-aware Quantization for Diffusion Transformers

Transformer-based diffusion models, dubbed Diffusion Transformers (DiTs), have achieved state-of-the-art performance in image and video generation tasks. However, their large model size and slow inference speed limit their practical…

Image and Video Processing · Electrical Eng. & Systems 2026-01-26 Xinyan Liu , Huihong Shi , Yang Xu , Zhongfeng Wang

DMQ: Dissecting Outliers of Diffusion Models for Post-Training Quantization

Diffusion models have achieved remarkable success in image generation but come with significant computational costs, posing challenges for deployment in resource-constrained environments. Recent post-training quantization (PTQ) methods have…

Computer Vision and Pattern Recognition · Computer Science 2025-07-18 Dongyeun Lee , Jiwan Hur , Hyounguk Shon , Jae Young Lee , Junmo Kim

An Analysis on Quantizing Diffusion Transformers

Diffusion Models (DMs) utilize an iterative denoising process to transform random noise into synthetic data. Initally proposed with a UNet structure, DMs excel at producing images that are virtually indistinguishable with or without…

Computer Vision and Pattern Recognition · Computer Science 2024-06-18 Yuewei Yang , Jialiang Wang , Xiaoliang Dai , Peizhao Zhang , Hongbo Zhang

PQD: Post-training Quantization for Efficient Diffusion Models

Diffusionmodels(DMs)havedemonstratedremarkableachievements in synthesizing images of high fidelity and diversity. However, the extensive computational requirements and slow generative speed of diffusion models have limited their widespread…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Jiaojiao Ye , Zhen Wang , Linnan Jiang

MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization

Diffusion models have achieved significant visual generation quality. However, their significant computational and memory costs pose challenge for their application on resource-constrained mobile devices or even desktop GPUs. Recent…

Computer Vision and Pattern Recognition · Computer Science 2024-05-31 Tianchen Zhao , Xuefei Ning , Tongcheng Fang , Enshu Liu , Guyue Huang , Zinan Lin , Shengen Yan , Guohao Dai , Yu Wang