Related papers: Instance-Aware Group Quantization for Vision Trans…

AIQViT: Architecture-Informed Post-Training Quantization for Vision Transformers

Post-training quantization (PTQ) has emerged as a promising solution for reducing the storage and computational cost of vision transformers (ViTs). Recent advances primarily target at crafting quantizers to deal with peculiar activations…

Computer Vision and Pattern Recognition · Computer Science 2025-02-10 Runqing Jiang , Ye Zhang , Longguang Wang , Pengpeng Yu , Yulan Guo

FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

Network quantization significantly reduces model inference complexity and has been widely used in real-world deployments. However, most existing quantization methods have been developed mainly on Convolutional Neural Networks (CNNs), and…

Computer Vision and Pattern Recognition · Computer Science 2023-02-20 Yang Lin , Tianyu Zhang , Peiqin Sun , Zheng Li , Shuchang Zhou

PTQ4ViT: Post-training quantization for vision transformers with twin uniform quantization

Quantization is one of the most effective methods to compress neural networks, which has achieved great success on convolutional neural networks (CNNs). Recently, vision transformers have demonstrated great potential in computer vision.…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Zhihang Yuan , Chenhao Xue , Yiqi Chen , Qiang Wu , Guangyu Sun

RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers

Post-training quantization (PTQ), which only requires a tiny dataset for calibration without end-to-end retraining, is a light and practical model compression technique. Recently, several PTQ schemes for vision transformers (ViTs) have been…

Computer Vision and Pattern Recognition · Computer Science 2023-08-08 Zhikai Li , Junrui Xiao , Lianwei Yang , Qingyi Gu

Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with Bridge Block Reconstruction for IoT Systems

Recently, vision transformers (ViTs) have superseded convolutional neural networks in numerous applications, including classification, detection, and segmentation. However, the high computational requirements of ViTs hinder their widespread…

Computer Vision and Pattern Recognition · Computer Science 2024-05-20 Jemin Lee , Yongin Kwon , Sihyeong Park , Misun Yu , Jeman Park , Hwanjun Song

I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization

Albeit the scalable performance of vision transformers (ViTs), the dense computational costs (training & inference) undermine their position in industrial applications. Post-training quantization (PTQ), tuning ViTs with a tiny dataset and…

Computer Vision and Pattern Recognition · Computer Science 2025-10-10 Yunshan Zhong , Jiawei Hu , Mingbao lin , Mengzhao Chen , Rongrong Ji

Efficiently Training A Flat Neural Network Before It has been Quantizated

Post-training quantization (PTQ) for vision transformers (ViTs) has garnered significant attention due to its efficiency in compressing models. However, existing methods typically overlook the relationship between a well-trained NN and the…

Computer Vision and Pattern Recognition · Computer Science 2025-11-04 Peng Xia , Junbiao Pang , Tianyang Cai

Towards Accurate Post-Training Quantization for Vision Transformer

Vision transformer emerges as a potential architecture for vision tasks. However, the intense computation and non-negligible delay hinder its application in the real world. As a widespread model compression technique, existing post-training…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Yifu Ding , Haotong Qin , Qinghua Yan , Zhenhua Chai , Junjie Liu , Xiaolin Wei , Xianglong Liu

IPTQ-ViT: Post-Training Quantization of Non-linear Functions for Integer-only Vision Transformers

Previous Quantization-Aware Training (QAT) methods for vision transformers rely on expensive retraining to recover accuracy loss in non-linear layer quantization, limiting their use in resource-constrained environments. In contrast,…

Computer Vision and Pattern Recognition · Computer Science 2025-11-20 Gihwan Kim , Jemin Lee , Hyungshin Kim

DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training Quantization for Vision Transformers

Vision Transformers (ViTs) have gained significant attention, but their high computing cost limits the practical applications. While post-training quantization (PTQ) reduces model size and speeds up inference, it often degrades performance,…

Computer Vision and Pattern Recognition · Computer Science 2025-06-23 Lianwei Yang , Haisong Gong , Haokun Lin , Yichen Wu , Zhenan Sun , Qingyi Gu

Patch-wise Mixed-Precision Quantization of Vision Transformer

As emerging hardware begins to support mixed bit-width arithmetic computation, mixed-precision quantization is widely used to reduce the complexity of neural networks. However, Vision Transformers (ViTs) require complex self-attention…

Computer Vision and Pattern Recognition · Computer Science 2023-05-12 Junrui Xiao , Zhikai Li , Lianwei Yang , Qingyi Gu

ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers

Vision Transformers (ViTs) have exhibited exceptional performance across diverse computer vision tasks, while their substantial parameter size incurs significantly increased memory and computational demands, impeding effective inference on…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Yanfeng Jiang , Ning Sun , Xueshuo Xie , Fei Yang , Tao Li

Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction

Post-training quantization (PTQ) for vision transformers (ViTs) has received increasing attention from both academic and industrial communities due to its minimal data needs and high time efficiency. However, many current methods fail to…

Computer Vision and Pattern Recognition · Computer Science 2025-02-05 Yunshan Zhong , You Huang , Jiawei Hu , Yuxin Zhang , Rongrong Ji

Quant Experts: Token-aware Adaptive Error Reconstruction with Mixture of Experts for Large Vision-Language Models Quantization

Post-Training Quantization (PTQ) has emerged as an effective technique for alleviating the substantial computational and memory overheads of Vision-Language Models (VLMs) by compressing both weights and activations without retraining the…

Computer Vision and Pattern Recognition · Computer Science 2026-03-02 Chenwei Jia , Baoting Li , Xuchong Zhang , Mingzhuo Wei , Bochen Lin , Hongbin Sun

UWC: Unit-wise Calibration Towards Rapid Network Compression

This paper introduces a post-training quantization~(PTQ) method achieving highly efficient Convolutional Neural Network~ (CNN) quantization with high performance. Previous PTQ methods usually reduce compression error via performing…

Computer Vision and Pattern Recognition · Computer Science 2022-01-19 Chen Lin , Zheyang Li , Bo Peng , Haoji Hu , Wenming Tan , Ye Ren , Shiliang Pu

FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation

Post-training quantization (PTQ) has stood out as a cost-effective and promising model compression paradigm in recent years, as it avoids computationally intensive model retraining. Nevertheless, current PTQ methods for Vision Transformers…

Computer Vision and Pattern Recognition · Computer Science 2025-06-16 Zhuguanyu Wu , Shihe Wang , Jiayi Zhang , Jiaxin Chen , Yunhong Wang

Post-Training Quantization for Video Matting

Video matting is crucial for applications such as film production and virtual reality, yet deploying its computationally intensive models on resource-constrained devices presents challenges. Quantization is a key technique for model…

Computer Vision and Pattern Recognition · Computer Science 2025-06-13 Tianrui Zhu , Houyuan Chen , Ruihao Gong , Michele Magno , Haotong Qin , Kai Zhang

Gradient-Aligned Calibration for Post-Training Quantization of Diffusion Models

Diffusion models have shown remarkable performance in image synthesis by progressively estimating a smooth transition from a Gaussian distribution of noise to a real image. Unfortunately, their practical deployment is limited by slow…

Machine Learning · Computer Science 2026-03-03 Dung Anh Hoang , Cuong Pham anh Trung Le , Jianfei Cai , Thanh-Toan Do

PD-Quant: Post-Training Quantization based on Prediction Difference Metric

Post-training quantization (PTQ) is a neural network compression technique that converts a full-precision model into a quantized model using lower-precision data types. Although it can help reduce the size and computational cost of deep…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Jiawei Liu , Lin Niu , Zhihang Yuan , Dawei Yang , Xinggang Wang , Wenyu Liu

MPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision Transformer

While vision transformers (ViTs) have shown great potential in computer vision tasks, their intense computation and memory requirements pose challenges for practical applications. Existing post-training quantization methods leverage value…

Computer Vision and Pattern Recognition · Computer Science 2024-02-02 Yu-Shan Tai , An-Yeu , Wu