Related papers: Rate-Distortion Optimized Post-Training Quantizati…

PD-Quant: Post-Training Quantization based on Prediction Difference Metric

Post-training quantization (PTQ) is a neural network compression technique that converts a full-precision model into a quantized model using lower-precision data types. Although it can help reduce the size and computational cost of deep…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Jiawei Liu , Lin Niu , Zhihang Yuan , Dawei Yang , Xinggang Wang , Wenyu Liu

PTQD: Accurate Post-Training Quantization for Diffusion Models

Diffusion models have recently dominated image synthesis tasks. However, the iterative denoising process is expensive in computations at inference time, making diffusion models less practical for low-latency and scalable real-world…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Yefei He , Luping Liu , Jing Liu , Weijia Wu , Hong Zhou , Bohan Zhuang

LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection

Due to highly constrained computing power and memory, deploying 3D lidar-based detectors on edge devices equipped in autonomous vehicles and robots poses a crucial challenge. Being a convenient and straightforward model compression…

Computer Vision and Pattern Recognition · Computer Science 2024-01-30 Sifan Zhou , Liang Li , Xinyu Zhang , Bo Zhang , Shipeng Bai , Miao Sun , Ziyu Zhao , Xiaobo Lu , Xiangxiang Chu

UWC: Unit-wise Calibration Towards Rapid Network Compression

This paper introduces a post-training quantization~(PTQ) method achieving highly efficient Convolutional Neural Network~ (CNN) quantization with high performance. Previous PTQ methods usually reduce compression error via performing…

Computer Vision and Pattern Recognition · Computer Science 2022-01-19 Chen Lin , Zheyang Li , Bo Peng , Haoji Hu , Wenming Tan , Ye Ren , Shiliang Pu

LoPRo: Enhancing Low-Rank Quantization via Permuted Block-Wise Rotation

Post-training quantization (PTQ) enables effective model compression while preserving relatively high accuracy. Current weight-only PTQ methods primarily focus on the challenging sub-3-bit regime, where approaches often suffer significant…

Machine Learning · Computer Science 2026-01-28 Hongyaoxing Gu , Lijuan Hu , Liye Yu , Haowei Li , Fangfang Liu

Sensitivity-Aware Post-Training Quantization for Deep Neural Networks

Model quantization reduces neural network parameter precision to achieve compression, but often compromises accuracy. Existing post-training quantization (PTQ) methods employ iterative parameter updates to preserve accuracy under high…

Computer Vision and Pattern Recognition · Computer Science 2025-09-09 Zekang Zheng , Haokun Li , Yaofo Chen , Mingkui Tan , Qing Du

2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution

Low-bit quantization has become widespread for compressing image super-resolution (SR) models for edge deployment, which allows advanced SR models to enjoy compact low-bit parameters and efficient integer/bitwise constructions for storage…

Image and Video Processing · Electrical Eng. & Systems 2024-06-12 Kai Liu , Haotong Qin , Yong Guo , Xin Yuan , Linghe Kong , Guihai Chen , Yulun Zhang

COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization

Post-training quantization (PTQ) has emerged as a practical approach to compress large neural networks, making them highly efficient for deployment. However, effectively reducing these models to their low-bit counterparts without…

Machine Learning · Computer Science 2024-10-22 Aozhong Zhang , Zi Yang , Naigang Wang , Yingyong Qi , Jack Xin , Xin Li , Penghang Yin

Improving Post-Training Quantization on Object Detection with Task Loss-Guided Lp Metric

Efficient inference for object detection networks is a major challenge on edge devices. Post-Training Quantization (PTQ), which transforms a full-precision model into low bit-width directly, is an effective and convenient approach to reduce…

Computer Vision and Pattern Recognition · Computer Science 2023-05-09 Lin Niu , Jiawei Liu , Zhihang Yuan , Dawei Yang , Xinggang Wang , Wenyu Liu

Rotation Invariant Quantization for Model Compression

Post-training Neural Network (NN) model compression is an attractive approach for deploying large, memory-consuming models on devices with limited memory resources. In this study, we investigate the rate-distortion tradeoff for NN model…

Machine Learning · Computer Science 2024-12-03 Joseph Kampeas , Yury Nahshan , Hanoch Kremer , Gil Lederman , Shira Zaloshinski , Zheng Li , Emir Haleva

Flexible Mixed Precision Quantization for Learned Image Compression

Despite its improvements in coding performance compared to traditional codecs, Learned Image Compression (LIC) suffers from large computational costs for storage and deployment. Model quantization offers an effective solution to reduce the…

Image and Video Processing · Electrical Eng. & Systems 2025-06-03 Md Adnan Faisal Hossain , Zhihao Duan , Fengqing Zhu

Parallelized Rate-Distortion Optimized Quantization Using Deep Learning

Rate-Distortion Optimized Quantization (RDOQ) has played an important role in the coding performance of recent video compression standards such as H.264/AVC, H.265/HEVC, VP9 and AV1. This scheme yields significant reductions in bit-rate at…

Machine Learning · Computer Science 2020-12-14 Dana Kianfar , Auke Wiggers , Amir Said , Reza Pourreza , Taco Cohen

Post-Training Quantization in Brain-Computer Interfaces based on Event-Related Potential Detection

Post-training quantization (PTQ) is a technique used to optimize and reduce the memory footprint and computational requirements of machine learning models. It has been used primarily for neural networks. For Brain-Computer Interfaces (BCI)…

Human-Computer Interaction · Computer Science 2024-10-11 Hubert Cecotti , Dalvir Dhaliwal , Hardip Singh , Yogesh Kumar Meena

ZeroQuant-V2: Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation

Post-training quantization (PTQ) has emerged as a promising technique for mitigating memory consumption and computational costs in large language models (LLMs). However, a systematic examination of various quantization schemes, model…

Machine Learning · Computer Science 2023-05-29 Zhewei Yao , Xiaoxia Wu , Cheng Li , Stephen Youn , Yuxiong He

PQD: Post-training Quantization for Efficient Diffusion Models

Diffusionmodels(DMs)havedemonstratedremarkableachievements in synthesizing images of high fidelity and diversity. However, the extensive computational requirements and slow generative speed of diffusion models have limited their widespread…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Jiaojiao Ye , Zhen Wang , Linnan Jiang

A Practical Approach for Rate-Distortion-Perception Analysis in Learned Image Compression

Rate-distortion optimization (RDO) of codecs, where distortion is quantified by the mean-square error, has been a standard practice in image/video compression over the years. RDO serves well for optimization of codec performance for…

Image and Video Processing · Electrical Eng. & Systems 2021-05-03 Ogun Kirmemis , A. Murat Tekalp

Gradient-Aligned Calibration for Post-Training Quantization of Diffusion Models

Diffusion models have shown remarkable performance in image synthesis by progressively estimating a smooth transition from a Gaussian distribution of noise to a real image. Unfortunately, their practical deployment is limited by slow…

Machine Learning · Computer Science 2026-03-03 Dung Anh Hoang , Cuong Pham anh Trung Le , Jianfei Cai , Thanh-Toan Do

Post-Training Quantization for Video Matting

Video matting is crucial for applications such as film production and virtual reality, yet deploying its computationally intensive models on resource-constrained devices presents challenges. Quantization is a key technique for model…

Computer Vision and Pattern Recognition · Computer Science 2025-06-13 Tianrui Zhu , Houyuan Chen , Ruihao Gong , Michele Magno , Haotong Qin , Kai Zhang

On Quantizing Neural Representation for Variable-Rate Video Coding

This work introduces NeuroQuant, a novel post-training quantization (PTQ) approach tailored to non-generalized Implicit Neural Representations for variable-rate Video Coding (INR-VC). Unlike existing methods that require extensive weight…

Image and Video Processing · Electrical Eng. & Systems 2025-02-18 Junqi Shi , Zhujia Chen , Hanfei Li , Qi Zhao , Ming Lu , Tong Chen , Zhan Ma

RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers

Post-training quantization (PTQ), which only requires a tiny dataset for calibration without end-to-end retraining, is a light and practical model compression technique. Recently, several PTQ schemes for vision transformers (ViTs) have been…

Computer Vision and Pattern Recognition · Computer Science 2023-08-08 Zhikai Li , Junrui Xiao , Lianwei Yang , Qingyi Gu