Related papers: Memoryless scalar quantization for random frames

MSQ: Memory-Efficient Bit Sparsification Quantization

As deep neural networks (DNNs) see increased deployment on mobile and edge devices, optimizing model efficiency has become crucial. Mixed-precision quantization is widely favored, as it offers a superior balance between efficiency and…

Machine Learning · Computer Science 2025-07-31 Seokho Han , Seoyeon Yoon , Jinhee Kim , Dongwei Wang , Kang Eun Jeon , Huanrui Yang , Jong Hwan Ko

MWQ: Multiscale Wavelet Quantized Neural Networks

Model quantization can reduce the model size and computational latency, it has become an essential technique for the deployment of deep neural networks on resourceconstrained hardware (e.g., mobile phones and embedded devices). The existing…

Computer Vision and Pattern Recognition · Computer Science 2021-03-10 Qigong Sun , Yan Ren , Licheng Jiao , Xiufang Li , Fanhua Shang , Fang Liu

Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models

Quantization is a promising approach for reducing memory overhead and accelerating inference, especially in large pre-trained language model (PLM) scenarios. While having no access to original training data due to security and privacy…

Computation and Language · Computer Science 2023-10-23 Miaoxi Zhu , Qihuang Zhong , Li Shen , Liang Ding , Juhua Liu , Bo Du , Dacheng Tao

Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks

Hardware-friendly network quantization (e.g., binary/uniform quantization) can efficiently accelerate the inference and meanwhile reduce memory consumption of the deep neural networks, which is crucial for model deployment on…

Computer Vision and Pattern Recognition · Computer Science 2019-08-15 Ruihao Gong , Xianglong Liu , Shenghu Jiang , Tianxiang Li , Peng Hu , Jiazhen Lin , Fengwei Yu , Junjie Yan

MRI Tissue Magnetism Quantification through Total Field Inversion with Deep Neural Networks

Quantitative susceptibility mapping (QSM) utilizes MRI signal phase to infer estimates of local tissue magnetism (magnetic susceptibility), which has been shown useful to provide novel image contrast and as biomarkers of abnormal tissue.…

Medical Physics · Physics 2019-04-16 Juan Liu , Kevin M. Koch

SPFQ: A Stochastic Algorithm and Its Error Analysis for Neural Network Quantization

Quantization is a widely used compression method that effectively reduces redundancies in over-parameterized neural networks. However, existing quantization techniques for deep neural networks often lack a comprehensive error analysis due…

Machine Learning · Computer Science 2023-09-21 Jinjie Zhang , Rayan Saab

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference. Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency,…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Kuan Wang , Zhijian Liu , Yujun Lin , Ji Lin , Song Han

Statistical Signal Processing for Quantum Error Mitigation

In the noisy intermediate-scale quantum (NISQ) era, quantum error mitigation (QEM) is essential for producing reliable outputs from quantum circuits. We present a statistical signal processing approach to QEM that estimates the most likely…

Quantum Physics · Physics 2025-07-15 Kausthubh Chandramouli , Kelly Mae Allen , Christopher Mori , Dror Baron , Mário A. T. Figueiredo

Hardware-Centric AutoML for Mixed-Precision Quantization

Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference. Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency,…

Computer Vision and Pattern Recognition · Computer Science 2020-08-14 Kuan Wang , Zhijian Liu , Yujun Lin , Ji Lin , Song Han

Frame Permutation Quantization

Frame permutation quantization (FPQ) is a new vector quantization technique using finite frames. In FPQ, a vector is encoded using a permutation source code to quantize its frame expansion. This means that the encoding is a partial ordering…

Information Theory · Computer Science 2015-03-24 Ha Q. Nguyen , Vivek K Goyal , Lav R. Varshney

Statistically-Lossless Quantization of Large Language Models

Model quantization has become essential for efficient large language model deployment, yet existing approaches involve clear trade-offs: methods such as GPTQ and AWQ achieve practical compression but are lossy, while lossless techniques…

Machine Learning · Computer Science 2026-05-05 Michael Helcig , Eldar Kurtic , Dan Alistarh

Spatial-Temporal Multi-Scale Quantization for Flexible Motion Generation

Despite significant advancements in human motion generation, current motion representations, typically formulated as discrete frame sequences, still face two critical limitations: (i) they fail to capture motion from a multi-scale…

Computer Vision and Pattern Recognition · Computer Science 2025-08-13 Zan Wang , Jingze Zhang , Yixin Chen , Baoxiong Jia , Wei Liang , Siyuan Huang

Learnable Companding Quantization for Accurate Low-bit Neural Networks

Quantizing deep neural networks is an effective method for reducing memory consumption and improving inference speed, and is thus useful for implementation in resource-constrained devices. However, it is still hard for extremely low-bit…

Computer Vision and Pattern Recognition · Computer Science 2021-11-03 Kohei Yamamoto

Quantization in Compressive Sensing: A Signal Processing Approach

Influence of the finite-length registers and quantization effects on the reconstruction of sparse and approximately sparse signals is analyzed in this paper. For the nonquantized measurements, the compressive sensing (CS) framework provides…

Information Theory · Computer Science 2019-07-03 Isidora Stankovic , Milos Brajovic , Milos Dakovic , Cornel Ioana , Ljubisa Stankovic

The Sparse Multivariate Method of Simulated Quantiles

In this paper the method of simulated quantiles (MSQ) of Dominicy and Veredas (2013) and Dominick et al. (2013) is extended to a general multivariate framework (MMSQ) and to provide a sparse estimator of the scale matrix (sparse-MMSQ). The…

Methodology · Statistics 2017-10-11 Mauro Bernardi , Lea Petrella , Paola Stolfi

Machine Learning for Quantum Noise Reduction

Quantum noise fundamentally limits the utility of near-term quantum devices, making error mitigation essential for practical quantum computation. While traditional quantum error correction codes require substantial qubit overhead and…

Quantum Physics · Physics 2025-09-23 Karan Kendre

Long-Range Zero-Shot Generative Deep Network Quantization

Quantization approximates a deep network model with floating-point numbers by the one with low bit width numbers, in order to accelerate inference and reduce computation. Quantizing a model without access to the original data, zero-shot…

Computer Vision and Pattern Recognition · Computer Science 2022-11-18 Yan Luo , Yangcheng Gao , Zhao Zhang , Haijun Zhang , Mingliang Xu , Meng Wang

Quantized Compressed Sensing for Partial Random Circulant Matrices

We provide the first analysis of a non-trivial quantization scheme for compressed sensing measurements arising from structured measurements. Specifically, our analysis studies compressed sensing matrices consisting of rows selected at…

Information Theory · Computer Science 2017-02-16 Joe-Mei Feng , Felix Krahmer , Rayan Saab

Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning

Mixed Precision Quantization (MPQ) has become an essential technique for optimizing neural network by determining the optimal bitwidth per layer. Existing MPQ methods, however, face a major hurdle: they require a computationally expensive…

Computer Vision and Pattern Recognition · Computer Science 2025-05-20 Lianbo Ma , Jianlun Ma , Yuee Zhou , Guoyang Xie , Qiang He , Zhichao Lu

Mixed-Precision Inference Quantization: Radically Towards Faster inference speed, Lower Storage requirement, and Lower Loss

Based on the model's resilience to computational noise, model quantization is important for compressing models and improving computing speed. Existing quantization techniques rely heavily on experience and "fine-tuning" skills. In the…

Machine Learning · Computer Science 2022-07-22 Daning Cheng , Wenguang Chen