English
Related papers

Related papers: Memoryless scalar quantization for random frames

200 papers

As deep neural networks (DNNs) see increased deployment on mobile and edge devices, optimizing model efficiency has become crucial. Mixed-precision quantization is widely favored, as it offers a superior balance between efficiency and…

Machine Learning · Computer Science 2025-07-31 Seokho Han , Seoyeon Yoon , Jinhee Kim , Dongwei Wang , Kang Eun Jeon , Huanrui Yang , Jong Hwan Ko

Model quantization can reduce the model size and computational latency, it has become an essential technique for the deployment of deep neural networks on resourceconstrained hardware (e.g., mobile phones and embedded devices). The existing…

Computer Vision and Pattern Recognition · Computer Science 2021-03-10 Qigong Sun , Yan Ren , Licheng Jiao , Xiufang Li , Fanhua Shang , Fang Liu

Quantization is a promising approach for reducing memory overhead and accelerating inference, especially in large pre-trained language model (PLM) scenarios. While having no access to original training data due to security and privacy…

Computation and Language · Computer Science 2023-10-23 Miaoxi Zhu , Qihuang Zhong , Li Shen , Liang Ding , Juhua Liu , Bo Du , Dacheng Tao

Hardware-friendly network quantization (e.g., binary/uniform quantization) can efficiently accelerate the inference and meanwhile reduce memory consumption of the deep neural networks, which is crucial for model deployment on…

Computer Vision and Pattern Recognition · Computer Science 2019-08-15 Ruihao Gong , Xianglong Liu , Shenghu Jiang , Tianxiang Li , Peng Hu , Jiazhen Lin , Fengwei Yu , Junjie Yan

Quantitative susceptibility mapping (QSM) utilizes MRI signal phase to infer estimates of local tissue magnetism (magnetic susceptibility), which has been shown useful to provide novel image contrast and as biomarkers of abnormal tissue.…

Medical Physics · Physics 2019-04-16 Juan Liu , Kevin M. Koch

Quantization is a widely used compression method that effectively reduces redundancies in over-parameterized neural networks. However, existing quantization techniques for deep neural networks often lack a comprehensive error analysis due…

Machine Learning · Computer Science 2023-09-21 Jinjie Zhang , Rayan Saab

Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference. Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency,…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Kuan Wang , Zhijian Liu , Yujun Lin , Ji Lin , Song Han

In the noisy intermediate-scale quantum (NISQ) era, quantum error mitigation (QEM) is essential for producing reliable outputs from quantum circuits. We present a statistical signal processing approach to QEM that estimates the most likely…

Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference. Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency,…

Computer Vision and Pattern Recognition · Computer Science 2020-08-14 Kuan Wang , Zhijian Liu , Yujun Lin , Ji Lin , Song Han

Frame permutation quantization (FPQ) is a new vector quantization technique using finite frames. In FPQ, a vector is encoded using a permutation source code to quantize its frame expansion. This means that the encoding is a partial ordering…

Information Theory · Computer Science 2015-03-24 Ha Q. Nguyen , Vivek K Goyal , Lav R. Varshney

Model quantization has become essential for efficient large language model deployment, yet existing approaches involve clear trade-offs: methods such as GPTQ and AWQ achieve practical compression but are lossy, while lossless techniques…

Machine Learning · Computer Science 2026-05-05 Michael Helcig , Eldar Kurtic , Dan Alistarh

Despite significant advancements in human motion generation, current motion representations, typically formulated as discrete frame sequences, still face two critical limitations: (i) they fail to capture motion from a multi-scale…

Computer Vision and Pattern Recognition · Computer Science 2025-08-13 Zan Wang , Jingze Zhang , Yixin Chen , Baoxiong Jia , Wei Liang , Siyuan Huang

Quantizing deep neural networks is an effective method for reducing memory consumption and improving inference speed, and is thus useful for implementation in resource-constrained devices. However, it is still hard for extremely low-bit…

Computer Vision and Pattern Recognition · Computer Science 2021-11-03 Kohei Yamamoto

Influence of the finite-length registers and quantization effects on the reconstruction of sparse and approximately sparse signals is analyzed in this paper. For the nonquantized measurements, the compressive sensing (CS) framework provides…

Information Theory · Computer Science 2019-07-03 Isidora Stankovic , Milos Brajovic , Milos Dakovic , Cornel Ioana , Ljubisa Stankovic

In this paper the method of simulated quantiles (MSQ) of Dominicy and Veredas (2013) and Dominick et al. (2013) is extended to a general multivariate framework (MMSQ) and to provide a sparse estimator of the scale matrix (sparse-MMSQ). The…

Methodology · Statistics 2017-10-11 Mauro Bernardi , Lea Petrella , Paola Stolfi

Quantum noise fundamentally limits the utility of near-term quantum devices, making error mitigation essential for practical quantum computation. While traditional quantum error correction codes require substantial qubit overhead and…

Quantum Physics · Physics 2025-09-23 Karan Kendre

Quantization approximates a deep network model with floating-point numbers by the one with low bit width numbers, in order to accelerate inference and reduce computation. Quantizing a model without access to the original data, zero-shot…

Computer Vision and Pattern Recognition · Computer Science 2022-11-18 Yan Luo , Yangcheng Gao , Zhao Zhang , Haijun Zhang , Mingliang Xu , Meng Wang

We provide the first analysis of a non-trivial quantization scheme for compressed sensing measurements arising from structured measurements. Specifically, our analysis studies compressed sensing matrices consisting of rows selected at…

Information Theory · Computer Science 2017-02-16 Joe-Mei Feng , Felix Krahmer , Rayan Saab

Mixed Precision Quantization (MPQ) has become an essential technique for optimizing neural network by determining the optimal bitwidth per layer. Existing MPQ methods, however, face a major hurdle: they require a computationally expensive…

Computer Vision and Pattern Recognition · Computer Science 2025-05-20 Lianbo Ma , Jianlun Ma , Yuee Zhou , Guoyang Xie , Qiang He , Zhichao Lu

Based on the model's resilience to computational noise, model quantization is important for compressing models and improving computing speed. Existing quantization techniques rely heavily on experience and "fine-tuning" skills. In the…

Machine Learning · Computer Science 2022-07-22 Daning Cheng , Wenguang Chen
‹ Prev 1 2 3 10 Next ›