Related papers: Quantized Proximal Averaging Network for Analysis …

Quantized Neural Networks for Low-Precision Accumulation with Guaranteed Overflow Avoidance

We introduce a quantization-aware training algorithm that guarantees avoiding numerical overflow when reducing the precision of accumulators during inference. We leverage weight normalization as a means of constraining parameters during…

Machine Learning · Computer Science 2023-02-01 Ian Colbert , Alessandro Pappalardo , Jakoba Petri-Koenig

Training Sparse Neural Networks using Compressed Sensing

Pruning the weights of neural networks is an effective and widely-used technique for reducing model size and inference complexity. We develop and test a novel method based on compressed sensing which combines the pruning and training into a…

Computer Vision and Pattern Recognition · Computer Science 2021-04-08 Jonathan W. Siegel , Jianhong Chen , Pengchuan Zhang , Jinchao Xu

Model compression as constrained optimization, with application to neural nets. Part II: quantization

We consider the problem of deep neural net compression by quantization: given a large, reference net, we want to quantize its real-valued weights using a codebook with $K$ entries so that the training loss of the quantized net is minimal.…

Machine Learning · Computer Science 2017-07-17 Miguel Á. Carreira-Perpiñán , Yerlan Idelbayev

Neural Networks with Quantization Constraints

Enabling low precision implementations of deep learning models, without considerable performance degradation, is necessary in resource and latency constrained settings. Moreover, exploiting the differences in sensitivity to quantization…

Machine Learning · Computer Science 2022-10-28 Ignacio Hounie , Juan Elenter , Alejandro Ribeiro

Quantization Aware Factorization for Deep Neural Network Compression

Tensor decomposition of convolutional and fully-connected layers is an effective way to reduce parameters and FLOP in neural networks. Due to memory and power consumption limitations of mobile or embedded devices, the quantization step is…

Machine Learning · Computer Science 2023-08-10 Daria Cherniuk , Stanislav Abukhovich , Anh-Huy Phan , Ivan Oseledets , Andrzej Cichocki , Julia Gusak

Robust Training of Neural Networks at Arbitrary Precision and Sparsity

The discontinuous operations inherent in quantization and sparsification introduce a long-standing obstacle to backpropagation, particularly in ultra-low precision and sparse regimes. While the community has long viewed quantization as…

Machine Learning · Computer Science 2026-03-11 Chengxi Ye , Grace Chu , Yanfeng Liu , Yichi Zhang , Lukasz Lew , Li Zhang , Mark Sandler , Andrew Howard

Simple, Efficient, and Neural Algorithms for Sparse Coding

Sparse coding is a basic task in many fields including signal processing, neuroscience and machine learning where the goal is to learn a basis that enables a sparse representation of a given set of data, if one exists. Its standard…

Machine Learning · Computer Science 2015-03-04 Sanjeev Arora , Rong Ge , Tengyu Ma , Ankur Moitra

Training Quantized Nets: A Deeper Understanding

Currently, deep neural networks are deployed on low-power portable devices by first training a full-precision model using powerful hardware, and then deriving a corresponding low-precision model for efficient inference on such systems.…

Machine Learning · Computer Science 2017-11-15 Hao Li , Soham De , Zheng Xu , Christoph Studer , Hanan Samet , Tom Goldstein

Quantization of Neural Network Equalizers in Optical Fiber Transmission Experiments

The quantization of neural networks for the mitigation of the nonlinear and components' distortions in dual-polarization optical fiber transmission is studied. Two low-complexity neural network equalizers are applied in three 16-QAM 34.4…

Signal Processing · Electrical Eng. & Systems 2023-10-11 Jamal Darweesh , Nelson Costa , Antonio Napoli , Bernhard Spinnler , Yves Jaouen , Mansoor Yousefi

Ps and Qs: Quantization-aware pruning for efficient low latency neural network inference

Efficient machine learning implementations optimized for inference in hardware have wide-ranging benefits, depending on the application, from lower inference latency to higher data throughput and reduced energy consumption. Two popular…

Machine Learning · Computer Science 2021-07-21 Benjamin Hawks , Javier Duarte , Nicholas J. Fraser , Alessandro Pappalardo , Nhan Tran , Yaman Umuroglu

Post-Training Quantization for Re-parameterization via Coarse & Fine Weight Splitting

Although neural networks have made remarkable advancements in various applications, they require substantial computational and memory resources. Network quantization is a powerful technique to compress neural networks, allowing for more…

Computer Vision and Pattern Recognition · Computer Science 2023-12-19 Dawei Yang , Ning He , Xing Hu , Zhihang Yuan , Jiangyong Yu , Chen Xu , Zhe Jiang

SPFQ: A Stochastic Algorithm and Its Error Analysis for Neural Network Quantization

Quantization is a widely used compression method that effectively reduces redundancies in over-parameterized neural networks. However, existing quantization techniques for deep neural networks often lack a comprehensive error analysis due…

Machine Learning · Computer Science 2023-09-21 Jinjie Zhang , Rayan Saab

SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks

We provide a new efficient version of the backpropagation algorithm, specialized to the case where the weights of the neural network being trained are sparse. Our algorithm is general, as it applies to arbitrary (unstructured) sparsity and…

Machine Learning · Computer Science 2023-02-10 Mahdi Nikdan , Tommaso Pegolotti , Eugenia Iofinova , Eldar Kurtic , Dan Alistarh

Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization

Quantization is a widely used technique to compress and accelerate deep neural networks. However, conventional quantization methods use the same bit-width for all (or most of) the layers, which often suffer significant accuracy degradation…

Computer Vision and Pattern Recognition · Computer Science 2021-10-14 Weihan Chen , Peisong Wang , Jian Cheng

AMS-Net: Adaptive Multiscale Sparse Neural Network with Interpretable Basis Expansion for Multiphase Flow Problems

In this work, we propose an adaptive sparse learning algorithm that can be applied to learn the physical processes and obtain a sparse representation of the solution given a large snapshot space. Assume that there is a rich class of…

Machine Learning · Computer Science 2022-07-26 Yating Wang , Wing Tat Leung , Guang Lin

An efficient algorithm for structured sparse quantile regression

Quantile regression is studied in combination with a penalty which promotes structured (or group) sparsity. A mixed $\ell_{1,\infty}$-norm on the parameter vector is used to impose structured sparsity on the traditional quantile regression…

Methodology · Statistics 2013-02-26 Vahid Nassiri , Ignace Loris

Sharpness-aware Quantization for Deep Neural Networks

Network quantization is a dominant paradigm of model compression. However, the abrupt changes in quantized weights during training often lead to severe loss fluctuations and result in a sharp loss landscape, making the gradients unstable…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Jing Liu , Jianfei Cai , Bohan Zhuang

Weight Pruning via Adaptive Sparsity Loss

Pruning neural networks has regained interest in recent years as a means to compress state-of-the-art deep neural networks and enable their deployment on resource-constrained devices. In this paper, we propose a robust compressive learning…

Machine Learning · Computer Science 2020-06-05 George Retsinas , Athena Elafrou , Georgios Goumas , Petros Maragos

Efficient and Sparse Neural Networks by Pruning Weights in a Multiobjective Learning Approach

Overparameterization and overfitting are common concerns when designing and training deep neural networks, that are often counteracted by pruning and regularization strategies. However, these strategies remain secondary to most learning…

Machine Learning · Computer Science 2020-09-01 Malena Reiners , Kathrin Klamroth , Michael Stiglmayr

StatQAT: Statistical Quantizer Optimization for Deep Networks

Quantization is essential for reducing the computational cost and memory usage of deep neural networks, enabling efficient inference on low-precision hardware. Despite the growing adoption of uniform and floating-point quantization schemes,…

Machine Learning · Statistics 2026-05-19 Mehmet Aktukmak , Daniel Huang , Ke Ding