Related papers: Weight Compander: A Simple Weight Reparameterizati…

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning…

Machine Learning · Computer Science 2016-06-07 Tim Salimans , Diederik P. Kingma

Implicit Regularization and Convergence for Weight Normalization

Normalization methods such as batch [Ioffe and Szegedy, 2015], weight [Salimansand Kingma, 2016], instance [Ulyanov et al., 2016], and layer normalization [Baet al., 2016] have been widely used in modern machine learning. Here, we study the…

Machine Learning · Computer Science 2022-08-31 Xiaoxia Wu , Edgar Dobriban , Tongzheng Ren , Shanshan Wu , Zhiyuan Li , Suriya Gunasekar , Rachel Ward , Qiang Liu

On the Benefits of Weight Normalization for Overparameterized Matrix Sensing

While normalization techniques are widely used in deep learning, their theoretical understanding remains relatively limited. In this work, we establish the benefits of (generalized) weight normalization (WN) applied to the overparameterized…

Machine Learning · Computer Science 2025-10-02 Yudong Wei , Liang Zhang , Bingcong Li , Niao He

Quantization-Aware Regularizers for Deep Neural Networks Compression

Deep Neural Networks reached state-of-the-art performance across numerous domains, but this progress has come at the cost of increasingly large and over-parameterized models, posing serious challenges for deployment on resource-constrained…

Machine Learning · Computer Science 2026-02-04 Dario Malchiodi , Mattia Ferraretto , Marco Frasca

Compression-aware Training of Deep Networks

In recent years, great progress has been made in a variety of application domains thanks to the development of increasingly deeper neural networks. Unfortunately, the huge number of units of these networks makes them expensive both…

Computer Vision and Pattern Recognition · Computer Science 2018-10-12 Jose M. Alvarez , Mathieu Salzmann

Regularization-based Pruning of Irrelevant Weights in Deep Neural Architectures

Deep neural networks exploiting millions of parameters are nowadays the norm in deep learning applications. This is a potential issue because of the great amount of computational resources needed for training, and of the possible loss of…

Computation and Language · Computer Science 2022-10-31 Giovanni Bonetta , Matteo Ribero , Rossella Cancelliere

Weight Concentration Regularization for Improving Pruning Robustness Under High Sparsity

Deep neural networks achieve outstanding performance across vision and language tasks, yet their large parameter counts limit deployment in resource-constrained settings. One-shot pruning reduces model size without retraining, but models…

Machine Learning · Computer Science 2026-05-18 Vincent-Daniel Yun , Junhyuk Jo , Sunwoo Lee

Weight Normalization based Quantization for Deep Neural Network Compression

With the development of deep neural networks, the size of network models becomes larger and larger. Model compression has become an urgent need for deploying these network models to mobile or embedded devices. Model quantization is a…

Machine Learning · Computer Science 2019-07-02 Wen-Pu Cai , Wu-Jun Li

Weight Rescaling: Effective and Robust Regularization for Deep Neural Networks with Batch Normalization

Weight decay is often used to ensure good generalization in the training practice of deep neural networks with batch normalization (BN-DNNs), where some convolution layers are invariant to weight rescaling due to the normalization. In this…

Machine Learning · Computer Science 2022-06-22 Ziquan Liu , Yufei Cui , Jia Wan , Yu Mao , Antoni B. Chan

Stochastic Weight Matrix-based Regularization Methods for Deep Neural Networks

The aim of this paper is to introduce two widely applicable regularization methods based on the direct modification of weight matrices. The first method, Weight Reinitialization, utilizes a simplified Bayesian assumption with partially…

Machine Learning · Computer Science 2022-06-07 Patrik Reizinger , Bálint Gyires-Tóth

A Partial Regularization Method for Network Compression

Deep Neural Networks have achieved remarkable success relying on the developing availability of GPUs and large-scale datasets with increasing network depth and width. However, due to the expensive computation and intensive memory,…

Machine Learning · Computer Science 2020-09-07 E Zhenqian , Gao Weiguo

Function Norms and Regularization in Deep Networks

Deep neural networks (DNNs) have become increasingly important due to their excellent empirical performance on a wide range of problems. However, regularization is generally achieved by indirect means, largely due to the complex set of…

Machine Learning · Computer Science 2018-07-02 Amal Rannen Triki , Maxim Berman , Matthew B. Blaschko

RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures

Existing neural networks are memory-consuming and computationally intensive, making deploying them challenging in resource-constrained environments. However, there are various methods to improve their efficiency. Two such methods are…

Machine Learning · Computer Science 2023-11-10 Anastasiia Prutianova , Alexey Zaytsev , Chung-Kuei Lee , Fengyu Sun , Ivan Koryakovskiy

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

Normalization techniques have become a basic component in modern convolutional neural networks (ConvNets). In particular, many recent works demonstrate that promoting the orthogonality of the weights helps train deep models and improve…

Computer Vision and Pattern Recognition · Computer Science 2022-01-05 Sheng Liu , Xiao Li , Yuexiang Zhai , Chong You , Zhihui Zhu , Carlos Fernandez-Granda , Qing Qu

Scalable Weight Reparametrization for Efficient Transfer Learning

This paper proposes a novel, efficient transfer learning method, called Scalable Weight Reparametrization (SWR) that is efficient and effective for multiple downstream tasks. Efficient transfer learning involves utilizing a pre-trained…

Machine Learning · Computer Science 2023-02-28 Byeonggeun Kim , Jun-Tae Lee , Seunghan yang , Simyung Chang

Weight-Sharing Regularization

Weight-sharing is ubiquitous in deep learning. Motivated by this, we propose a "weight-sharing regularization" penalty on the weights $w \in \mathbb{R}^d$ of a neural network, defined as $\mathcal{R}(w) = \frac{1}{d - 1}\sum_{i > j}^d |w_i…

Machine Learning · Computer Science 2024-03-12 Mehran Shakerinava , Motahareh Sohrabi , Siamak Ravanbakhsh , Simon Lacoste-Julien

Learning Discrete Weights and Activations Using the Local Reparameterization Trick

In computer vision and machine learning, a crucial challenge is to lower the computation and memory demands for neural network inference. A commonplace solution to address this challenge is through the use of binarization. By binarizing the…

Machine Learning · Computer Science 2023-07-06 Guy Berger , Aviv Navon , Ethan Fetaya

Robust Implicit Regularization via Weight Normalization

Overparameterized models may have many interpolating solutions; implicit regularization refers to the hidden preference of a particular optimization method towards a certain interpolating solution among the many. A by now established line…

Machine Learning · Computer Science 2024-09-18 Hung-Hsu Chou , Holger Rauhut , Rachel Ward

Post-Training Quantization for Re-parameterization via Coarse & Fine Weight Splitting

Although neural networks have made remarkable advancements in various applications, they require substantial computational and memory resources. Network quantization is a powerful technique to compress neural networks, allowing for more…

Computer Vision and Pattern Recognition · Computer Science 2023-12-19 Dawei Yang , Ning He , Xing Hu , Zhihang Yuan , Jiangyong Yu , Chen Xu , Zhe Jiang

Weight Conditioning for Smooth Optimization of Neural Networks

In this article, we introduce a novel normalization technique for neural network weight matrices, which we term weight conditioning. This approach aims to narrow the gap between the smallest and largest singular values of the weight…

Computer Vision and Pattern Recognition · Computer Science 2026-03-16 Hemanth Saratchandran , Thomas X. Wang , Simon Lucey