Related papers: Weight-Sharing Regularization

Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries

Sparse regularization techniques are well-established in machine learning, yet their application in neural networks remains challenging due to the non-differentiability of penalties like the $L_1$ norm, which is incompatible with stochastic…

Machine Learning · Computer Science 2025-02-10 Chris Kolb , Tobias Weber , Bernd Bischl , David Rügamer

Weight Compander: A Simple Weight Reparameterization for Regularization

Regularization is a set of techniques that are used to improve the generalization ability of deep neural networks. In this paper, we introduce weight compander (WC), a novel effective method to improve generalization by reparameterizing…

Machine Learning · Computer Science 2023-06-30 Rinor Cakaj , Jens Mehnert , Bin Yang

Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search

Weight sharing has become a de facto standard in neural architecture search because it enables the search to be done on commodity hardware. However, recent works have empirically shown a ranking disorder between the performance of…

Machine Learning · Computer Science 2021-04-13 Kaicheng Yu , Rene Ranftl , Mathieu Salzmann

Can We Gain More from Orthogonality Regularizations in Training Deep CNNs?

This paper seeks to answer the question: as the (near-) orthogonality of weights is found to be a favorable property for training deep convolutional neural networks, how can we enforce it in more effective and easy-to-use ways? We develop…

Machine Learning · Computer Science 2018-10-23 Nitin Bansal , Xiaohan Chen , Zhangyang Wang

Learning in the Machine: To Share or Not to Share?

Weight-sharing is one of the pillars behind Convolutional Neural Networks and their successes. However, in physical neural systems such as the brain, weight-sharing is implausible. This discrepancy raises the fundamental question of whether…

Machine Learning · Computer Science 2019-10-08 Jordan Ott , Erik Linstead , Nicholas LaHaye , Pierre Baldi

Fiedler Regularization: Learning Neural Networks with Graph Sparsity

We introduce a novel regularization approach for deep learning that incorporates and respects the underlying graphical structure of the neural network. Existing regularization methods often focus on dropping/penalizing weights in a global…

Machine Learning · Statistics 2020-08-18 Edric Tam , David Dunson

Regularization-based Pruning of Irrelevant Weights in Deep Neural Architectures

Deep neural networks exploiting millions of parameters are nowadays the norm in deep learning applications. This is a potential issue because of the great amount of computational resources needed for training, and of the possible loss of…

Computation and Language · Computer Science 2022-10-31 Giovanni Bonetta , Matteo Ribero , Rossella Cancelliere

Function Norms and Regularization in Deep Networks

Deep neural networks (DNNs) have become increasingly important due to their excellent empirical performance on a wide range of problems. However, regularization is generally achieved by indirect means, largely due to the complex set of…

Machine Learning · Computer Science 2018-07-02 Amal Rannen Triki , Maxim Berman , Matthew B. Blaschko

A Comparative Study on Regularization Strategies for Embedding-based Neural Networks

This paper aims to compare different regularization strategies to address a common phenomenon, severe overfitting, in embedding-based neural networks for NLP. We chose two widely studied neural models and tasks as our testbed. We tried…

Computation and Language · Computer Science 2015-08-18 Hao Peng , Lili Mou , Ge Li , Yunchuan Chen , Yangyang Lu , Zhi Jin

Weight Rescaling: Effective and Robust Regularization for Deep Neural Networks with Batch Normalization

Weight decay is often used to ensure good generalization in the training practice of deep neural networks with batch normalization (BN-DNNs), where some convolution layers are invariant to weight rescaling due to the normalization. In this…

Machine Learning · Computer Science 2022-06-22 Ziquan Liu , Yufei Cui , Jia Wan , Yu Mao , Antoni B. Chan

Estimating Implicit Regularization in Deep Learning

Deep learning systems are known to exhibit implicit regularization (alt. implicit bias), favoring simple solutions instead of merely minimizing the loss function. In some cases, we can analytically derive the implicit regularization --…

Machine Learning · Statistics 2026-05-08 Joseph H. Rudoler , Kevin Tan , Giles Hooker , Konrad P. Kording

Neural Pruning via Growing Regularization

Regularization has long been utilized to learn sparsity in deep neural network pruning. However, its role is mainly explored in the small penalty strength regime. In this work, we extend its application to a new scenario where the…

Computer Vision and Pattern Recognition · Computer Science 2021-04-07 Huan Wang , Can Qin , Yulun Zhang , Yun Fu

Gram Regularization for Multi-view 3D Shape Retrieval

How to obtain the desirable representation of a 3D shape is a key challenge in 3D shape retrieval task. Most existing 3D shape retrieval methods focus on capturing shape representation with different neural network architectures, while the…

Computer Vision and Pattern Recognition · Computer Science 2020-11-17 Zhaoqun Li

Quantization-Aware Regularizers for Deep Neural Networks Compression

Deep Neural Networks reached state-of-the-art performance across numerous domains, but this progress has come at the cost of increasingly large and over-parameterized models, posing serious challenges for deployment on resource-constrained…

Machine Learning · Computer Science 2026-02-04 Dario Malchiodi , Mattia Ferraretto , Marco Frasca

Projection Based Weight Normalization for Deep Neural Networks

Optimizing deep neural networks (DNNs) often suffers from the ill-conditioned problem. We observe that the scaling-based weight space symmetry property in rectified nonlinear network will cause this negative effect. Therefore, we propose to…

Machine Learning · Computer Science 2017-10-09 Lei Huang , Xianglong Liu , Bo Lang , Bo Li

Random Weight Factorization Improves the Training of Continuous Neural Representations

Continuous neural representations have recently emerged as a powerful and flexible alternative to classical discretized representations of signals. However, training them to capture fine details in multi-scale signals is difficult and…

Machine Learning · Computer Science 2022-10-06 Sifan Wang , Hanwen Wang , Jacob H. Seidman , Paris Perdikaris

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

Neural architecture search has attracted wide attentions in both academia and industry. To accelerate it, researchers proposed weight-sharing methods which first train a super-network to reuse computation among different operators, from…

Machine Learning · Computer Science 2020-12-16 Xin Chen , Lingxi Xie , Jun Wu , Longhui Wei , Yuhui Xu , Qi Tian

Understanding Weight Similarity of Neural Networks via Chain Normalization Rule and Hypothesis-Training-Testing

We present a weight similarity measure method that can quantify the weight similarity of non-convex neural networks. To understand the weight similarity of different trained models, we propose to extract the feature representation from the…

Machine Learning · Computer Science 2022-08-10 Guangcong Wang , Guangrun Wang , Wenqi Liang , Jianhuang Lai

Deep Learning Meets Sparse Regularization: A Signal Processing Perspective

Deep learning has been wildly successful in practice and most state-of-the-art machine learning methods are based on neural networks. Lacking, however, is a rigorous mathematical theory that adequately explains the amazing performance of…

Machine Learning · Statistics 2023-10-03 Rahul Parhi , Robert D. Nowak

Heavy-Tailed Regularization of Weight Matrices in Deep Neural Networks

Unraveling the reasons behind the remarkable success and exceptional generalization capabilities of deep neural networks presents a formidable challenge. Recent insights from random matrix theory, specifically those concerning the spectral…

Machine Learning · Statistics 2023-04-10 Xuanzhe Xiao , Zeng Li , Chuanlong Xie , Fengwei Zhou