Related papers: Learning Compact Neural Networks with Regularizati…

Regularization-based Pruning of Irrelevant Weights in Deep Neural Architectures

Deep neural networks exploiting millions of parameters are nowadays the norm in deep learning applications. This is a potential issue because of the great amount of computational resources needed for training, and of the possible loss of…

Computation and Language · Computer Science 2022-10-31 Giovanni Bonetta , Matteo Ribero , Rossella Cancelliere

Learning in Compact Spaces with Approximately Normalized Transformer

The successful training of deep neural networks requires addressing challenges such as overfitting, numerical instabilities leading to divergence, and increasing variance in the residual stream. A common solution is to apply regularization…

Machine Learning · Computer Science 2025-11-20 Jörg K. H. Franke , Urs Spiegelhalter , Marianna Nezhurina , Jenia Jitsev , Frank Hutter , Michael Hefenbrock

An Improving Framework of regularization for Network Compression

Deep Neural Networks have achieved remarkable success relying on the developing high computation capability of GPUs and large-scale datasets with increasing network depth and width in image recognition, object detection and many other…

Machine Learning · Computer Science 2020-01-08 E Zhenqian , Gao Weiguo

Neural Pruning via Growing Regularization

Regularization has long been utilized to learn sparsity in deep neural network pruning. However, its role is mainly explored in the small penalty strength regime. In this work, we extend its application to a new scenario where the…

Computer Vision and Pattern Recognition · Computer Science 2021-04-07 Huan Wang , Can Qin , Yulun Zhang , Yun Fu

Gradient-Coherent Strong Regularization for Deep Neural Networks

Regularization plays an important role in generalization of deep neural networks, which are often prone to overfitting with their numerous parameters. L1 and L2 regularizers are common regularization tools in machine learning with their…

Machine Learning · Computer Science 2019-10-21 Dae Hoon Park , Chiu Man Ho , Yi Chang , Huaqing Zhang

Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel

Recent works have shown that on sufficiently over-parametrized neural nets, gradient descent with relatively large initialization optimizes a prediction function in the RKHS of the Neural Tangent Kernel (NTK). This analysis leads to global…

Machine Learning · Statistics 2020-04-28 Colin Wei , Jason D. Lee , Qiang Liu , Tengyu Ma

A Partial Regularization Method for Network Compression

Deep Neural Networks have achieved remarkable success relying on the developing availability of GPUs and large-scale datasets with increasing network depth and width. However, due to the expensive computation and intensive memory,…

Machine Learning · Computer Science 2020-09-07 E Zhenqian , Gao Weiguo

Smaller Models, Better Generalization

Reducing network complexity has been a major research focus in recent years with the advent of mobile technology. Convolutional Neural Networks that perform various vision tasks without memory overhaul is the need of the hour. This paper…

Machine Learning · Computer Science 2019-08-30 Mayank Sharma , Suraj Tripathi , Abhimanyu Dubey , Jayadeva , Sai Guruju , Nihal Goalla

Combining Explicit and Implicit Regularization for Efficient Learning in Deep Networks

Works on implicit regularization have studied gradient trajectories during the optimization process to explain why deep networks favor certain kinds of solutions over others. In deep linear networks, it has been shown that gradient descent…

Machine Learning · Computer Science 2023-06-02 Dan Zhao

Efficient Continual Learning in Neural Networks with Embedding Regularization

Continual learning of deep neural networks is a key requirement for scaling them up to more complex applicative scenarios and for achieving real lifelong learning of these architectures. Previous approaches to the problem have considered…

Machine Learning · Computer Science 2020-06-25 Jary Pomponi , Simone Scardapane , Vincenzo Lomonaco , Aurelio Uncini

Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization

While deep learning is successful in a number of applications, it is not yet well understood theoretically. A satisfactory theoretical characterization of deep learning however, is beginning to emerge. It covers the following questions: 1)…

Machine Learning · Computer Science 2019-08-27 Tomaso Poggio , Andrzej Banburski , Qianli Liao

ExpandNets: Linear Over-parameterization to Train Compact Convolutional Networks

We introduce an approach to training a given compact network. To this end, we leverage over-parameterization, which typically improves both neural network optimization and generalization. Specifically, we propose to expand each linear layer…

Computer Vision and Pattern Recognition · Computer Science 2021-04-15 Shuxuan Guo , Jose M. Alvarez , Mathieu Salzmann

Constraint-Based Regularization of Neural Networks

We propose a method for efficiently incorporating constraints into a stochastic gradient Langevin framework for the training of deep neural networks. Constraints allow direct control of the parameter space of the model. Appropriately…

Machine Learning · Computer Science 2021-06-22 Benedict Leimkuhler , Timothée Pouchon , Tiffany Vlaar , Amos Storkey

Learning Sparse Low-Precision Neural Networks With Learnable Regularization

We consider learning deep neural networks (DNNs) that consist of low-precision weights and activations for efficient inference of fixed-point operations. In training low-precision networks, gradient descent in the backward pass is performed…

Computer Vision and Pattern Recognition · Computer Science 2020-05-26 Yoojin Choi , Mostafa El-Khamy , Jungwon Lee

An Improved Analysis of Training Over-parameterized Deep Neural Networks

A recent line of research has shown that gradient-based algorithms with random initialization can converge to the global minima of the training loss for over-parameterized (i.e., sufficiently wide) deep neural networks. However, the…

Machine Learning · Computer Science 2019-06-12 Difan Zou , Quanquan Gu

Deep Learning Meets Sparse Regularization: A Signal Processing Perspective

Deep learning has been wildly successful in practice and most state-of-the-art machine learning methods are based on neural networks. Lacking, however, is a rigorous mathematical theory that adequately explains the amazing performance of…

Machine Learning · Statistics 2023-10-03 Rahul Parhi , Robert D. Nowak

Neural Networks Regularization Through Class-wise Invariant Representation Learning

Training deep neural networks is known to require a large number of training samples. However, in many applications only few training samples are available. In this work, we tackle the issue of training neural networks for classification…

Machine Learning · Computer Science 2017-12-25 Soufiane Belharbi , Clément Chatelain , Romain Hérault , Sébastien Adam

Neural Network Reduction with Guided Regularizers

Regularization techniques such as $\mathcal{L}_1$ and $\mathcal{L}_2$ regularizers are effective in sparsifying neural networks (NNs). However, to remove a certain neuron or channel in NNs, all weight elements related to that neuron or…

Machine Learning · Computer Science 2023-05-31 Ali Haisam Muhammad Rafid , Adrian Sandu

Nearly Minimal Over-Parametrization of Shallow Neural Networks

A recent line of work has shown that an overparametrized neural network can perfectly fit the training data, an otherwise often intractable nonconvex optimization problem. For (fully-connected) shallow networks, in the best case scenario,…

Machine Learning · Computer Science 2019-10-30 Armin Eftekhari , ChaeHwan Song , Volkan Cevher

Regularization and Optimization strategies in Deep Convolutional Neural Network

Convolution Neural Networks, known as ConvNets exceptionally perform well in many complex machine learning tasks. The architecture of ConvNets demands the huge and rich amount of data and involves with a vast number of parameters that leads…

Computer Vision and Pattern Recognition · Computer Science 2017-12-14 Pushparaja Murugan , Shanmugasundaram Durairaj