English
Related papers

Related papers: Learning Compact Neural Networks with Regularizati…

200 papers

Deep neural networks exploiting millions of parameters are nowadays the norm in deep learning applications. This is a potential issue because of the great amount of computational resources needed for training, and of the possible loss of…

Computation and Language · Computer Science 2022-10-31 Giovanni Bonetta , Matteo Ribero , Rossella Cancelliere

The successful training of deep neural networks requires addressing challenges such as overfitting, numerical instabilities leading to divergence, and increasing variance in the residual stream. A common solution is to apply regularization…

Machine Learning · Computer Science 2025-11-20 Jörg K. H. Franke , Urs Spiegelhalter , Marianna Nezhurina , Jenia Jitsev , Frank Hutter , Michael Hefenbrock

Deep Neural Networks have achieved remarkable success relying on the developing high computation capability of GPUs and large-scale datasets with increasing network depth and width in image recognition, object detection and many other…

Machine Learning · Computer Science 2020-01-08 E Zhenqian , Gao Weiguo

Regularization has long been utilized to learn sparsity in deep neural network pruning. However, its role is mainly explored in the small penalty strength regime. In this work, we extend its application to a new scenario where the…

Computer Vision and Pattern Recognition · Computer Science 2021-04-07 Huan Wang , Can Qin , Yulun Zhang , Yun Fu

Regularization plays an important role in generalization of deep neural networks, which are often prone to overfitting with their numerous parameters. L1 and L2 regularizers are common regularization tools in machine learning with their…

Machine Learning · Computer Science 2019-10-21 Dae Hoon Park , Chiu Man Ho , Yi Chang , Huaqing Zhang

Recent works have shown that on sufficiently over-parametrized neural nets, gradient descent with relatively large initialization optimizes a prediction function in the RKHS of the Neural Tangent Kernel (NTK). This analysis leads to global…

Machine Learning · Statistics 2020-04-28 Colin Wei , Jason D. Lee , Qiang Liu , Tengyu Ma

Deep Neural Networks have achieved remarkable success relying on the developing availability of GPUs and large-scale datasets with increasing network depth and width. However, due to the expensive computation and intensive memory,…

Machine Learning · Computer Science 2020-09-07 E Zhenqian , Gao Weiguo

Reducing network complexity has been a major research focus in recent years with the advent of mobile technology. Convolutional Neural Networks that perform various vision tasks without memory overhaul is the need of the hour. This paper…

Machine Learning · Computer Science 2019-08-30 Mayank Sharma , Suraj Tripathi , Abhimanyu Dubey , Jayadeva , Sai Guruju , Nihal Goalla

Works on implicit regularization have studied gradient trajectories during the optimization process to explain why deep networks favor certain kinds of solutions over others. In deep linear networks, it has been shown that gradient descent…

Machine Learning · Computer Science 2023-06-02 Dan Zhao

Continual learning of deep neural networks is a key requirement for scaling them up to more complex applicative scenarios and for achieving real lifelong learning of these architectures. Previous approaches to the problem have considered…

Machine Learning · Computer Science 2020-06-25 Jary Pomponi , Simone Scardapane , Vincenzo Lomonaco , Aurelio Uncini

While deep learning is successful in a number of applications, it is not yet well understood theoretically. A satisfactory theoretical characterization of deep learning however, is beginning to emerge. It covers the following questions: 1)…

Machine Learning · Computer Science 2019-08-27 Tomaso Poggio , Andrzej Banburski , Qianli Liao

We introduce an approach to training a given compact network. To this end, we leverage over-parameterization, which typically improves both neural network optimization and generalization. Specifically, we propose to expand each linear layer…

Computer Vision and Pattern Recognition · Computer Science 2021-04-15 Shuxuan Guo , Jose M. Alvarez , Mathieu Salzmann

We propose a method for efficiently incorporating constraints into a stochastic gradient Langevin framework for the training of deep neural networks. Constraints allow direct control of the parameter space of the model. Appropriately…

Machine Learning · Computer Science 2021-06-22 Benedict Leimkuhler , Timothée Pouchon , Tiffany Vlaar , Amos Storkey

We consider learning deep neural networks (DNNs) that consist of low-precision weights and activations for efficient inference of fixed-point operations. In training low-precision networks, gradient descent in the backward pass is performed…

Computer Vision and Pattern Recognition · Computer Science 2020-05-26 Yoojin Choi , Mostafa El-Khamy , Jungwon Lee

A recent line of research has shown that gradient-based algorithms with random initialization can converge to the global minima of the training loss for over-parameterized (i.e., sufficiently wide) deep neural networks. However, the…

Machine Learning · Computer Science 2019-06-12 Difan Zou , Quanquan Gu

Deep learning has been wildly successful in practice and most state-of-the-art machine learning methods are based on neural networks. Lacking, however, is a rigorous mathematical theory that adequately explains the amazing performance of…

Machine Learning · Statistics 2023-10-03 Rahul Parhi , Robert D. Nowak

Training deep neural networks is known to require a large number of training samples. However, in many applications only few training samples are available. In this work, we tackle the issue of training neural networks for classification…

Machine Learning · Computer Science 2017-12-25 Soufiane Belharbi , Clément Chatelain , Romain Hérault , Sébastien Adam

Regularization techniques such as $\mathcal{L}_1$ and $\mathcal{L}_2$ regularizers are effective in sparsifying neural networks (NNs). However, to remove a certain neuron or channel in NNs, all weight elements related to that neuron or…

Machine Learning · Computer Science 2023-05-31 Ali Haisam Muhammad Rafid , Adrian Sandu

A recent line of work has shown that an overparametrized neural network can perfectly fit the training data, an otherwise often intractable nonconvex optimization problem. For (fully-connected) shallow networks, in the best case scenario,…

Machine Learning · Computer Science 2019-10-30 Armin Eftekhari , ChaeHwan Song , Volkan Cevher

Convolution Neural Networks, known as ConvNets exceptionally perform well in many complex machine learning tasks. The architecture of ConvNets demands the huge and rich amount of data and involves with a vast number of parameters that leads…

Computer Vision and Pattern Recognition · Computer Science 2017-12-14 Pushparaja Murugan , Shanmugasundaram Durairaj
‹ Prev 1 2 3 10 Next ›