English
Related papers

Related papers: When Does Re-initialization Work?

200 papers

Recent results suggest that reinitializing a subset of the parameters of a neural network during training can improve generalization, particularly for small training sets. We study the impact of different reinitialization methods in several…

Machine Learning · Computer Science 2021-09-02 Ibrahim Alabdulmohsin , Hartmut Maennel , Daniel Keysers

Regularization is typically understood as improving generalization by altering the landscape of local extrema to which the model eventually converges. Deep neural networks (DNNs), however, challenge this view: We show that removing…

Machine Learning · Computer Science 2019-06-03 Aditya Golatkar , Alessandro Achille , Stefano Soatto

Despite huge successes on a wide range of tasks, neural networks are known to sometimes struggle to generalise to unseen data. Many approaches have been proposed over the years to promote the generalisation ability of neural networks,…

Machine Learning · Computer Science 2026-02-02 Christiaan P. Opperman , Anna S. Bosman , Katherine M. Malan

Training neural networks is an optimization problem, and finding a decent set of parameters through gradient descent can be a difficult task. A host of techniques has been developed to aid this process before and during the training phase.…

Machine Learning · Computer Science 2020-08-19 Divya Gaur , Joachim Folz , Andreas Dengel

Normalization techniques have only recently begun to be exploited in supervised learning tasks. Batch normalization exploits mini-batch statistics to normalize the activations. This was shown to speed up training and result in better…

Machine Learning · Computer Science 2017-03-08 Mengye Ren , Renjie Liao , Raquel Urtasun , Fabian H. Sinz , Richard S. Zemel

Over-parameterized deep neural networks trained by simple first-order methods are known to be able to fit any labeling of data. Such over-fitting ability hinders generalization when mislabeled training examples are present. On the other…

Machine Learning · Computer Science 2020-10-06 Wei Hu , Zhiyuan Li , Dingli Yu

Overfitting is one of the most critical challenges in deep neural networks, and there are various types of regularization methods to improve generalization performance. Injecting noises to hidden units during training, e.g., dropout, is…

Machine Learning · Computer Science 2017-11-10 Hyeonwoo Noh , Tackgeun You , Jonghwan Mun , Bohyung Han

Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the…

Machine Learning · Statistics 2016-07-22 Jimmy Lei Ba , Jamie Ryan Kiros , Geoffrey E. Hinton

Deep Reinforcement Learning (Deep RL) has been receiving increasingly more attention thanks to its encouraging performance on a variety of control tasks. Yet, conventional regularization techniques in training neural networks (e.g., $L_2$…

Machine Learning · Computer Science 2021-11-30 Zhuang Liu , Xuanlin Li , Bingyi Kang , Trevor Darrell

A significant advance in accelerating neural network training has been the development of normalization methods, permitting the training of deep models both faster and with better accuracy. These advances come with practical challenges: for…

Machine Learning · Computer Science 2019-03-05 Jasmine Collins , Johannes Balle , Jonathon Shlens

Normalization techniques are essential for accelerating the training and improving the generalization of deep neural networks (DNNs), and have successfully been used in various applications. This paper reviews and comments on the past,…

Machine Learning · Computer Science 2020-09-29 Lei Huang , Jie Qin , Yi Zhou , Fan Zhu , Li Liu , Ling Shao

In recent years, research on learning with noisy labels has focused on devising novel algorithms that can achieve robustness to noisy training labels while generalizing to clean data. These algorithms often incorporate sophisticated…

Machine Learning · Computer Science 2023-07-12 Hui Kang , Sheng Liu , Huaxi Huang , Jun Yu , Bo Han , Dadong Wang , Tongliang Liu

Deep convolutional neural networks are known to be unstable during training at high learning rate unless normalization techniques are employed. Normalizing weights or activations allows the use of higher learning rates, resulting in faster…

Machine Learning · Computer Science 2019-12-02 Brendan Ruff , Taylor Beck , Joscha Bach

Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature, with several works highlighting diverse benefits such as improving loss landscape conditioning and combatting…

Machine Learning · Computer Science 2024-07-03 Clare Lyle , Zeyu Zheng , Khimya Khetarpal , James Martens , Hado van Hasselt , Razvan Pascanu , Will Dabney

It has been observed in practice that applying pruning-at-initialization methods to neural networks and training the sparsified networks can not only retain the testing performance of the original dense models, but also sometimes even…

Machine Learning · Computer Science 2023-01-31 Hongru Yang , Yingbin Liang , Xiaojie Guo , Lingfei Wu , Zhangyang Wang

Residual networks (ResNet) and weight normalization play an important role in various deep learning applications. However, parameter initialization strategies have not been studied previously for weight normalized networks and, in practice,…

Machine Learning · Statistics 2019-10-31 Devansh Arpit , Victor Campos , Yoshua Bengio

In recent years, a variety of normalization methods have been proposed to help train neural networks, such as batch normalization (BN), layer normalization (LN), weight normalization (WN), group normalization (GN), etc. However,…

Machine Learning · Computer Science 2020-06-17 Jiacheng Sun , Xiangyong Cao , Hanwen Liang , Weiran Huang , Zewei Chen , Zhenguo Li

Advancements in parallel processing have lead to a surge in multilayer perceptrons' (MLP) applications and deep learning in the past decades. Recurrent Neural Networks (RNNs) give additional representational power to feedforward MLPs by…

Machine Learning · Statistics 2014-10-22 Saahil Ognawala , Justin Bayer

Optimal parameter initialization remains a crucial problem for neural network training. A poor weight initialization may take longer to train and/or converge to sub-optimal solutions. Here, we propose a method of weight re-initialization by…

Machine Learning · Computer Science 2021-04-21 Norman Mu , Zhewei Yao , Amir Gholami , Kurt Keutzer , Michael Mahoney

Standard practice in training neural networks involves initializing the weights in an independent fashion. The results of recent work suggest that feature "diversity" at initialization plays an important role in training the network.…

Machine Learning · Computer Science 2020-07-06 Yaniv Blumenfeld , Dar Gilboa , Daniel Soudry
‹ Prev 1 2 3 10 Next ›