English
Related papers

Related papers: TaskNorm: Rethinking Batch Normalization for Meta-…

200 papers

Batch Normalization (BatchNorm) is a widely adopted technique that enables faster and more stable training of deep neural networks (DNNs). Despite its pervasiveness, the exact reasons for BatchNorm's effectiveness are still poorly…

Machine Learning · Statistics 2019-04-16 Shibani Santurkar , Dimitris Tsipras , Andrew Ilyas , Aleksander Madry

Batch normalization (BatchNorm) is a popular layer normalization technique used when training deep neural networks. It has been shown to enhance the training speed and accuracy of deep learning models. However, the mechanics by which…

Machine Learning · Computer Science 2025-02-14 Hermanus L. Potgieter , Coenraad Mouton , Marelie H. Davel

Batch normalization was introduced in 2015 to speed up training of deep convolution networks by normalizing the activations across the current batch to have zero mean and unity variance. The results presented here show an interesting aspect…

Computer Vision and Pattern Recognition · Computer Science 2018-02-22 Mohamed Hajaj , Duncan Gillies

Batch Normalization (BatchNorm) is a technique that improves the training of deep neural networks, especially Convolutional Neural Networks (CNN). It has been empirically demonstrated that BatchNorm increases performance, stability, and…

Machine Learning · Computer Science 2023-03-24 Yashna Peerthum , Mark Stamp

Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the…

Machine Learning · Statistics 2016-07-22 Jimmy Lei Ba , Jamie Ryan Kiros , Geoffrey E. Hinton

Batch Normalization (BatchNorm) is an extremely useful component of modern neural network architectures, enabling optimization using higher learning rates and achieving faster convergence. In this paper, we use mean-field theory to…

Machine Learning · Computer Science 2019-03-08 Mingwei Wei , James Stokes , David J Schwab

Inspired by BatchNorm, there has been an explosion of normalization layers in deep learning. Recent works have identified a multitude of beneficial properties in BatchNorm to explain its success. However, given the pursuit of alternative…

Machine Learning · Computer Science 2021-10-27 Ekdeep Singh Lubana , Robert P. Dick , Hidenori Tanaka

Implementation of quantized neural networks on computing hardware leads to considerable speed up and memory saving. However, quantized deep networks are difficult to train and batch~normalization (BatchNorm) layer plays an important role in…

Machine Learning · Computer Science 2020-04-30 Eyyüb Sari , Vahid Partovi Nia

Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates…

Machine Learning · Computer Science 2015-03-03 Sergey Ioffe , Christian Szegedy

Normalization techniques such as Batch Normalization have been applied successfully for training deep neural networks. Yet, despite its apparent empirical benefits, the reasons behind the success of Batch Normalization are mostly…

Machine Learning · Statistics 2018-10-09 Jonas Kohler , Hadi Daneshmand , Aurelien Lucchi , Ming Zhou , Klaus Neymeyr , Thomas Hofmann

BatchNorm is a critical building block in modern convolutional neural networks. Its unique property of operating on "batches" instead of individual samples introduces significantly different behaviors from most other operations in deep…

Computer Vision and Pattern Recognition · Computer Science 2021-05-18 Yuxin Wu , Justin Johnson

The ability to learn new concepts with small amounts of data is a critical aspect of intelligence that has proven challenging for deep learning methods. Meta-learning has emerged as a promising technique for leveraging data from previous…

Machine Learning · Computer Science 2020-04-29 Mingzhang Yin , George Tucker , Mingyuan Zhou , Sergey Levine , Chelsea Finn

Recurrent Neural Networks (RNNs) are powerful models for sequential data that have the potential to learn long-term dependencies. However, they are computationally expensive to train and difficult to parallelize. Recent work has shown that…

Machine Learning · Statistics 2015-10-07 César Laurent , Gabriel Pereyra , Philémon Brakel , Ying Zhang , Yoshua Bengio

A key component of most neural network architectures is the use of normalization layers, such as Batch Normalization. Despite its common use and large utility in optimizing deep architectures, it has been challenging both to generically…

Machine Learning · Computer Science 2020-02-17 Cecilia Summers , Michael J. Dinneen

Training neural networks is an optimization problem, and finding a decent set of parameters through gradient descent can be a difficult task. A host of techniques has been developed to aid this process before and during the training phase.…

Machine Learning · Computer Science 2020-08-19 Divya Gaur , Joachim Folz , Andreas Dengel

Deep multitask networks, in which one neural network produces multiple predictive outputs, can offer better speed and performance than their single-task counterparts but are challenging to train properly. We present a gradient normalization…

Computer Vision and Pattern Recognition · Computer Science 2018-07-16 Zhao Chen , Vijay Badrinarayanan , Chen-Yu Lee , Andrew Rabinovich

The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent years. Contrary to conventional approaches to AI where tasks are solved from scratch using a fixed learning algorithm, meta-learning aims to…

Machine Learning · Computer Science 2020-11-10 Timothy Hospedales , Antreas Antoniou , Paul Micaelli , Amos Storkey

Normalization techniques are crucial for enhancing Transformer models' performance and stability in time series analysis tasks, yet traditional methods like batch and layer normalization often lead to issues such as token shift, attention…

Machine Learning · Computer Science 2024-05-28 Nan Huang , Christian Kümmerle , Xiang Zhang

While modern convolutional neural networks achieve outstanding accuracy on many image classification tasks, they are, compared to humans, much more sensitive to image degradation. Here, we describe a variant of Batch Normalization,…

Computer Vision and Pattern Recognition · Computer Science 2019-03-05 Bojian Yin , Siebren Schaafsma , Henk Corporaal , H. Steven Scholte , Sander M. Bohte

Binary Neural Networks (BNNs) are difficult to train, and suffer from drop of accuracy. It appears in practice that BNNs fail to train in the absence of Batch Normalization (BatchNorm) layer. We find the main role of BatchNorm is to avoid…

Machine Learning · Computer Science 2020-04-30 Eyyüb Sari , Mouloud Belbahri , Vahid Partovi Nia
‹ Prev 1 2 3 10 Next ›