Related papers: Understanding and Improving Group Normalization

Towards Understanding Regularization in Batch Normalization

Batch Normalization (BN) improves both convergence and generalization in training neural networks. This work understands these phenomena theoretically. We analyze BN by using a basic block of neural networks, consisting of a kernel layer, a…

Machine Learning · Computer Science 2019-04-25 Ping Luo , Xinjiang Wang , Wenqi Shao , Zhanglin Peng

Group Normalization

Batch Normalization (BN) is a milestone technique in the development of deep learning, enabling various networks to train. However, normalizing along the batch dimension introduces problems --- BN's error increases rapidly when the batch…

Computer Vision and Pattern Recognition · Computer Science 2018-06-13 Yuxin Wu , Kaiming He

Batch Group Normalization

Deep Convolutional Neural Networks (DCNNs) are hard and time-consuming to train. Normalization is one of the effective solutions. Among previous normalization methods, Batch Normalization (BN) performs well at medium and large batch sizes…

Machine Learning · Computer Science 2020-12-10 Xiao-Yun Zhou , Jiacheng Sun , Nanyang Ye , Xu Lan , Qijun Luo , Bo-Lin Lai , Pedro Esperanca , Guang-Zhong Yang , Zhenguo Li

Understanding Batch Normalization

Batch normalization (BN) is a technique to normalize activations in intermediate layers of deep neural networks. Its tendency to improve accuracy and speed up training have established BN as a favorite technique in deep learning. Yet,…

Machine Learning · Computer Science 2018-12-03 Johan Bjorck , Carla Gomes , Bart Selman , Kilian Q. Weinberger

Exploring the Efficacy of Group-Normalization in Deep Learning Models for Alzheimer's Disease Classification

Batch Normalization is an important approach to advancing deep learning since it allows multiple networks to train simultaneously. A problem arises when normalizing along the batch dimension because B.N.'s error increases significantly as…

Computer Vision and Pattern Recognition · Computer Science 2024-04-02 Gousia Habib , Ishfaq Ahmed Malik , Jameel Ahmad , Imtiaz Ahmed , Shaima Qureshi

Layer Normalization

Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the…

Machine Learning · Statistics 2016-07-22 Jimmy Lei Ba , Jamie Ryan Kiros , Geoffrey E. Hinton

Making Batch Normalization Great in Federated Deep Learning

Batch Normalization (BN) is widely used in {centralized} deep learning to improve convergence and generalization. However, in {federated} learning (FL) with decentralized data, prior work has observed that training with BN could hinder…

Machine Learning · Computer Science 2024-04-01 Jike Zhong , Hong-You Chen , Wei-Lun Chao

Group Whitening: Balancing Learning Efficiency and Representational Capacity

Batch normalization (BN) is an important technique commonly incorporated into deep learning models to perform standardization within mini-batches. The merits of BN in improving a model's learning efficiency can be further amplified by…

Machine Learning · Computer Science 2021-04-07 Lei Huang , Yi Zhou , Li Liu , Fan Zhu , Ling Shao

Training Deep Neural Networks Without Batch Normalization

Training neural networks is an optimization problem, and finding a decent set of parameters through gradient descent can be a difficult task. A host of techniques has been developed to aid this process before and during the training phase.…

Machine Learning · Computer Science 2020-08-19 Divya Gaur , Joachim Folz , Andreas Dengel

Understanding the Failure of Batch Normalization for Transformers in NLP

Batch Normalization (BN) is a core and prevalent technique in accelerating the training of deep neural networks and improving the generalization on Computer Vision (CV) tasks. However, it fails to defend its position in Natural Language…

Computation and Language · Computer Science 2022-10-14 Jiaxi Wang , Ji Wu , Lei Huang

An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation

Batch normalization has been widely used to improve optimization in deep neural networks. While the uncertainty in batch statistics can act as a regularizer, using these dataset statistics specific to the training set impairs generalization…

Computer Vision and Pattern Recognition · Computer Science 2019-08-02 Vincent Michalski , Vikram Voleti , Samira Ebrahimi Kahou , Anthony Ortiz , Pascal Vincent , Chris Pal , Doina Precup

Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes

Normalization techniques have only recently begun to be exploited in supervised learning tasks. Batch normalization exploits mini-batch statistics to normalize the activations. This was shown to speed up training and result in better…

Machine Learning · Computer Science 2017-03-08 Mengye Ren , Renjie Liao , Raquel Urtasun , Fabian H. Sinz , Richard S. Zemel

Restructuring Batch Normalization to Accelerate CNN Training

Batch Normalization (BN) has become a core design block of modern Convolutional Neural Networks (CNNs). A typical modern CNN has a large number of BN layers in its lean and deep architecture. BN requires mean and variance calculations over…

Computer Vision and Pattern Recognition · Computer Science 2019-03-04 Wonkyung Jung , Daejin Jung , and Byeongho Kim , Sunjung Lee , Wonjong Rhee , Jung Ho Ahn

Batch Normalized Recurrent Neural Networks

Recurrent Neural Networks (RNNs) are powerful models for sequential data that have the potential to learn long-term dependencies. However, they are computationally expensive to train and difficult to parallelize. Recent work has shown that…

Machine Learning · Statistics 2015-10-07 César Laurent , Gabriel Pereyra , Philémon Brakel , Ying Zhang , Yoshua Bengio

Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization

Batch Normalization (BN) is a commonly used technique to accelerate and stabilize training of deep neural networks. Despite its empirical success, a full theoretical understanding of BN is yet to be developed. In this work, we analyze BN…

Machine Learning · Computer Science 2022-03-22 Tolga Ergen , Arda Sahiner , Batu Ozturkler , John Pauly , Morteza Mardani , Mert Pilanci

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates…

Machine Learning · Computer Science 2015-03-03 Sergey Ioffe , Christian Szegedy

Counterbalancing Teacher: Regularizing Batch Normalized Models for Robustness

Batch normalization (BN) is a ubiquitous technique for training deep neural networks that accelerates their convergence to reach higher accuracy. However, we demonstrate that BN comes with a fundamental drawback: it incentivizes the model…

Machine Learning · Computer Science 2022-07-05 Saeid Asgari Taghanaki , Ali Gholami , Fereshte Khani , Kristy Choi , Linh Tran , Ran Zhang , Aliasghar Khani

Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction

Normalization layers (e.g., Batch Normalization, Layer Normalization) were introduced to help with optimization difficulties in very deep nets, but they clearly also help generalization, even in not-so-deep nets. Motivated by the long-held…

Machine Learning · Computer Science 2023-01-18 Kaifeng Lyu , Zhiyuan Li , Sanjeev Arora

Batch Normalization Preconditioning for Neural Network Training

Batch normalization (BN) is a popular and ubiquitous method in deep learning that has been shown to decrease training time and improve generalization performance of neural networks. Despite its success, BN is not theoretically well…

Machine Learning · Computer Science 2022-01-21 Susanna Lange , Kyle Helfrich , Qiang Ye

Towards Deeper Graph Neural Networks with Differentiable Group Normalization

Graph neural networks (GNNs), which learn the representation of a node by aggregating its neighbors, have become an effective computational tool in downstream applications. Over-smoothing is one of the key issues which limit the performance…

Machine Learning · Computer Science 2020-06-15 Kaixiong Zhou , Xiao Huang , Yuening Li , Daochen Zha , Rui Chen , Xia Hu