Stochastic Normalizations as Bayesian Learning

Alexander Shekhovtsov; Boris Flach

Stochastic Normalizations as Bayesian Learning

Machine Learning 2018-11-05 v1 Neural and Evolutionary Computing Machine Learning

Authors: Alexander Shekhovtsov , Boris Flach

Abstract

In this work we investigate the reasons why Batch Normalization (BN) improves the generalization performance of deep networks. We argue that one major reason, distinguishing it from data-independent normalization methods, is randomness of batch statistics. This randomness appears in the parameters rather than in activations and admits an interpretation as a practical Bayesian learning. We apply this idea to other (deterministic) normalization techniques that are oblivious to the batch size. We show that their generalization performance can be improved significantly by Bayesian learning of the same form. We obtain test performance comparable to BN and, at the same time, better validation losses suitable for subsequent output uncertainty estimation through approximate Bayesian posterior.

Keywords

batch normalization bayesian deep learning generalization in machine learning

Cite

@article{arxiv.1811.00639,
  title  = {Stochastic Normalizations as Bayesian Learning},
  author = {Alexander Shekhovtsov and Boris Flach},
  journal= {arXiv preprint arXiv:1811.00639},
  year   = {2018}
}

Comments

Accepted to ACCV 2018

Stochastic Normalizations as Bayesian Learning

Abstract

Keywords

Cite

Comments

Related papers