TaskNorm: Rethinking Batch Normalization for Meta-Learning

John Bronskill; Jonathan Gordon; James Requeima; Sebastian Nowozin; Richard E. Turner

TaskNorm: Rethinking Batch Normalization for Meta-Learning

Machine Learning 2020-07-14 v2 Machine Learning

Authors: John Bronskill , Jonathan Gordon , James Requeima , Sebastian Nowozin , Richard E. Turner

Abstract

Modern meta-learning approaches for image classification rely on increasingly deep networks to achieve state-of-the-art performance, making batch normalization an essential component of meta-learning pipelines. However, the hierarchical nature of the meta-learning setting presents several challenges that can render conventional batch normalization ineffective, giving rise to the need to rethink normalization in this setting. We evaluate a range of approaches to batch normalization for meta-learning scenarios, and develop a novel approach that we call TaskNorm. Experiments on fourteen datasets demonstrate that the choice of batch normalization has a dramatic effect on both classification accuracy and training time for both gradient based and gradient-free meta-learning approaches. Importantly, TaskNorm is found to consistently improve performance. Finally, we provide a set of best practices for normalization that will allow fair comparison of meta-learning algorithms.

Keywords

multi-task learning deep learning neural networks

Cite

@article{arxiv.2003.03284,
  title  = {TaskNorm: Rethinking Batch Normalization for Meta-Learning},
  author = {John Bronskill and Jonathan Gordon and James Requeima and Sebastian Nowozin and Richard E. Turner},
  journal= {arXiv preprint arXiv:2003.03284},
  year   = {2020}
}

TaskNorm: Rethinking Batch Normalization for Meta-Learning

Abstract

Keywords

Cite

Related papers