English

Shake-Shake regularization

Machine Learning 2017-05-24 v2 Computer Vision and Pattern Recognition

Abstract

The method introduced in this paper aims at helping deep learning practitioners faced with an overfit problem. The idea is to replace, in a multi-branch network, the standard summation of parallel branches with a stochastic affine combination. Applied to 3-branch residual networks, shake-shake regularization improves on the best single shot published results on CIFAR-10 and CIFAR-100 by reaching test errors of 2.86% and 15.85%. Experiments on architectures without skip connections or Batch Normalization show encouraging results and open the door to a large set of applications. Code is available at https://github.com/xgastaldi/shake-shake

Keywords

Cite

@article{arxiv.1705.07485,
  title  = {Shake-Shake regularization},
  author = {Xavier Gastaldi},
  journal= {arXiv preprint arXiv:1705.07485},
  year   = {2017}
}
R2 v1 2026-06-22T19:54:00.043Z