EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones

Yulin Wang; Yang Yue; Rui Lu; Tianjiao Liu; Zhao Zhong; Shiji Song; Gao Huang

EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones

Computer Vision and Pattern Recognition 2023-08-17 v3 Artificial Intelligence Machine Learning

Authors: Yulin Wang , Yang Yue , Rui Lu , Tianjiao Liu , Zhao Zhong , Shiji Song , Gao Huang

Abstract

The superior performance of modern deep networks usually comes with a costly training procedure. This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers). Our work is inspired by the inherent learning dynamics of deep networks: we experimentally show that at an earlier training stage, the model mainly learns to recognize some 'easier-to-learn' discriminative patterns within each example, e.g., the lower-frequency components of images and the original information before data augmentation. Driven by this phenomenon, we propose a curriculum where the model always leverages all the training data at each epoch, while the curriculum starts with only exposing the 'easier-to-learn' patterns of each example, and introduces gradually more difficult patterns. To implement this idea, we 1) introduce a cropping operation in the Fourier spectrum of the inputs, which enables the model to learn from only the lower-frequency components efficiently, 2) demonstrate that exposing the features of original images amounts to adopting weaker data augmentation, and 3) integrate 1) and 2) and design a curriculum learning schedule with a greedy-search algorithm. The resulting approach, EfficientTrain, is simple, general, yet surprisingly effective. As an off-the-shelf method, it reduces the wall-time training cost of a wide variety of popular models (e.g., ResNet, ConvNeXt, DeiT, PVT, Swin, and CSWin) by >1.5x on ImageNet-1K/22K without sacrificing accuracy. It is also effective for self-supervised learning (e.g., MAE). Code is available at https://github.com/LeapLabTHU/EfficientTrain.

Keywords

neural network training vision transformer deep learning for image classification

Cite

@article{arxiv.2211.09703,
  title  = {EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones},
  author = {Yulin Wang and Yang Yue and Rui Lu and Tianjiao Liu and Zhao Zhong and Shiji Song and Gao Huang},
  journal= {arXiv preprint arXiv:2211.09703},
  year   = {2023}
}

Comments

ICCV 2023

EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones

Abstract

Keywords

Cite

Comments

Related papers