Stage-based Hyper-parameter Optimization for Deep Learning

Ahnjae Shin; Dong-Jin Shin; Sungwoo Cho; Do Yoon Kim; Eunji Jeong; Gyeong-In Yu; Byung-Gon Chun

Stage-based Hyper-parameter Optimization for Deep Learning

Machine Learning 2019-11-26 v1 Machine Learning

Authors: Ahnjae Shin , Dong-Jin Shin , Sungwoo Cho , Do Yoon Kim , Eunji Jeong , Gyeong-In Yu , Byung-Gon Chun

Abstract

As deep learning techniques advance more than ever, hyper-parameter optimization is the new major workload in deep learning clusters. Although hyper-parameter optimization is crucial in training deep learning models for high model performance, effectively executing such a computation-heavy workload still remains a challenge. We observe that numerous trials issued from existing hyper-parameter optimization algorithms share common hyper-parameter sequence prefixes, which implies that there are redundant computations from training the same hyper-parameter sequence multiple times. We propose a stage-based execution strategy for efficient execution of hyper-parameter optimization algorithms. Our strategy removes redundancy in the training process by splitting the hyper-parameter sequences of trials into homogeneous stages, and generating a tree of stages by merging the common prefixes. Our preliminary experiment results show that applying stage-based execution to hyper-parameter optimization algorithms outperforms the original trial-based method, saving required GPU-hours and end-to-end training time by up to 6.60 times and 4.13 times, respectively.

Keywords

algorithm selection hyperparameter optimization parallel algorithm

Cite

@article{arxiv.1911.10504,
  title  = {Stage-based Hyper-parameter Optimization for Deep Learning},
  author = {Ahnjae Shin and Dong-Jin Shin and Sungwoo Cho and Do Yoon Kim and Eunji Jeong and Gyeong-In Yu and Byung-Gon Chun},
  journal= {arXiv preprint arXiv:1911.10504},
  year   = {2019}
}

Stage-based Hyper-parameter Optimization for Deep Learning

Abstract

Keywords

Cite

Related papers