Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale

Yang Li; Yu Shen; Huaijun Jiang; Wentao Zhang; Jixiang Li; Ji Liu; Ce Zhang; Bin Cui

Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale

Machine Learning 2022-01-19 v1

Authors: Yang Li , Yu Shen , Huaijun Jiang , Wentao Zhang , Jixiang Li , Ji Liu , Ce Zhang , Bin Cui

Abstract

The ever-growing demand and complexity of machine learning are putting pressure on hyper-parameter tuning systems: while the evaluation cost of models continues to increase, the scalability of state-of-the-arts starts to become a crucial bottleneck. In this paper, inspired by our experience when deploying hyper-parameter tuning in a real-world application in production and the limitations of existing systems, we propose Hyper-Tune, an efficient and robust distributed hyper-parameter tuning framework. Compared with existing systems, Hyper-Tune highlights multiple system optimizations, including (1) automatic resource allocation, (2) asynchronous scheduling, and (3) multi-fidelity optimizer. We conduct extensive evaluations on benchmark datasets and a large-scale real-world dataset in production. Empirically, with the aid of these optimizations, Hyper-Tune outperforms competitive hyper-parameter tuning systems on a wide range of scenarios, including XGBoost, CNN, RNN, and some architectural hyper-parameters for neural networks. Compared with the state-of-the-art BOHB and A-BOHB, Hyper-Tune achieves up to 11.2x and 5.1x speedups, respectively.

Keywords

algorithm selection process optimization large language model training

Cite

@article{arxiv.2201.06834,
  title  = {Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale},
  author = {Yang Li and Yu Shen and Huaijun Jiang and Wentao Zhang and Jixiang Li and Ji Liu and Ce Zhang and Bin Cui},
  journal= {arXiv preprint arXiv:2201.06834},
  year   = {2022}
}

Comments

48th International Conference on Very Large Data Bases (VLDB'22, Scalable Data Science track)

Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale

Abstract

Keywords

Cite

Comments

Related papers