English

In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization

Machine Learning 2024-08-14 v3

Abstract

With the increasing computational costs associated with deep learning, automated hyperparameter optimization methods, strongly relying on black-box Bayesian optimization (BO), face limitations. Freeze-thaw BO offers a promising grey-box alternative, strategically allocating scarce resources incrementally to different configurations. However, the frequent surrogate model updates inherent to this approach pose challenges for existing methods, requiring retraining or fine-tuning their neural network surrogates online, introducing overhead, instability, and hyper-hyperparameters. In this work, we propose FT-PFN, a novel surrogate for Freeze-thaw style BO. FT-PFN is a prior-data fitted network (PFN) that leverages the transformers' in-context learning ability to efficiently and reliably do Bayesian learning curve extrapolation in a single forward pass. Our empirical analysis across three benchmark suites shows that the predictions made by FT-PFN are more accurate and 10-100 times faster than those of the deep Gaussian process and deep ensemble surrogates used in previous work. Furthermore, we show that, when combined with our novel acquisition mechanism (MFPI-random), the resulting in-context freeze-thaw BO method (ifBO), yields new state-of-the-art performance in the same three families of deep learning HPO benchmarks considered in prior work.

Keywords

Cite

@article{arxiv.2404.16795,
  title  = {In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization},
  author = {Herilalaina Rakotoarison and Steven Adriaensen and Neeratyoy Mallik and Samir Garibov and Edward Bergman and Frank Hutter},
  journal= {arXiv preprint arXiv:2404.16795},
  year   = {2024}
}

Comments

Published at the 41st International Conference on Machine Learning (ICML), Vienna, Austria

R2 v1 2026-06-28T16:06:41.639Z