Related papers: An Asymptotically Optimal Multi-Armed Bandit Algor…

Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization

Performance of machine learning algorithms depends critically on identifying a good set of hyperparameters. While recent approaches use Bayesian optimization to adaptively select configurations, we focus on speeding up random search through…

Machine Learning · Computer Science 2018-06-20 Lisha Li , Kevin Jamieson , Giulia DeSalvo , Afshin Rostamizadeh , Ameet Talwalkar

Sub-sampling for Efficient Non-Parametric Bandit Exploration

In this paper we propose the first multi-armed bandit algorithm based on re-sampling that achieves asymptotically optimal regret simultaneously for different families of arms (namely Bernoulli, Gaussian and Poisson distributions). Unlike…

Machine Learning · Statistics 2020-10-28 Dorian Baudry , Emilie Kaufmann , Odalric-Ambrym Maillard

BOHB: Robust and Efficient Hyperparameter Optimization at Scale

Modern deep learning methods are very sensitive to many hyperparameters, and, due to the long training times of state-of-the-art models, vanilla Bayesian hyperparameter optimization is typically computationally infeasible. On the other…

Machine Learning · Computer Science 2018-07-06 Stefan Falkner , Aaron Klein , Frank Hutter

Bayesian Optimization for Categorical and Category-Specific Continuous Inputs

Many real-world functions are defined over both categorical and category-specific continuous variables and thus cannot be optimized by traditional Bayesian optimization (BO) methods. To optimize such functions, we propose a new method that…

Machine Learning · Computer Science 2019-12-02 Dang Nguyen , Sunil Gupta , Santu Rana , Alistair Shilton , Svetha Venkatesh

Stochastic Optimization with Bandit Sampling

Many stochastic optimization algorithms work by estimating the gradient of the cost function on the fly by sampling datapoints uniformly at random from a training set. However, the estimator might have a large variance, which inadvertently…

Machine Learning · Computer Science 2017-08-10 Farnood Salehi , L. Elisa Celis , Patrick Thiran

From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits

The stochastic multi-arm bandit problem has been extensively studied under standard assumptions on the arm's distribution (e.g bounded with known support, exponential family, etc). These assumptions are suitable for many real-world problems…

Machine Learning · Statistics 2021-11-19 Dorian Baudry , Patrick Saux , Odalric-Ambrym Maillard

Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits

In stochastic contextual bandits, an agent sequentially makes actions from a time-dependent action set based on past experience to minimize the cumulative regret. Like many other machine learning algorithms, the performance of bandits…

Machine Learning · Computer Science 2024-04-09 Yue Kang , Cho-Jui Hsieh , Thomas C. M. Lee

On Limited-Memory Subsampling Strategies for Bandits

There has been a recent surge of interest in nonparametric bandit algorithms based on subsampling. One drawback however of these approaches is the additional complexity required by random subsampling and the storage of the full history of…

Artificial Intelligence · Computer Science 2021-06-22 Dorian Baudry , Yoan Russac , Olivier Cappé

Non-stochastic Best Arm Identification and Hyperparameter Optimization

Motivated by the task of hyperparameter optimization, we introduce the non-stochastic best-arm identification problem. Within the multi-armed bandit literature, the cumulative regret objective enjoys algorithms and analyses for both the…

Machine Learning · Computer Science 2015-03-02 Kevin Jamieson , Ameet Talwalkar

HyperArm Bandit Optimization: A Novel approach to Hyperparameter Optimization and an Analysis of Bandit Algorithms in Stochastic and Adversarial Settings

This paper explores the application of bandit algorithms in both stochastic and adversarial settings, with a focus on theoretical analysis and practical applications. The study begins by introducing bandit problems, distinguishing between…

Machine Learning · Computer Science 2025-03-14 Samih Karroum , Saad Mazhar

Reinforcement-based Simultaneous Algorithm and its Hyperparameters Selection

Many algorithms for data analysis exist, especially for classification problems. To solve a data analysis problem, a proper algorithm should be chosen, and also its hyperparameters should be selected. In this paper, we present a new method…

Machine Learning · Computer Science 2016-11-08 Valeria Efimova , Andrey Filchenkov , Anatoly Shalyto

Neural Thompson Sampling

Thompson Sampling (TS) is one of the most effective algorithms for solving contextual multi-armed bandit problems. In this paper, we propose a new algorithm, called Neural Thompson Sampling, which adapts deep neural networks for both…

Machine Learning · Computer Science 2022-01-03 Weitong Zhang , Dongruo Zhou , Lihong Li , Quanquan Gu

A Bayesian Sampling Approach to Exploration in Reinforcement Learning

We present a modular approach to reinforcement learning that uses a Bayesian representation of the uncertainty over models. The approach, BOSS (Best of Sampled Set), drives exploration by sampling multiple models from the posterior and…

Machine Learning · Computer Science 2012-05-14 John Asmuth , Lihong Li , Michael L. Littman , Ali Nouri , David Wingate

Multi-armed Bandits with Cost Subsidy

In this paper, we consider a novel variant of the multi-armed bandit (MAB) problem, MAB with cost subsidy, which models many real-life applications where the learning agent has to pay to select an arm and is concerned about optimizing…

Machine Learning · Computer Science 2021-03-16 Deeksha Sinha , Karthik Abinav Sankararama , Abbas Kazerouni , Vashist Avadhanula

Fast, Precise Thompson Sampling for Bayesian Optimization

Thompson sampling (TS) has optimal regret and excellent empirical performance in multi-armed bandit problems. Yet, in Bayesian optimization, TS underperforms popular acquisition functions (e.g., EI, UCB). TS samples arms according to the…

Machine Learning · Statistics 2024-12-02 David Sweet

Neural Bandit Based Optimal LLM Selection for a Pipeline of Subtasks

As large language models (LLMs) become increasingly popular, there is a growing need to predict which out of a set of LLMs will yield a successful answer to a given query at low cost. This problem promises to become even more relevant as…

Computation and Language · Computer Science 2026-04-23 Baran Atalar , Eddie Zhang , Carlee Joe-Wong

Asymptotically Optimal Information-Directed Sampling

We introduce a simple and efficient algorithm for stochastic linear bandits with finitely many actions that is asymptotically optimal and (nearly) worst-case optimal in finite time. The approach is based on the frequentist…

Machine Learning · Statistics 2021-07-05 Johannes Kirschner , Tor Lattimore , Claire Vernade , Csaba Szepesvári

A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms

In this paper we propose a general methodology to derive regret bounds for randomized multi-armed bandit algorithms. It consists in checking a set of sufficient conditions on the sampling probability of each arm and on the family of…

Machine Learning · Computer Science 2024-11-14 Dorian Baudry , Kazuya Suzuki , Junya Honda

High dimensional Bayesian Optimization Algorithm for Complex System in Time Series

At present, high-dimensional global optimization problems with time-series models have received much attention from engineering fields. Since it was proposed, Bayesian optimization has quickly become a popular and promising approach for…

Machine Learning · Computer Science 2021-08-06 Yuyang Chen , Kaiming Bi , Chih-Hang J. Wu , David Ben-Arieh , Ashesh Sinha

Adaptive Sample Sharing for Multi Agent Linear Bandits

The multi-agent linear bandit setting is a well-known setting for which designing efficient collaboration between agents remains challenging. This paper studies the impact of data sharing among agents on regret minimization. Unlike most…

Machine Learning · Computer Science 2025-05-28 Hamza Cherkaoui , Merwan Barlier , Igor Colin