Related papers: Reinforcement-based Simultaneous Algorithm and its…
Many different machine learning algorithms exist; taking into account each algorithm's hyperparameters, there is a staggeringly large number of possible alternatives overall. We consider the problem of simultaneously selecting a learning…
Performance of machine learning algorithms depends critically on identifying a good set of hyperparameters. While recent approaches use Bayesian optimization to adaptively select configurations, we focus on speeding up random search through…
The improving multi-armed bandits problem is a formal model for allocating effort under uncertainty, motivated by scenarios such as investing research effort into new technologies, performing clinical trials, and hyperparameter selection…
The evaluation of hyperparameters, neural architectures, or data augmentation policies becomes a critical model selection problem in advanced deep learning with a large hyperparameter search space. In this paper, we propose an efficient and…
Many real-world functions are defined over both categorical and category-specific continuous variables and thus cannot be optimized by traditional Bayesian optimization (BO) methods. To optimize such functions, we propose a new method that…
Motivated by applications such as online labor markets we consider a variant of the stochastic multi-armed bandit problem where we have a collection of arms representing strategic agents with different performance characteristics. The…
With the increase of machine learning usage by industries and scientific communities in a variety of tasks such as text mining, image recognition and self-driving cars, automatic setting of hyper-parameter in learning algorithms is a key…
We consider the Max $K$-Armed Bandit problem, where a learning agent is faced with several sources (arms) of items (rewards), and interested in finding the best item overall. At each time step the agent chooses an arm, and obtains a random…
Reinforcement learning has recently gained traction as a means to improve combinatorial optimization methods, yet its effectiveness within local search metaheuristics specifically remains comparatively underexamined. In this study, we…
Automated algorithm selection and hyperparameter tuning facilitates the application of machine learning. Traditional multi-armed bandit strategies look to the history of observed rewards to identify the most promising arms for optimizing…
This paper considers the problem of combinatorial multi-armed bandits with semi-bandit feedback and a cardinality constraint on the super-arm size. Existing algorithms for solving this problem typically involve two key sub-routines: (1) a…
Motivated by the task of hyperparameter optimization, we introduce the non-stochastic best-arm identification problem. Within the multi-armed bandit literature, the cumulative regret objective enjoys algorithms and analyses for both the…
Deep Reinforcement Learning has been shown to be very successful in complex games, e.g. Atari or Go. These games have clearly defined rules, and hence allow simulation. In many practical applications, however, interactions with the…
We consider the best arm identification (BAI) problem in the $K-$armed bandit framework with a modification - the agent is allowed to play a subset of arms at each time slot instead of one arm. Consequently, the agent observes the sample…
We consider the problem of best arm identification in a variant of multi-armed bandits called linked bandits. In a single interaction with linked bandits, multiple arms are played sequentially until one of them receives a positive reward.…
This paper considers the multi-armed thresholding bandit problem -- identifying all arms whose expected rewards are above a predefined threshold via as few pulls (or rounds) as possible -- proposed by Locatelli et al. [2016] recently.…
Contextual dueling bandit is used to model the bandit problems, where a learner's goal is to find the best arm for a given context using observed noisy human preference feedback over the selected arms for the past contexts. However,…
An automatic machine learning (AutoML) task is to select the best algorithm and its hyper-parameters simultaneously. Previously, the hyper-parameters of all algorithms are joint as a single search space, which is not only huge but also…
Recently multi-armed bandit problem arises in many real-life scenarios where arms must be sampled in batches, due to limited time the agent can wait for the feedback. Such applications include biological experimentation and online…
This paper explores the application of bandit algorithms in both stochastic and adversarial settings, with a focus on theoretical analysis and practical applications. The study begins by introducing bandit problems, distinguishing between…