Related papers: Master Algorithms for Active Experts Problems base…

Defensive Universal Learning with Experts

This paper shows how universal learning can be achieved with expert advice. To this aim, we specify an experts algorithm with the following characteristics: (a) it uses only feedback from the actions actually chosen (bandit setup), (b) it…

Machine Learning · Computer Science 2007-05-23 Jan Poland , Marcus Hutter

Dying Experts: Efficient Algorithms with Optimal Regret Bounds

We study a variant of decision-theoretic online learning in which the set of experts that are available to Learner can shrink over time. This is a restricted version of the well-studied sleeping experts problem, itself a generalization of…

Machine Learning · Computer Science 2019-10-31 Hamid Shayestehmanesh , Sajjad Azami , Nishant A. Mehta

Streaming Algorithms for Learning with Experts: Deterministic Versus Robust

In the online learning with experts problem, an algorithm must make a prediction about an outcome on each of $T$ days (or times), given a set of $n$ experts who make predictions on each day (or time). The algorithm is given feedback on the…

Data Structures and Algorithms · Computer Science 2023-03-06 David P. Woodruff , Fred Zhang , Samson Zhou

Adaptive Decision-Making with Constraints and Dependent Losses: Performance Guarantees and Applications to Online and Nonlinear Identification

We consider adaptive decision-making problems where an agent optimizes a cumulative performance objective by repeatedly choosing among a finite set of options. Compared to the classical prediction-with-expert-advice set-up, we consider…

Machine Learning · Computer Science 2023-04-10 Michael Muehlebach

Active Ranking of Experts Based on their Performances in Many Tasks

We consider the problem of ranking n experts based on their performances on d tasks. We make a monotonicity assumption stating that for each pair of experts, one outperforms the other on all tasks. We consider the sequential setting where…

Machine Learning · Statistics 2023-06-06 El Mehdi Saad , Nicolas Verzelen , Alexandra Carpentier

Online Learning with Automata-based Expert Sequences

We consider a general framework of online learning with expert advice where regret is defined with respect to sequences of experts accepted by a weighted automaton. Our framework covers several problems previously studied, including…

Machine Learning · Computer Science 2017-10-24 Mehryar Mohri , Scott Yang

Prediction with expert evaluators' advice

We introduce a new protocol for prediction with expert advice in which each expert evaluates the learner's and his own performance using a loss function that may change over time and may be different from the loss functions used by the…

Machine Learning · Computer Science 2009-03-23 Alexey Chernov , Vladimir Vovk

Bayesian Decision Making around Experts

Complex learning agents are increasingly deployed alongside existing experts, such as human operators or previously trained agents. However, it remains unclear how should learners optimally incorporate certain forms of expert data, which…

Machine Learning · Computer Science 2025-10-10 Daniel Jarne Ornia , Joel Dyer , Nicholas Bishop , Anisoara Calinescu , Michael Wooldridge

Bandits with Abstention under Expert Advice

We study the classic problem of prediction with expert advice under bandit feedback. Our model assumes that one action, corresponding to the learner's abstention from play, has no reward or loss on every trial. We propose the CBA algorithm,…

Machine Learning · Computer Science 2024-11-13 Stephen Pasteris , Alberto Rumi , Maximilian Thiessen , Shota Saito , Atsushi Miyauchi , Fabio Vitale , Mark Herbster

Imitation Learning by Reinforcement Learning

Imitation learning algorithms learn a policy from demonstrations of expert behavior. We show that, for deterministic experts, imitation learning can be done by reduction to reinforcement learning with a stationary reward. Our theoretical…

Machine Learning · Statistics 2022-03-16 Kamil Ciosek

Algorithm Selection as a Bandit Problem with Unbounded Losses

Algorithm selection is typically based on models of algorithm performance, learned during a separate offline training sequence, which can be prohibitively expensive. In recent work, we adopted an online approach, in which a performance…

Artificial Intelligence · Computer Science 2013-01-31 Matteo Gagliolo , Juergen Schmidhuber

Memory Bounds for the Experts Problem

Online learning with expert advice is a fundamental problem of sequential prediction. In this problem, the algorithm has access to a set of $n$ "experts" who make predictions on each day. The goal on each day is to process these…

Data Structures and Algorithms · Computer Science 2022-04-22 Vaidehi Srinivas , David P. Woodruff , Ziyu Xu , Samson Zhou

Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits with Strategic Agents

Motivated by applications such as online labor markets we consider a variant of the stochastic multi-armed bandit problem where we have a collection of arms representing strategic agents with different performance characteristics. The…

Computer Science and Game Theory · Computer Science 2025-03-11 Seyed A. Esmaeili , Suho Shin , Aleksandrs Slivkins

Extreme Bandits using Robust Statistics

We consider a multi-armed bandit problem motivated by situations where only the extreme values, as opposed to expected values in the classical bandit setting, are of interest. We propose distribution free algorithms using robust statistics…

Machine Learning · Statistics 2021-09-10 Sujay Bhatt , Ping Li , Gennady Samorodnitsky

Can We Learn to Beat the Best Stock

A novel algorithm for actively trading stocks is presented. While traditional expert advice and "universal" algorithms (as well as standard technical trading heuristics) attempt to predict winners or trends, our approach relies on…

Artificial Intelligence · Computer Science 2011-07-04 A. Borodin , R. El-Yaniv , V. Gogan

A Multi-Armed Bandit Approach for Online Expert Selection in Markov Decision Processes

We formulate a multi-armed bandit (MAB) approach to choosing expert policies online in Markov decision processes (MDPs). Given a set of expert policies trained on a state and action space, the goal is to maximize the cumulative reward of…

Systems and Control · Computer Science 2017-07-19 Eric Mazumdar , Roy Dong , Vicenç Rúbies Royo , Claire Tomlin , S. Shankar Sastry

A Generalized Online Algorithm for Translation and Scale Invariant Prediction with Expert Advice

In this work, we aim to create a completely online algorithmic framework for prediction with expert advice that is translation-free and scale-free of the expert losses. Our goal is to create a generalized algorithm that is suitable for use…

Machine Learning · Computer Science 2020-09-10 Kaan Gokcesu , Hakan Gokcesu

Generalized Translation and Scale Invariant Online Algorithm for Adversarial Multi-Armed Bandits

We study the adversarial multi-armed bandit problem and create a completely online algorithmic framework that is invariant under arbitrary translations and scales of the arm losses. We study the expected performance of our algorithm against…

Machine Learning · Computer Science 2021-09-21 Kaan Gokcesu , Hakan Gokcesu

Online Bandit Learning against an Adaptive Adversary: from Regret to Policy Regret

Online learning algorithms are designed to learn even when their input is generated by an adversary. The widely-accepted formal definition of an online algorithm's ability to learn is the game-theoretic notion of regret. We argue that the…

Machine Learning · Computer Science 2012-07-03 Raman Arora , Ofer Dekel , Ambuj Tewari

Data Dependent Regret Guarantees Against General Comparators for Full or Bandit Feedback

We study the adversarial online learning problem and create a completely online algorithmic framework that has data dependent regret guarantees in both full expert feedback and bandit feedback settings. We study the expected performance of…

Machine Learning · Computer Science 2023-03-14 Kaan Gokcesu , Hakan Gokcesu