Related papers: Optimal anytime regret with two experts

Optimal discovery with probabilistic expert advice: finite time analysis and macroscopic optimality

We consider an original problem that arises from the issue of security analysis of a power system and that we name optimal discovery with probabilistic expert advice. We address it with an algorithm based on the optimistic paradigm and on…

Machine Learning · Computer Science 2013-04-02 Sebastien Bubeck , Damien Ernst , Aurelien Garivier

Best of Both Worlds: Regret Minimization versus Minimax Play

In this paper, we investigate the existence of online learning algorithms with bandit feedback that simultaneously guarantee $O(1)$ regret compared to a given comparator strategy, and $\tilde{O}(\sqrt{T})$ regret compared to any fixed…

Machine Learning · Computer Science 2025-06-05 Adrian Müller , Jon Schneider , Stratis Skoulakis , Luca Viano , Volkan Cevher

Combinatorial Optimization Problems with Balanced Regret

For decision making under uncertainty, min-max regret has been established as a popular methodology to find robust solutions. In this approach, we compare the performance of our solution against the best possible performance had we known…

Optimization and Control · Mathematics 2021-11-25 Marc Goerigk , Michael Hartisch

Ordering Selection Operators Using the Minmax Regret Rule

Optimising queries in real-world situations under imperfect conditions is still a problem that has not been fully solved. We consider finding the optimal order in which to execute a given set of selection operators under partial ignorance…

Databases · Computer Science 2015-07-30 Khaled H. Alyoubi , Sven Helmer , Peter T. Wood

Near-Optimal Algorithms for Private Online Optimization in the Realizable Regime

We consider online learning problems in the realizable setting, where there is a zero-loss solution, and propose new Differentially Private (DP) algorithms that obtain near-optimal regret bounds. For the problem of online prediction from…

Machine Learning · Computer Science 2023-03-01 Hilal Asi , Vitaly Feldman , Tomer Koren , Kunal Talwar

Online Learning with Feedback Graphs: The True Shape of Regret

Sequential learning with feedback graphs is a natural extension of the multi-armed bandit problem where the problem is equipped with an underlying graph structure that provides additional information - playing an action reveals the losses…

Machine Learning · Computer Science 2023-06-06 Tomáš Kocák , Alexandra Carpentier

A Tight Lower Bound for Non-stochastic Multi-armed Bandits with Expert Advice

We determine the minimax optimal expected regret in the classic non-stochastic multi-armed bandit with expert advice problem, by proving a lower bound that matches the upper bound of Kale (2014). The two bounds determine the minimax optimal…

Machine Learning · Computer Science 2025-11-04 Zachary Chase , Shinji Ito , Idan Mehalel

Tracking the Best Expert in Non-stationary Stochastic Environments

We study the dynamic regret of multi-armed bandit and experts problem in non-stationary stochastic environments. We introduce a new parameter $\Lambda$, which measures the total statistical variance of the loss distributions over $T$ rounds…

Machine Learning · Computer Science 2019-06-24 Chen-Yu Wei , Yi-Te Hong , Chi-Jen Lu

Regret Bounds for Batched Bandits

We present simple and efficient algorithms for the batched stochastic multi-armed bandit and batched stochastic linear bandit problems. We prove bounds for their expected regrets that improve over the best-known regret bounds for any number…

Data Structures and Algorithms · Computer Science 2020-02-19 Hossein Esfandiari , Amin Karbasi , Abbas Mehrabian , Vahab Mirrokni

Logarithmic Regret and Polynomial Scaling in Online Multi-step-ahead Prediction

This letter studies the problem of online multi-step-ahead prediction for unknown linear stochastic systems. Using conditional distribution theory, we derive an optimal parameterization of the prediction policy as a linear function of…

Machine Learning · Computer Science 2025-11-18 Jiachen Qian , Yang Zheng

Learning The Best Expert Efficiently

We consider online learning problems where the aim is to achieve regret which is efficient in the sense that it is the same order as the lowest regret amongst K experts. This is a substantially stronger requirement that achieving…

Machine Learning · Computer Science 2019-11-12 Daron Anderson , Douglas J. Leith

Best of many worlds: Robust model selection for online supervised learning

We introduce algorithms for online, full-information prediction that are competitive with contextual tree experts of unknown complexity, in both probabilistic and adversarial settings. We show that by incorporating a probabilistic framework…

Machine Learning · Computer Science 2018-05-23 Vidya Muthukumar , Mitas Ray , Anant Sahai , Peter L. Bartlett

Regret Balancing for Bandit and RL Model Selection

We consider model selection in stochastic bandit and reinforcement learning problems. Given a set of base learning algorithms, an effective model selection strategy adapts to the best learning algorithm in an online fashion. We show that by…

Machine Learning · Computer Science 2020-06-11 Yasin Abbasi-Yadkori , Aldo Pacchiano , My Phan

Regret Minimization in Scalar, Static, Non-linear Optimization Problems

We study the problem of determining an effective exploration strategy in static and non-linear optimization problems, which depend on an unknown scalar parameter to be learned from online collected noisy data. An optimal trade-off between…

Optimization and Control · Mathematics 2024-09-13 Ying Wang , Mirko Pasquini , Kévin Colin , Håkan Hjalmarsson

Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality

We revisit the problem of stochastic online learning with feedback graphs, with the goal of devising algorithms that are optimal, up to constants, both asymptotically and in finite time. We show that, surprisingly, the notion of optimal…

Machine Learning · Computer Science 2022-06-22 Teodor V. Marinov , Mehryar Mohri , Julian Zimmert

Asymptotic Instance-Optimal Algorithms for Interactive Decision Making

Past research on interactive decision making problems (bandits, reinforcement learning, etc.) mostly focuses on the minimax regret that measures the algorithm's performance on the hardest instance. However, an ideal algorithm should adapt…

Machine Learning · Computer Science 2023-06-13 Kefan Dong , Tengyu Ma

Efficient Tracking of Large Classes of Experts

In the framework of prediction of individual sequences, sequential prediction methods are to be constructed that perform nearly as well as the best expert from a given class. We consider prediction strategies that compete with the class of…

Machine Learning · Computer Science 2012-07-12 András Gyorgy , Tamás Linder , Gábor Lugosi

No-Regret Algorithms for Unconstrained Online Convex Optimization

Some of the most compelling applications of online convex optimization, including online prediction and classification, are unconstrained: the natural feasible set is R^n. Existing algorithms fail to achieve sub-linear regret in this…

Machine Learning · Computer Science 2012-11-13 Matthew Streeter , H. Brendan McMahan

Internal Regret with Partial Monitoring. Calibration-Based Optimal Algorithms

We provide consistent random algorithms for sequential decision under partial monitoring, i.e. when the decision maker does not observe the outcomes but receives instead random feedback signals. Those algorithms have no internal regret in…

Machine Learning · Computer Science 2011-02-23 Vianney Perchet

Adapting to Non-stationarity with Growing Expert Ensembles

When dealing with time series with complex non-stationarities, low retrospective regret on individual realizations is a more appropriate goal than low prospective risk in expectation. Online learning algorithms provide powerful guarantees…

Machine Learning · Statistics 2011-06-30 Cosma Rohilla Shalizi , Abigail Z. Jacobs , Kristina Lisa Klinkner , Aaron Clauset