English
Related papers

Related papers: Optimal anytime regret with two experts

200 papers

To deal with changing environments, a new performance measure -- adaptive regret, defined as the maximum static regret over any interval, was proposed in online learning. Under the setting of online convex optimization, several algorithms…

Machine Learning · Computer Science 2025-08-04 Lijun Zhang , Wenhao Yang , Guanghui Wang , Wei Jiang , Zhi-Hua Zhou

We study the regret of optimal strategies for online convex optimization games. Using von Neumann's minimax theorem, we show that the optimal regret in this adversarial setting is closely related to the behavior of the empirical…

Machine Learning · Computer Science 2009-04-01 Jacob Abernethy , Alekh Agarwal , Peter L. Bartlett , Alexander Rakhlin

Consider a sequence of bits where we are trying to predict the next bit from the previous bits. Assume we are allowed to say 'predict 0' or 'predict 1', and our payoff is +1 if the prediction is correct and -1 otherwise. We will say that at…

Data Structures and Algorithms · Computer Science 2012-10-11 Michael Kapralov , Rina Panigrahy

We design differentially private algorithms for the problem of prediction with expert advice under dynamic regret, also known as tracking the best expert. Our work addresses three natural types of adversaries, stochastic with shifting…

Machine Learning · Computer Science 2025-03-14 Aadirupa Saha , Vinod Raman , Hilal Asi

A new algorithm for regret minimization in online convex optimization is described. The regret of the algorithm after $T$ time periods is $O(\sqrt{T \log T})$ - which is the minimum possible up to a logarithmic term. In addition, the new…

Machine Learning · Computer Science 2023-07-24 Elad Hazan , Nimrod Megiddo

We introduce an online convex optimization algorithm which utilizes projected subgradient descent with optimal adaptive learning rates. Our method provides second-order minimax-optimal dynamic regret guarantee (i.e. dependent on the sum of…

Optimization and Control · Mathematics 2022-09-14 Hakan Gokcesu , Suleyman S. Kozat

We study online learning settings in which experts act strategically to maximize their influence on the learning algorithm's predictions by potentially misreporting their beliefs about a sequence of binary events. Our goal is twofold.…

Machine Learning · Computer Science 2020-07-02 Rupert Freeman , David M. Pennock , Chara Podimata , Jennifer Wortman Vaughan

This paper addresses an online convex optimization problem where the cost function at each step depends on a history of past decisions (i.e., memory), and the decision maker has access to limited predictions of future cost values within a…

Optimization and Control · Mathematics 2025-12-29 Zhengmiao Wang , Zhi-Wei Liu , Ming Chi , Xiaoling Wang , Housheng Su , Lintao Ye

In practical applications, data is used to make decisions in two steps: estimation and optimization. First, a machine learning model estimates parameters for a structural model relating decisions to outcomes. Second, a decision is chosen to…

Optimization and Control · Mathematics 2022-10-28 Samuel Tan , Peter I. Frazier

In this paper we study the non-stationary stochastic optimization question with bandit feedback and dynamic regret measures. The seminal work of Besbes et al. (2015) shows that, when aggregated function changes is known a priori, a simple…

Machine Learning · Statistics 2022-10-12 Yining Wang

We consider the fixed-budget best arm identification problem with rewards following normal distributions. In this problem, the forecaster is given $K$ arms (or treatments) and $T$ time steps. The forecaster attempts to find the arm with the…

Machine Learning · Statistics 2024-04-16 Junpei Komiyama

This study presents an effective global optimization technique designed for multivariate functions that are H\"older continuous. Unlike traditional methods that construct lower bounding proxy functions, this algorithm employs a…

Machine Learning · Computer Science 2023-03-28 Kaan Gokcesu , Hakan Gokcesu

In adversarial multi-armed bandits, two performance measures are commonly used: static regret, which compares the learner to the best fixed arm, and dynamic regret, which compares it to the best sequence of arms. While optimal algorithms…

Machine Learning · Computer Science 2026-02-18 Jian Qian , Chen-Yu Wei

This paper considers a variant of the online paging problem, where the online algorithm has access to multiple predictors, each producing a sequence of predictions for the page arrival times. The predictors may have occasional prediction…

Data Structures and Algorithms · Computer Science 2020-11-20 Yuval Emek , Shay Kutten , Yangguang Shi

We introduce a novel extension of the canonical multi-armed bandit problem that incorporates an additional strategic innovation: abstention. In this enhanced framework, the agent is not only tasked with selecting an arm at each time step,…

Machine Learning · Computer Science 2026-03-24 Junwen Yang , Tianyuan Jin , Vincent Y. F. Tan

We consider a family of learning strategies for online optimization problems that evolve in continuous time and we show that they lead to no regret. From a more traditional, discrete-time viewpoint, this continuous-time approach allows us…

Optimization and Control · Mathematics 2014-02-28 Joon Kwon , Panayotis Mertikopoulos

We consider online algorithms under both the competitive ratio criteria and the regret minimization one. Our main goal is to build a unified methodology that would be able to guarantee both criteria simultaneously. For a general class of…

Machine Learning · Computer Science 2019-04-09 Amit Daniely , Yishay Mansour

We design and analyze minimax-optimal algorithms for online linear optimization games where the player's choice is unconstrained. The player strives to minimize regret, the difference between his loss and the loss of a post-hoc benchmark…

Machine Learning · Computer Science 2013-02-12 H. Brendan McMahan

To deal with changing environments, a new performance measure -- adaptive regret, defined as the maximum static regret over any interval, was proposed in online learning. Under the setting of online convex optimization, several algorithms…

Machine Learning · Computer Science 2021-05-17 Lijun Zhang , Guanghui Wang , Wei-Wei Tu , Zhi-Hua Zhou

In this paper, we provide a generic anytime lower bounding procedure for minmax regret optimization problems. We show that the lower bound obtained is always at least as accurate as the lower bound recently proposed by Chassein and Goerigk…

Data Structures and Algorithms · Computer Science 2017-07-12 Hugo Gilbert , Olivier Spanjaard