English
Related papers

Related papers: Optimal anytime regret with two experts

200 papers

We consider the fundamental problem of prediction with expert advice where the experts are "optimizable": there is a black-box optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point…

Machine Learning · Computer Science 2016-01-28 Elad Hazan , Tomer Koren

We consider the problem setting of prediction with expert advice with possibly heavy-tailed losses, i.e. the only assumption on the losses is an upper bound on their second moments, denoted by $\theta$. We develop adaptive algorithms that…

Machine Learning · Computer Science 2026-01-09 Antoine Moulin , Emmanuel Esposito , Dirk van der Hoeven

We consider an online two-stage stochastic optimization with long-term constraints over a finite horizon of $T$ periods. At each period, we take the first-stage action, observe a model parameter realization and then take the second-stage…

Machine Learning · Computer Science 2024-01-03 Piao Hu , Jiashuo Jiang , Guodong Lyu , Hao Su

We revisit the problem of online learning with sleeping experts/bandits: in each time step, only a subset of the actions are available for the algorithm to choose from (and learn about). The work of Kleinberg et al. (2010) showed that there…

Machine Learning · Computer Science 2021-04-27 Ehsan Emamjomeh-Zadeh , Chen-Yu Wei , Haipeng Luo , David Kempe

We consider the classical problem of sequential resource allocation where a decision maker must repeatedly divide a budget between several resources, each with diminishing returns. This can be recast as a specific stochastic optimization…

Machine Learning · Statistics 2020-01-17 Xavier Fontaine , Shie Mannor , Vianney Perchet

The framework of online learning with memory naturally captures learning problems with temporal constraints, and was previously studied for the experts setting. In this work we extend the notion of learning with memory to the general Online…

Machine Learning · Computer Science 2014-06-11 Oren Anava , Elad Hazan , Shie Mannor

In this work, we consider the problem of regret minimization in adaptive minimum variance and linear quadratic control problems. Regret minimization has been extensively studied in the literature for both types of adaptive control problems.…

Optimization and Control · Mathematics 2022-11-16 Kévin Colin , Håkan Hjalmarsson , Xavier Bombois

We study online learning problems in which a decision maker has to take a sequence of decisions subject to $m$ long-term constraints. The goal of the decision maker is to maximize their total reward, while at the same time achieving small…

Machine Learning · Computer Science 2022-09-16 Matteo Castiglioni , Andrea Celli , Alberto Marchesi , Giulia Romano , Nicola Gatti

We consider the classical problem of sequential probability assignment under logarithmic loss while competing against an arbitrary, potentially nonparametric class of experts. We obtain tight bounds on the minimax regret via a new approach…

Machine Learning · Computer Science 2020-08-04 Blair Bilodeau , Dylan J. Foster , Daniel M. Roy

We consider algorithms for "smoothed online convex optimization" problems, a variant of the class of online convex optimization problems that is strongly related to metrical task systems. Prior literature on these problems has focused on…

Data Structures and Algorithms · Computer Science 2015-08-18 Lachlan L. H. Andrew , Siddharth Barman , Katrina Ligett , Minghong Lin , Adam Meyerson , Alan Roytman , Adam Wierman

We address online linear optimization problems when the possible actions of the decision maker are represented by binary vectors. The regret of the decision maker is the difference between her realized loss and the best loss she would have…

Machine Learning · Computer Science 2013-04-02 Jean-Yves Audibert , Sébastien Bubeck , Gábor Lugosi

We consider the setting of iterative learning control, or model-based policy learning in the presence of uncertain, time-varying dynamics. In this setting, we propose a new performance metric, planning regret, which replaces the standard…

Machine Learning · Computer Science 2021-03-01 Naman Agarwal , Elad Hazan , Anirudha Majumdar , Karan Singh

We consider the problem of online learning in Linear Quadratic Control systems whose state transition and state-action transition matrices $A$ and $B$ may be initially unknown. We devise an online learning algorithm and provide guarantees…

Machine Learning · Computer Science 2021-09-30 Yassir Jedra , Alexandre Proutiere

In this paper a class of combinatorial optimization problems is discussed. It is assumed that a feasible solution can be constructed in two stages. In the first stage the objective function costs are known while in the second stage they are…

Data Structures and Algorithms · Computer Science 2020-05-22 Marc Goerigk , Adam Kasperski , Pawel Zielinski

In one view of the classical game of prediction with expert advice with binary outcomes, in each round, each expert maintains an adversarially chosen belief and honestly reports this belief. We consider a recently introduced, strategic…

Machine Learning · Computer Science 2024-04-09 Ali Mortazavi , Junhao Lin , Nishant A. Mehta

We aim to design strategies for sequential decision making that adjust to the difficulty of the learning problem. We study this question both in the setting of prediction with expert advice, and for more general combinatorial decision…

Machine Learning · Computer Science 2015-03-02 Wouter M. Koolen , Tim van Erven

The Lipschitz multi-armed bandit (MAB) problem generalizes the classical multi-armed bandit problem by assuming one is given side information consisting of a priori upper bounds on the difference in expected payoff between certain pairs of…

Data Structures and Algorithms · Computer Science 2009-11-09 Robert Kleinberg , Aleksandrs Slivkins

Online reinforcement learning in infinite-horizon Markov decision processes (MDPs) remains less theoretically and algorithmically developed than its episodic counterpart, with many algorithms suffering from high ``burn-in'' costs and…

Machine Learning · Computer Science 2026-03-26 Guy Zamir , Matthew Zurek , Yudong Chen

We revisit the classic regret-minimization problem in the stochastic multi-armed bandit setting when the arm-distributions are allowed to be heavy-tailed. Regret minimization has been well studied in simpler settings of either bounded…

Machine Learning · Computer Science 2021-02-09 Shubhada Agrawal , Sandeep Juneja , Wouter M. Koolen

In this paper, we study a variant of the framework of online learning using expert advice with limited/bandit feedback. We consider each expert as a learning entity, seeking to more accurately reflecting certain real-world applications. In…

Machine Learning · Computer Science 2017-02-21 Adish Singla , Hamed Hassani , Andreas Krause