English
Related papers

Related papers: Optimal anytime regret with two experts

200 papers

This work studies the problem of learning episodic Markov Decision Processes with known transition and bandit feedback. We develop the first algorithm with a ``best-of-both-worlds'' guarantee: it achieves $\mathcal{O}(log T)$ regret when…

Machine Learning · Computer Science 2020-11-03 Tiancheng Jin , Haipeng Luo

In the experts problem, on each of $T$ days, an agent needs to follow the advice of one of $n$ ``experts''. After each day, the loss associated with each expert's advice is revealed. A fundamental result in learning theory says that the…

Data Structures and Algorithms · Computer Science 2023-03-10 Binghui Peng , Aviad Rubinstein

We consider estimation and control in linear time-varying dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing causal estimators and controllers which…

Machine Learning · Computer Science 2021-06-24 Gautam Goel , Babak Hassibi

In a typical optimization problem, the task is to pick one of a number of options with the lowest cost or the highest value. In practice, these cost/value quantities often come through processes such as measurement or machine learning,…

Data Structures and Algorithms · Computer Science 2022-07-20 Mohammad Mahdian , Jieming Mao , Kangning Wang

We introduce a generic template for developing regret minimization algorithms in the Stochastic Shortest Path (SSP) model, which achieves minimax optimal regret as long as certain properties are ensured. The key of our analysis is a new…

Machine Learning · Computer Science 2021-11-11 Liyu Chen , Mehdi Jafarnia-Jahromi , Rahul Jain , Haipeng Luo

We study how we can adapt a predictor to a non-stationary environment with advises from multiple experts. We study the problem under complete feedback when the best expert changes over time from a decision theoretic point of view. Proposed…

Machine Learning · Computer Science 2017-08-08 Vishnu Raj , Sheetal Kalyani

In this paper, we study the behavior of the Hedge algorithm in the online stochastic setting. We prove that anytime Hedge with decreasing learning rate, which is one of the simplest algorithm for the problem of prediction with expert…

Machine Learning · Statistics 2019-07-10 Jaouad Mourtada , Stéphane Gaïffas

We study the problem of nonstochastic bandits with expert advice, extending the setting from finitely many experts to any countably infinite set: A learner aims to maximize the total reward by taking actions sequentially based on bandit…

Machine Learning · Computer Science 2021-03-29 X. Flora Meng , Tuhin Sarkar , Munther A. Dahleh

In online convex optimization, the player aims to minimize regret, or the difference between her loss and that of the best fixed decision in hindsight over the entire repeated game. Algorithms that minimize (standard) regret may converge to…

Machine Learning · Computer Science 2023-02-14 Zhou Lu , Elad Hazan

We develop an algorithmic framework for solving convex optimization problems using no-regret game dynamics. By converting the problem of minimizing a convex function into an auxiliary problem of solving a min-max game in a sequential…

Machine Learning · Computer Science 2023-02-21 Jun-Kun Wang , Jacob Abernethy , Kfir Y. Levy

We study the problem of online learning with a notion of regret defined with respect to a set of strategies. We develop tools for analyzing the minimax rates and for deriving regret-minimization algorithms in this scenario. While the…

Machine Learning · Statistics 2013-02-13 Wei Han , Alexander Rakhlin , Karthik Sridharan

We consider an online two-stage stochastic optimization with long-term constraints over a finite horizon of $T$ periods. At each period, we take the first-stage action, observe a model parameter realization and then take the second-stage…

Machine Learning · Computer Science 2024-05-21 Jiashuo Jiang

We study a generalization of the online binary prediction with expert advice framework where at each round, the learner is allowed to pick $m\geq 1$ experts from a pool of $K$ experts and the overall utility is a modular or submodular…

Machine Learning · Computer Science 2023-05-25 Omid Sadeghi , Maryam Fazel

Practitioners conducting adaptive experiments often encounter two competing priorities: maximizing total welfare (or `reward') through effective treatment assignment and swiftly concluding experiments to implement population-wide…

Machine Learning · Computer Science 2024-07-31 Chao Qin , Daniel Russo

Optimization problems routinely depend on uncertain parameters that must be predicted before a decision is made. Classical robust and regret formulations are designed to handle erroneous predictions and can provide statistical error bounds…

Optimization and Control · Mathematics 2026-03-30 Jannis Kurtz , Bart P. G. van Parys

In this correspondence, we introduce a minimax regret criteria to the least squares problems with bounded data uncertainties and solve it using semi-definite programming. We investigate a robust minimax least squares approach that minimizes…

Systems and Control · Computer Science 2012-03-20 Nargiz Kalantarova , Mehmet A. Donmez , Suleyman S. Kozat

This paper considers the stability of online learning algorithms and its implications for learnability (bounded regret). We introduce a novel quantity called {\em forward regret} that intuitively measures how good an online learning…

Machine Learning · Computer Science 2012-11-28 Ankan Saha , Prateek Jain , Ambuj Tewari

We present an online learning analysis of minimax adaptive control for the case where the uncertainty includes a finite set of linear dynamical systems. Precisely, for each system inside the uncertainty set, we define the model-based regret…

Systems and Control · Electrical Eng. & Systems 2023-09-12 Venkatraman Renganathan , Andrea Iannelli , Anders Rantzer

We present safe control of partially-observed linear time-varying systems in the presence of unknown and unpredictable process and measurement noise. We introduce a control algorithm that minimizes dynamic regret, i.e., that minimizes the…

Systems and Control · Electrical Eng. & Systems 2023-04-03 Hongyu Zhou , Vasileios Tzoumas

In this paper, we address the efficient implementation of moving horizon state estimation of constrained discrete-time linear systems. We propose a novel iteration scheme which employs a proximity-based formulation of the underlying…

Optimization and Control · Mathematics 2021-11-09 Meriem Gharbi , Bahman Gharesifard , Christian Ebenbauer
‹ Prev 1 3 4 5 6 7 10 Next ›