English
Related papers

Related papers: Optimal anytime regret with two experts

200 papers

In this paper, we consider a best action identification problem in the stochastic linear bandit setup with a fixed confident constraint. In the considered best action identification problem, instead of minimizing the accumulative regret as…

Machine Learning · Computer Science 2018-12-04 Jun Geng , Lifeng Lai

We introduce efficient algorithms which achieve nearly optimal regrets for the problem of stochastic online shortest path routing with end-to-end feedback. The setting is a natural application of the combinatorial stochastic bandits…

Machine Learning · Computer Science 2018-12-20 Ruihao Zhu , Eytan Modiano

The filtering problem of causally estimating a desired signal from a related observation signal is investigated through the lens of regret optimization. Classical filter designs, such as $\mathcal H_2$ (Kalman) and $\mathcal H_\infty$,…

Optimization and Control · Mathematics 2022-11-23 Oron Sabag , Babak Hassibi

We address the online linear optimization problem with bandit feedback. Our contribution is twofold. First, we provide an algorithm (based on exponential weights) with a regret of order $\sqrt{d n \log N}$ for any finite action set with $N$…

Machine Learning · Computer Science 2012-02-15 Sébastien Bubeck , Nicolò Cesa-Bianchi , Sham M. Kakade

Fast changing states or volatile environments pose a significant challenge to online optimization, which needs to perform rapid adaptation under limited observation. In this paper, we give query and regret optimal bandit algorithms under…

Machine Learning · Computer Science 2024-01-18 Zhou Lu , Qiuyi Zhang , Xinyi Chen , Fred Zhang , David Woodruff , Elad Hazan

The multi-armed bandit problem is a popular model for studying exploration/exploitation trade-off in sequential decision problems. Many algorithms are now available for this well-studied problem. One of the earliest algorithms, given by W.…

Machine Learning · Computer Science 2012-04-10 Shipra Agrawal , Navin Goyal

We present an algorithm guaranteeing dynamic regret bounds for online omniprediction with long term constraints. The goal in this recently introduced problem is for a learner to generate a sequence of predictions which are broadcast to a…

Machine Learning · Computer Science 2025-10-09 Yahav Bechavod , Jiuyao Lu , Aaron Roth

We prove a new minimax theorem connecting the worst-case Bayesian regret and minimax regret under partial monitoring with no assumptions on the space of signals or decisions of the adversary. We then generalise the information-theoretic…

Machine Learning · Computer Science 2019-05-30 Tor Lattimore , Csaba Szepesvari

We investigate online convex optimization in non-stationary environments and choose dynamic regret as the performance measure, defined as the difference between cumulative loss incurred by the online algorithm and that of any feasible…

Machine Learning · Computer Science 2024-04-09 Peng Zhao , Yu-Jie Zhang , Lijun Zhang , Zhi-Hua Zhou

We study the contextual continuum bandits problem, where the learner sequentially receives a side information vector and has to choose an action in a convex set, minimizing a function associated with the context. The goal is to minimize all…

Machine Learning · Statistics 2025-10-28 Arya Akhavan , Karim Lounici , Massimiliano Pontil , Alexandre B. Tsybakov

The regret bound of an optimization algorithms is one of the basic criteria for evaluating the performance of the given algorithm. By inspecting the differences between the regret bounds of traditional algorithms and adaptive one, we…

Machine Learning · Statistics 2017-07-07 HyoungSeok Kim , JiHoon Kang , WooMyoung Park , SukHyun Ko , YoonHo Cho , DaeSung Yu , YoungSook Song , JungWon Choi

The problem of regret minimization for online adaptive control of linear-quadratic systems is studied. In this problem, the true system transition parameters (matrices $A$ and $B$) are unknown, and the objective is to design and analyze…

Optimization and Control · Mathematics 2022-10-31 Mohammad Akbari , Bahman Gharesifard , Tamas Linder

Designing efficient general-purpose contextual bandit algorithms that work with large -- or even continuous -- action spaces would facilitate application to important scenarios such as information retrieval, recommendation systems, and…

Machine Learning · Computer Science 2022-07-14 Yinglun Zhu , Paul Mineiro

In this work, we propose a computationally efficient algorithm for the problem of global optimization in univariate loss functions. For the performance evaluation, we study the cumulative regret of the algorithm instead of the simple regret…

Machine Learning · Computer Science 2022-01-19 Kaan Gokcesu , Hakan Gokcesu

Most bandit algorithm designs are purely theoretical. Therefore, they have strong regret guarantees, but also are often too conservative in practice. In this work, we pioneer the idea of algorithm design by minimizing the empirical Bayes…

Machine Learning · Computer Science 2020-06-12 Chih-Wei Hsu , Branislav Kveton , Ofer Meshi , Martin Mladenov , Csaba Szepesvari

Adaptively controlling and minimizing regret in unknown dynamical systems while controlling the growth of the system state is crucial in real-world applications. In this work, we study the problem of stabilization and regret minimization of…

Systems and Control · Electrical Eng. & Systems 2022-02-10 Jafar Abbaszadeh Chekan , Kamyar Azizzadenesheli , Cedric Langbort

Online prediction from experts is a fundamental problem in machine learning and several works have studied this problem under privacy constraints. We propose and analyze new algorithms for this problem that improve over the regret bounds of…

Machine Learning · Computer Science 2023-07-03 Hilal Asi , Vitaly Feldman , Tomer Koren , Kunal Talwar

The minmax regret problem for combinatorial optimization under uncertainty can be viewed as a zero-sum game played between an optimizing player and an adversary, where the optimizing player selects a solution and the adversary selects costs…

Discrete Mathematics · Computer Science 2014-09-23 Andrew Mastin , Patrick Jaillet , Sang Chin

We consider the problem of controlling an unknown linear dynamical system in the presence of (nonstochastic) adversarial perturbations and adversarial convex loss functions. In contrast to classical control, the a priori determination of an…

Machine Learning · Computer Science 2020-01-22 Elad Hazan , Sham M. Kakade , Karan Singh

We study the problem of incentive-compatible online learning with bandit feedback. In this class of problems, the experts are self-interested agents who might misrepresent their preferences with the goal of being selected most often. The…

Machine Learning · Computer Science 2024-05-13 Julian Zimmert , Teodor V. Marinov
‹ Prev 1 4 5 6 7 8 10 Next ›