English
Related papers

Related papers: Optimal anytime regret with two experts

200 papers

Many techniques for online optimization problems involve making decisions based solely on presently available information: fewer works take advantage of potential predictions. In this paper, we discuss the problem of online convex…

Optimization and Control · Mathematics 2019-02-04 Robert Ravier , Vahid Tarokh

We consider minimisation of dynamic regret in non-stationary bandits with a slowly varying property. Namely, we assume that arms' rewards are stochastic and independent over time, but that the absolute difference between the expected…

Machine Learning · Computer Science 2021-10-26 Ramakrishnan Krishnamurthy , Aditya Gopalan

In online learning the performance of an algorithm is typically compared to the performance of a fixed function from some class, with a quantity called regret. Forster proposed a last-step min-max algorithm which was somewhat simpler than…

Machine Learning · Computer Science 2013-01-28 Edward Moroshko , Koby Crammer

This paper addresses Online Convex Optimization (OCO) problems where the constraints have additive perturbations that (i) vary over time and (ii) are not known at the time to make a decision. Perturbations may not be i.i.d. generated and…

Optimization and Control · Mathematics 2019-06-04 Víctor Valls , George Iosifidis , Douglas J. Leith , Leandros Tassiulas

Making judicious channel access and transmission scheduling decisions is essential for improving performance as well as energy and spectral efficiency in multichannel wireless systems. This problem has been a subject of extensive study in…

Machine Learning · Computer Science 2015-04-07 Yang Liu , Mingyan Liu

Upper Confidence Bound (UCB) algorithms are a widely-used class of sequential algorithms for the $K$-armed bandit problem. Despite extensive research over the past decades aimed at understanding their asymptotic and (near) minimax…

Statistics Theory · Mathematics 2024-12-10 Qiyang Han , Koulik Khamaru , Cun-Hui Zhang

We study the $K$-armed contextual dueling bandit problem, a sequential decision making setting in which the learner uses contextual information to make two decisions, but only observes \emph{preference-based feedback} suggesting that one…

Machine Learning · Computer Science 2021-11-25 Aadirupa Saha , Akshay Krishnamurthy

Recent literature on online learning has focused on developing adaptive algorithms that take advantage of a regularity of the sequence of observations, yet retain worst-case performance guarantees. A complementary direction is to develop…

Machine Learning · Computer Science 2015-01-27 Ali Jadbabaie , Alexander Rakhlin , Shahin Shahrampour , Karthik Sridharan

We develop a novel and generic algorithm for the adversarial multi-armed bandit problem (or more generally the combinatorial semi-bandit problem). When instantiated differently, our algorithm achieves various new data-dependent regret…

Machine Learning · Computer Science 2018-06-08 Chen-Yu Wei , Haipeng Luo

We consider the setting of online logistic regression and consider the regret with respect to the 2-ball of radius B. It is known (see [Hazan et al., 2014]) that any proper algorithm which has logarithmic regret in the number of samples…

Machine Learning · Computer Science 2020-11-04 Rémi Jézéquel , Pierre Gaillard , Alessandro Rudi

We study online conformal prediction for non-stationary data streams subject to unknown distribution drift. While most prior work studied this problem under adversarial settings and/or assessed performance in terms of gaps of time-averaged…

Statistics Theory · Mathematics 2026-03-06 Jiadong Liang , Zhimei Ren , Yuxin Chen

We study an online mixed discrete and continuous optimization problem where a decision maker interacts with an unknown environment for a number of $T$ rounds. At each round, the decision maker needs to first jointly choose a discrete and a…

Optimization and Control · Mathematics 2024-08-27 Lintao Ye , Ming Chi , Zhi-Wei Liu , Xiaoling Wang , Vijay Gupta

We introduce two new no-regret algorithms for the stochastic shortest path (SSP) problem with a linear MDP that significantly improve over the only existing results of (Vial et al., 2021). Our first algorithm is computationally efficient…

Machine Learning · Computer Science 2021-12-21 Liyu Chen , Rahul Jain , Haipeng Luo

This paper considers two fundamental sequential decision-making problems: the problem of prediction with expert advice and the multi-armed bandit problem. We focus on stochastic regimes in which an adversary may corrupt losses, and we…

Machine Learning · Statistics 2021-09-24 Shinji Ito

In this paper we propose a novel experimental design-based algorithm to minimize regret in online stochastic linear and combinatorial bandits. While existing literature tends to focus on optimism-based algorithms--which have been shown to…

Machine Learning · Computer Science 2021-03-02 Andrew Wagenmaker , Julian Katz-Samuels , Kevin Jamieson

We study online prediction where regret of the algorithm is measured against a benchmark defined via evolving constraints. This framework captures online prediction on graphs, as well as other prediction problems with combinatorial…

Machine Learning · Computer Science 2015-06-15 Alexander Rakhlin , Karthik Sridharan

We propose a novel approach for analyzing dynamic regret of first-order constrained online convex optimization algorithms for strongly convex and Lipschitz-smooth objectives. Crucially, we provide a general analysis that is applicable to a…

Optimization and Control · Mathematics 2025-08-22 Fabian Jakob , Andrea Iannelli

Stochastic shortest path (SSP) is a well-known problem in planning and control, in which an agent has to reach a goal state in minimum total expected cost. In the learning formulation of the problem, the agent is unaware of the environment…

Machine Learning · Computer Science 2020-02-25 Alon Cohen , Haim Kaplan , Yishay Mansour , Aviv Rosenberg

Practical online learning tasks are often naturally defined on unconstrained domains, where optimal algorithms for general convex losses are characterized by the notion of comparator adaptivity. In this paper, we design such algorithms in…

Machine Learning · Computer Science 2022-10-13 Zhiyu Zhang , Ashok Cutkosky , Ioannis Ch. Paschalidis

In this paper, we consider the problem of scheduling jobs on parallel identical machines, where the processing times of jobs are uncertain: only interval bounds of processing times are known. The optimality criterion of a schedule is the…

Data Structures and Algorithms · Computer Science 2016-03-25 Maciej Drwal , Roman Rischke