Related papers: Adaptive Oracle-Efficient Online Learning
We revisit the question of reducing online learning to approximate optimization of the offline problem. In this setting, we give two algorithms with near-optimal performance in the full information setting: they guarantee optimal regret and…
We study the problems of offline and online contextual optimization with feedback information, where instead of observing the loss, we observe, after-the-fact, the optimal action an oracle with full knowledge of the objective function would…
In this paper, we study oracle-efficient algorithms for beyond worst-case analysis of online learning. We focus on two settings. First, the smoothed analysis setting of [RST11,HRS22] where an adversary is constrained to generating samples…
Smoothed online learning has emerged as a popular framework to mitigate the substantial loss in statistical and computational complexity that arises when one moves from classical to adversarial learning. Unfortunately, for some spaces, it…
We consider the design of computationally efficient online learning algorithms in an adversarial setting in which the learner has access to an offline optimization oracle. We present an algorithm called Generalized…
We study the problem of online multi-group learning, a learning model in which an online learner must simultaneously achieve small prediction regret on a large collection of (possibly overlapping) subsequences corresponding to a family of…
The Hybrid Online Learning Problem, where features are drawn i.i.d. from an unknown distribution but labels are generated adversarially, is a well-motivated setting positioned between statistical and fully-adversarial online learning. Prior…
We consider adaptive decision-making problems where an agent optimizes a cumulative performance objective by repeatedly choosing among a finite set of options. Compared to the classical prediction-with-expert-advice set-up, we consider…
We investigate the problem of online learning, which has gained significant attention in recent years due to its applicability in a wide range of fields from machine learning to game theory. Specifically, we study the online optimization of…
We study online alignment of large language models under misspecified preference feedback, where the observed preference oracle deviates from an ideal but unknown ground-truth oracle. The online LLM alignment problem is a bi-level…
Distributionally robust offline reinforcement learning (RL), which seeks robust policy training against environment perturbation by modeling dynamics uncertainty, calls for function approximations when facing large state-action spaces.…
One of the main strengths of online algorithms is their ability to adapt to arbitrary data sequences. This is especially important in nonparametric settings, where performance is measured against rich classes of comparator functions that…
We approach the fundamental problem of obstacle avoidance for robotic systems via the lens of online learning. In contrast to prior work that either assumes worst-case realizations of uncertainty in the environment or a stationary…
A fundamental challenge in contextual bandits is to develop flexible, general-purpose algorithms with computational requirements no worse than classical supervised learning tasks such as classification and regression. Algorithms based on…
$ $The classical theory of statistical estimation aims to estimate a parameter of interest under data generated from a fixed design ("offline estimation"), while the contemporary theory of online learning provides algorithms for estimation…
We present methods for online linear optimization that take advantage of benign (as opposed to worst-case) sequences. Specifically if the sequence encountered by the learner is described well by a known "predictable process", the algorithms…
We study learning control in an online reset-free lifelong learning scenario, where mistakes can compound catastrophically into the future and the underlying dynamics of the environment may change. Traditional model-free policy learning…
Offline policy learning (OPL) leverages existing data collected a priori for policy optimization without any active exploration. Despite the prevalence and recent interest in this problem, its theoretical and algorithmic foundations in…
A major approach to saddle point optimization $\min_x\max_y f(x, y)$ is a gradient based approach as is popularized by generative adversarial networks (GANs). In contrast, we analyze an alternative approach relying only on an oracle that…
We revisit the problem of \textit{online linear optimization} in case the set of feasible actions is accessible through an approximated linear optimization oracle with a factor $\alpha$ multiplicative approximation guarantee. This setting…