Related papers: Efficient Constrained Regret Minimization
We study online learning problems in which a decision maker has to take a sequence of decisions subject to $m$ long-term constraints. The goal of the decision maker is to maximize their total reward, while at the same time achieving small…
We consider prediction with expert advice for strongly convex and bounded losses, and investigate trade-offs between regret and "variance" (i.e., squared difference of learner's predictions and best expert predictions). With $K$ experts,…
In online learning an algorithm plays against an environment with losses possibly picked by an adversary at each round. The generality of this framework includes problems that are not adversarial, for example offline optimization, or saddle…
In decision-making under uncertainty, several criteria have been studied to aggregate the performance of a solution over multiple possible scenarios. This paper introduces a novel variant of ordered weighted averaging (OWA) for optimization…
We study online learning problems in which a decision maker wants to maximize their expected reward without violating a finite set of $m$ resource constraints. By casting the learning process over a suitably defined space of strategy…
We study online decision making problems under resource constraints, where both reward and cost functions are drawn from distributions that may change adversarially over time. We focus on two canonical settings: $(i)$ online resource…
Online learning methods, like the online gradient algorithm (OGA) and exponentially weighted aggregation (EWA), often depend on tuning parameters that are difficult to set in practice. We consider an online meta-learning scenario, and we…
We consider the online convex optimization problem. In the setting of arbitrary sequences and finite set of parameters, we establish a new fast-rate quantile regret bound. Then we investigate the optimization into the L1-ball by…
Online learning has traditionally focused on the expected rewards. In this paper, a risk-averse online learning problem under the performance measure of the mean-variance of the rewards is studied. Both the bandit and full information…
A novel Follow-the-Perturbed-Leader type algorithm is proposed and analyzed for solving general long-term constrained optimization problems in an online manner, where the target and constraint functions are oblivious adversarially generated…
We consider an online learning problem on a continuum. A decision maker is given a compact feasible set $S$, and is faced with the following sequential problem: at iteration~$t$, the decision maker chooses a distribution $x^{(t)} \in…
We propose a framework which generalizes "decision making with structured observations" by allowing robust (i.e. multivalued) models. In this framework, each model associates each decision with a convex set of probability distributions over…
We consider a generalization of the celebrated Online Convex Optimization (OCO) framework with adversarial online constraints. In this problem, an online learner interacts with an adversary sequentially over multiple rounds. At the…
We address online linear optimization problems when the possible actions of the decision maker are represented by binary vectors. The regret of the decision maker is the difference between her realized loss and the best loss she would have…
In this paper we propose a framework for solving constrained online convex optimization problem. Our motivation stems from the observation that most algorithms proposed for online convex optimization require a projection onto the convex set…
This paper considers the distributed online convex optimization problem with time-varying constraints over a network of agents. This is a sequential decision making problem with two sequences of arbitrarily varying convex loss and…
This paper studies an online optimal resource reservation problem in communication networks with job transfers where the goal is to minimize the reservation cost while maintaining the blocking cost under a certain budget limit. To tackle…
We consider model selection in stochastic bandit and reinforcement learning problems. Given a set of base learning algorithms, an effective model selection strategy adapts to the best learning algorithm in an online fashion. We show that by…
We consider an online two-stage stochastic optimization with long-term constraints over a finite horizon of $T$ periods. At each period, we take the first-stage action, observe a model parameter realization and then take the second-stage…
The online meta-learning framework has arisen as a powerful tool for the continual lifelong learning setting. The goal for an agent is to quickly learn new tasks by drawing on prior experience, while it faces with tasks one after another.…