Related papers: Online Linear Optimization with Many Hints
We consider a variant of the classical online linear optimization problem in which at every step, the online player receives a "hint" vector before choosing the action for that round. Rather surprisingly, it was shown that if the hint…
We consider the online linear optimization problem, where at every step the algorithm plays a point $x_t$ in the unit ball, and suffers loss $\langle c_t, x_t\rangle$ for some cost vector $c_t$ that is then revealed to the algorithm. Recent…
In this paper, we broaden the horizon of online convex optimization (OCO), and consider multi-objective OCO, where there are $K$ distinct loss function sequences, and an algorithm has to choose its action at time $t$, before the $K$ loss…
A well-studied generalization of the standard online convex optimization (OCO) framework is constrained online convex optimization (COCO). In COCO, on every round, a convex cost function and a convex constraint function are revealed to the…
We study the problem of online convex optimization (OCO) under unknown linear constraints that are either static, or stochastically time-varying. For this problem, we introduce an algorithm that we term Optimistically Safe OCO (OSOCO) and…
A new algorithm for regret minimization in online convex optimization is described. The regret of the algorithm after $T$ time periods is $O(\sqrt{T \log T})$ - which is the minimum possible up to a logarithmic term. In addition, the new…
This paper addresses Online Convex Optimization (OCO) problems where the constraints have additive perturbations that (i) vary over time and (ii) are not known at the time to make a decision. Perturbations may not be i.i.d. generated and…
We consider online learning problems where the aim is to achieve regret which is efficient in the sense that it is the same order as the lowest regret amongst K experts. This is a substantially stronger requirement that achieving…
The framework of online learning with memory naturally captures learning problems with temporal constraints, and was previously studied for the experts setting. In this work we extend the notion of learning with memory to the general Online…
A natural goal when designing online learning algorithms for non-stationary environments is to bound the regret of the algorithm in terms of the temporal variation of the input sequence. Intuitively, when the variation is small, it should…
In this paper we propose a framework for solving constrained online convex optimization problem. Our motivation stems from the observation that most algorithms proposed for online convex optimization require a projection onto the convex set…
We study the setting of optimizing with bandit feedback with additional prior knowledge provided to the learner in the form of an initial hint of the optimal action. We present a novel algorithm for stochastic linear bandits that uses this…
In the convex optimization approach to online regret minimization, many methods have been developed to guarantee a $O(\sqrt{T})$ bound on regret for subdifferentiable convex loss functions with bounded subgradients, by using a reduction to…
We consider an assortment selection and pricing problem in which a seller has $N$ different items available for sale. In each round, the seller observes a $d$-dimensional contextual preference information vector for the user, and offers to…
In citep{Hazan-2008-extract}, the authors showed that the regret of online linear optimization can be bounded by the total variation of the cost vectors. In this paper, we extend this result to general online convex optimization. We first…
We study Online Convex Optimization (OCO) with adversarial constraints, where an online algorithm must make sequential decisions to minimize both convex loss functions and cumulative constraint violations. We focus on a setting where the…
We study the problem of online learning (OL) from revealed preferences: a learner wishes to learn a non-strategic agent's private utility function through observing the agent's utility-maximizing actions in a changing environment. We adopt…
In this paper, we study a special bandit setting of online stochastic linear optimization, where only one-bit of information is revealed to the learner at each round. This problem has found many applications including online advertisement…
Existing approaches to online convex optimization (OCO) make sequential one-slot-ahead decisions, which lead to (possibly adversarial) losses that drive subsequent decision iterates. Their performance is evaluated by the so-called regret…
We consider Constrained Online Convex Optimization (COCO) with adversarially chosen constraints. At each round, the learner chooses an action before observing the loss and constraint function for that round. The goal is to achieve small…