Related papers: Optimization, Learning, and Games with Predictable…
We study how to learn $\epsilon$-optimal strategies in zero-sum imperfect information games (IIG) with trajectory feedback. In this setting, players update their policies sequentially based on their observations over a fixed number of…
In a recent series of papers it has been established that variants of Gradient Descent/Ascent and Mirror Descent exhibit last iterate convergence in convex-concave zero-sum games. Specifically, \cite{DISZ17, LiangS18} show last iterate…
We design and analyze minimax-optimal algorithms for online linear optimization games where the player's choice is unconstrained. The player strives to minimize regret, the difference between his loss and the loss of a post-hoc benchmark…
In the paper, we develop a composite version of Mirror Prox algorithm for solving convex-concave saddle point problems and monotone variational inequalities of special structure, allowing to cover saddle point/variational analogies of what…
Most existing results about \emph{last-iterate convergence} of learning dynamics are limited to two-player zero-sum games, and only apply under rigid assumptions about what dynamics the players follow. In this paper we provide new results…
Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games, both in theory and practice. Particularly popular algorithms include optimistic multiplicative weights update (OMWU) and optimistic…
We consider the problem of minimizing a smooth convex function by reducing the optimization to computing the Nash equilibrium of a particular zero-sum convex-concave game. Zero-sum games can be solved using online learning dynamics, where a…
Based on the ideas of arXiv:1710.06612, we consider the problem of minimization of the Holder-continuous non-smooth functional $f$ with non-positive convex (generally, non-smooth) Lipschitz-continuous functional constraint. We propose some…
In this paper we study two-player bilinear zero-sum games with constrained strategy spaces. An instance of natural occurrences of such constraints is when mixed strategies are used, which correspond to a probability simplex constraint. We…
We study the unconstrained and the minimax saddle point variants of the convex multi-stage stochastic programming problem, where consecutive decisions are coupled through the objective functions, rather than through the constraints. We…
Online Convex Optimization plays a key role in large scale machine learning. Early approaches to this problem were conservative, in which the main focus was protection against the worst case scenario. But recently several algorithms have…
In game-theoretic learning, several agents are simultaneously following their individual interests, so the environment is non-stationary from each player's perspective. In this context, the performance of a learning algorithm is often…
We consider the problem of online learning and its application to solving minimax games. For the online learning problem, Follow the Perturbed Leader (FTPL) is a widely studied algorithm which enjoys the optimal $O(T^{1/2})$ worst-case…
We propose a novel adaptive, accelerated algorithm for the stochastic constrained convex optimization setting. Our method, which is inspired by the Mirror-Prox method, \emph{simultaneously} achieves the optimal rates for smooth/non-smooth…
Learning-to-optimize is an emerging framework that leverages training data to speed up the solution of certain optimization problems. One such approach is based on the classical mirror descent algorithm, where the mirror map is modelled…
An $\alpha$-potential game is a multi-player non-cooperative interaction in which a global potential function approximates individual player rewards up to a structural bias $\alpha$. While identifying a Nash Equilibrium (NE) in generic…
We study stochastic convex optimization under infinite noise variance. Specifically, when the stochastic gradient is unbiased and has uniformly bounded $(1+\kappa)$-th moment, for some $\kappa \in (0,1]$, we quantify the convergence rate of…
We develop a first-order accelerated algorithm for a class of constrained bilinear saddle-point problems with applications to network systems. The algorithm is a modified time-varying primal-dual version of an accelerated mirror-descent…
High-velocity streams of high-dimensional data pose significant "big data" analysis challenges across a range of applications and settings. Online learning and online convex programming play a significant role in the rapid recovery of…
Optimistic Online Learning aims to exploit experts conveying reliable information to predict the future. However, such implicit optimism may be challenged when it comes to practical crafting of such experts. A fundamental example consists…