Related papers: Sparsified Linear Programming for Zero-Sum Equilib…
Counterfactual Regret Minimization and variants (e.g. Public Chance Sampling CFR and Pure CFR) have been known as the best approaches for creating approximate Nash equilibrium solutions for imperfect information games such as poker. This…
The article provides a solution algorithm for the linear programming problem (LPP) with the latter being presented as an antagonistic matrix game so the game's further solution is based on the iterative method. The algorithm is presented as…
We derive sublinear-time quantum algorithms for computing the Nash equilibrium of two-player zero-sum games, based on efficient Gibbs sampling methods. We are able to achieve speed-ups for both dense and sparse payoff matrices at the cost…
Positive linear programs (LP), also known as packing and covering linear programs, are an important class of problems that bridges computer science, operations research, and optimization. Despite the consistent efforts on this problem, all…
Driven by recent successes in two-player, zero-sum game solving and playing, artificial intelligence work on games has increasingly focused on algorithms that produce equilibrium-based strategies. However, this approach has been less…
Counterfactual Regret Minimization (CFR) and its variants developed based upon Regret Matching (RM) have been considered to be the best method to solve incomplete information extensive form games. In addition to RM and CFR, Fictitious Play…
Despite the many recent practical and theoretical breakthroughs in computational game theory, equilibrium finding in extensive-form team games remains a significant challenge. While NP-hard in the worst case, there are provably efficient…
The practical scalability of many optimization algorithms for large extensive-form games is often limited by the games' huge payoff matrices. To ameliorate the issue, Zhang and Sandholm (2020) recently proposed a sparsification technique…
Extensive-form games provide a versatile framework for modeling interactions of multiple agents subjected to imperfect observations and stochastic events. In recent years, two paradigms, policy space response oracles (PSRO) and…
There has been significant recent progress in algorithms for approximation of Nash equilibrium in large two-player zero-sum imperfect-information games and exact computation of Nash equilibrium in multiplayer strategic-form games. While…
We develop provably efficient reinforcement learning algorithms for two-player zero-sum finite-horizon Markov games with simultaneous moves. To incorporate function approximation, we consider a family of Markov games where the reward…
A linear program with linear complementarity constraints (LPCC) requires the minimization of a linear objective over a set of linear constraints together with additional linear complementarity constraints. This class has emerged as a…
There has been tremendous recent progress on equilibrium-finding algorithms for zero-sum imperfect-information extensive-form games, but there has been a puzzling gap between theory and practice. First-order methods have significantly…
Counterfactual Regret Minimization (CFR) is the leading framework for solving large imperfect-information games. It converges to an equilibrium by iteratively traversing the game tree. In order to deal with extremely large games,…
Counterfactual regret minimization (CFR) is a family of iterative algorithms that are the most popular and, in practice, fastest approach to approximately solving large imperfect-information games. In this paper we introduce novel CFR…
While the topic of mean-field games (MFGs) has a relatively long history, heretofore there has been limited work concerning algorithms for the computation of equilibrium control policies. In this paper, we develop a computable policy…
In zero-sum games, the optimal strategy is well-defined by the Nash equilibrium. However, it is overly conservative when playing against suboptimal opponents and it can not exploit their weaknesses. Limited look-ahead game solving in…
In this paper, we present exploitability descent, a new algorithm to compute approximate equilibria in two-player zero-sum extensive-form games with imperfect information, by direct policy optimization against worst-case opponents. We prove…
We propose a novel Linear Program (LP) based formula- tion for solving jigsaw puzzles. We formulate jigsaw solving as a set of successive global convex relaxations of the stan- dard NP-hard formulation, that can describe both jigsaws with…
Regret matching (RM) -- and its modern variants -- is a foundational online algorithm that has been at the heart of many AI breakthrough results in solving benchmark zero-sum games, such as poker. Yet, surprisingly little is known so far in…