Related papers: Efficient Policy Space Response Oracles

Global Policy-Space Response Oracles for Two-Player Zero-Sum Games

The Policy-Space Response Oracles (PSRO) framework scales equilibrium computation to large zero-sum games by iteratively expanding a restricted strategy set using deep reinforcement learning (DRL). A central challenge is to construct, under…

Artificial Intelligence · Computer Science 2026-05-28 Junyu Zhang , Feihong Yang , Jian Wang , Chao Wang , Xudong Zhang

Simulation-Free PSRO: Removing Game Simulation from Policy Space Response Oracles

Policy Space Response Oracles (PSRO) combines game-theoretic equilibrium computation with learning and is effective in approximating Nash Equilibrium in zero-sum games. However, the computational cost of PSRO has become a significant…

Multiagent Systems · Computer Science 2026-01-12 Yingzhuo Liu , Shuodi Liu , Weijun Luo , Liuyu Xiang , Zhaofeng He

Anytime PSRO for Two-Player Zero-Sum Games

Policy space response oracles (PSRO) is a multi-agent reinforcement learning algorithm that has achieved state-of-the-art performance in very large two-player zero-sum games. PSRO is based on the tabular double oracle (DO) method, an…

Computer Science and Game Theory · Computer Science 2022-02-01 Stephen McAleer , Kevin Wang , John Lanier , Marc Lanctot , Pierre Baldi , Tuomas Sandholm , Roy Fox

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Finding approximate Nash equilibria in zero-sum imperfect-information games is challenging when the number of information states is large. Policy Space Response Oracles (PSRO) is a deep reinforcement learning algorithm grounded in game…

Computer Science and Game Theory · Computer Science 2021-02-22 Stephen McAleer , John Lanier , Roy Fox , Pierre Baldi

XDO: A Double Oracle Algorithm for Extensive-Form Games

Policy Space Response Oracles (PSRO) is a reinforcement learning (RL) algorithm for two-player zero-sum games that has been empirically shown to find approximate Nash equilibria in large games. Although PSRO is guaranteed to converge to an…

Computer Science and Game Theory · Computer Science 2022-02-01 Stephen McAleer , John Lanier , Kevin Wang , Pierre Baldi , Roy Fox

Policy Abstraction and Nash Refinement in Tree-Exploiting PSRO

Policy Space Response Oracles (PSRO) interleaves empirical game-theoretic analysis with deep reinforcement learning (DRL) to solve games too complex for traditional analytic methods. Tree-exploiting PSRO (TE-PSRO) is a variant of this…

Computer Science and Game Theory · Computer Science 2025-02-18 Christine Konicki , Mithun Chakraborty , Michael P. Wellman

Policy Space Diversity for Non-Transitive Games

Policy-Space Response Oracles (PSRO) is an influential algorithm framework for approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous studies have been trying to promote policy diversity in PSRO. A major…

Computer Science and Game Theory · Computer Science 2023-11-09 Jian Yao , Weiming Liu , Haobo Fu , Yaodong Yang , Stephen McAleer , Qiang Fu , Wei Yang

Self-Play PSRO: Toward Optimal Populations in Two-Player Zero-Sum Games

In competitive two-agent environments, deep reinforcement learning (RL) methods based on the \emph{Double Oracle (DO)} algorithm, such as \emph{Policy Space Response Oracles (PSRO)} and \emph{Anytime PSRO (APSRO)}, iteratively add RL best…

Computer Science and Game Theory · Computer Science 2022-07-15 Stephen McAleer , JB Lanier , Kevin Wang , Pierre Baldi , Roy Fox , Tuomas Sandholm

Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles

For solving zero-sum games involving non-transitivity, a useful approach is to maintain a policy population to approximate the Nash Equilibrium (NE). Previous studies have shown that the Policy Space Response Oracles (PSRO) algorithm is an…

Computer Science and Game Theory · Computer Science 2026-01-06 Jiesong Lian , Yucong Huang , Chengdong Ma , Mingzhi Wang , Ying Wen , Long Hu , Yixue Hao

Conflux-PSRO: Effectively Leveraging Collective Advantages in Policy Space Response Oracles

Policy Space Response Oracle (PSRO) with policy population construction has been demonstrated as an effective method for approximating Nash Equilibrium (NE) in zero-sum games. Existing studies have attempted to improve diversity in policy…

Computer Science and Game Theory · Computer Science 2024-11-14 Yucong Huang , Jiesong Lian , Mingzhi Wang , Chengdong Ma , Ying Wen

A-PSRO: A Unified Strategy Learning Method with Advantage Function for Normal-form Games

Solving Nash equilibrium is the key challenge in normal-form games with large strategy spaces, where open-ended learning frameworks offer an efficient approach. In this work, we propose an innovative unified open-ended learning framework…

Computer Science and Game Theory · Computer Science 2024-03-25 Yudong Hu , Haoran Li , Congying Han , Tiande Guo , Mingqiang Li , Bonan Li

A Generalized Training Approach for Multiagent Learning

This paper investigates a population-based training regime based on game-theoretic principles called Policy-Spaced Response Oracles (PSRO). PSRO is general in the sense that it (1) encompasses well-known algorithms such as fictitious play…

Multiagent Systems · Computer Science 2020-02-17 Paul Muller , Shayegan Omidshafiei , Mark Rowland , Karl Tuyls , Julien Perolat , Siqi Liu , Daniel Hennes , Luke Marris , Marc Lanctot , Edward Hughes , Zhe Wang , Guy Lever , Nicolas Heess , Thore Graepel , Remi Munos

Online Double Oracle

Solving strategic games with huge action space is a critical yet under-explored topic in economics, operations research and artificial intelligence. This paper proposes new learning algorithms for solving two-player zero-sum normal-form…

Artificial Intelligence · Computer Science 2023-02-16 Le Cong Dinh , Yaodong Yang , Stephen McAleer , Zheng Tian , Nicolas Perez Nieves , Oliver Slumbers , David Henry Mguni , Haitham Bou Ammar , Jun Wang

Policy Space Response Oracles: A Survey

Game theory provides a mathematical way to study the interaction between multiple decision makers. However, classical game-theoretic analysis is limited in scalability due to the large number of strategies, precluding direct application to…

Computer Science and Game Theory · Computer Science 2024-05-28 Ariyan Bighashdel , Yongzhao Wang , Stephen McAleer , Rahul Savani , Frans A. Oliehoek

Iterative Empirical Game Solving via Single Policy Best Response

Policy-Space Response Oracles (PSRO) is a general algorithmic framework for learning policies in multiagent systems by interleaving empirical game analysis with deep reinforcement learning (Deep RL). At each iteration, Deep RL is invoked to…

Multiagent Systems · Computer Science 2021-06-04 Max Olan Smith , Thomas Anthony , Michael P. Wellman

Self-adaptive PSRO: Towards an Automatic Population-based Game Solver

Policy-Space Response Oracles (PSRO) as a general algorithmic framework has achieved state-of-the-art performance in learning equilibrium policies of two-player zero-sum games. However, the hand-crafted hyperparameter value selection in…

Artificial Intelligence · Computer Science 2024-04-18 Pengdeng Li , Shuxin Li , Chang Yang , Xinrun Wang , Xiao Huang , Hau Chan , Bo An

Fictitious Cross-Play: Learning Global Nash Equilibrium in Mixed Cooperative-Competitive Games

Self-play (SP) is a popular multi-agent reinforcement learning (MARL) framework for solving competitive games, where each agent optimizes policy by treating others as part of the environment. Despite the empirical successes, the theoretical…

Artificial Intelligence · Computer Science 2023-10-06 Zelai Xu , Yancheng Liang , Chao Yu , Yu Wang , Yi Wu

Learning Equilibria in Mean-Field Games: Introducing Mean-Field PSRO

Recent advances in multiagent learning have seen the introduction ofa family of algorithms that revolve around the population-based trainingmethod PSRO, showing convergence to Nash, correlated and coarse corre-lated equilibria. Notably,…

Computer Science and Game Theory · Computer Science 2022-08-30 Paul Muller , Mark Rowland , Romuald Elie , Georgios Piliouras , Julien Perolat , Mathieu Lauriere , Raphael Marinier , Olivier Pietquin , Karl Tuyls

Regret-Minimizing Double Oracle for Extensive-Form Games

By incorporating regret minimization, double oracle methods have demonstrated rapid convergence to Nash Equilibrium (NE) in normal-form games and extensive-form games, through algorithms such as online double oracle (ODO) and extensive-form…

Computer Science and Game Theory · Computer Science 2023-07-14 Xiaohang Tang , Le Cong Dinh , Stephen Marcus McAleer , Yaodong Yang

Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games

The ex ante equilibrium for two-team zero-sum games, where agents within each team collaborate to compete against the opposing team, is known to be the best a team can do for coordination. Many existing works on ex ante equilibrium…

Computer Science and Game Theory · Computer Science 2024-10-03 Naming Liu , Mingzhi Wang , Xihuai Wang , Weinan Zhang , Yaodong Yang , Youzhi Zhang , Bo An , Ying Wen