Related papers: XDO: A Double Oracle Algorithm for Extensive-Form …

Anytime PSRO for Two-Player Zero-Sum Games

Policy space response oracles (PSRO) is a multi-agent reinforcement learning algorithm that has achieved state-of-the-art performance in very large two-player zero-sum games. PSRO is based on the tabular double oracle (DO) method, an…

Computer Science and Game Theory · Computer Science 2022-02-01 Stephen McAleer , Kevin Wang , John Lanier , Marc Lanctot , Pierre Baldi , Tuomas Sandholm , Roy Fox

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Finding approximate Nash equilibria in zero-sum imperfect-information games is challenging when the number of information states is large. Policy Space Response Oracles (PSRO) is a deep reinforcement learning algorithm grounded in game…

Computer Science and Game Theory · Computer Science 2021-02-22 Stephen McAleer , John Lanier , Roy Fox , Pierre Baldi

Global Policy-Space Response Oracles for Two-Player Zero-Sum Games

The Policy-Space Response Oracles (PSRO) framework scales equilibrium computation to large zero-sum games by iteratively expanding a restricted strategy set using deep reinforcement learning (DRL). A central challenge is to construct, under…

Artificial Intelligence · Computer Science 2026-05-28 Junyu Zhang , Feihong Yang , Jian Wang , Chao Wang , Xudong Zhang

Regret-Minimizing Double Oracle for Extensive-Form Games

By incorporating regret minimization, double oracle methods have demonstrated rapid convergence to Nash Equilibrium (NE) in normal-form games and extensive-form games, through algorithms such as online double oracle (ODO) and extensive-form…

Computer Science and Game Theory · Computer Science 2023-07-14 Xiaohang Tang , Le Cong Dinh , Stephen Marcus McAleer , Yaodong Yang

Efficient Policy Space Response Oracles

Policy Space Response Oracle methods (PSRO) provide a general solution to learn Nash equilibrium in two-player zero-sum games but suffer from two drawbacks: (1) the computation inefficiency due to the need for consistent meta-game…

Computer Science and Game Theory · Computer Science 2022-06-02 Ming Zhou , Jingxiao Chen , Ying Wen , Weinan Zhang , Yaodong Yang , Yong Yu , Jun Wang

A Generalized Training Approach for Multiagent Learning

This paper investigates a population-based training regime based on game-theoretic principles called Policy-Spaced Response Oracles (PSRO). PSRO is general in the sense that it (1) encompasses well-known algorithms such as fictitious play…

Multiagent Systems · Computer Science 2020-02-17 Paul Muller , Shayegan Omidshafiei , Mark Rowland , Karl Tuyls , Julien Perolat , Siqi Liu , Daniel Hennes , Luke Marris , Marc Lanctot , Edward Hughes , Zhe Wang , Guy Lever , Nicolas Heess , Thore Graepel , Remi Munos

Self-Play PSRO: Toward Optimal Populations in Two-Player Zero-Sum Games

In competitive two-agent environments, deep reinforcement learning (RL) methods based on the \emph{Double Oracle (DO)} algorithm, such as \emph{Policy Space Response Oracles (PSRO)} and \emph{Anytime PSRO (APSRO)}, iteratively add RL best…

Computer Science and Game Theory · Computer Science 2022-07-15 Stephen McAleer , JB Lanier , Kevin Wang , Pierre Baldi , Roy Fox , Tuomas Sandholm

Online Double Oracle

Solving strategic games with huge action space is a critical yet under-explored topic in economics, operations research and artificial intelligence. This paper proposes new learning algorithms for solving two-player zero-sum normal-form…

Artificial Intelligence · Computer Science 2023-02-16 Le Cong Dinh , Yaodong Yang , Stephen McAleer , Zheng Tian , Nicolas Perez Nieves , Oliver Slumbers , David Henry Mguni , Haitham Bou Ammar , Jun Wang

Iterative Empirical Game Solving via Single Policy Best Response

Policy-Space Response Oracles (PSRO) is a general algorithmic framework for learning policies in multiagent systems by interleaving empirical game analysis with deep reinforcement learning (Deep RL). At each iteration, Deep RL is invoked to…

Multiagent Systems · Computer Science 2021-06-04 Max Olan Smith , Thomas Anthony , Michael P. Wellman

Policy Space Diversity for Non-Transitive Games

Policy-Space Response Oracles (PSRO) is an influential algorithm framework for approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous studies have been trying to promote policy diversity in PSRO. A major…

Computer Science and Game Theory · Computer Science 2023-11-09 Jian Yao , Weiming Liu , Haobo Fu , Yaodong Yang , Stephen McAleer , Qiang Fu , Wei Yang

A Unified Perspective on Deep Equilibrium Finding

Extensive-form games provide a versatile framework for modeling interactions of multiple agents subjected to imperfect observations and stochastic events. In recent years, two paradigms, policy space response oracles (PSRO) and…

Computer Science and Game Theory · Computer Science 2022-04-12 Xinrun Wang , Jakub Cerny , Shuxin Li , Chang Yang , Zhuyun Yin , Hau Chan , Bo An

Policy Abstraction and Nash Refinement in Tree-Exploiting PSRO

Policy Space Response Oracles (PSRO) interleaves empirical game-theoretic analysis with deep reinforcement learning (DRL) to solve games too complex for traditional analytic methods. Tree-exploiting PSRO (TE-PSRO) is a variant of this…

Computer Science and Game Theory · Computer Science 2025-02-18 Christine Konicki , Mithun Chakraborty , Michael P. Wellman

Self-adaptive PSRO: Towards an Automatic Population-based Game Solver

Policy-Space Response Oracles (PSRO) as a general algorithmic framework has achieved state-of-the-art performance in learning equilibrium policies of two-player zero-sum games. However, the hand-crafted hyperparameter value selection in…

Artificial Intelligence · Computer Science 2024-04-18 Pengdeng Li , Shuxin Li , Chang Yang , Xinrun Wang , Xiao Huang , Hau Chan , Bo An

Double Oracle Algorithm for Computing Equilibria in Continuous Games

Many efficient algorithms have been designed to recover Nash equilibria of various classes of finite games. Special classes of continuous games with infinite strategy spaces, such as polynomial games, can be solved by semidefinite…

Computer Science and Game Theory · Computer Science 2020-10-01 Lukáš Adam , Rostislav Horčík , Tomáš Kasl , Tomáš Kroupa

Sample-Efficient Regret-Minimizing Double Oracle in Extensive-Form Games

Extensive-Form Game (EFG) represents a fundamental model for analyzing sequential interactions among multiple agents and the primary challenge to solve it lies in mitigating sample complexity. Existing research indicated that Double Oracle…

Computer Science and Game Theory · Computer Science 2024-11-05 Xiaohang Tang , Chiyuan Wang , Chengdong Ma , Ilija Bogunovic , Stephen McAleer , Yaodong Yang

Fictitious Cross-Play: Learning Global Nash Equilibrium in Mixed Cooperative-Competitive Games

Self-play (SP) is a popular multi-agent reinforcement learning (MARL) framework for solving competitive games, where each agent optimizes policy by treating others as part of the environment. Despite the empirical successes, the theoretical…

Artificial Intelligence · Computer Science 2023-10-06 Zelai Xu , Yancheng Liang , Chao Yu , Yu Wang , Yi Wu

Policy Space Response Oracles: A Survey

Game theory provides a mathematical way to study the interaction between multiple decision makers. However, classical game-theoretic analysis is limited in scalability due to the large number of strategies, precluding direct application to…

Computer Science and Game Theory · Computer Science 2024-05-28 Ariyan Bighashdel , Yongzhao Wang , Stephen McAleer , Rahul Savani , Frans A. Oliehoek

Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles

For solving zero-sum games involving non-transitivity, a useful approach is to maintain a policy population to approximate the Nash Equilibrium (NE). Previous studies have shown that the Policy Space Response Oracles (PSRO) algorithm is an…

Computer Science and Game Theory · Computer Science 2026-01-06 Jiesong Lian , Yucong Huang , Chengdong Ma , Mingzhi Wang , Ying Wen , Long Hu , Yixue Hao

Conflux-PSRO: Effectively Leveraging Collective Advantages in Policy Space Response Oracles

Policy Space Response Oracle (PSRO) with policy population construction has been demonstrated as an effective method for approximating Nash Equilibrium (NE) in zero-sum games. Existing studies have attempted to improve diversity in policy…

Computer Science and Game Theory · Computer Science 2024-11-14 Yucong Huang , Jiesong Lian , Mingzhi Wang , Chengdong Ma , Ying Wen

A-PSRO: A Unified Strategy Learning Method with Advantage Function for Normal-form Games

Solving Nash equilibrium is the key challenge in normal-form games with large strategy spaces, where open-ended learning frameworks offer an efficient approach. In this work, we propose an innovative unified open-ended learning framework…

Computer Science and Game Theory · Computer Science 2024-03-25 Yudong Hu , Haoran Li , Congying Han , Tiande Guo , Mingqiang Li , Bonan Li