English
Related papers

Related papers: Neural Replicator Dynamics

200 papers

Neural replicator dynamics (NeuRD) is an alternative to the foundational softmax policy gradient (SPG) algorithm motivated by online learning and evolutionary game theory. The NeuRD expected update is designed to be nearly identical to that…

Machine Learning · Computer Science 2022-06-07 Dustin Morrill , Esra'a Saleh , Michael Bowling , Amy Greenwald

Optimization of parameterized policies for reinforcement learning (RL) is an important and challenging problem in artificial intelligence. Among the most common approaches are algorithms based on gradient ascent of a score function…

Machine Learning · Computer Science 2020-06-15 Sriram Srinivasan , Marc Lanctot , Vinicius Zambaldi , Julien Perolat , Karl Tuyls , Remi Munos , Michael Bowling

This paper introduces two metrics (cycle-based and memory-based metrics), grounded on a dynamical game-theoretic solution concept called sink equilibrium, for the evaluation, ranking, and computation of policies in multi-agent learning. We…

Computer Science and Game Theory · Computer Science 2020-06-23 Rui Yan , Xiaoming Duan , Zongying Shi , Yisheng Zhong , Jason R. Marden , Francesco Bullo

We formulate a general framework for competitive gradient-based learning that encompasses a wide breadth of multi-agent learning algorithms, and analyze the limiting behavior of competitive gradient-based learning algorithms using dynamical…

Machine Learning · Computer Science 2020-02-21 Eric Mazumdar , Lillian J. Ratliff , S. Shankar Sastry

In high-stake scenarios like medical treatment and auto-piloting, it's risky or even infeasible to collect online experimental data to train the agent. Simulation-based training can alleviate this issue, but may suffer from its inherent…

Machine Learning · Computer Science 2022-03-16 Jialian Li , Tongzheng Ren , Dong Yan , Hang Su , Jun Zhu

Reinforcement Learning is a powerful framework for training agents to navigate different situations, but it is susceptible to changes in environmental dynamics. However, solving Markov Decision Processes that are robust to changes is…

Machine Learning · Computer Science 2024-06-21 Etash Kumar Guha

Solving partially observable Markov decision processes (POMDPs) remains a fundamental challenge in reinforcement learning (RL), primarily due to the curse of dimensionality induced by the non-stationarity of optimal policies. In this work,…

Optimization and Control · Mathematics 2025-10-20 Semih Cayci , Atilla Eryilmaz

We show by counterexample that policy-gradient algorithms have no guarantees of even local convergence to Nash equilibria in continuous action and state space multi-agent settings. To do so, we analyze gradient-play in N-player general-sum…

Machine Learning · Computer Science 2019-12-18 Eric Mazumdar , Lillian J. Ratliff , Michael I. Jordan , S. Shankar Sastry

Despite its groundbreaking success, multi-agent reinforcement learning (MARL) still suffers from instability and nonstationarity. Replicator dynamics, the most well-known model from evolutionary game theory (EGT), provide a theoretical…

Machine Learning · Computer Science 2025-01-28 Tuo Zhang , Leonardo Stella , Julian Barreiro-Gomez

Most reinforcement learning methods are based upon the key assumption that the transition dynamics and reward functions are fixed, that is, the underlying Markov decision process is stationary. However, in many real-world applications, this…

Machine Learning · Computer Science 2020-09-23 Yash Chandak , Georgios Theocharous , Shiv Shankar , Martha White , Sridhar Mahadevan , Philip S. Thomas

Accurate and efficient simulation of modern robots remains challenging due to their high degrees of freedom and intricate mechanisms. Neural simulators have emerged as a promising alternative to traditional analytical simulators, capable of…

Robotics · Computer Science 2025-08-22 Jie Xu , Eric Heiden , Iretiayo Akinola , Dieter Fox , Miles Macklin , Yashraj Narang

Multistage decision policies provide useful control strategies in high-dimensional state spaces, particularly in complex control tasks. However, they exhibit weak performance guarantees in the presence of disturbance, model mismatch, or…

Robotics · Computer Science 2018-08-07 Olalekan Ogunmolu , Nicholas Gans , Tyler Summers

A fundamental challenge in multiagent reinforcement learning is to learn beneficial behaviors in a shared environment with other simultaneously learning agents. In particular, each agent perceives the environment as effectively…

In this paper, we study the global convergence of model-based and model-free policy gradient descent and natural policy gradient descent algorithms for linear quadratic deep structured teams. In such systems, agents are partitioned into a…

Multiagent Systems · Computer Science 2020-12-16 Vida Fathi , Jalal Arabneydi , Amir G. Aghdam

Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks. The two main…

Machine Learning · Computer Science 2018-10-23 John Schulman , Philipp Moritz , Sergey Levine , Michael Jordan , Pieter Abbeel

Multi-agent interactions are increasingly important in the context of reinforcement learning, and the theoretical foundations of policy gradient methods have attracted surging research interest. We investigate the global convergence of…

Optimization and Control · Mathematics 2023-03-21 Sarath Pattathil , Kaiqing Zhang , Asuman Ozdaglar

In this work, we consider policy-based methods for solving the reinforcement learning problem, and establish the sample complexity guarantees. A policy-based algorithm typically consists of an actor and a critic. We consider using various…

Machine Learning · Computer Science 2023-01-16 Zaiwei Chen , Siva Theja Maguluri

Policy-gradient methods are widely used in reinforcement learning, yet training often becomes unstable or slows down as learning progresses. We study this phenomenon through the noise-to-signal ratio (NSR) of a policy-gradient estimator,…

Optimization and Control · Mathematics 2026-02-10 Haoyu Han , Heng Yang

Reinforcement learning considers the problem of finding policies that maximize an expected cumulative reward in a Markov decision process with unknown transition probabilities. In this paper we consider the problem of finding optimal…

Machine Learning · Computer Science 2020-10-19 Santiago Paternain , Juan Andres Bazerque , Alejandro Ribeiro

Projected policy gradient under the simplex parameterization, policy gradient and natural policy gradient under the softmax parameterization, are fundamental algorithms in reinforcement learning. There have been a flurry of recent…

Optimization and Control · Mathematics 2024-04-12 Jiacai Liu , Wenye Li , Ke Wei
‹ Prev 1 2 3 10 Next ›