Related papers: 3DPG: Distributed Deep Deterministic Policy Gradie…

Continuous-Time Distributed Dynamic Programming for Networked Multi-Agent Markov Decision Processes

The main goal of this paper is to investigate continuous-time distributed dynamic programming (DP) algorithms for networked multi-agent Markov decision problems (MAMDPs). In our study, we adopt a distributed multi-agent framework where…

Systems and Control · Electrical Eng. & Systems 2024-06-14 Donghwan Lee , Han-Dong Lim , Do Wan Kim

Almost Sure Convergence of Networked Policy Gradient over Time-Varying Networks in Markov Potential Games

We propose networked policy gradient play for solving Markov potential games with continuous and/or discrete state-action pairs. During the game, agents use parametrized and differentiable policies that depend on the current state and the…

Systems and Control · Electrical Eng. & Systems 2025-10-02 Sarper Aydin , Ceyhun Eksin

Multi-agent Deep Reinforcement Learning with Extremely Noisy Observations

Multi-agent reinforcement learning systems aim to provide interacting agents with the ability to collaboratively learn and adapt to the behaviour of other agents. In many real-world applications, the agents can only acquire a partial view…

Machine Learning · Computer Science 2018-12-04 Ozsel Kilinc , Giovanni Montana

Fully-Decentralized MADDPG with Networked Agents

In this paper, we devise three actor-critic algorithms with decentralized training for multi-agent reinforcement learning in cooperative, adversarial, and mixed settings with continuous action spaces. To this goal, we adapt the MADDPG…

Machine Learning · Computer Science 2025-03-11 Diego Bolliger , Lorenz Zauter , Robert Ziegler

Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning

Deep reinforcement learning for multi-agent cooperation and competition has been a hot topic recently. This paper focuses on cooperative multi-agent problem based on actor-critic methods under local observations settings. Multi agent deep…

Artificial Intelligence · Computer Science 2017-10-04 Xiangxiang Chu , Hangjun Ye

Policy Gradient with Self-Attention for Model-Free Distributed Nonlinear Multi-Agent Games

Multi-agent games in dynamic nonlinear settings are challenging due to the time-varying interactions among the agents and the non-stationarity of the (potential) Nash equilibria. In this paper we consider model-free games, where agent…

Systems and Control · Electrical Eng. & Systems 2025-09-24 Eduardo Sebastián , Maitrayee Keskar , Eeman Iqbal , Eduardo Montijano , Carlos Sagüés , Nikolay Atanasov

Sample-Efficient Distributionally Robust Multi-Agent Reinforcement Learning via Online Interaction

Well-trained multi-agent systems can fail when deployed in real-world environments due to model mismatches between the training and deployment environments, caused by environment uncertainties including noise or adversarial attacks.…

Machine Learning · Computer Science 2026-03-03 Zain Ulabedeen Farhat , Debamita Ghosh , George K. Atia , Yue Wang

$K$-Level Policy Gradients for Multi-Agent Reinforcement Learning

Actor-critic algorithms for deep multi-agent reinforcement learning (MARL) typically employ a policy update that responds to the current strategies of other agents. While being straightforward, this approach does not account for the updates…

Machine Learning · Computer Science 2025-09-16 Aryaman Reddi , Gabriele Tiboni , Jan Peters , Carlo D'Eramo

Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games

Potential games are arguably one of the most important and widely studied classes of normal form games. They define the archetypal setting of multi-agent coordination as all agent utilities are perfectly aligned with each other via a common…

Machine Learning · Computer Science 2025-09-24 Stefanos Leonardos , Will Overman , Ioannis Panageas , Georgios Piliouras

Consolidation via Policy Information Regularization in Deep RL for Multi-Agent Games

This paper introduces an information-theoretic constraint on learned policy complexity in the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) reinforcement learning algorithm. Previous research with a related approach in continuous…

Artificial Intelligence · Computer Science 2025-05-16 Tailia Malloy , Tim Klinger , Miao Liu , Matthew Riemer , Gerald Tesauro , Chris R. Sims

Deep Interactive Motion Prediction and Planning: Playing Games with Motion Prediction Models

In most classical Autonomous Vehicle (AV) stacks, the prediction and planning layers are separated, limiting the planner to react to predictions that are not informed by the planned trajectory of the AV. This work presents a module that…

Robotics · Computer Science 2022-04-06 Jose L. Vazquez , Alexander Liniger , Wilko Schwarting , Daniela Rus , Luc Van Gool

Policy Gradient Methods for Non-Markovian Reinforcement Learning

We study policy gradient methods for reinforcement learning in non-Markovian decision processes (NMDPs), where observations and rewards depend on the entire interaction history. To handle this dependence, the agent maintains an internal…

Machine Learning · Computer Science 2026-05-12 Avik Kar , Siddharth Chandak , Rahul Singh , Soumitra Sinhahajari , Eric Moulines , Shalabh Bhatnagar , Nicholas Bambos

Empirical Policy Optimization for $n$-Player Markov Games

In single-agent Markov decision processes, an agent can optimize its policy based on the interaction with environment. In multi-player Markov games (MGs), however, the interaction is non-stationary due to the behaviors of other players, so…

Computer Science and Game Theory · Computer Science 2021-10-19 Yuanheng Zhu , Dongbin Zhao , Mengchen Zhao , Dong Li

AoI-Aware Resource Allocation with Deep Reinforcement Learning for HAPS-V2X Networks

Sixth-generation (6G) networks are designed to meet the hyper-reliable and low-latency communication (HRLLC) requirements of safety-critical applications such as autonomous driving. Integrating non-terrestrial networks (NTN) into the 6G…

Networking and Internet Architecture · Computer Science 2025-08-04 Ahmet Melih Ince , Ayse Elif Canbilen , Halim Yanikomeroglu

Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence

Multi-agent interactions are increasingly important in the context of reinforcement learning, and the theoretical foundations of policy gradient methods have attracted surging research interest. We investigate the global convergence of…

Optimization and Control · Mathematics 2023-03-21 Sarath Pattathil , Kaiqing Zhang , Asuman Ozdaglar

A Digital Twin-based Multi-Agent Reinforcement Learning Framework for Vehicle-to-Grid Coordination

The coordination of large-scale, decentralised systems, such as a fleet of Electric Vehicles (EVs) in a Vehicle-to-Grid (V2G) network, presents a significant challenge for modern control systems. While collaborative Digital Twins have been…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-01 Zhengchang Hua , Panagiotis Oikonomou , Karim Djemame , Nikos Tziritas , Georgios Theodoropoulos

Distributed Differential Graphical Game for Control of Double-Integrator Multi-Agent Systems with Input Delay

This paper studies cooperative control of noncooperative double-integrator multi-agent systems (MASs) with input delay on connected directed graphs in the context of a differential graphical game (DGG). In the distributed DGG, each agent…

Systems and Control · Electrical Eng. & Systems 2024-03-04 Hossein B. Jond

Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward

Many cooperative multi-agent problems require agents to learn individual tasks while contributing to the collective success of the group. This is a challenging task for current state-of-the-art multi-agent reinforcement algorithms that are…

Multiagent Systems · Computer Science 2020-03-25 Hassam Ullah Sheikh , Ladislau Bölöni

Edge Delayed Deep Deterministic Policy Gradient: efficient continuous control for edge scenarios

Deep Reinforcement Learning is gaining increasing attention thanks to its capability to learn complex policies in high-dimensional settings. Recent advancements utilize a dual-network architecture to learn optimal policies through the…

Machine Learning · Computer Science 2025-10-14 Alberto Sinigaglia , Niccolò Turcato , Ruggero Carli , Gian Antonio Susto

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

We propose a fully distributed actor-critic algorithm approximated by deep neural networks, named \textit{Diff-DAC}, with application to single-task and to average multitask reinforcement learning (MRL). Each agent has access to data from…

Machine Learning · Computer Science 2020-10-27 Sergio Valcarcel Macua , Aleksi Tukiainen , Daniel García-Ocaña Hernández , David Baldazo , Enrique Munoz de Cote , Santiago Zazo