Related papers: Policy Gradient With Value Function Approximation …

Multi-agent active perception with prediction rewards

Multi-agent active perception is a task where a team of agents cooperatively gathers observations to compute a joint estimate of a hidden variable. The task is decentralized and the joint estimate can only be computed after the task ends by…

Artificial Intelligence · Computer Science 2020-10-24 Mikko Lauri , Frans A. Oliehoek

Macro-Action-Based Multi-Agent/Robot Deep Reinforcement Learning under Partial Observability

The state-of-the-art multi-agent reinforcement learning (MARL) methods have provided promising solutions to a variety of complex problems. Yet, these methods all assume that agents perform synchronized primitive-action executions so that…

Artificial Intelligence · Computer Science 2022-10-12 Yuchen Xiao

Policy Iteration for Decentralized Control of Markov Decision Processes

Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov…

Artificial Intelligence · Computer Science 2014-01-16 Daniel S. Bernstein , Christopher Amato , Eric A. Hansen , Shlomo Zilberstein

Centralized Model and Exploration Policy for Multi-Agent RL

Reinforcement learning (RL) in partially observable, fully cooperative multi-agent settings (Dec-POMDPs) can in principle be used to address many real-world challenges such as controlling a swarm of rescue robots or a team of quadcopters.…

Artificial Intelligence · Computer Science 2022-02-08 Qizhen Zhang , Chris Lu , Animesh Garg , Jakob Foerster

Planning for Decentralized Control of Multiple Robots Under Uncertainty

We describe a probabilistic framework for synthesizing control policies for general multi-robot systems, given environment and sensor models and a cost function. Decentralized, partially observable Markov decision processes (Dec-POMDPs) are…

Robotics · Computer Science 2014-02-13 Christopher Amato , George D. Konidaris , Gabriel Cruz , Christopher A. Maynor , Jonathan P. How , Leslie P. Kaelbling

Macro-Action-Based Deep Multi-Agent Reinforcement Learning

In real-world multi-robot systems, performing high-quality, collaborative behaviors requires robots to asynchronously reason about high-level action selection at varying time durations. Macro-Action Decentralized Partially Observable Markov…

Machine Learning · Computer Science 2021-10-19 Yuchen Xiao , Joshua Hoffman , Christopher Amato

Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward

Many cooperative multi-agent problems require agents to learn individual tasks while contributing to the collective success of the group. This is a challenging task for current state-of-the-art multi-agent reinforcement algorithms that are…

Multiagent Systems · Computer Science 2020-03-25 Hassam Ullah Sheikh , Ladislau Bölöni

Information Gathering in Decentralized POMDPs by Policy Graph Improvement

Decentralized policies for information gathering are required when multiple autonomous agents are deployed to collect data about a phenomenon of interest without the ability to communicate. Decentralized partially observable Markov decision…

Artificial Intelligence · Computer Science 2019-02-27 Mikko Lauri , Joni Pajarinen , Jan Peters

Decentralized Multi-Agent Reinforcement Learning: An Off-Policy Method

We discuss the problem of decentralized multi-agent reinforcement learning (MARL) in this work. In our setting, the global state, action, and reward are assumed to be fully observable, while the local policy is protected as privacy by each…

Multiagent Systems · Computer Science 2021-11-02 Kuo Li , Qing-Shan Jia

Approximate Linear Programming for Decentralized Policy Iteration in Cooperative Multi-agent Markov Decision Processes

In this work, we consider a cooperative multi-agent Markov decision process (MDP) involving m agents. At each decision epoch, all the m agents independently select actions in order to maximize a common long-term objective. In the policy…

Machine Learning · Computer Science 2024-05-01 Lakshmi Mandal , Chandrashekar Lakshminarayanan , Shalabh Bhatnagar

Scalable Solution Methods for Dec-POMDPs with Deterministic Dynamics

Many high-level multi-agent planning problems, including multi-robot navigation and path planning, can be effectively modeled using deterministic actions and observations. In this work, we focus on such domains and introduce the class of…

Artificial Intelligence · Computer Science 2025-09-01 Yang You , Alex Schutz , Zhikun Li , Bruno Lacerda , Robert Skilton , Nick Hawes

Option-Critic in Cooperative Multi-agent Systems

In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems, using the options framework (Sutton et al, 1999). First, we address the planning problem for the decentralized POMDP represented by the…

Artificial Intelligence · Computer Science 2020-03-23 Jhelum Chakravorty , Nadeem Ward , Julien Roy , Maxime Chevalier-Boisvert , Sumana Basu , Andrei Lupu , Doina Precup

Multi-Agent Learning of Numerical Methods for Hyperbolic PDEs with Factored Dec-MDP

Factored decentralized Markov decision process (Dec-MDP) is a framework for modeling sequential decision making problems in multi-agent systems. In this paper, we formalize the learning of numerical methods for hyperbolic partial…

Machine Learning · Computer Science 2022-10-17 Yiwei Fu , Dheeraj S. K. Kapilavai , Elliot Way

Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes

The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic…

Artificial Intelligence · Computer Science 2019-05-30 Kunal Menda , Yi-Chun Chen , Justin Grana , James W. Bono , Brendan D. Tracey , Mykel J. Kochenderfer , David Wolpert

Decentralized Natural Policy Gradient with Variance Reduction for Collaborative Multi-Agent Reinforcement Learning

This paper studies a policy optimization problem arising from collaborative multi-agent reinforcement learning in a decentralized setting where agents communicate with their neighbors over an undirected graph to maximize the sum of their…

Optimization and Control · Mathematics 2022-09-07 Jinchi Chen , Jie Feng , Weiguo Gao , Ke Wei

Fully Decentralized Cooperative Multi-Agent Reinforcement Learning is A Context Modeling Problem

This paper studies fully decentralized cooperative multi-agent reinforcement learning, where each agent solely observes the states, its local actions, and the shared rewards. The inability to access other agents' actions often leads to…

Machine Learning · Computer Science 2026-05-12 Chao Li , Bingkun Bao , Yang Gao

Learning to Collaborate: Multi-Scenario Ranking via Multi-Agent Reinforcement Learning

Ranking is a fundamental and widely studied problem in scenarios such as search, advertising, and recommendation. However, joint optimization for multi-scenario ranking, which aims to improve the overall performance of several ranking…

Artificial Intelligence · Computer Science 2018-09-18 Jun Feng , Heng Li , Minlie Huang , Shichen Liu , Wenwu Ou , Zhirong Wang , Xiaoyan Zhu

Multi-Agent Fully Decentralized Value Function Learning with Linear Convergence Rates

This work develops a fully decentralized multi-agent algorithm for policy evaluation. The proposed scheme can be applied to two distinct scenarios. In the first scenario, a collection of agents have distinct datasets gathered following…

Machine Learning · Computer Science 2019-08-13 Lucas Cassano , Kun Yuan , Ali H. Sayed

Off-Policy Multi-Agent Decomposed Policy Gradients

Multi-agent policy gradient (MAPG) methods recently witness vigorous progress. However, there is a significant performance discrepancy between MAPG methods and state-of-the-art multi-agent value-based approaches. In this paper, we…

Machine Learning · Computer Science 2020-10-06 Yihan Wang , Beining Han , Tonghan Wang , Heng Dong , Chongjie Zhang

Efficient Multiagent Planning via Shared Action Suggestions

Decentralized partially observable Markov decision processes with communication (Dec-POMDP-Com) provide a framework for multiagent decision making under uncertainty, but the NEXP-complete complexity for finite-horizon problems renders…

Multiagent Systems · Computer Science 2025-11-18 Dylan M. Asmar , Mykel J. Kochenderfer