Related papers: Approximate Dec-POMDP Solving Using Multi-Agent A*

MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs

We present multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DEC-POMDPs) with finite horizon. The algorithm is suitable for computing…

Artificial Intelligence · Computer Science 2012-07-09 Daniel Szer , Francois Charpillet , Shlomo Zilberstein

Anytime Planning for Decentralized POMDPs using Expectation Maximization

Decentralized POMDPs provide an expressive framework for multi-agent sequential decision making. While fnite-horizon DECPOMDPs have enjoyed signifcant success, progress remains slow for the infnite-horizon case mainly due to the inherent…

Artificial Intelligence · Computer Science 2012-03-19 Akshat Kumar , Shlomo Zilberstein

My Brain is Full: When More Memory Helps

We consider the problem of finding good finite-horizon policies for POMDPs under the expected reward metric. The policies considered are {em free finite-memory policies with limited memory}; a policy is a mapping from the space of…

Artificial Intelligence · Computer Science 2013-01-30 Christopher Lusena , Tong Li , Shelia Sittinger , Chris Wells , Judy Goldsmith

Distribution over Beliefs for Memory Bounded Dec-POMDP Planning

We propose a new point-based method for approximate planning in Dec-POMDP which outperforms the state-of-the-art approaches in terms of solution quality. It uses a heuristic estimation of the prior probability of beliefs to choose a bounded…

Artificial Intelligence · Computer Science 2012-03-19 Gabriel Corona , Francois Charpillet

Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs

This article presents the state-of-the-art in optimal solution methods for decentralized partially observable Markov decision processes (Dec-POMDPs), which are general models for collaborative multiagent planning under uncertainty. Building…

Artificial Intelligence · Computer Science 2014-02-05 Frans Adriaan Oliehoek , Matthijs T. J. Spaan , Christopher Amato , Shimon Whiteson

Scaling Up Decentralized MDPs Through Heuristic Search

Decentralized partially observable Markov decision processes (Dec-POMDPs) are rich models for cooperative decision-making under uncertainty, but are often intractable to solve optimally (NEXP-complete). The transition and observation…

Artificial Intelligence · Computer Science 2012-10-19 Jilles S. Dibangoye , Christopher Amato , Arnoud Doniec

Multiagent Rollout and Policy Iteration for POMDP with Application to Multi-Robot Repair Problems

In this paper we consider infinite horizon discounted dynamic programming problems with finite state and control spaces, partial state observations, and a multiagent structure. We discuss and compare algorithms that simultaneously or…

Robotics · Computer Science 2020-11-10 Sushmita Bhattacharya , Siva Kailas , Sahil Badyal , Stephanie Gil , Dimitri Bertsekas

Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs

Memory-Bounded Dynamic Programming (MBDP) has proved extremely effective in solving decentralized POMDPs with large horizons. We generalize the algorithm and improve its scalability by reducing the complexity with respect to the number of…

Artificial Intelligence · Computer Science 2012-06-26 Sven Seuken , Shlomo Zilberstein

Risk-seeking conservative policy iteration with agent-state based policies for Dec-POMDPs with guaranteed convergence

Optimally solving decentralized decision-making problems modeled as Dec-POMDPs is known to be NEXP-complete. These optimal solutions are policies based on the entire history of observations and actions of an agent. However, some…

Multiagent Systems · Computer Science 2026-04-13 Amit Sinha , Matthieu Geist , Aditya Mahajan

Optimizing Memory-Bounded Controllers for Decentralized POMDPs

We present a memory-bounded optimization approach for solving infinite-horizon decentralized POMDPs. Policies for each agent are represented by stochastic finite state controllers. We formulate the problem of optimizing these policies as a…

Artificial Intelligence · Computer Science 2012-06-26 Christopher Amato , Daniel S Bernstein , Shlomo Zilberstein

Policy Iteration for Decentralized Control of Markov Decision Processes

Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov…

Artificial Intelligence · Computer Science 2014-01-16 Daniel S. Bernstein , Christopher Amato , Eric A. Hansen , Shlomo Zilberstein

Mixed Integer Linear Programming For Exact Finite-Horizon Planning In Decentralized Pomdps

We consider the problem of finding an n-agent joint-policy for the optimal finite-horizon control of a decentralized Pomdp (Dec-Pomdp). This is a problem of very high complexity (NEXP-hard in n >= 2). In this paper, we propose a new…

Artificial Intelligence · Computer Science 2016-03-28 Raghav Aras , Alain Dutech , François Charpillet

An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs

Decentralized planning in uncertain environments is a complex task generally dealt with by using a decision-theoretic approach, mainly through the framework of Decentralized Partially Observable Markov Decision Processes (DEC-POMDPs).…

Artificial Intelligence · Computer Science 2014-01-17 Raghav Aras , Alain Dutech

Rollout Sampling Policy Iteration for Decentralized POMDPs

We present decentralized rollout sampling policy iteration (DecRSPI) - a new algorithm for multi-agent decision problems formalized as DEC-POMDPs. DecRSPI is designed to improve scalability and tackle problems that lack an explicit model.…

Artificial Intelligence · Computer Science 2012-03-19 Feng Wu , Shlomo Zilberstein , Xiaoping Chen

Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration with Application to Autonomous Sequential Repair Problems

In this paper we consider infinite horizon discounted dynamic programming problems with finite state and control spaces, and partial state observations. We discuss an algorithm that uses multistep lookahead, truncated rollout with a known…

Robotics · Computer Science 2020-02-12 Sushmita Bhattacharya , Sahil Badyal , Thomas Wheeler , Stephanie Gil , Dimitri Bertsekas

Optimal and Approximate Q-value Functions for Decentralized POMDPs

Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out…

Artificial Intelligence · Computer Science 2011-11-02 Frans A. Oliehoek , Matthijs T. J. Spaan , Nikos Vlassis

Policy Gradient With Value Function Approximation For Collective Multiagent Planning

Decentralized (PO)MDPs provide an expressive framework for sequential decision making in a multiagent system. Given their computational complexity, recent research has focused on tractable yet practical subclasses of Dec-POMDPs. We address…

Artificial Intelligence · Computer Science 2018-04-11 Duc Thien Nguyen , Akshat Kumar , Hoong Chuin Lau

Scalable Solution Methods for Dec-POMDPs with Deterministic Dynamics

Many high-level multi-agent planning problems, including multi-robot navigation and path planning, can be effectively modeled using deterministic actions and observations. In this work, we focus on such domains and introduce the class of…

Artificial Intelligence · Computer Science 2025-09-01 Yang You , Alex Schutz , Zhikun Li , Bruno Lacerda , Robert Skilton , Nick Hawes

The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs

We consider the problem of finding the best memoryless stochastic policy for an infinite-horizon partially observable Markov decision process (POMDP) with finite state and action spaces with respect to either the discounted or mean reward…

Optimization and Control · Mathematics 2022-05-02 Johannes Müller , Guido Montúfar

Multi-Environment POMDPs with Finite-Horizon Objectives

Partially Observable Markov Decision Processes (POMDPs) are systems in which one agent interacts with a stochastic environment, and receives only partial information about the current state. In a multi-environment POMDP (MEPOMDP), the…

Artificial Intelligence · Computer Science 2026-05-11 Léonard Brice , Filip Cano , Krishnendu Chatterjee , Thomas A. Henzinger , Stefanie Muroya