Related papers: Improved Memory-Bounded Dynamic Programming for De…

Optimizing Memory-Bounded Controllers for Decentralized POMDPs

We present a memory-bounded optimization approach for solving infinite-horizon decentralized POMDPs. Policies for each agent are represented by stochastic finite state controllers. We formulate the problem of optimizing these policies as a…

Artificial Intelligence · Computer Science 2012-06-26 Christopher Amato , Daniel S Bernstein , Shlomo Zilberstein

An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs

Decentralized planning in uncertain environments is a complex task generally dealt with by using a decision-theoretic approach, mainly through the framework of Decentralized Partially Observable Markov Decision Processes (DEC-POMDPs).…

Artificial Intelligence · Computer Science 2014-01-17 Raghav Aras , Alain Dutech

Anytime Planning for Decentralized POMDPs using Expectation Maximization

Decentralized POMDPs provide an expressive framework for multi-agent sequential decision making. While fnite-horizon DECPOMDPs have enjoyed signifcant success, progress remains slow for the infnite-horizon case mainly due to the inherent…

Artificial Intelligence · Computer Science 2012-03-19 Akshat Kumar , Shlomo Zilberstein

Scaling Up Decentralized MDPs Through Heuristic Search

Decentralized partially observable Markov decision processes (Dec-POMDPs) are rich models for cooperative decision-making under uncertainty, but are often intractable to solve optimally (NEXP-complete). The transition and observation…

Artificial Intelligence · Computer Science 2012-10-19 Jilles S. Dibangoye , Christopher Amato , Arnoud Doniec

Scaling Up Robust MDPs by Reinforcement Learning

We consider large-scale Markov decision processes (MDPs) with parameter uncertainty, under the robust MDP paradigm. Previous studies showed that robust MDPs, based on a minimax approach to handle uncertainty, can be solved using dynamic…

Machine Learning · Computer Science 2013-06-27 Aviv Tamar , Huan Xu , Shie Mannor

Approximate Dynamic Programming By Minimizing Distributionally Robust Bounds

Approximate dynamic programming is a popular method for solving large Markov decision processes. This paper describes a new class of approximate dynamic programming (ADP) methods- distributionally robust ADP-that address the curse of…

Machine Learning · Statistics 2012-05-22 Marek Petrik

Towards Scalable Semidefinite Programming: Optimal Metric ADMM with A Worst-case Performance Guarantee

Despite the numerous uses of semidefinite programming (SDP) and its universal solvability via interior point methods (IPMs), it is rarely applied to practical large-scale problems. This mainly owes to the computational cost of IPMs that…

Optimization and Control · Mathematics 2024-03-19 Yifan Ran , Stefan Vlaski , Wei Dai

Strengthening Deterministic Policies for POMDPs

The synthesis problem for partially observable Markov decision processes (POMDPs) is to compute a policy that satisfies a given specification. Such policies have to take the full execution history of a POMDP into account, rendering the…

Artificial Intelligence · Computer Science 2020-07-20 Leonore Winterer , Ralf Wimmer , Nils Jansen , Bernd Becker

PODDP: Partially Observable Differential Dynamic Programming for Latent Belief Space Planning

Autonomous agents are limited in their ability to observe the world state. Partially observable Markov decision processes (POMDPs) formally model the problem of planning under world state uncertainty, but POMDPs with continuous actions and…

Robotics · Computer Science 2020-07-08 Dicong Qiu , Yibiao Zhao , Chris L. Baker

Linear Programming for Decision Processes with Partial Information

Markov Decision Processes (MDPs) are stochastic optimization problems that model situations where a decision maker controls a system based on its state. Partially observed Markov decision processes (POMDPs) are generalizations of MDPs where…

Optimization and Control · Mathematics 2019-03-26 Victor Cohen , Axel Parmentier

The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs

We consider the problem of finding the best memoryless stochastic policy for an infinite-horizon partially observable Markov decision process (POMDP) with finite state and action spaces with respect to either the discounted or mean reward…

Optimization and Control · Mathematics 2022-05-02 Johannes Müller , Guido Montúfar

Approximate Constrained Discounted Dynamic Programming with Uniform Feasibility and Optimality

We consider a dynamic programming (DP) approach to approximately solving an infinite-horizon constrained Markov decision process (CMDP) problem with a fixed initial-state for the expected total discounted-reward criterion with a…

Optimization and Control · Mathematics 2023-08-08 Hyeong Soo Chang

Scalable Solution Methods for Dec-POMDPs with Deterministic Dynamics

Many high-level multi-agent planning problems, including multi-robot navigation and path planning, can be effectively modeled using deterministic actions and observations. In this work, we focus on such domains and introduce the class of…

Artificial Intelligence · Computer Science 2025-09-01 Yang You , Alex Schutz , Zhikun Li , Bruno Lacerda , Robert Skilton , Nick Hawes

Geometry and Determinism of Optimal Stationary Control in Partially Observable Markov Decision Processes

It is well known that for any finite state Markov decision process (MDP) there is a memoryless deterministic policy that maximizes the expected reward. For partially observable Markov decision processes (POMDPs), optimal memoryless policies…

Optimization and Control · Mathematics 2016-02-16 Guido Montufar , Keyan Ghazi-Zahedi , Nihat Ay

B$^3$RTDP: A Belief Branch and Bound Real-Time Dynamic Programming Approach to Solving POMDPs

Partially Observable Markov Decision Processes (POMDPs) offer a promising world representation for autonomous agents, as they can model both transitional and perceptual uncertainties. Calculating the optimal solution to POMDP problems can…

Artificial Intelligence · Computer Science 2022-10-25 Sigurdur Orn Adalgeirsson , Cynthia Breazeal

Decentralized Control of Partially Observable Markov Decision Processes using Belief Space Macro-actions

The focus of this paper is on solving multi-robot planning problems in continuous spaces with partial observability. Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for multi-robot coordination…

Multiagent Systems · Computer Science 2015-02-24 Shayegan Omidshafiei , Ali-akbar Agha-mohammadi , Christopher Amato , Jonathan P. How

Scaling Internal-State Policy-Gradient Methods for POMDPs

Policy-gradient methods have received increased attention recently as a mechanism for learning to act in partially observable environments. They have shown promise for problems admitting memoryless policies but have been less successful…

Machine Learning · Computer Science 2025-12-08 Douglas Aberdeen , Jonathan Baxter

A Low-Complexity Multi-Survivor Dynamic Programming for Constrained Discrete Optimization

Constrained discrete optimization problems are encountered in many areas of communication and machine learning. We consider the case where the objective function satisfies Bellman's optimality principle without the constraints on which we…

Optimization and Control · Mathematics 2021-05-14 I. Zakir Ahmed , Hamid Sadjadpour , Shahram Yousefi

Solving Truly Massive Budgeted Monotonic POMDPs with Oracle-Guided Meta-Reinforcement Learning

Monotonic Partially Observable Markov Decision Processes (POMDPs), where the system state progressively decreases until a restorative action is performed, can be used to model sequential repair problems effectively. This paper considers the…

Machine Learning · Computer Science 2025-09-17 Manav Vora , Jonas Liang , Michael N. Grussing , Melkior Ornik

Region-Based Incremental Pruning for POMDPs

We present a major improvement to the incremental pruning algorithm for solving partially observable Markov decision processes. Our technique targets the cross-sum step of the dynamic programming (DP) update, a key source of complexity in…

Artificial Intelligence · Computer Science 2012-07-19 Zhengzhu Feng , Shlomo Zilberstein