Related papers: Influence-Optimistic Local Values for Multiagent P…

Factored Online Planning in Many-Agent POMDPs

In centralized multi-agent systems, often modeled as multi-agent partially observable Markov decision processes (MPOMDPs), the action and observation spaces grow exponentially with the number of agents, making the value and belief…

Artificial Intelligence · Computer Science 2024-02-26 Maris F. L. Galesloot , Thiago D. Simão , Sebastian Junges , Nils Jansen

Decentralized Planning Using Probabilistic Hyperproperties

Multi-agent planning under stochastic dynamics is usually formalised using decentralized (partially observable) Markov decision processes ( MDPs) and reachability or expected reward specifications. In this paper, we propose a different…

Logic in Computer Science · Computer Science 2025-02-20 Francesco Pontiggia , Filip Macák , Roman Andriushchenko , Michele Chiari , Milan Češka

Scaling Up Decentralized MDPs Through Heuristic Search

Decentralized partially observable Markov decision processes (Dec-POMDPs) are rich models for cooperative decision-making under uncertainty, but are often intractable to solve optimally (NEXP-complete). The transition and observation…

Artificial Intelligence · Computer Science 2012-10-19 Jilles S. Dibangoye , Christopher Amato , Arnoud Doniec

Myopic Policy Bounds for Information Acquisition POMDPs

This paper addresses the problem of optimal control of robotic sensing systems aimed at autonomous information gathering in scenarios such as environmental monitoring, search and rescue, and surveillance and reconnaissance. The information…

Systems and Control · Computer Science 2016-01-28 Mikko Lauri , Nikolay Atanasov , George J. Pappas , Risto Ritala

Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability Objectives

Partially Observable Markov Decision Processes (POMDPs) are powerful models for sequential decision making under transition and observation uncertainties. This paper studies the challenging yet important problem in POMDPs known as the…

Artificial Intelligence · Computer Science 2024-06-06 Qi Heng Ho , Martin S. Feather , Federico Rossi , Zachary N. Sunberg , Morteza Lahijanian

Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes

Autonomous systems are often required to operate in partially observable environments. They must reliably execute a specified objective even with incomplete information about the state of the environment. We propose a methodology to…

Artificial Intelligence · Computer Science 2020-01-14 Maxime Bouton , Jana Tumova , Mykel J. Kochenderfer

Improving Learnt Local MAPF Policies with Heuristic Search

Multi-agent path finding (MAPF) is the problem of finding collision-free paths for a team of agents to reach their goal locations. State-of-the-art classical MAPF solvers typically employ heuristic search to find solutions for hundreds of…

Multiagent Systems · Computer Science 2024-04-01 Rishi Veerapaneni , Qian Wang , Kevin Ren , Arthur Jakobsson , Jiaoyang Li , Maxim Likhachev

Approximate Linear Programming for Decentralized Policy Iteration in Cooperative Multi-agent Markov Decision Processes

In this work, we consider a cooperative multi-agent Markov decision process (MDP) involving m agents. At each decision epoch, all the m agents independently select actions in order to maximize a common long-term objective. In the policy…

Machine Learning · Computer Science 2024-05-01 Lakshmi Mandal , Chandrashekar Lakshminarayanan , Shalabh Bhatnagar

Online POMDP Planning with Anytime Deterministic Optimality Guarantees

Decision-making under uncertainty is a critical aspect of many practical autonomous systems due to incomplete information. Partially Observable Markov Decision Processes (POMDPs) offer a mathematically principled framework for formulating…

Artificial Intelligence · Computer Science 2025-10-28 Moran Barenboim , Vadim Indelman

Optimal Control of Logically Constrained Partially Observable and Multi-Agent Markov Decision Processes

Autonomous systems often have logical constraints arising, for example, from safety, operational, or regulatory requirements. Such constraints can be expressed using temporal logic specifications. The system state is often partially…

Artificial Intelligence · Computer Science 2024-06-21 Krishna C. Kalagarla , Dhruva Kartik , Dongming Shen , Rahul Jain , Ashutosh Nayyar , Pierluigi Nuzzo

Best Policy Identification in Linear MDPs

We investigate the problem of best policy identification in discounted linear Markov Decision Processes in the fixed confidence setting under a generative model. We first derive an instance-specific lower bound on the expected number of…

Machine Learning · Computer Science 2022-08-12 Jerome Taupin , Yassir Jedra , Alexandre Proutiere

On the probabilistic feasibility of solutions in multi-agent optimization problems under uncertainty

We investigate the probabilistic feasibility of randomized solutions to two distinct classes of uncertain multi-agent optimization programs. We first assume that only the constraints of the program are affected by uncertainty, while the…

Optimization and Control · Mathematics 2020-09-29 George Pantazis , Filiberto Fele , Kostas Margellos

Optimal Control of Partially Observable Markov Decision Processes with Finite Linear Temporal Logic Constraints

Autonomous agents often operate in scenarios where the state is partially observed. In addition to maximizing their cumulative reward, agents must execute complex tasks with rich temporal and logical structures. These tasks can be expressed…

Systems and Control · Electrical Eng. & Systems 2022-03-18 Krishna C. Kalagarla , Dhruva Kartik , Dongming Shen , Rahul Jain , Ashutosh Nayyar , Pierluigi Nuzzo

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning

Policy optimization methods with function approximation are widely used in multi-agent reinforcement learning. However, it remains elusive how to design such algorithms with statistical guarantees. Leveraging a multi-agent performance…

Machine Learning · Computer Science 2023-05-09 Yulai Zhao , Zhuoran Yang , Zhaoran Wang , Jason D. Lee

Multiagent Rollout and Policy Iteration for POMDP with Application to Multi-Robot Repair Problems

In this paper we consider infinite horizon discounted dynamic programming problems with finite state and control spaces, partial state observations, and a multiagent structure. We discuss and compare algorithms that simultaneously or…

Robotics · Computer Science 2020-11-10 Sushmita Bhattacharya , Siva Kailas , Sahil Badyal , Stephanie Gil , Dimitri Bertsekas

Solving POMDPs by Searching the Space of Finite Policies

Solving partially observable Markov decision processes (POMDPs) is highly intractable in general, at least in part because the optimal policy may be infinitely large. In this paper, we explore the problem of finding the optimal policy from…

Artificial Intelligence · Computer Science 2013-01-30 Nicolas Meuleau , Kee-Eung Kim , Leslie Pack Kaelbling , Anthony R. Cassandra

Maximizing Expected Impact in an Agent Reputation Network -- Technical Report

Many multi-agent systems (MASs) are situated in stochastic environments. Some such systems that are based on the partially observable Markov decision process (POMDP) do not take the benevolence of other agents for granted. We propose a new…

Artificial Intelligence · Computer Science 2018-05-15 Gavin Rens , Abhaya Nayak , Thomas Meyer

Compositional Planning for Logically Constrained Multi-Agent Markov Decision Processes

Designing control policies for large, distributed systems is challenging, especially in the context of critical, temporal logic based specifications (e.g., safety) that must be met with high probability. Compositional methods for such…

Systems and Control · Electrical Eng. & Systems 2024-10-08 Krishna C. Kalagarla , Matthew Low , Rahul Jain , Ashutosh Nayyar , Pierluigi Nuzzo

Strong Polynomiality of the Value Iteration Algorithm for Computing Nearly Optimal Policies for Discounted Dynamic Programming

This note provides upper bounds on the number of operations required to compute by value iterations a nearly optimal policy for an infinite-horizon discounted Markov decision process with a finite number of states and actions. For a given…

Optimization and Control · Mathematics 2020-01-29 Eugene A. Feinberg , Gaojin He

Multi-agent Reach-avoid MDP via Potential Games and Low-rank Policy Structure

We optimize finite horizon multi-agent reach-avoid Markov decision process (MDP) via \emph{local feedback policies}. The global feedback policy solution yields global optimality but its communication complexity, memory usage and computation…

Systems and Control · Electrical Eng. & Systems 2026-04-10 Adam Casselman , Abraham P. Vinod , Sarah H. Q. Li