Related papers: Multiagent Value Iteration Algorithms in Dynamic P…

Multiagent Rollout Algorithms and Reinforcement Learning

We consider finite and infinite horizon dynamic programming problems, where the control at each stage consists of several distinct decisions, each one made by one of several agents. We introduce an approach, whereby at every stage, each…

Machine Learning · Computer Science 2020-04-15 Dimitri Bertsekas

Value and Policy Iteration in Optimal Control and Adaptive Dynamic Programming

In this paper, we consider discrete-time infinite horizon problems of optimal control to a terminal set of states. These are the problems that are often taken as the starting point for adaptive dynamic programming. Under very general…

Systems and Control · Computer Science 2015-10-05 Dimitri P. Bertsekas

Multi-agent Hierarchical Reinforcement Learning with Dynamic Termination

In a multi-agent system, an agent's optimal policy will typically depend on the policies chosen by others. Therefore, a key issue in multi-agent systems research is that of predicting the behaviours of others, and responding promptly to…

Multiagent Systems · Computer Science 2019-10-22 Dongge Han , Wendelin Boehmer , Michael Wooldridge , Alex Rogers

On-Line Policy Iteration for Infinite Horizon Dynamic Programming

In this paper we propose an on-line policy iteration (PI) algorithm for finite-state infinite horizon discounted dynamic programming, whereby the policy improvement operation is done on-line, only for the states that are encountered during…

Optimization and Control · Mathematics 2021-06-03 Dimitri Bertsekas

Multiagent Rollout and Policy Iteration for POMDP with Application to Multi-Robot Repair Problems

In this paper we consider infinite horizon discounted dynamic programming problems with finite state and control spaces, partial state observations, and a multiagent structure. We discuss and compare algorithms that simultaneously or…

Robotics · Computer Science 2020-11-10 Sushmita Bhattacharya , Siva Kailas , Sahil Badyal , Stephanie Gil , Dimitri Bertsekas

Approximate Linear Programming for Decentralized Policy Iteration in Cooperative Multi-agent Markov Decision Processes

In this work, we consider a cooperative multi-agent Markov decision process (MDP) involving m agents. At each decision epoch, all the m agents independently select actions in order to maximize a common long-term objective. In the policy…

Machine Learning · Computer Science 2024-05-01 Lakshmi Mandal , Chandrashekar Lakshminarayanan , Shalabh Bhatnagar

Generalized Dual Dynamic Programming for Infinite Horizon Problems in Continuous State and Action Spaces

We describe a nonlinear generalization of dual dynamic programming theory and its application to value function estimation for deterministic control problems over continuous state and action spaces, in a discrete-time infinite horizon…

Optimization and Control · Mathematics 2018-10-05 Joseph Warrington , Paul N. Beuchat , John Lygeros

Convergence Analysis of Policy Iteration

Adaptive optimal control of nonlinear dynamic systems with deterministic and known dynamics under a known undiscounted infinite-horizon cost function is investigated. Policy iteration scheme initiated using a stabilizing initial control is…

Systems and Control · Computer Science 2015-05-21 Ali Heydari

Scalable Centralized Deep Multi-Agent Reinforcement Learning via Policy Gradients

In this paper, we explore using deep reinforcement learning for problems with multiple agents. Most existing methods for deep multi-agent reinforcement learning consider only a small number of agents. When the number of agents increases,…

Machine Learning · Computer Science 2018-05-24 Arbaaz Khan , Clark Zhang , Daniel D. Lee , Vijay Kumar , Alejandro Ribeiro

Dynamic Programming Through the Lens of Semismooth Newton-Type Methods (Extended Version)

Policy iteration and value iteration are at the core of many (approximate) dynamic programming methods. For Markov Decision Processes with finite state and action spaces, we show that they are instances of semismooth Newton-type methods to…

Optimization and Control · Mathematics 2022-06-28 Matilde Gargiani , Andrea Zanelli , Dominic Liao-McPherson , Tyler Summers , John Lygeros

An Efficient Policy Iteration Algorithm for Dynamic Programming Equations

We present an accelerated algorithm for the solution of static Hamilton-Jacobi-Bellman equations related to optimal control problems. Our scheme is based on a classic policy iteration procedure, which is known to have superlinear…

Optimization and Control · Mathematics 2016-02-22 Alessandro Alla , Maurizio Falcone , Dante Kalise

Reward-Reinforced Reinforcement Learning for Multi-agent Systems

Reinforcement learning algorithms in multi-agent systems deliver highly resilient and adaptable solutions for common problems in telecommunications,aerospace, and industrial robotics. However, achieving an optimal global goal remains a…

Multiagent Systems · Computer Science 2021-05-18 Changgang Zheng , Shufan Yang , Juan Parra-Ullauri , Antonio Garcia-Dominguez , Nelly Bencomo

Finite-Time Analysis of Distributed TD(0) with Linear Function Approximation for Multi-Agent Reinforcement Learning

We study the policy evaluation problem in multi-agent reinforcement learning. In this problem, a group of agents works cooperatively to evaluate the value function for the global discounted accumulative reward problem, which is composed of…

Optimization and Control · Mathematics 2019-06-04 Thinh T. Doan , Siva Theja Maguluri , Justin Romberg

Influencing Long-Term Behavior in Multiagent Reinforcement Learning

The main challenge of multiagent reinforcement learning is the difficulty of learning useful policies in the presence of other simultaneously learning agents whose changing behaviors jointly affect the environment's transition and reward…

Machine Learning · Computer Science 2022-10-18 Dong-Ki Kim , Matthew Riemer , Miao Liu , Jakob N. Foerster , Michael Everett , Chuangchuang Sun , Gerald Tesauro , Jonathan P. How

Regular Policies in Abstract Dynamic Programming

We consider challenging dynamic programming models where the associated Bellman equation, and the value and policy iteration algorithms commonly exhibit complex and even pathological behavior. Our analysis is based on the new notion of…

Optimization and Control · Mathematics 2016-09-13 Dimitri P. Bertsekas

Multi-agent Multi-target Path Planning in Markov Decision Processes

Missions for autonomous systems often require agents to visit multiple targets in complex operating conditions. This work considers the problem of visiting a set of targets in minimum time by a team of non-communicating agents in a Markov…

Optimization and Control · Mathematics 2023-06-21 Farhad Nawaz , Melkior Ornik

Fast Multi-Agent Temporal-Difference Learning via Homotopy Stochastic Primal-Dual Optimization

We study the policy evaluation problem in multi-agent reinforcement learning where a group of agents, with jointly observed states and private local actions and rewards, collaborate to learn the value function of a given policy via local…

Optimization and Control · Mathematics 2021-11-08 Dongsheng Ding , Xiaohan Wei , Zhuoran Yang , Zhaoran Wang , Mihailo R. Jovanović

Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes

The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic…

Artificial Intelligence · Computer Science 2019-05-30 Kunal Menda , Yi-Chun Chen , Justin Grana , James W. Bono , Brendan D. Tracey , Mykel J. Kochenderfer , David Wolpert

Multi-Agent Fully Decentralized Value Function Learning with Linear Convergence Rates

This work develops a fully decentralized multi-agent algorithm for policy evaluation. The proposed scheme can be applied to two distinct scenarios. In the first scenario, a collection of agents have distinct datasets gathered following…

Machine Learning · Computer Science 2019-08-13 Lucas Cassano , Kun Yuan , Ali H. Sayed

Policy Iteration for Decentralized Control of Markov Decision Processes

Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov…

Artificial Intelligence · Computer Science 2014-01-16 Daniel S. Bernstein , Christopher Amato , Eric A. Hansen , Shlomo Zilberstein