Related papers: Multi-Agent Guided Policy Optimization

JointPPO: Diving Deeper into the Effectiveness of PPO in Multi-Agent Reinforcement Learning

While Centralized Training with Decentralized Execution (CTDE) has become the prevailing paradigm in Multi-Agent Reinforcement Learning (MARL), it may not be suitable for scenarios in which agents can fully communicate and share…

Multiagent Systems · Computer Science 2024-07-08 Chenxing Liu , Guizhong Liu

Rethinking Ratio-Based Trust Regions for Policy Optimization in Multi-Agent Reinforcement Learning

Centralized training with decentralized execution (CTDE) is a standard framework for cooperative multi-agent policy-gradient reinforcement learning, allowing agents to learn from joint information while acting from local observations.…

Machine Learning · Computer Science 2026-05-12 Chulabhaya Wijesundara , Andrea Baisero , Zhongheng Li , Gregory Castañón , Alan Carlin , Christopher Amato

Research on Multi-Agent Communication and Collaborative Decision-Making Based on Deep Reinforcement Learning

In a multi-agent environment, In order to overcome and alleviate the non-stationarity of the multi-agent environment, the mainstream method is to adopt the framework of Centralized Training Decentralized Execution (CTDE). This thesis is…

Multiagent Systems · Computer Science 2023-05-30 Zeng Da

Decentralized Policy Optimization

The study of decentralized learning or independent learning in cooperative multi-agent reinforcement learning has a history of decades. Recently empirical studies show that independent PPO (IPPO) can obtain good performance, close to or…

Machine Learning · Computer Science 2022-11-08 Kefan Su , Zongqing Lu

Model-Based Decentralized Policy Optimization

Decentralized policy optimization has been commonly used in cooperative multi-agent tasks. However, since all agents are updating their policies simultaneously, from the perspective of individual agents, the environment is non-stationary,…

Machine Learning · Computer Science 2023-02-17 Hao Luo , Jiechuan Jiang , Zongqing Lu

Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

Centralized Training with Decentralized Execution (CTDE) has recently emerged as a popular framework for cooperative Multi-Agent Reinforcement Learning (MARL), where agents can use additional global state information to guide training in a…

Artificial Intelligence · Computer Science 2025-05-14 Yihe Zhou , Shunyu Liu , Yunpeng Qing , Kaixuan Chen , Tongya Zheng , Jie Song , Mingli Song

GTDE: Grouped Training with Decentralized Execution for Multi-agent Actor-Critic

The rapid advancement of multi-agent reinforcement learning (MARL) has given rise to diverse training paradigms to learn the policies of each agent in the multi-agent system. The paradigms of decentralized training and execution (DTDE) and…

Multiagent Systems · Computer Science 2025-01-22 Mengxian Li , Qi Wang , Yongjun Xu

Multi-Agent Tool-Integrated Policy Optimization

Large language models (LLMs) increasingly rely on multi-turn tool-integrated planning for knowledge-intensive and complex reasoning tasks. Existing implementations typically rely on a single agent, but they suffer from limited context…

Computation and Language · Computer Science 2025-10-07 Zhanfeng Mo , Xingxuan Li , Yuntao Chen , Lidong Bing

AgentMixer: Multi-Agent Correlated Policy Factorization

In multi-agent reinforcement learning, centralized training with decentralized execution (CTDE) methods typically assume that agents make decisions based on their local observations independently, which may not lead to a correlated joint…

Multiagent Systems · Computer Science 2024-12-16 Zhiyuan Li , Wenshuai Zhao , Lijun Wu , Joni Pajarinen

More Centralized Training, Still Decentralized Execution: Multi-Agent Conditional Policy Factorization

In cooperative multi-agent reinforcement learning (MARL), combining value decomposition with actor-critic enables agents to learn stochastic policies, which are more suitable for the partially observable environment. Given the goal of…

Machine Learning · Computer Science 2023-02-13 Jiangxing Wang , Deheng Ye , Zongqing Lu

Centralized Permutation Equivariant Policy for Cooperative Multi-Agent Reinforcement Learning

The Centralized Training with Decentralized Execution (CTDE) paradigm has gained significant attention in multi-agent reinforcement learning (MARL) and is the foundation of many recent algorithms. However, decentralized policies operate…

Multiagent Systems · Computer Science 2025-08-19 Zhuofan Xu , Benedikt Bollig , Matthias Függer , Thomas Nowak , Vincent Le Dréau

Reinforcement Learning-Augmented LLM Agents for Collaborative Decision Making and Performance Optimization

Large Language Models (LLMs) perform well in language tasks but often lack collaborative awareness and struggle to optimize global performance in multi-agent settings. We present a reinforcement learning-augmented LLM agent framework that…

Artificial Intelligence · Computer Science 2026-01-01 Dong Qiu , Duo Xu , Limengxi Yue

Multi-Agent Trust Region Policy Optimization

We extend trust region policy optimization (TRPO) to multi-agent reinforcement learning (MARL) problems. We show that the policy update of TRPO can be transformed into a distributed consensus optimization problem for multi-agent cases. By…

Artificial Intelligence · Computer Science 2023-08-08 Hepeng Li , Haibo He

MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models

Multi-robot systems can benefit from reinforcement learning (RL) algorithms that learn behaviours in a small number of trials, a property known as sample efficiency. This research thus investigates the use of learned world models to improve…

Robotics · Computer Science 2021-03-08 Daniël Willemsen , Mario Coppola , Guido C. H. E. de Croon

Multi-Agent Constrained Policy Optimisation

Developing reinforcement learning algorithms that satisfy safety constraints is becoming increasingly important in real-world applications. In multi-agent reinforcement learning (MARL) settings, policy optimisation with safety awareness is…

Artificial Intelligence · Computer Science 2022-02-11 Shangding Gu , Jakub Grudzien Kuba , Munning Wen , Ruiqing Chen , Ziyan Wang , Zheng Tian , Jun Wang , Alois Knoll , Yaodong Yang

Multi-agent Continual Coordination via Progressive Task Contextualization

Cooperative Multi-agent Reinforcement Learning (MARL) has attracted significant attention and played the potential for many real-world applications. Previous arts mainly focus on facilitating the coordination ability from different aspects…

Multiagent Systems · Computer Science 2023-05-24 Lei Yuan , Lihe Li , Ziqian Zhang , Fuxiang Zhang , Cong Guan , Yang Yu

Multi-Agent Deep Reinforcement Learning Under Constrained Communications

Centralized training with decentralized execution (CTDE) has been the dominant paradigm in multi-agent reinforcement learning (MARL), but its reliance on global state information during training introduces scalability, robustness, and…

Machine Learning · Computer Science 2026-01-27 Shahil Shaik , Jonathon M. Smereka , Yue Wang

Scalable Model-based Policy Optimization for Decentralized Networked Systems

Reinforcement learning algorithms require a large amount of samples; this often limits their real-world applications on even simple tasks. Such a challenge is more outstanding in multi-agent tasks, as each step of operation is more costly…

Machine Learning · Computer Science 2022-09-05 Yali Du , Chengdong Ma , Yuchen Liu , Runji Lin , Hao Dong , Jun Wang , Yaodong Yang

End-to-End Optimization of LLM-Driven Multi-Agent Search Systems via Heterogeneous-Group-Based Reinforcement Learning

Large language models (LLMs) are versatile, yet their deployment in complex real-world settings is limited by static knowledge cutoffs and the difficulty of producing controllable behavior within a single inference. Multi-agent search…

Machine Learning · Computer Science 2026-04-21 Guanzhong Chen , Shaoxiong Yang , Chao Li , Wei Liu , Jian Luan , Zenglin Xu

Macro-Action-Based Multi-Agent/Robot Deep Reinforcement Learning under Partial Observability

The state-of-the-art multi-agent reinforcement learning (MARL) methods have provided promising solutions to a variety of complex problems. Yet, these methods all assume that agents perform synchronized primitive-action executions so that…

Artificial Intelligence · Computer Science 2022-10-12 Yuchen Xiao