Related papers: Multi-agent Policy Optimization with Approximative…
This work develops a fully decentralized multi-agent algorithm for policy evaluation. The proposed scheme can be applied to two distinct scenarios. In the first scenario, a collection of agents have distinct datasets gathered following…
Policy optimization methods with function approximation are widely used in multi-agent reinforcement learning. However, it remains elusive how to design such algorithms with statistical guarantees. Leveraging a multi-agent performance…
Synchronizing decisions across multiple agents in realistic settings is problematic since it requires agents to wait for other agents to terminate and communicate about termination reliably. Ideally, agents should learn and execute…
We present Coordinated Proximal Policy Optimization (CoPPO), an algorithm that extends the original Proximal Policy Optimization (PPO) to the multi-agent setting. The key idea lies in the coordinated adaptation of step size during the…
Cooperative multi-agent reinforcement learning (MARL) aims to coordinate multiple agents to achieve a common goal. A key challenge in MARL is credit assignment, which involves assessing each agent's contribution to the shared reward. Given…
Cooperative problems under continuous control have always been the focus of multi-agent reinforcement learning. Existing algorithms suffer from the problem of uneven learning degree with the increase of the number of agents. In this paper,…
We study the policy evaluation problem in multi-agent reinforcement learning. In this problem, a group of agents works cooperatively to evaluate the value function for the global discounted accumulative reward problem, which is composed of…
Recent technological progress in the development of Unmanned Aerial Vehicles (UAVs) together with decreasing acquisition costs make the application of drone fleets attractive for a wide variety of tasks. In agriculture, disaster management,…
In this paper, we propose a distributed off-policy actor critic method to solve multi-agent reinforcement learning problems. Specifically, we assume that all agents keep local estimates of the global optimal policy parameter and update…
Many real world tasks require multiple agents to work together. Multi-agent reinforcement learning (RL) methods have been proposed in recent years to solve these tasks, but current methods often fail to efficiently learn policies. We thus…
This paper proposes the Cooperative Soft Actor Critic (CSAC) method of enabling consecutive reinforcement learning agents to cooperatively solve a long time horizon multi-stage task. This method is achieved by modifying the policy of each…
A key challenge in multi-robot and multi-agent systems is generating solutions that are robust to other self-interested or even adversarial parties who actively try to prevent the agents from achieving their goals. The practicality of…
Multi-agent proximal policy optimization (MAPPO) has recently demonstrated state-of-the-art performance on challenging multi-agent reinforcement learning tasks. However, MAPPO still struggles with the credit assignment problem, wherein the…
Many recent successful off-policy multi-agent reinforcement learning (MARL) algorithms for cooperative partially observable environments focus on finding factorized value functions, leading to convoluted network structures. Building on the…
The main challenge of multiagent reinforcement learning is the difficulty of learning useful policies in the presence of other simultaneously learning agents whose changing behaviors jointly affect the environment's transition and reward…
Multi-agent reinforcement learning shines as the pinnacle of multi-agent systems, conquering intricate real-world challenges, fostering collaboration and coordination among agents, and unleashing the potential for intelligent…
Learning in games has been widely used to solve many cooperative multi-agent problems such as coverage control, consensus, self-reconfiguration or vehicle-target assignment. One standard approach in this domain is to formulate the problem…
This paper extends off-policy reinforcement learning to the multi-agent case in which a set of networked agents communicating with their neighbors according to a time-varying graph collaboratively evaluates and improves a target policy…
This paper examines a novel type of multi-agent problem, in which an agent makes multiple identical copies of itself in order to achieve a single agent task better or more efficiently. This strategy improves performance if the environment…
Many reality tasks such as robot coordination can be naturally modelled as multi-agent cooperative system where the rewards are sparse. This paper focuses on learning decentralized policies for such tasks using sub-optimal demonstration. To…