Related papers: Coding for Distributed Multi-Agent Reinforcement L…

Distributed Multi-Agent Reinforcement Learning with One-hop Neighbors and Compute Straggler Mitigation

Most multi-agent reinforcement learning (MARL) methods are limited in the scale of problems they can handle. With increasing numbers of agents, the number of training iterations required to find the optimal behaviors increases exponentially…

Multiagent Systems · Computer Science 2025-01-03 Baoqian Wang , Junfei Xie , Nikolay Atanasov

Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning for incomplete information environments has attracted extensive attention from researchers. However, due to the slow sample collection and poor sample exploration, there are still some problems in…

Artificial Intelligence · Computer Science 2022-05-12 Shuhan Qi , Shuhao Zhang , Xiaohan Hou , Jiajia Zhang , Xuan Wang , Jing Xiao

Design and Optimization of Hierarchical Gradient Coding for Distributed Learning at Edge Devices

Edge computing has recently emerged as a promising paradigm to boost the performance of distributed learning by leveraging the distributed resources at edge nodes. Architecturally, the introduction of edge nodes adds an additional…

Networking and Internet Architecture · Computer Science 2024-06-18 Weiheng Tang , Jingyi Li , Lin Chen , Xu Chen

Approximate Gradient Coding for Heterogeneous Nodes

In distributed machine learning (DML), the training data is distributed across multiple worker nodes to perform the underlying training in parallel. One major problem affecting the performance of DML algorithms is presence of stragglers.…

Information Theory · Computer Science 2021-05-14 Amogh Johri , Arti Yardi , Tejas Bodas

Nested Gradient Codes for Straggler Mitigation in Distributed Machine Learning

We consider distributed learning in the presence of slow and unresponsive worker nodes, referred to as stragglers. In order to mitigate the effect of stragglers, gradient coding redundantly assigns partial computations to the worker such…

Information Theory · Computer Science 2022-12-19 Luis Maßny , Christoph Hofmeister , Maximilian Egger , Rawad Bitar , Antonia Wachter-Zeh

Approximate Gradient Coding for Distributed Learning with Heterogeneous Stragglers

In this paper, we propose an optimally structured gradient coding scheme to mitigate the straggler problem in distributed learning. Conventional gradient coding methods often assume homogeneous straggler models or rely on excessive data…

Systems and Control · Electrical Eng. & Systems 2025-10-28 Heekang Song , Wan Choi

Coded Computation across Shared Heterogeneous Workers with Communication Delay

Distributed computing enables large-scale computation tasks to be processed over multiple workers in parallel. However, the randomness of communication and computation delays across workers causes the straggler effect, which may degrade the…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-20 Yuxuan Sun , Fan Zhang , Junlin Zhao , Sheng Zhou , Zhisheng Niu , Deniz Gündüz

Speeding Up Distributed Machine Learning Using Codes

Codes are widely used in many engineering applications to offer robustness against noise. In large-scale systems there are several types of noise that can affect the performance of distributed machine learning algorithms -- straggler nodes,…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-30 Kangwook Lee , Maximilian Lam , Ramtin Pedarsani , Dimitris Papailiopoulos , Kannan Ramchandran

Heterogeneous Coded Computation across Heterogeneous Workers

Coded distributed computing framework enables large-scale machine learning (ML) models to be trained efficiently in a distributed manner, while mitigating the straggler effect. In this work, we consider a multi-task assignment problem in a…

Information Theory · Computer Science 2019-05-21 Yuxuan Sun , Junlin Zhao , Sheng Zhou , Deniz Gündüz

Multi-Agent Reinforcement Learning for Sample-Efficient Deep Neural Network Mapping

Mapping deep neural networks (DNNs) to hardware is critical for optimizing latency, energy consumption, and resource utilization, making it a cornerstone of high-performance accelerator design. Due to the vast and complex mapping space,…

Machine Learning · Computer Science 2025-07-23 Srivatsan Krishnan , Jason Jabbour , Dan Zhang , Natasha Jaques , Aleksandra Faust , Shayegan Omidshafiei , Vijay Janapa Reddi

Optimization-based Block Coordinate Gradient Coding for Mitigating Partial Stragglers in Distributed Learning

Gradient coding schemes effectively mitigate full stragglers in distributed learning by introducing identical redundancy in coded local partial derivatives corresponding to all model parameters. However, they are no longer effective for…

Information Theory · Computer Science 2023-04-26 Qi Wang , Ying Cui , Chenglin Li , Junni Zou , Hongkai Xiong

Straggler-Robust Distributed Optimization with the Parameter Server Utilizing Coded Gradient

Optimization in distributed networks plays a central role in almost all distributed machine learning problems. In principle, the use of distributed task allocation has reduced the computational time, allowing better response rates and…

Optimization and Control · Mathematics 2020-07-28 Elie Atallah , Nazanin Rahnavard , Chinwendu Enyioha

Distributed Multi-Agent Reinforcement Learning Based on Graph-Induced Local Value Functions

Achieving distributed reinforcement learning (RL) for large-scale cooperative multi-agent systems (MASs) is challenging because: (i) each agent has access to only limited information; (ii) issues on convergence or computational complexity…

Machine Learning · Computer Science 2024-04-15 Gangshan Jing , He Bai , Jemin George , Aranya Chakrabortty , Piyush K. Sharma

Distributed Policy Gradient with Variance Reduction in Multi-Agent Reinforcement Learning

This paper studies a distributed policy gradient in collaborative multi-agent reinforcement learning (MARL), where agents over a communication network aim to find the optimal policy to maximize the average of all agents' local returns. Due…

Multiagent Systems · Computer Science 2022-12-06 Xiaoxiao Zhao , Jinlong Lei , Li Li , Jie Chen

Safe Multi-Agent Reinforcement Learning through Decentralized Multiple Control Barrier Functions

Multi-Agent Reinforcement Learning (MARL) algorithms show amazing performance in simulation in recent years, but placing MARL in real-world applications may suffer safety problems. MARL with centralized shields was proposed and verified in…

Multiagent Systems · Computer Science 2021-03-24 Zhiyuan Cai , Huanhui Cao , Wenjie Lu , Lin Zhang , Hao Xiong

Multi-Target Pursuit by a Decentralized Heterogeneous UAV Swarm using Deep Multi-Agent Reinforcement Learning

Multi-agent pursuit-evasion tasks involving intelligent targets are notoriously challenging coordination problems. In this paper, we investigate new ways to learn such coordinated behaviors of unmanned aerial vehicles (UAVs) aimed at…

Robotics · Computer Science 2023-03-06 Maryam Kouzeghar , Youngbin Song , Malika Meghjani , Roland Bouffanais

Straggler-resistant distributed matrix computation via coding theory

The current BigData era routinely requires the processing of large scale data on massive distributed computing clusters. Such large scale clusters often suffer from the problem of "stragglers", which are defined as slow or failed nodes. The…

Information Theory · Computer Science 2020-02-11 Aditya Ramamoorthy , Anindya Bijoy Das , Li Tang

Distributed Stochastic Gradient Descent Using LDGM Codes

We consider a distributed learning problem in which the computation is carried out on a system consisting of a master node and multiple worker nodes. In such systems, the existence of slow-running machines called stragglers will cause a…

Information Theory · Computer Science 2019-01-16 Shunsuke Horii , Takahiro Yoshida , Manabu Kobayashi , Toshiyasu Matsushima

A Review of Cooperative Multi-Agent Deep Reinforcement Learning

Deep Reinforcement Learning has made significant progress in multi-agent systems in recent years. In this review article, we have focused on presenting recent approaches on Multi-Agent Reinforcement Learning (MARL) algorithms. In…

Machine Learning · Computer Science 2021-05-03 Afshin OroojlooyJadid , Davood Hajinezhad

Dynamic Network-Assisted D2D-Aided Coded Distributed Learning

Today, various machine learning (ML) applications offer continuous data processing and real-time data analytics at the edge of a wireless network. Distributed real-time ML solutions are highly sensitive to the so-called straggler effect…

Machine Learning · Computer Science 2024-10-28 Nikita Zeulin , Olga Galinina , Nageen Himayat , Sergey Andreev , Robert W. Heath