Related papers: Boosting Asynchronous Decentralized Learning with …

Fair Decentralized Learning

Decentralized learning (DL) is an emerging approach that enables nodes to collaboratively train a machine learning model without sharing raw data. In many application domains, such as healthcare, this approach faces challenges due to the…

Machine Learning · Computer Science 2025-05-30 Sayan Biswas , Anne-Marie Kermarrec , Rishi Sharma , Thibaud Trinca , Martijn de Vos

Decentralized Learning Made Practical with Client Sampling

Decentralized learning (DL) leverages edge devices for collaborative model training while avoiding coordination by a central server. Due to privacy concerns, DL has become an attractive alternative to centralized learning schemes since…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-08 Martijn de Vos , Akash Dhasade , Anne-Marie Kermarrec , Erick Lavoie , Johan Pouwelse , Rishi Sharma

Low-Cost Privacy-Preserving Decentralized Learning

Decentralized learning (DL) is an emerging paradigm of collaborative machine learning that enables nodes in a network to train models collectively without sharing their raw data or relying on a central server. This paper introduces Zip-DL,…

Machine Learning · Computer Science 2025-05-30 Sayan Biswas , Davide Frey , Romaric Gaudel , Anne-Marie Kermarrec , Dimitri Lerévérend , Rafael Pires , Rishi Sharma , François Taïani

ScaDLES: Scalable Deep Learning over Streaming data at the Edge

Distributed deep learning (DDL) training systems are designed for cloud and data-center environments that assumes homogeneous compute resources, high network bandwidth, sufficient memory and storage, as well as independent and identically…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-30 Sahil Tyagi , Martin Swany

Energy-Aware Decentralized Learning with Intermittent Model Training

Decentralized learning (DL) offers a powerful framework where nodes collaboratively train models without sharing raw data and without the coordination of a central server. In the iterative rounds of DL, models are trained locally, shared…

Machine Learning · Computer Science 2024-07-02 Akash Dhasade , Paolo Dini , Elia Guerra , Anne-Marie Kermarrec , Marco Miozzo , Rafael Pires , Rishi Sharma , Martijn de Vos

Dynamic Network-Assisted D2D-Aided Coded Distributed Learning

Today, various machine learning (ML) applications offer continuous data processing and real-time data analytics at the edge of a wireless network. Distributed real-time ML solutions are highly sensitive to the so-called straggler effect…

Machine Learning · Computer Science 2024-10-28 Nikita Zeulin , Olga Galinina , Nageen Himayat , Sergey Andreev , Robert W. Heath

Speed Up Federated Learning in Heterogeneous Environment: A Dynamic Tiering Approach

Federated learning (FL) enables collaboratively training a model while keeping the training data decentralized and private. However, one significant impediment to training a model using FL, especially large models, is the resource…

Machine Learning · Computer Science 2023-12-12 Seyed Mahmoud Sajjadi Mohammadabadi , Syed Zawad , Feng Yan , Lei Yang

Mosaic Learning: A Framework for Decentralized Learning with Model Fragmentation

Decentralized learning (DL) enables collaborative machine learning (ML) without a central server, making it suitable for settings where training data cannot be centrally hosted. We introduce Mosaic Learning, a DL framework that decomposes…

Machine Learning · Computer Science 2026-02-05 Sayan Biswas , Davide Frey , Romaric Gaudel , Nirupam Gupta , Anne-Marie Kermarrec , Dimitri Lerévérend , Rafael Pires , Rishi Sharma , François Taïani , Martijn de Vos

Locally Asynchronous Stochastic Gradient Descent for Decentralised Deep Learning

Distributed training algorithms of deep neural networks show impressive convergence speedup properties on very large problems. However, they inherently suffer from communication related slowdowns and communication topology becomes a crucial…

Machine Learning · Computer Science 2022-03-25 Tomer Avidor , Nadav Tal Israel

Communication-Efficient Distributed Deep Learning: A Comprehensive Survey

Distributed deep learning (DL) has become prevalent in recent years to reduce training time by leveraging multiple computing devices (e.g., GPUs/TPUs) due to larger models and datasets. However, system scalability is limited by…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-04 Zhenheng Tang , Shaohuai Shi , Wei Wang , Bo Li , Xiaowen Chu

Communication-Efficient Distributed Deep Learning via Federated Dynamic Averaging

The ever-growing volume and decentralized nature of data, coupled with the need to harness it and extract knowledge, have led to the extensive use of distributed deep learning (DDL) techniques for training. These techniques rely on local…

Machine Learning · Computer Science 2024-11-22 Michail Theologitis , Georgios Frangias , Georgios Anestis , Vasilis Samoladas , Antonios Deligiannakis

DRACO: Decentralized Asynchronous Federated Learning over Row-Stochastic Wireless Networks

Recent developments and emerging use cases, such as smart Internet of Things (IoT) and Edge AI, have sparked considerable interest in the training of neural networks over fully decentralized (serverless) networks. One of the major…

Machine Learning · Computer Science 2025-01-30 Eunjeong Jeong , Marios Kountouris

Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent

Distributed Deep Learning (DDL) is essential for large-scale Deep Learning (DL) training. Synchronous Stochastic Gradient Descent (SSGD) 1 is the de facto DDL optimization method. Using a sufficiently large batch size is critical to…

Machine Learning · Computer Science 2021-12-03 Wei Zhang , Mingrui Liu , Yu Feng , Xiaodong Cui , Brian Kingsbury , Yuhai Tu

D2D-Enabled Data Sharing for Distributed Machine Learning at Wireless Network Edge

Mobile edge learning is an emerging technique that enables distributed edge devices to collaborate in training shared machine learning models by exploiting their local data samples and communication and computation resources. To deal with…

Signal Processing · Electrical Eng. & Systems 2020-01-31 Xiaoran Cai , Xiaopeng Mo , Junyang Chen , Jie Xu

Resource-efficient Parallel Split Learning in Heterogeneous Edge Computing

Edge AI has been recently proposed to facilitate the training and deployment of Deep Neural Network (DNN) models in proximity to the sources of data. To enable the training of large models on resource-constraint edge devices and protect…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-03-26 Mingjin Zhang , Jiannong Cao , Yuvraj Sahni , Xiangchun Chen , Shan Jiang

SADDLe: Sharpness-Aware Decentralized Deep Learning with Heterogeneous Data

Decentralized training enables learning with distributed datasets generated at different locations without relying on a central server. In realistic scenarios, the data distribution across these sparsely connected learning agents can be…

Machine Learning · Computer Science 2025-02-27 Sakshi Choudhary , Sai Aparna Aketi , Kaushik Roy

Straggler-Resilient Federated Learning: Leveraging the Interplay Between Statistical Accuracy and System Heterogeneity

Federated Learning is a novel paradigm that involves learning from data samples distributed across a large network of clients while the data remains local. It is, however, known that federated learning is prone to multiple system challenges…

Machine Learning · Computer Science 2021-01-01 Amirhossein Reisizadeh , Isidoros Tziotis , Hamed Hassani , Aryan Mokhtari , Ramtin Pedarsani

Oscars: Adaptive Semi-Synchronous Parallel Model for Distributed Deep Learning with Global View

Deep learning has become an indispensable part of life, such as face recognition, NLP, etc., but the training of deep model has always been a challenge, and in recent years, the complexity of training data and models has shown explosive…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-02-18 Sheng Huang

Communication-Efficient Decentralized Learning with Sparsification and Adaptive Peer Selection

Distributed learning techniques such as federated learning have enabled multiple workers to train machine learning models together to reduce the overall training time. However, current distributed training algorithms (centralized or…

Machine Learning · Computer Science 2020-02-25 Zhenheng Tang , Shaohuai Shi , Xiaowen Chu

SplitBrain: Hybrid Data and Model Parallel Deep Learning

The recent success of deep learning applications has coincided with those widely available powerful computational resources for training sophisticated machine learning models with huge datasets. Nonetheless, training large models such as…

Machine Learning · Computer Science 2022-01-03 Farley Lai , Asim Kadav , Erik Kruus