Related papers: Launchpad: A Programming Model for Distributed Mac…

A Survey on Distributed Machine Learning

The demand for artificial intelligence has grown significantly over the last decade and this growth has been fueled by advances in machine learning techniques and the ability to leverage hardware acceleration. However, in order to increase…

Machine Learning · Computer Science 2022-11-28 Joost Verbraeken , Matthijs Wolting , Jonathan Katzy , Jeroen Kloppenburg , Tim Verbelen , Jan S. Rellermeyer

Revisiting Large Scale Distributed Machine Learning

Nowadays, with the widespread of smartphones and other portable gadgets equipped with a variety of sensors, data is ubiquitous available and the focus of machine learning has shifted from being able to infer from small training samples to…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-07-07 Radu Cristian Ionescu

An Explorative Study on Distributed Computing Techniques in Training and Inference of Large Language Models

Large language models (LLM) are advanced AI systems trained on extensive textual data, leveraging deep learning techniques to understand and generate human-like language. Today's LLMs with billions of parameters are so huge that hardly any…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-14 Sheikh Azizul Hakim , Saem Hasan

Declarative Learning-Based Programming as an Interface to AI Systems

Data-driven approaches are becoming more common as problem-solving techniques in many areas of research and industry. In most cases, machine learning models are the key component of these solutions, but a solution involves multiple such…

Artificial Intelligence · Computer Science 2019-06-20 Parisa Kordjamshidi , Dan Roth , Kristian Kersting

A Survey From Distributed Machine Learning to Distributed Deep Learning

Artificial intelligence has made remarkable progress in handling complex tasks, thanks to advances in hardware acceleration and machine learning algorithms. However, to acquire more accurate outcomes and solve more complex issues,…

Machine Learning · Computer Science 2023-09-12 Mohammad Dehghani , Zahra Yazdanparast

Modeling Scalability of Distributed Machine Learning

Present day machine learning is computationally intensive and processes large amounts of data. It is implemented in a distributed fashion in order to address these scalability issues. The work is parallelized across a number of computing…

Machine Learning · Computer Science 2017-03-28 Alexander Ulanov , Andrey Simanovsky , Manish Marwah

Strategies and Principles of Distributed Machine Learning on Big Data

The rise of Big Data has led to new demands for Machine Learning (ML) systems to learn complex models with millions to billions of parameters, that promise adequate capacity to digest massive datasets and offer powerful predictive analytics…

Machine Learning · Statistics 2016-01-01 Eric P. Xing , Qirong Ho , Pengtao Xie , Wei Dai

Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future Directions

Language models (LMs) are machine learning models designed to predict linguistic patterns by estimating the probability of word sequences based on large-scale datasets, such as text. LMs have a wide range of applications in natural language…

Computation and Language · Computer Science 2025-03-24 Hadi Amini , Md Jueal Mia , Yasaman Saadati , Ahmed Imteaj , Seyedsina Nabavirazavi , Urmish Thakker , Md Zarif Hossain , Awal Ahmed Fime , S. S. Iyengar

Distributed Learning Systems with First-order Methods

Scalable and efficient distributed learning is one of the main driving forces behind the recent rapid advancement of machine learning and artificial intelligence. One prominent feature of this topic is that recent progresses have been made…

Machine Learning · Computer Science 2021-04-13 Ji Liu , Ce Zhang

Distributed learning of deep neural network over multiple agents

In domains such as health care and finance, shortage of labeled data and computational resources is a critical issue while developing machine learning algorithms. To address the issue of labeled data scarcity in training and deployment of…

Machine Learning · Computer Science 2018-10-16 Otkrist Gupta , Ramesh Raskar

Combining Federated and Active Learning for Communication-efficient Distributed Failure Prediction in Aeronautics

Machine Learning has proven useful in the recent years as a way to achieve failure prediction for industrial systems. However, the high computational resources necessary to run learning algorithms are an obstacle to its widespread…

Artificial Intelligence · Computer Science 2020-01-22 Nicolas Aussel , Sophie Chabridon , Yohan Petetin

Distributed Hybrid Parallelism for Large Language Models: Comparative Study and System Design Guide

With the rapid growth of large language models (LLMs), a wide range of methods have been developed to distribute computation and memory across hardware devices for efficient training and inference. While existing surveys provide descriptive…

Machine Learning · Computer Science 2026-02-11 Hossam Amer , Rezaul Karim , Ali Pourranjbar , Weiwei Zhang , Walid Ahmed , Boxing Chen

Rethinking Machine Learning Development and Deployment for Edge Devices

Machine learning (ML), especially deep learning is made possible by the availability of big data, enormous compute power and, often overlooked, development tools or frameworks. As the algorithms become mature and efficient, more and more ML…

Machine Learning · Computer Science 2018-06-21 Liangzhen Lai , Naveen Suda

A Learning-based Distributed Algorithm for Scheduling in Multi-hop Wireless Networks

We address the joint problem of learning and scheduling in multi-hop wireless network without a prior knowledge on link rates. Previous scheduling algorithms need the link rate information, and learning algorithms often require a…

Networking and Internet Architecture · Computer Science 2023-12-11 Daehyun Park , Sunjung Kang , Changhee Joo

Language Model Teams as Distributed Systems

Large language models (LLMs) are growing increasingly capable, prompting recent interest in LLM teams. Yet, despite increased deployment of LLM teams at scale, we lack a principled framework for addressing key questions such as when a team…

Multiagent Systems · Computer Science 2026-03-13 Elizabeth Mieczkowski , Katherine M. Collins , Ilia Sucholutsky , Natalia Vélez , Thomas L. Griffiths

Distributed Distance-Bounded Network Design Through Distributed Convex Programming

Solving linear programs is often a challenging task in distributed settings. While there are good algorithms for solving packing and covering linear programs in a distributed manner (Kuhn et al.~2006), this is essentially the only class of…

Data Structures and Algorithms · Computer Science 2017-09-12 Michael Dinitz , Yasamin Nazari

Data Driven Resource Allocation for Distributed Learning

In distributed machine learning, data is dispatched to multiple machines for processing. Motivated by the fact that similar data points often belong to the same or similar classes, and more generally, classification rules of high accuracy…

Machine Learning · Computer Science 2016-12-16 Travis Dick , Mu Li , Venkata Krishna Pillutla , Colin White , Maria Florina Balcan , Alex Smola

Self-organizing Democratized Learning: Towards Large-scale Distributed Learning Systems

Emerging cross-device artificial intelligence (AI) applications require a transition from conventional centralized learning systems towards large-scale distributed AI systems that can collaboratively perform complex learning tasks. In this…

Machine Learning · Computer Science 2022-04-29 Minh N. H. Nguyen , Shashi Raj Pandey , Tri Nguyen Dang , Eui-Nam Huh , Nguyen H. Tran , Walid Saad , Choong Seon Hong

Distributed and Democratized Learning: Philosophy and Research Challenges

Due to the availability of huge amounts of data and processing abilities, current artificial intelligence (AI) systems are effective in solving complex tasks. However, despite the success of AI in different areas, the problem of designing…

Artificial Intelligence · Computer Science 2020-10-15 Minh N. H. Nguyen , Shashi Raj Pandey , Kyi Thar , Nguyen H. Tran , Mingzhe Chen , Walid Saad , Choong Seon Hong

Distributed Supervised Learning using Neural Networks

Distributed learning is the problem of inferring a function in the case where training data is distributed among multiple geographically separated sources. Particularly, the focus is on designing learning strategies with low computational…

Machine Learning · Statistics 2016-07-22 Simone Scardapane