Related papers: Launchpad: A Programming Model for Distributed Mac…
The demand for artificial intelligence has grown significantly over the last decade and this growth has been fueled by advances in machine learning techniques and the ability to leverage hardware acceleration. However, in order to increase…
Nowadays, with the widespread of smartphones and other portable gadgets equipped with a variety of sensors, data is ubiquitous available and the focus of machine learning has shifted from being able to infer from small training samples to…
Large language models (LLM) are advanced AI systems trained on extensive textual data, leveraging deep learning techniques to understand and generate human-like language. Today's LLMs with billions of parameters are so huge that hardly any…
Data-driven approaches are becoming more common as problem-solving techniques in many areas of research and industry. In most cases, machine learning models are the key component of these solutions, but a solution involves multiple such…
Artificial intelligence has made remarkable progress in handling complex tasks, thanks to advances in hardware acceleration and machine learning algorithms. However, to acquire more accurate outcomes and solve more complex issues,…
Present day machine learning is computationally intensive and processes large amounts of data. It is implemented in a distributed fashion in order to address these scalability issues. The work is parallelized across a number of computing…
The rise of Big Data has led to new demands for Machine Learning (ML) systems to learn complex models with millions to billions of parameters, that promise adequate capacity to digest massive datasets and offer powerful predictive analytics…
Language models (LMs) are machine learning models designed to predict linguistic patterns by estimating the probability of word sequences based on large-scale datasets, such as text. LMs have a wide range of applications in natural language…
Scalable and efficient distributed learning is one of the main driving forces behind the recent rapid advancement of machine learning and artificial intelligence. One prominent feature of this topic is that recent progresses have been made…
In domains such as health care and finance, shortage of labeled data and computational resources is a critical issue while developing machine learning algorithms. To address the issue of labeled data scarcity in training and deployment of…
Machine Learning has proven useful in the recent years as a way to achieve failure prediction for industrial systems. However, the high computational resources necessary to run learning algorithms are an obstacle to its widespread…
With the rapid growth of large language models (LLMs), a wide range of methods have been developed to distribute computation and memory across hardware devices for efficient training and inference. While existing surveys provide descriptive…
Machine learning (ML), especially deep learning is made possible by the availability of big data, enormous compute power and, often overlooked, development tools or frameworks. As the algorithms become mature and efficient, more and more ML…
We address the joint problem of learning and scheduling in multi-hop wireless network without a prior knowledge on link rates. Previous scheduling algorithms need the link rate information, and learning algorithms often require a…
Large language models (LLMs) are growing increasingly capable, prompting recent interest in LLM teams. Yet, despite increased deployment of LLM teams at scale, we lack a principled framework for addressing key questions such as when a team…
Solving linear programs is often a challenging task in distributed settings. While there are good algorithms for solving packing and covering linear programs in a distributed manner (Kuhn et al.~2006), this is essentially the only class of…
In distributed machine learning, data is dispatched to multiple machines for processing. Motivated by the fact that similar data points often belong to the same or similar classes, and more generally, classification rules of high accuracy…
Emerging cross-device artificial intelligence (AI) applications require a transition from conventional centralized learning systems towards large-scale distributed AI systems that can collaboratively perform complex learning tasks. In this…
Due to the availability of huge amounts of data and processing abilities, current artificial intelligence (AI) systems are effective in solving complex tasks. However, despite the success of AI in different areas, the problem of designing…
Distributed learning is the problem of inferring a function in the case where training data is distributed among multiple geographically separated sources. Particularly, the focus is on designing learning strategies with low computational…