Related papers: Communication Efficient Algorithms for Top-k Selec…
We consider the scenario of $n$ sensor nodes observing streams of data. The nodes are connected to a central server whose task it is to compute some function over all data items observed by the nodes. In our case, there exists a total order…
Previous parallel sorting algorithms do not scale to the largest available machines, since they either have prohibitive communication volume or prohibitive critical path length. We describe algorithms that are a viable compromise and…
This thesis is concerned with the design of distributed algorithms for solving optimization problems. We consider networks where each node has exclusive access to a cost function, and design algorithms that make all nodes cooperate to find…
There has been surprisingly little work on algorithms for sorting strings on distributed-memory parallel machines. We develop efficient algorithms for this problem based on the multi-way merging principle. These algorithms inspect only…
Split learning is a simple solution for Vertical Federated Learning (VFL), which has drawn substantial attention in both research and application due to its simplicity and efficiency. However, communication efficiency is still a crucial…
We consider algorithmic problems in the setting in which the input data has been partitioned arbitrarily on many servers. The goal is to compute a function of all the data, and the bottleneck is the communication used by the algorithm. We…
In this paper we present a deterministic parallel algorithm solving the multiple selection problem in congested clique model. In this problem for given set of elements S and a set of ranks $K = \{k_1 , k_2 , ..., k_r \}$ we are asking for…
Decentralized optimization methods enable on-device training of machine learning models without a central coordinator. In many scenarios communication between devices is energy demanding and time consuming and forms the bottleneck of the…
Collecting the most informative data from a large dataset distributed over a network is a fundamental problem in many fields, including control, signal processing and machine learning. In this paper, we establish a connection between…
We consider the problem of determining the top-$k$ largest measurements from a dataset distributed among a network of $n$ agents with noisy communication links. We show that this scenario can be cast as a distributed convex optimization…
This paper proposes and analyzes a communication-efficient distributed optimization framework for general nonconvex nonsmooth signal processing and machine learning problems under an asynchronous protocol. At each iteration, worker machines…
We study the scalability of consensus-based distributed optimization algorithms by considering two questions: How many processors should we use for a given problem, and how often should they communicate when communication is not free?…
We consider the problem of communication efficient distributed optimization where multiple nodes exchange important algorithm information in every iteration to solve large problems. In particular, we focus on the stochastic variance-reduced…
Resource allocation is a fundamental problem in Industrial Internet of Things (IIoT) systems, in which devices work together under limited communication bandwidth to complete diverse tasks. This paper proposes a communication-efficient…
Many complex multi-target prediction problems that concern large target spaces are characterised by a need for efficient prediction strategies that avoid the computation of predictions for all targets explicitly. Examples of such problems…
We consider the problem of finding the $k^{th}$ highest element in a totally ordered set of $n$ elements (select), and partitioning a totally ordered set into the top $k$ and bottom $n-k$ elements (partition) using pairwise comparisons.…
Tensors, especially higher-order tensors, are typically represented in low-rank formats to preserve the main information of the high-dimensional data while saving memory space. In practice, only a small fraction elements in high-dimensional…
We consider the selection problem on a completely connected network of $n$ processors with no shared memory. Each processor initially holds a given numeric item of $b$ bits allowed to send a message of $\max ( b, \lg n )$ bits to another…
For massive data sets, efficient computation commonly relies on distributed algorithms that store and process subsets of the data on different machines, minimizing communication costs. Our focus is on regression and classification problems…
String sorting is an important part of tasks such as building index data structures. Unfortunately, current string sorting algorithms do not scale to massively parallel distributed-memory machines since they either have latency (at least)…