English

Supervised Distributed Computing

Distributed, Parallel, and Cluster Computing 2025-09-11 v2

Abstract

We introduce a new framework for distributed computing that extends and refines the standard master-worker approach of scheduling multi-threaded computations. In this framework, there are different roles: a supervisor, a source, a target, and a collection of workers. Initially, the source stores some instance II of a computational problem, and at the end, the target is supposed to store a correct solution S(I)S(I) for that instance. We assume that the computation required for S(I)S(I) can be modeled as a directed acyclic graph G=(V,E)G=(V,E), where VV is a set of tasks and (v,w)E(v,w) \in E if and only if task ww needs information from task vv in order to be executed. Given GG, the role of the supervisor is to schedule the execution of the tasks in GG by assigning them to the workers. If all workers are honest, information can be exchanged between the workers, and the workers have access to the source and target, the supervisor only needs to know GG to successfully schedule the computations. I.e., the supervisor does not have to handle any data itself like in standard master-worker approaches, which has the tremendous benefit that tasks can be run massively in parallel in large distributed environments without the supervisor becoming a bottleneck. But what if a constant fraction of the workers is adversarial? Interestingly, we show that under certain assumptions a data-agnostic scheduling approach would even work in an adversarial setting without (asymptotically) increasing the work required for communication and computations. We demonstrate the validity of these assumptions by presenting concrete solutions for supervised matrix multiplication and sorting.

Keywords

Cite

@article{arxiv.2503.11600,
  title  = {Supervised Distributed Computing},
  author = {John Augustine and Christian Scheideler and Julian Werthmann},
  journal= {arXiv preprint arXiv:2503.11600},
  year   = {2025}
}