English

Provably-Efficient and Internally-Deterministic Parallel Union-Find

Data Structures and Algorithms 2023-04-24 v1 Distributed, Parallel, and Cluster Computing

Abstract

Determining the degree of inherent parallelism in classical sequential algorithms and leveraging it for fast parallel execution is a key topic in parallel computing, and detailed analyses are known for a wide range of classical algorithms. In this paper, we perform the first such analysis for the fundamental Union-Find problem, in which we are given a graph as a sequence of edges, and must maintain its connectivity structure under edge additions. We prove that classic sequential algorithms for this problem are well-parallelizable under reasonable assumptions, addressing a conjecture by [Blelloch, 2017]. More precisely, we show via a new potential argument that, under uniform random edge ordering, parallel union-find operations are unlikely to interfere: TT concurrent threads processing the graph in parallel will encounter memory contention O(T2logVlogE)O(T^2 \cdot \log |V| \cdot \log |E|) times in expectation, where E|E| and V|V| are the number of edges and nodes in the graph, respectively. We leverage this result to design a new parallel Union-Find algorithm that is both internally deterministic, i.e., its results are guaranteed to match those of a sequential execution, but also work-efficient and scalable, as long as the number of threads TT is O(E13ε)O(|E|^{\frac{1}{3} - \varepsilon}), for an arbitrarily small constant ε>0\varepsilon > 0, which holds for most large real-world graphs. We present lower bounds which show that our analysis is close to optimal, and experimental results suggesting that the performance cost of internal determinism is limited.

Keywords

Cite

@article{arxiv.2304.09331,
  title  = {Provably-Efficient and Internally-Deterministic Parallel Union-Find},
  author = {Alexander Fedorov and Diba Hashemi and Giorgi Nadiradze and Dan Alistarh},
  journal= {arXiv preprint arXiv:2304.09331},
  year   = {2023}
}