English

CPMA: An Efficient Batch-Parallel Compressed Set Without Pointers

Data Structures and Algorithms 2024-02-20 v3 Distributed, Parallel, and Cluster Computing Performance

Abstract

This paper introduces the batch-parallel Compressed Packed Memory Array (CPMA), a compressed, dynamic, ordered set data structure based on the Packed Memory Array (PMA). Traditionally, batch-parallel sets are built on pointer-based data structures such as trees because pointer-based structures enable fast parallel unions via pointer manipulation. When compared with cache-optimized trees, PMAs were slower to update but faster to scan. The batch-parallel CPMA overcomes this tradeoff between updates and scans by optimizing for cache-friendliness. On average, the CPMA achieves 3x faster batch-insert throughput and 4x faster range-query throughput compared with compressed PaC-trees, a state-of-the-art batch-parallel set library based on cache-optimized trees. We further evaluate the CPMA compared with compressed PaC-trees and Aspen, a state-of-the-art system, on a real-world application of dynamic-graph processing. The CPMA is on average 1.2x faster on a suite of graph algorithms and 2x faster on batch inserts when compared with compressed PaC-trees. Furthermore, the CPMA is on average 1.3x faster on graph algorithms and 2x faster on batch inserts compared with Aspen.

Keywords

Cite

@article{arxiv.2305.05055,
  title  = {CPMA: An Efficient Batch-Parallel Compressed Set Without Pointers},
  author = {Brian Wheatman and Randal Burns and Aydın Buluç and Helen Xu},
  journal= {arXiv preprint arXiv:2305.05055},
  year   = {2024}
}