English

Scalable $k$-d trees for distributed data

Data Structures and Algorithms 2022-01-21 v1 Computational Engineering, Finance, and Science Computation

Abstract

Data structures known as kk-d trees have numerous applications in scientific computing, particularly in areas of modern statistics and data science such as range search in decision trees, clustering, nearest neighbors search, local regression, and so forth. In this article we present a scalable mechanism to construct kk-d trees for distributed data, based on approximating medians for each recursive subdivision of the data. We provide theoretical guarantees of the quality of approximation using this approach, along with a simulation study quantifying the accuracy and scalability of our proposed approach in practice.

Keywords

Cite

@article{arxiv.2201.08288,
  title  = {Scalable $k$-d trees for distributed data},
  author = {Aritra Chakravorty and William S. Cleveland and Patrick J. Wolfe},
  journal= {arXiv preprint arXiv:2201.08288},
  year   = {2022}
}

Comments

34 pages, 3 figures; submitted for publication