English

Implementation of a Parallel Tree Method on a GPU

Instrumentation and Methods for Astrophysics 2011-12-21 v1 Astrophysics of Galaxies Performance

Abstract

The kd-tree is a fundamental tool in computer science. Among other applications, the application of kd-tree search (by the tree method) to the fast evaluation of particle interactions and neighbor search is highly important, since the computational complexity of these problems is reduced from O(N^2) for a brute force method to O(N log N) for the tree method, where N is the number of particles. In this paper, we present a parallel implementation of the tree method running on a graphics processing unit (GPU). We present a detailed description of how we have implemented the tree method on a Cypress GPU. An optimization that we found important is localized particle ordering to effectively utilize cache memory. We present a number of test results and performance measurements. Our results show that the execution of the tree traversal in a force calculation on a GPU is practical and efficient.

Keywords

Cite

@article{arxiv.1112.4539,
  title  = {Implementation of a Parallel Tree Method on a GPU},
  author = {Naohito Nakasato},
  journal= {arXiv preprint arXiv:1112.4539},
  year   = {2011}
}

Comments

Journal of Computational Science, 2011; See our recent update at http://galaxy.u-aizu.ac.jp/trac/note/wiki/Octree_On_GPU

R2 v1 2026-06-21T19:54:08.591Z