Related papers: Oct-tree Method on GPU
The kd-tree is a fundamental tool in computer science. Among other applications, the application of kd-tree search (by the tree method) to the fast evaluation of particle interactions and neighbor search is highly important, since the…
We present a new very fast tree-code which runs on massively parallel Graphical Processing Units (GPU) with NVIDIA CUDA architecture. The tree-construction and calculation of multipole moments is carried out on the host CPU, while the force…
We present parallel algorithms for constructing and traversing sparse octrees on graphics processing units (GPUs). The algorithms are based on parallel-scan and sort methods. To test the performance and feasibility, we implemented them in…
This paper presents an octree construction method, called Cornerstone, that facilitates global domain decomposition and interactions between particles in mesh-free numerical simulations. Our method is based on algorithms developed for 3D…
Computing $k$-Nearest Neighbors (KNN) is one of the core kernels used in many machine learning, data mining and scientific computing applications. Although kd-tree based $O(\log n)$ algorithms have been proposed for computing KNN, due to…
Gravitational $N$-body simulations calculate numerous interactions between particles. The tree algorithm reduces these calculations by constructing a hierarchical oct-tree structure and approximating gravitational forces on particles. Over…
The tree method is a widely implemented algorithm for collisionless $N$-body simulations in astrophysics well suited for GPU(s). Adopting hierarchical time stepping can accelerate $N$-body simulations; however, it is infrequently…
We propose a hybrid tree algorithm for reducing calculation and communication cost of collision-less N-body simulations. The concept of our algorithm is that we split interaction force into two parts: hard-force from neighbor particles and…
We present an algorithm that allows for building left-balanced and complete k-d trees over k-dimensional points in a trivially parallel and GPU friendly way. Our algorithm requires exactly one int per data point as temporary storage, and…
High-performance implementations of $k$-Nearest Neighbor Search ($k$NN) in low dimensions use tree-based data structures. Tree algorithms are hard to parallelize on GPUs due to their irregularity. However, newer Nvidia GPUs offer hardware…
Modeling of collisionless galactic systems is based on the N-body model, which requires large computational resources due to the long-range nature of gravitational forces. The most common method for calculating gravity is the TreeCode…
Similarity search, the task of identifying objects most similar to a given query object under a specific metric, has gathered significant attention due to its practical applications. However, the absence of coordinate information to…
As an entry for the 2012 Gordon-Bell performance prize, we report performance results of astrophysical N-body simulations of one trillion particles performed on the full system of K computer. This is the first gravitational trillion-body…
This paper proposes an efficient data structure, ikd-Tree, for dynamic space partition. The ikd-Tree incrementally updates a k-d tree with new coming points only, leading to much lower computation time than existing static k-d trees.…
We present a gravitational hierarchical N-body code that is designed to run efficiently on Graphics Processing Units (GPUs). All parts of the algorithm are executed on the GPU which eliminates the need for data transfer between the Central…
We describe a parallel version of our tree-code for the simulation of self-gravitating systems in Astrophysics. It is based on a dynamic and adaptive method for the domain decomposition, which exploits the hierarchical data arrangement used…
We present a parallel algorithm for computing the treewidth of a graph on a GPU. We implement this algorithm in OpenCL, and experimentally evaluate its performance. Our algorithm is based on an $O^*(2^{n})$-time algorithm that explores the…
This paper focuses on the parallel implementation of a direct $N$-body method~(particle-particle algorithm) and the application of multiple GPUs for galactic dynamics simulations. Application of a hybrid OpenMP-CUDA technology is considered…
We describe the use of Graphics Processing Units (GPUs) for speeding up the code NBODY6 which is widely used for direct $N$-body simulations. Over the years, the $N^2$ nature of the direct force calculation has proved a barrier for…
We examine the problem of optimizing classification tree evaluation for on-line and real-time applications by using GPUs. Looking at trees with continuous attributes often used in image segmentation, we first put the existing algorithms for…