Related papers: Implementation and performance of FDPS: A Framewor…
In this paper, we describe the algorithms we implemented in FDPS to make efficient use of accelerator hardware such as GPGPUs. We have developed FDPS to make it possible for many researchers to develop their own high-performance parallel…
Numerical simulations based on particle methods have been widely used in various fields including astrophysics. To date, simulation softwares have been developed by individual researchers or research groups in each field, with a huge amount…
Scalable and efficient numerical simulations continue to gain importance, as computation is firmly established as the third pillar of discovery, alongside theory and experiment. Meanwhile, the performance of computing hardware grows through…
We present an algorithm for parallelising the TreePM code. We use both functional and domain decompositions. Functional decomposition is used to separate the computation of long range and short range forces, as well as the task of…
We developed a portable code for dissipative particle dynamics (DPD) simulations. This Fortran program named CAMUS has a couple of notable features. One is the omission of constructing the so-called neighboring particles list, providing a…
The development of cost-effective highperformance parallel computing on multi-processor supercomputers makes it attractive to port excessively time consuming simulation software from personal computers (PC) to super computes. The power…
We introduce a natively distributed mini-application benchmark representative of plastic spiking neural network simulators. It can be used to measure performances of existing computing platforms and to drive the development of future…
This article introduces a highly parallel algorithm for molecular dynamics simulations with short-range forces on single node multi- and many-core systems. The algorithm is designed to achieve high parallel speedups for strongly…
We present a computational algorithm for computing short range forces between particles. The algorithm has two distinguishing features. First, it is optimized for multi-processor computers, and will use as many processors as are available.…
A fast $N$-body code has been developed for simulating a stellar disk embedded in a live dark matter halo. In generating its Poisson solver, a self-consistent field (SCF) code which inherently possesses perfect scalability is incorporated…
This article presents a depth-first search (DFS)-based algorithm for evaluating sensitivity gradients in the topology optimization of soft materials exhibiting complex deformation behavior. The algorithm is formulated using a time-dependent…
The current landscape of scientific research is widely based on modeling and simulation, typically with complexity in the simulation's flow of execution and parameterization properties. Execution flows are not necessarily straightforward…
Traditional heterogeneous parallel algorithms, designed for heterogeneous clusters of workstations, are based on the assumption that the absolute speed of the processors does not depend on the size of the computational task. This assumption…
An adpative integration technique for time advancement of particle motion in the context of coupled computational fluid dynamics (CFD) - discrete element method (DEM) simulations is presented in this work. CFD-DEM models provide an accurate…
I describe here the performances of a parallel treecode with individual particle timesteps. The code is based on the Barnes-Hut algorithm and runs cosmological N-body simulations on parallel machines with a distributed memory architecture…
Particle tracking in large-scale numerical simulations of turbulent flows presents one of the major bottlenecks in parallel performance and scaling efficiency. Here, we describe a particle tracking algorithm for large-scale parallel…
We introduce a particle-based simulation method for granular material in interactive frame rates. We divide the simulation into two decoupled steps. In the first step, a relatively small number of particles is accurately simulated with a…
Fluid flow simulation is a highly active area with applications in a wide range of engineering problems and interactive systems. Meshless methods like the Moving Particle Semi-implicit (MPS) are a great alternative to deal efficiently with…
An improved implementation of an N-body code for simulating collisionless cosmological dynamics is presented. TPM (Tree-Particle-Mesh) combines the PM method on large scales with a tree code to handle particle-particle interactions at small…
We present a scalable dissipative particle dynamics simulation code, fully implemented on the Graphics Processing Units (GPUs) using a hybrid CUDA/MPI programming model, which achieves 10-30 times speedup on a single GPU over 16 CPU cores…