Related papers: Data-Driven Execution of Fast Multipole Methods

Parallel Algorithms for Constructing Data Structures for Fast Multipole Methods

We present efficient algorithms to build data structures and the lists needed for fast multipole methods. The algorithms are capable of being efficiently implemented on both serial, data parallel GPU and on distributed architectures. With…

Mathematical Software · Computer Science 2013-01-10 Qi Hu , Nail A. Gumerov , Ramani Duraiswami

An FMM Based on Dual Tree Traversal for Many-core Architectures

The present work attempts to integrate the independent efforts in the fast N-body community to create the fastest N-body library for many-core and heterogenous architectures. Focus is placed on low accuracy optimizations, in response to the…

Numerical Analysis · Computer Science 2012-09-20 Rio Yokota

Efficiently Scheduling Parallel DAG Tasks on Identical Multiprocessors

Parallel real-time embedded applications can be modelled as directed acyclic graphs (DAGs) whose nodes model subtasks and whose edges model precedence constraints among subtasks. Efficiently scheduling such parallel tasks can be challenging…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-24 Shardul Lendve , Konstantinos Bletsas , Pedro F. Souto

Asynchronous Execution of the Fast Multipole Method Using Charm++

Fast multipole methods (FMM) on distributed mem- ory have traditionally used a bulk-synchronous model of com- municating the local essential tree (LET) and overlapping it with computation of the local data. This could be perceived as an…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-05-30 Mustafa AbdulJabbar , Rio Yokota , David Keyes

PetFMM--A dynamically load-balancing parallel fast multipole library

Fast algorithms for the computation of $N$-body problems can be broadly classified into mesh-based interpolation methods, and hierarchical or multiresolution methods. To this last class belongs the well-known fast multipole method (FMM),…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-09-21 Felipe A. Cruz , Matthew G. Knepley , L. A. Barba

A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems

Among the algorithms that are likely to play a major role in future exascale computing, the fast multipole method (FMM) appears as a rising star. Our previous recent work showed scaling of an FMM on GPU clusters, with problem sizes in the…

Numerical Analysis · Computer Science 2012-10-30 Rio Yokota , Lorena Barba

Dynamic autotuning of adaptive fast multipole methods on hybrid multicore CPU & GPU systems

We discuss an implementation of adaptive fast multipole methods targeting hybrid multicore CPU- and GPU-systems. From previous experiences with the computational profile of our version of the fast multipole algorithm, suitable parts are…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-09-03 Marcus Holm , Stefan Engblom , Anders Goude , Sverker Holmgren

A Performance Model for the Communication in Fast Multipole Methods on HPC Platforms

Exascale systems are predicted to have approximately one billion cores, assuming Gigahertz cores. Limitations on affordable network topologies for distributed memory systems of such massive scale bring new challenges to the current parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-05-27 Huda Ibeid , Rio Yokota , David Keyes

Alternative Tilings for the Fast Multipole Method on the Plane

The fast multipole method (FMM) performs fast approximate kernel summation to a specified tolerance $\epsilon$ by using a hierarchical division of the domain, which groups source and receiver points into regions that satisfy local…

Numerical Analysis · Computer Science 2012-04-17 Yuancheng Luo , Ramani Duraiswami

Characterization of the errors of the FMM in particle simulations

The Fast Multipole Method (FMM) offers an acceleration for pairwise interaction calculation, known as $N$-body problems, from $\mathcal{O}(N^2)$ to $\mathcal{O}(N)$ with $N$ particles. This has brought dramatic increase in the capability of…

Data Structures and Algorithms · Computer Science 2011-09-21 Felipe A. Cruz , L. A. Barba

An Efficient Load Balancing Method for Tree Algorithms

Nowadays, multiprocessing is mainstream with exponentially increasing number of processors. Load balancing is, therefore, a critical operation for the efficient execution of parallel algorithms. In this paper we consider the fundamental…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-10-03 Osama Talaat Ibrahim , Ahmed El-Mahdy

Removing the Barrier to Scalability in Parallel FMM

The Fast Multipole Method (FMM) is well known to possess a bottleneck arising from decreasing workload on higher levels of the FMM tree [Greengard and Gropp, Comp. Math. Appl., 20(7), 1990]. We show that this potential bottleneck can be…

Computational Engineering, Finance, and Science · Computer Science 2010-08-17 Matthew G. Knepley

A parallel directional Fast Multipole Method

This paper introduces a parallel directional fast multipole method (FMM) for solving N-body problems with highly oscillatory kernels, with a focus on the Helmholtz kernel in three dimensions. This class of oscillatory kernels requires a…

Numerical Analysis · Mathematics 2018-01-08 Austin R. Benson , Jack Poulson , Kenneth Tran , Björn Engquist , Lexing Ying

Energy-Efficient Scheduling for Homogeneous Multiprocessor Systems

We present a number of novel algorithms, based on mathematical optimization formulations, in order to solve a homogeneous multiprocessor scheduling problem, while minimizing the total energy consumption. In particular, for a system with a…

Operating Systems · Computer Science 2015-11-13 Mason Thammawichai , Eric C. Kerrigan

Fast multipole methods for evaluation of layer potentials with locally-corrected quadratures

While fast multipole methods (FMMs) are in widespread use for the rapid evaluation of potential fields governed by the Laplace, Helmholtz, Maxwell or Stokes equations, their coupling to high-order quadratures for evaluating layer potentials…

Numerical Analysis · Mathematics 2021-04-26 Leslie Greengard , Michael O'Neil , Manas Rachh , Felipe Vico

Learning Greens Operators through Hierarchical Neural Networks Inspired by the Fast Multipole Method

The Fast Multipole Method (FMM) is an efficient numerical algorithm for computation of long-ranged forces in $N$-body problems within gravitational and electrostatic fields. This method utilizes multipole expansions of the Green's function…

Machine Learning · Computer Science 2025-09-26 Emilio McAllister Fognini , Marta M. Betcke , Ben T. Cox

Efficient Parallelization of Short-Range Molecular Dynamics Simulations on Many-Core Systems

This article introduces a highly parallel algorithm for molecular dynamics simulations with short-range forces on single node multi- and many-core systems. The algorithm is designed to achieve high parallel speedups for strongly…

Computational Physics · Physics 2013-11-20 R. Meyer

Data-aware Dynamic Execution of Irregular Workloads on Heterogeneous Systems

Current approaches to scheduling workloads on heterogeneous systems with specialized accelerators often rely on manual partitioning, offloading tasks with specific compute patterns to accelerators. This method requires extensive…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-12 Zhenyu Bai , Dan Wu , Pranav Dangi , Dhananjaya Wijerathne , Venkata Pavan Kumar Miriyala , Tulika Mitra

Dynamic load balancing for large-scale adaptive finite element computation

For the parallel computation of partial differential equations, one key is the grid partitioning. It requires that each process owns the same amount of computations, and also, the partitioning quality should be proper to reduce the…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-02-02 Hui Liu , Tao Cui , Wei Leng , Linbo Zhang

Treecode and fast multipole method for N-body simulation with CUDA

Due to the variety and importance of applications of treecodes and FMM, the combination of algorithmic acceleration with hardware acceleration can have tremendous impact. Alas, programming these algorithms efficiently is no piece of cake.…

Computational Physics · Physics 2012-08-14 Rio Yokota , Lorena Barba