English
Related papers

Related papers: A Transformation--Based Approach for the Design of…

200 papers

Parallel parameterized complexity theory studies how fixed-parameter tractable (fpt) problems can be solved in parallel. Previous theoretical work focused on parallel algorithms that are very fast in principle, but did not take into account…

Data Structures and Algorithms · Computer Science 2019-02-21 Max Bannach , Malte Skambath , Till Tantau

The Fast Fourier Transform (FFT) is a fundamental numerical technique with widespread application in a range of scientific problems. As scientific simulations attempt to exploit exascale systems, there has been a growing demand for…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-21 Sana Taghipour Anvari , Julian Samaroo , Matin Raayai Ardakani , David Kaeli

The parallel and distributed processing are becoming de facto industry standard, and a large part of the current research is targeted on how to make computing scalable and distributed, dynamically, without allocating the resources on…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-10 Rajendra Purohit , K R Chowdhary , S D Purohit

The FFT of three-dimensional (3D) input data is an important computational kernel of numerical simulations and is widely used in High Performance Computing (HPC) codes running on a large number of processors. Performance of many scientific…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-28 Vivek Gavane , Supriya Prabhugawankar , Shivam Garg , Archana Achalere , Rajendra Joshi

Multi-dimensional discrete Fourier transforms (DFT) are typically decomposed into multiple 1D transforms. Hence, parallel implementations of any multi-dimensional DFT focus on parallelizing within or across the 1D DFT. Existing DFT packages…

Mathematical Software · Computer Science 2019-12-24 Doru Thom Popovici , Martin D. Schatz , Franz Franchetti , Tze Meng Low

We present a parallel FFT algorithm for SIMD systems following the `Transpose Algorithm' approach. The method is based on the assignment of the data field onto a 1-dimensional ring of systolic cells. The systolic array can be universally…

High Energy Physics - Lattice · Physics 2015-06-25 Thomas Lippert , Klaus Schilling , Federico Toschi , Sven Trentmann , Raffaele Tripiccione

We present a systematic, algebraically based, design methodology for efficient implementation of computer programs optimized over multiple levels of the processor/memory and network hierarchy. Using a common formalism to describe the…

Mathematical Software · Computer Science 2008-03-18 Lenore R. Mullin , James E. Raynolds

Parallel algorithms relying on synchronous parallelization libraries often experience adverse performance due to global synchronization barriers. Asynchronous many-task runtimes offer task futurization capabilities that minimize or remove…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-05 Alexander Strack , Christopher Taylor , Patrick Diehl , Dirk Pflüger

Matrix Distributed Processing (MDP) is a C++ library for fast development of efficient parallel algorithms. It constitues the core of FermiQCD. MDP enables programmers to focus on algorithms, while parallelization is dealt with…

High Energy Physics - Lattice · Physics 2007-05-23 Massimo Di Pierro

We present efficient algorithms to build data structures and the lists needed for fast multipole methods. The algorithms are capable of being efficiently implemented on both serial, data parallel GPU and on distributed architectures. With…

Mathematical Software · Computer Science 2013-01-10 Qi Hu , Nail A. Gumerov , Ramani Duraiswami

Algorithmic skeletons are used as building-blocks to ease the task of parallel programming by abstracting the details of parallel implementation from the developer. Most existing libraries provide implementations of skeletons that are…

Programming Languages · Computer Science 2016-07-11 Venkatesh Kannan , G. W. Hamilton

Traditional heterogeneous parallel algorithms, designed for heterogeneous clusters of workstations, are based on the assumption that the absolute speed of the processors does not depend on the size of the computational task. This assumption…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-09-15 Alexey Lastovetsky , Ravi Reddy , Vladimir Rychkov , David Clarke

Current high-performance computer systems used for scientific computing typically combine shared memory computational nodes in a distributed memory environment. Extracting high performance from these complex systems requires tailored…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-14 Afshin Zafari , Elisabeth Larsson , Martin Tillenius

Over the last two decades, scientific workflow management systems (SWfMS) have emerged as a means to facilitate the design, execution, and monitoring of reusable scientific data processing pipelines. At the same time, the amounts of data…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-03-29 Marc Bux , Ulf Leser

The paper is devoted to an analytical study of the "master-worker" framework scalability on multiprocessors with distributed memory. A new model of parallel computations called BSF is proposed. The BSF model is based on BSP and SPMD models.…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-09 L. B. Sokolinsky

The increasing number of processing elements and decreas- ing memory to core ratio in modern high-performance platforms makes efficient strong scaling a key requirement for numerical algorithms. In order to achieve efficient scalability on…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-01-14 Michael Lange , Gerard Gorman , Michele Weiland , Lawrence Mitchell , James Southern

With the rapid growth of large language models (LLMs), a wide range of methods have been developed to distribute computation and memory across hardware devices for efficient training and inference. While existing surveys provide descriptive…

Machine Learning · Computer Science 2026-02-11 Hossam Amer , Rezaul Karim , Ali Pourranjbar , Weiwei Zhang , Walid Ahmed , Boxing Chen

The application of program transformation and algebraic methods to the development of efficient combinatorial optimization (CO) algorithms relies on an exhaustive combinatorial generator for the problem specification, followed by the fusion…

Discrete Mathematics · Computer Science 2026-05-29 Xi He , Max. A. Little

The fast Fourier transform (FFT) is a primitive kernel in numerous fields of science and engineering. OpenFFT is an open-source parallel package for 3-D FFTs, built on a communication-optimal domain decomposition method for achieving…

Mathematical Software · Computer Science 2015-08-27 Truong Vinh Truong Duy , Taisuke Ozaki

Multivariate partial fractioning is a powerful tool for simplifying rational function coefficients in scattering amplitude computations. Since current research problems lead to large sets of complicated rational functions, performance of…

High Energy Physics - Phenomenology · Physics 2022-12-19 Dominik Bendle , Janko Boehm , Murray Heymann , Rourou Ma , Mirko Rahn , Lukas Ristau , Marcel Wittmann , Zihao Wu , Yang Zhang
‹ Prev 1 2 3 10 Next ›