English
Related papers

Related papers: Strategy Preserving Compilation for Parallel Funct…

200 papers

Graphics processing units (GPU) had evolved from a specialized hardware capable to render high quality graphics in games to a commodity hardware for effective processing blocks of data in a parallel schema. This evolution is particularly…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-03-26 Luis Cabellos

In recent years the computing landscape has seen an in- creasing shift towards specialized accelerators. Field pro- grammable gate arrays (FPGAs) are particularly promising as they offer significant performance and energy improvements…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-11-24 Raghu Prabhakar , David Koeplinger , Kevin Brown , HyoukJoong Lee , Christopher De Sa , Christos Kozyrakis , Kunle Olukotun

In this paper we present an optimized parallel implementation of a flexible MAP decoder for synchronization error correcting codes, supporting a very wide range of code sizes and channel conditions. On mid-range GPUs we demonstrate decoding…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-26 Johann A. Briffa

Efficient parallelization of algorithms on general-purpose GPUs is essential in many areas today. However, it is a non-trivial task for software engineers to utilize GPUs to improve the performance of high-level programs in general.…

Programming Languages · Computer Science 2024-07-09 Lars Hummelgren , John Wikman , Oscar Eriksson , Philipp Haller , David Broman

We present a new adaptive parallel algorithm for the challenging problem of multi-dimensional numerical integration on massively parallel architectures. Adaptive algorithms have demonstrated the best performance, but efficient many-core…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-24 Ioannis Sakiotis , Kamesh Arumugam , Marc Paterno , Desh Ranjan , Balša Terzić , Mohammad Zubair

We study parallel algorithms for the minimization of Deterministic Finite Automata (DFAs). In particular, we implement four different massively parallel algorithms on Graphics Processing Units (GPUs). Our results confirm the expectations…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-31 Jan Martens , Anton Wijs

The application of program transformation and algebraic methods to the development of efficient combinatorial optimization (CO) algorithms relies on an exhaustive combinatorial generator for the problem specification, followed by the fusion…

Discrete Mathematics · Computer Science 2026-05-29 Xi He , Max. A. Little

There is a large body of legacy scientific code written in languages like Fortran that is not optimised to get the best performance out of heterogeneous acceleration devices like GPUs and FPGAs, and manually porting such code into parallel…

Performance · Computer Science 2019-01-25 Wim Vanderbauwhede , Syed Waqar Nabi

Parallel computing is a standard approach to achieving high-performance computing (HPC). Three commonly used methods to implement parallel computing include: 1) applying multithreading technology on single-core or multi-core CPUs; 2)…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-18 Xinyao Yi

Optimizing programs to run efficiently on modern parallel hardware is hard but crucial for many applications. The predominantly used imperative languages - like C or OpenCL - force the programmer to intertwine the code describing…

Programming Languages · Computer Science 2020-02-07 Bastian Hagedorn , Johannes Lenfers , Thomas Koehler , Sergei Gorlatch , Michel Steuwer

OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-23 Pekka Jääskeläinen , Carlos Sánchez de La Lama , Erik Schnetter , Kalle Raiskila , Jarmo Takala , Heikki Berg

We study parallel algorithms for the minimisation and equivalence checking of Deterministic Finite Automata (DFAs). Regarding DFA minimisation, we implement four different massively parallel algorithms on Graphics Processing Units~(GPUs).…

Formal Languages and Automata Theory · Computer Science 2025-08-29 Jan Heemstra , Jan Martens , Anton Wijs

Future computing systems, from handhelds to supercomputers, will undoubtedly be more parallel and heterogeneous than todays systems to provide more performance and energy efficiency. Thus, GPUs are increasingly being used to accelerate…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-18 Saeed Taheri , Apan Qasem , Martin Burtscher

We present a systematic, algebraically based, design methodology for efficient implementation of computer programs optimized over multiple levels of the processor/memory and network hierarchy. Using a common formalism to describe the…

Mathematical Software · Computer Science 2008-03-18 Lenore R. Mullin , James E. Raynolds

Recent work showed that compiling functional programs to use dense, serialized memory representations for recursive algebraic datatypes can yield significant constant-factor speedups for sequential programs. But serializing data in a…

Programming Languages · Computer Science 2021-07-02 Chaitanya Koparkar , Mike Rainey , Michael Vollmer , Milind Kulkarni , Ryan R. Newton

With the growing complexity and capability of contemporary robotic systems, the necessity of sophisticated computing solutions to efficiently handle tasks such as real-time processing, sensor integration, decision-making, and control…

Robotics · Computer Science 2025-09-09 Md Rafid Islam

Performing Retrieval-Augmented Generation (RAG) directly on mobile devices is promising for data privacy and responsiveness but is hindered by the architectural constraints of mobile NPUs. Specifically, current hardware struggles with the…

Computation and Language · Computer Science 2025-12-18 Zhiyang Chen , Daliang Xu , Haiyang Shen , Chiheng Lou , Mengwei Xu , Shangguang Wang , Xin Jin , Yun Ma

Genetic Algorithms (GAs) are used to solve search and optimization problems in which an optimal solution can be found using an iterative process with probabilistic and non-deterministic transitions. However, depending on the problem's…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-01-23 Matheus F. Torquato , Marcelo A. C. Fernandes

We describe a methodology for designing efficient parallel and distributed scientific software. This methodology utilizes sequences of mechanizable algebra--based optimizing transformations. In this study, we apply our methodology to the…

Software Engineering · Computer Science 2008-11-18 Harry B. Hunt , Lenore R. Mullin , Daniel J. Rosenkrantz , James E. Raynolds

While parallelism remains the main source of performance, architectural implementations and programming models change with each new hardware generation, often leading to costly application re-engineering. Most tools for performance…

Programming Languages · Computer Science 2022-07-04 William S. Moses , Ivan R. Ivanov , Jens Domke , Toshio Endo , Johannes Doerfert , Oleksandr Zinenko
‹ Prev 1 2 3 10 Next ›