Related papers: Patterns and Rewrite Rules for Systematic Code Gen…
Over recent years heterogeneous systems have become more prevalent across HPC systems, with over 100 supercomputers in the TOP500 incorporating GPUs or other accelerators. These hardware platforms have different performance characteristics…
Modern computer systems typically conbine multicore CPUs with accelerators like GPUs for inproved performance and energy efficiency. However, these sys- tems suffer from poor performance portability, code tuned for one device must be…
Spatial computing architectures promise a major stride in performance and energy efficiency over the traditional load/store devices currently employed in large scale computing systems. The adoption of high-level synthesis (HLS) from…
Heterogeneous systems have become one of the most common architectures today, thanks to their excellent performance and energy consumption. However, due to their heterogeneity they are very complex to program and even more to achieve…
In this survey paper, we review recent work on frameworks for the high-level, portable programming of heterogeneous multi-/manycore systems (especially, GPU-based systems) using high-level constructs such as annotated user-level software…
High-performance computing systems (HPC) provide powerful capabilities for modeling, simulation, and data analytics for a broad class of computational problems. They enable extreme performance of the order of quadrillion floating-point…
The electrical and electronic engineering has used parallel programming to solve its large scale complex problems for performance reasons. However, as parallel programming requires a non-trivial distribution of tasks and data, developers…
Current computational systems are heterogeneous by nature, featuring a combination of CPUs and GPUs. As the latter are becoming an established platform for high-performance computing, the focus is shifting towards the seamless programming…
The popularity of neural networks (NNs) spans academia, industry, and popular culture. In particular, convolutional neural networks (CNNs) have been applied to many image based machine learning tasks and have yielded strong results. The…
High-performance computing (HPC) is a major driver accelerating scientific research and discovery, from quantum simulations to medical therapeutics. While the increasing availability of HPC resources is in many cases pivotal to successful…
Medical image processing is often limited by the computational cost of the involved algorithms. Whereas dedicated computing devices (GPUs in particular) exist and do provide significant efficiency boosts, they have an extra cost of use in…
Programming modern high-performance computing systems is challenging due to the need to efficiently program GPUs and accelerators and to handle data movement between nodes. The C++ language has been continuously enhanced in recent years…
With the growing scale and complexity of high-performance computing (HPC) systems, resilience solutions that ensure continuity of service despite frequent errors and component failures must be methodically designed to balance the…
High-performance computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important…
High Performance Computing (HPC) aims at providing reasonably fast computing solutions to scientific and real life problems. The advent of multicore architectures is noticeable in the HPC history, because it has brought the underlying…
The pervasive adoption of Deep Learning (DL) and Graph Processing (GP) makes it a de facto requirement to build large-scale clusters of heterogeneous accelerators including GPUs and FPGAs. The OpenCL programming framework can be used on the…
When considering different hardware platforms, not just the time-to-solution can be of importance but also the energy necessary to reach it. This is not only the case with battery powered and mobile devices but also with high-performance…
There is a large body of legacy scientific code written in languages like Fortran that is not optimised to get the best performance out of heterogeneous acceleration devices like GPUs and FPGAs, and manually porting such code into parallel…
One of the greatest efforts of computational scientists is to translate the mathematical model describing a class of physical phenomena into large and complex codes. Many of these codes face the difficulty of implementing the mathematical…
OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus…