Related papers: Lattice Simulations using OpenACC compilers
This documentation is designed for beginners in Graphics Processing Unit (GPU)-programming and who want to get familiar with OpenACC and OpenMP offloading models. Here we present an overview of these two programming models as well as of the…
The rapid development in computing technology has paved the way for directive-based programming models towards a principal role in maintaining software portability of performance-critical applications. Efforts on such models involve a least…
We report on our implementation of the RHMC algorithm for the simulation of lattice QCD with two staggered flavors on Graphics Processing Units, using the NVIDIA CUDA programming language. The main feature of our code is that the GPU is not…
We present the results of an effort to accelerate a Rational Hybrid Monte Carlo (RHMC) program for lattice quantum chromodynamics (QCD) simulation for 2 flavours of staggered fermions on multiple Kepler K20X GPUs distributed on different…
This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved…
Lattice spin models are useful for studying critical phenomena and allow the extraction of equilibrium and dynamical properties. Simulations of such systems are usually based on Monte Carlo (MC) techniques, and the main difficulty is often…
The rise of exascale supercomputers has fueled competition among GPU vendors, driving lattice QCD developers to write code that supports multiple APIs. Moreover, new developments in algorithms and physics research require frequent updates…
An increasingly large number of HPC systems rely on heterogeneous architectures combining traditional multi-core CPUs with power efficient accelerators. Designing efficient applications for these systems has been troublesome in the past as…
GPU runtimes are historically implemented in CUDA or other vendor specific languages dedicated to GPU programming. In this work we show that OpenMP 5.1, with minor compiler extensions, is capable of replacing existing solutions without a…
GPUs have climbed up to the top of supercomputer systems making life harder to many legacy scientific codes. Nowadays, many recipes are being used in such code's portability, without any clarity of which is the best option. We present a…
This paper presents, to the author's knowledge, the first graphics processing unit (GPU) accelerated program that solves the evolution of interacting scalar fields in an expanding universe. We present the implementation in NVIDIA's Compute…
Graphics Processing Units (GPUs) are being used in many areas of physics, since the performance versus cost is very attractive. The GPUs can be addressed by CUDA which is a NVIDIA's parallel computing architecture. It enables dramatic…
We report on our implementation of LatticeQCD applications using OpenCL. We focus on the general concept and on distributing different parts on hybrid systems, consisting of both CPUs (Central Processing Units) and GPUs (Graphic Processing…
The presence of GPU from different vendors demands the Lattice QCD codes to support multiple architectures. To this end, Open Computing Language (OpenCL) is one of the viable frameworks for writing a portable code. It is of interest to find…
Various kinds of applications take advantage of GPUs through automation tools that attempt to automatically exploit the available performance of the GPU's parallel architecture. Directive-based programming models, such as OpenACC, are one…
The supercomputing platforms available for high performance computing based research evolve at a great rate. However, this rapid development of novel technologies requires constant adaptations and optimizations of the existing codes for…
The speed, bandwidth and cost characteristics of today's PC graphics cards make them an attractive target as general purpose computational platforms. High performance can be achieved also for lattice simulations but the actual…
Current PC processors are equipped with vector processing units and have other advanced features that can be used to accelerate lattice QCD programs. Clusters of PCs with a high-bandwidth network thus become powerful and cost-effective…
We are improving one of the available lattice software packages HiRep by adding GPU acceleration supporting highly-optimized simulations on both NVIDIA and AMD GPUs. HiRep allows lattice simulations of theories with fermions in higher…
Designing hardware is a time-consuming and complex process. Realization of both, embedded and high-performance applications can benefit from a design process on a higher level of abstraction. This helps to reduce development time and allows…