Related papers: Accelerating Lattice QCD Simulations using GPUs
Modern graphics hardware is designed for highly parallel numerical tasks and promises significant cost and performance benefits for many scientific applications. One such application is lattice quantum chromodyamics (lattice QCD), where the…
Markov Chain Monte Carlo simulations of lattice Quantum Chromodynamics (QCD) are the only known tool to investigate non-perturbatively the theory of the strong interaction and are required to perform precision tests of the Standard Model of…
The past decade has witnessed a dramatic acceleration of lattice quantum chromodynamics calculations in nuclear and particle physics. This has been due to both significant progress in accelerating the iterative linear solvers using…
Over the past five years, graphics processing units (GPUs) have had a transformational effect on numerical lattice quantum chromodynamics (LQCD) calculations in nuclear and particle physics. While GPUs have been applied with great success…
We show that using the multi-splitting algorithm as a preconditioner for the domain wall Dirac linear operator, arising in lattice QCD, effectively reduces the inter-node communication cost, at the expense of performing more on-node…
We consider Monte Carlo simulations of classical spin models of statistical mechanics using the massively parallel architecture provided by graphics processing units (GPUs). We discuss simulations of models with discrete and continuous…
Recent developments have shown that a lot can be gained for QCD simulations from GPU hardware. This can be exploited especially in the case of Ginsparg-Wilson fermions when the com putational costs are particularly high. In this work, we…
Lattice QCD calculations were one of the first applications to show the potential of GPUs in the area of high performance computing. Our interest is to find ways to effectively use GPUs for lattice calculations using the overlap operator.…
Modern heterogeneous high-performance computing (HPC) systems powered by advanced graphics processing unit (GPU) architectures enable accelerating computing with unprecedented performance and scalability. Here, we present a GPU-accelerated…
This paper presents, to the author's knowledge, the first graphics processing unit (GPU) accelerated program that solves the evolution of interacting scalar fields in an expanding universe. We present the implementation in NVIDIA's Compute…
Clawpack is a library for solving nonlinear hyperbolic partial differential equations using high-resolution finite volume methods based on Riemann solvers and limiters. It supports Adaptive Mesh Refinement (AMR), which is essential in…
When simulating a lattice system near its critical temperature, local algorithms for modeling the system's evolution can introduce very large autocorrelation times into sampled data. This critical slowing down places restrictions on the…
It is now a noticeable trend in High Performance Computing that the systems are becoming more and more heterogeneous. Compute nodes with a host CPU are being equipped with accelerators, the latter being a GPU or FPGA cards or both. In many…
Latent Dirichlet Allocation(LDA) is a popular topic model. Given the fact that the input corpus of LDA algorithms consists of millions to billions of tokens, the LDA training process is very time-consuming, which may prevent the usage of…
Extensions to the C++ implementation of the QCD Data Parallel Interface are provided enabling acceleration of expression evaluation on NVIDIA GPUs. Single expressions are off-loaded to the device memory and execution domain leveraging the…
We discuss the implementation and optimization challenges for a Wilson-Dirac solver with Clover term on QPACE, a parallel machine based on Cell processors and a torus network. We choose the mixed-precision Schwarz preconditioned FGCR…
We describe the GPU implementation of shifted or multimass iterative solvers for sparse linear systems of the sort encountered in lattice gauge theory. We provide a generic tool that can be used by those without GPU programming experience…
Numerical simulations of quantum chromodynamics (QCD) on a lattice require the frequent solution of linear systems of equations with large, sparse and typically ill-conditioned matrices. Algebraic multigrid methods are meanwhile the…
Restricted solid on solid surface growth models can be mapped onto binary lattice gases. We show that efficient simulation algorithms can be realized on GPUs either by CUDA or by OpenCL programming. We consider a deposition/evaporation…
We present a GPU-accelerated version of the real-space SPARC electronic structure code for performing hybrid functional calculations in generalized Kohn-Sham density functional theory. In particular, we develop a batch variant of the…