Related papers: A Multi-GPU Programming Library for Real-Time Appl…
In recent years graphical processing units (GPUs) have become a powerful tool in scientific computing. Their potential to speed up highly parallel applications brings the power of high performance computing to a wider range of users.…
Many emerging cyber-physical systems, such as autonomous vehicles and robots, rely heavily on artificial intelligence and machine learning algorithms to perform important system operations. Since these highly parallel applications are…
We present a single-node, multi-GPU programmable graph processing library that allows programmers to easily extend single-GPU graph algorithms to achieve scalable performance on large graphs with billions of edges. Directly using the…
Existing GPU libraries often struggle to fully exploit the parallel resources and on-chip memory (SRAM) of GPUs when chaining multiple GPU functions as individual kernels. While Kernel Fusion (KF) techniques like Horizontal Fusion (HF) and…
A modern graphics processing unit (GPU) is able to perform massively parallel scientific computations at low cost. We extend our implementation of the checkerboard algorithm for the two dimensional Ising model [T. Preis et al., J. Comp.…
Purpose: To develop generic optimization strategies for image reconstruction using graphical processing units (GPUs) in magnetic resonance imaging (MRI) and to exemplarily report about our experience with a highly accelerated implementation…
In this work, we survey the role of GPUs in real-time systems. Originally designed for parallel graphics workloads, GPUs are now widely used in time-critical applications such as machine learning, autonomous vehicles, and robotics due to…
In this paper, we introduce Heteroflow, a new C++ library to help developers quickly write parallel CPU-GPU programs using task dependency graphs. Heteroflow leverages the power of modern C++ and task-based approaches to enable efficient…
Parallel computing can offer an enormous advantage regarding the performance for very large applications in almost any field: scientific computing, computer vision, databases, data mining, and economics. GPUs are high performance many-core…
We investigate the energy efficiency of a library designed for parallel computations with sparse matrices. The library leverages high-performance, energy-efficient Graphics Processing Unit (GPU) accelerators to enable large-scale scientific…
Realistic reservoir simulation is known to be prohibitively expensive in terms of computation time when increasing the accuracy of the simulation or by enlarging the model grid size. One method to address this issue is to parallelize the…
Tremendous advances in parallel computing and graphics hardware opened up several novel real-time GPU applications in the fields of computer vision, computer graphics as well as augmented reality (AR) and virtual reality (VR). Although…
Matrix multiplication is a foundational operation in scientific computing and machine learning, yet its computational complexity makes it a significant bottleneck for large-scale applications. The shift to parallel architectures, primarily…
Nowadays, the paradigm of parallel computing is changing. CUDA is now a popular programming model for general purpose computations on GPUs and a great number of applications were ported to CUDA obtaining speedups of orders of magnitude…
Modern computing platforms tend to deploy multiple GPUs (2, 4, or more) on a single node to boost system performance, with each GPU having a large capacity of global memory and streaming multiprocessors (SMs). GPUs are an expensive…
Recent years have witnessed a rapid advancement in GPU technology, establishing it as a formidable high-performance parallel computing technology with superior floating-point computational capabilities compared to traditional CPUs. This…
While the advances in synchrotron light sources, together with the development of focusing optics and detectors, allow nanoscale ptychographic imaging of materials and biological specimens, the corresponding experiments can yield…
Last several years, GPUs are used to accelerate computations in many computer science domains. We focused on GPU accelerated Support Vector Machines (SVM) training with non-linear kernel functions. We had searched for all available GPU…
Concurrently coupled numerical simulations using heterogeneous solvers are powerful tools for modeling multiscale phenomena. However, major modifications to existing codes are often required to enable such simulations, posing significant…
This paper introduces a new C++/CUDA library for GPU-accelerated stochastic optimization called MPPI-Generic. It provides implementations of Model Predictive Path Integral control, Tube-Model Predictive Path Integral Control, and Robust…