English
Related papers

Related papers: Testing GPU Numerics: Finding Numerical Difference…

200 papers

CUDA and OpenCL are two different frameworks for GPU programming. OpenCL is an open standard that can be used to program CPUs, GPUs, and other devices from different vendors, while CUDA is specific to NVIDIA GPUs. Although OpenCL promises a…

Performance · Computer Science 2011-05-17 Kamran Karimi , Neil G. Dickson , Firas Hamze

Hybrid computational architectures based on the joint power of Central Processing Units and Graphic Processing Units (GPUs) are becoming popular and powerful hardware tools for a wide range of simulations in biology, chemistry, engineering,…

Instrumentation and Methods for Astrophysics · Physics 2015-06-15 Roberto Capuzzo-Dolcetta , Mario Spera

GPUs are the most popular platform for accelerating HPC workloads, such as artificial intelligence and science simulations. However, most microarchitectural research in academia relies on GPU core pipeline designs based on architectures…

Hardware Architecture · Computer Science 2025-10-30 Rodrigo Huerta , Mojtaba Abaie Shoushtary , José-Lorenzo Cruz , Antonio González

The last decade has seen a shift in the computer systems industry where heterogeneous computing has become prevalent. Graphics Processing Units (GPUs) are now present in supercomputers to mobile phones and tablets. GPUs are used for…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-09-04 Yehia Arafa , Abdel-Hameed Badawy , Gopinath Chennupati , Nandakishore Santhi , Stephan Eidenbenz

Many studies have focused on developing and improving auto-tuning algorithms for Nvidia Graphics Processing Units (GPUs), but the effectiveness and efficiency of these approaches on AMD devices have hardly been studied. This paper aims to…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-17 Milo Lurati , Stijn Heldens , Alessio Sclocco , Ben van Werkhoven

Different from developing neural networks (NNs) for general-purpose processors, the development for NN chips usually faces with some hardware-specific restrictions, such as limited precision of network signals and parameters, constrained…

Neural and Evolutionary Computing · Computer Science 2018-01-19 Yu Ji , YouHui Zhang , WenGuang Chen , Yuan Xie

Numerical features of matrix multiplier hardware units in NVIDIA and AMD data centre GPUs have recently been studied. Features such as rounding, normalisation, and internal precision of the accumulators are of interest. In this paper, we…

Hardware Architecture · Computer Science 2025-10-21 Faizan A Khattak , Mantas Mikaitis

Portability is critical to ensuring high productivity in developing and maintaining scientific software as the diversity in on-node hardware architectures increases. While several programming models provide portability for diverse GPU…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-08 Joshua H. Davis , Pranav Sivaraman , Joy Kitson , Konstantinos Parasyris , Harshitha Menon , Isaac Minn , Giorgis Georgakoudis , Abhinav Bhatele

Since the first idea of using GPU to general purpose computing, things have evolved over the years and now there are several approaches to GPU programming. GPU computing practically began with the introduction of CUDA (Compute Unified…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-09 Bogdan Oancea , Tudorel Andrei , Raluca Mariana Dragoescu

In order to obtain more accurate solutions of polynomial systems with numerical continuation methods we use multiprecision arithmetic. Our goal is to offset the overhead of double double arithmetic accelerating the path trackers and in…

Mathematical Software · Computer Science 2012-01-04 Jan Verschelde , Genady Yoffe

Graphics Processing Unit (GPU) computing is becoming an alternate computing platform for numerical simulations. However, it is not clear which numerical scheme will provide the highest computational efficiency for different types of…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-07 Ben J. Zimmerman , Jonathan D. Regele , Bong Wie

Analysis of processing time and similarity of images generated between CPU and GPU architectures and sequential and parallel programming. For image processing a computer with AMD FX-8350 processor and an Nvidia GTX 960 Maxwell GPU was used,…

Efficiently exploiting GPUs is increasingly essential in scientific computing, as many current and upcoming supercomputers are built using them. To facilitate this, there are a number of programming approaches, such as CUDA, OpenACC and…

Performance · Computer Science 2017-11-07 G. D. Balogh , I. Z. Reguly , G. R. Mudalige

Image Processing is a specialized area of Digital Signal Processing which contains various mathematical and algebraic operations such as matrix inversion, transpose of matrix, derivative, convolution, Fourier Transform etc. Operations like…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-24 Batuhan Hangün , Önder Eyecioğlu

We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-02-11 Jiří Filipovič , Jana Hozzová , Amin Nezarat , Jaroslav Oľha , Filip Petrovič

GPUs are playing an increasingly important role in general-purpose computing. Many algorithms require synchronizations at different levels of granularity in a single GPU. Additionally, the emergence of dense GPU nodes also calls for…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-14 Lingqi Zhang , Mohamed Wahib , Haoyu Zhang , Satoshi Matsuoka

Matrix multiplication is a foundational operation in scientific computing and machine learning, yet its computational complexity makes it a significant bottleneck for large-scale applications. The shift to parallel architectures, primarily…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-30 Mufakir Qamar Ansari , Mudabir Qamar Ansari

Hardware accelerators (such as Nvidia's CUDA GPUs) have tremendous promise for computational science, because they can deliver large gains in performance at relatively low cost. In this work, we focus on the use of Nvidia's Tesla GPU for…

Computational Physics · Physics 2010-06-04 Rakesh Ginjupalli , Gaurav Khanna

In recent years, it has become increasingly common for high performance computers (HPC) to possess some level of heterogeneous architecture - typically in the form of GPU accelerators. In some machines these are isolated within a dedicated…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-19 I. Zacharoudiou , J. W. S. McCullough , P. V. Coveney

This paper presents the implementation of a HLLC finite volume solver using GPU technology for the solution of shallow water problems in two dimensions. It compares both CPU and GPU approaches for implementing all the solver's steps. The…

Computational Engineering, Finance, and Science · Computer Science 2018-07-03 Fabrice Zaoui
‹ Prev 1 2 3 10 Next ›