English
Related papers

Related papers: SIMT/GPU Data Race Verification using ISCC and Int…

200 papers

GPU computing is embracing weak memory concurrency for performance improvement. However, compared to CPUs, modern GPUs provide more fine-grained concurrency features such as scopes, have additional properties like divergence, and thereby…

Logic in Computer Science · Computer Science 2025-05-27 Soham Chakraborty , S. Krishna , Andreas Pavlogiannis , Omkar Tuppe

Data races are egregious parallel programming bugs on CPUs. They are even worse on GPUs due to the hierarchical thread and memory structure, which makes it possible to write code that is correctly synchronized within a thread group while…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-10 John Jacobson , Martin Burtscher , Ganesh Gopalakrishnan

We examine the problem of optimizing classification tree evaluation for on-line and real-time applications by using GPUs. Looking at trees with continuous attributes often used in image segmentation, we first put the existing algorithms for…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-11-08 Jason Spencer

Parallel programming models can encourage performance portability by moving the responsibility for work assignment and data distribution from the programmer to a runtime system. However, analyzing the resulting implicit memory allocations,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-14 Fabian Knorr , Philip Salzmann , Peter Thoman , Thomas Fahringer

The high degree of parallelism and relatively complicated synchronization mechanisms in GPUs make writing correct kernels difficult. Data races pose one such concurrency correctness challenge, and therefore, effective methods of detecting…

Programming Languages · Computer Science 2021-11-25 Sagnik Dey , Mayant Mukul , Parth Sharma , Swarnendu Biswas

Nowadays, several industrial applications are being ported to parallel architectures. These applications take advantage of the potential parallelism provided by multiple core processors. Many-core processors, especially the GPUs(Graphics…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-03-28 Wendell Rodrigues , Frédéric Guyomarc'h , Jean-Luc Dekeyser

This paper presents a tool for repairing errors in GPU kernels written in CUDA or OpenCL due to data races and barrier divergence. Our novel extension to prior work can also remove barriers that are deemed unnecessary for correctness. We…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-18 Saurabh Joshi , Gautam Muduganti

Matrix multiplication is a foundational operation in scientific computing and machine learning, yet its computational complexity makes it a significant bottleneck for large-scale applications. The shift to parallel architectures, primarily…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-30 Mufakir Qamar Ansari , Mudabir Qamar Ansari

Parallel computing is a standard approach to achieving high-performance computing (HPC). Three commonly used methods to implement parallel computing include: 1) applying multithreading technology on single-core or multi-core CPUs; 2)…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-18 Xinyao Yi

The continued growth of the computational capability of throughput processors has made throughput processors the platform of choice for a wide variety of high performance computing applications. Graphics Processing Units (GPUs) are a prime…

Hardware Architecture · Computer Science 2018-05-01 Rachata Ausavarungnirun

Heterogeneous systems are becoming more common on High Performance Computing (HPC) systems. Even using tools like CUDA and OpenCL it is a non-trivial task to obtain optimal performance on the GPU. Approaches to simplifying this task include…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-01-11 Marek Blazewicz , Steven R. Brandt , Peter Diener , David M. Koppelman , Krzysztof Kurowski , Frank Löffler , Erik Schnetter , Jian Tao

In this paper, we explore the limits of graphics processors (GPUs) for general purpose parallel computing by studying problems that require highly irregular data access patterns: parallel graph algorithms for list ranking and connected…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-02-25 Frank Dehne , Kumanan Yogaratnam

The advent of high performance computing (HPC) and graphics processing units (GPU), present an enormous computation resource for Large data transactions (big data) that require parallel processing for robust and prompt data analysis. While…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-10-17 Kato Mivule , Benjamin Harvey , Crystal Cobb , Hoda El Sayed

The number of cores on graphical computing units (GPUs) is reaching thousands nowadays, whereas the clock speed of processors stagnates. Unfortunately, constraint programming solvers do not take advantage yet of GPU parallelism. One reason…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-26 Pierre Talbot , Frédéric Pinel , Pascal Bouvry

It is time-consuming and error-prone to implement inference procedures for each new probabilistic model. Probabilistic programming addresses this problem by allowing a user to specify the model and having a compiler automatically generate…

As concurrent programming becomes increasingly prevalent, effectively identifying and addressing concurrency issues such as data races and deadlocks is critical. This study evaluates the performance of several leading large language models…

Software Engineering · Computer Science 2025-09-05 Ridhi Jain , Rahul Purandare

As in various fields like scientific research and industrial application, the computation time optimization is becoming a task that is of increasing importance because of its highly parallel architecture. The graphics processing unit is…

Performance · Computer Science 2017-10-18 Huichao Hong , Lixin Zheng , Shuwan Pan

Using GPUs as general-purpose processors has revolutionized parallel computing by offering, for a large and growing set of algorithms, massive data-parallelization on desktop machines. An obstacle to widespread adoption, however, is the…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-10-14 Alexey Kolesnichenko , Christopher M. Poskitt , Sebastian Nanz , Bertrand Meyer

The simulation of the two-dimensional Ising model is used as a benchmark to show the computational capabilities of Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-26 Joshua Romero , Mauro Bisson , Massimiliano Fatica , Massimo Bernaschi

In recent decades, High Performance Computing (HPC) has undergone significant enhancements, particularly in the realm of hardware platforms, aimed at delivering increased processing power while keeping power consumption within reasonable…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-03 S. -Kazem Shekofteh , Christian Alles , Nils Kochendörfer , Holger Fröning
‹ Prev 1 2 3 10 Next ›