English
Related papers

Related papers: Accelerating Concurrent Heap on GPUs

200 papers

The single-source shortest path (SSSP) problem is a well-studied problem that is used in many applications. In the parallel setting, a work-efficient algorithm that additionally attains $o(n)$ parallel depth has been elusive. Alternatively,…

Data Structures and Algorithms · Computer Science 2023-05-15 Kyle Berney , John Iacono , Ben Karsin , Nodari Sitchinava

The convex hull is a fundamental geometrical structure for many applications where groups of points must be enclosed or represented by a convex polygon. Although efficient sequential convex hull algorithms exist, and are constantly being…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-27 Alan Keith , Héctor Ferrada , Cristóbal A. Navarro

In recent years the Cache-Oblivious model of external memory computation has provided an attractive theoretical basis for the analysis of algorithms on massive datasets. Much progress has been made in discovering algorithms that are…

Data Structures and Algorithms · Computer Science 2008-02-08 Benjamin Sach , Raphaël Clifford

In this work, we present an extension of Gaussian process (GP) models with sophisticated parallelization and GPU acceleration. The parallelization scheme arises naturally from the modular computational structure w.r.t. datapoints in the…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-10-21 Zhenwen Dai , Andreas Damianou , James Hensman , Neil Lawrence

Parallel algorithms on CPU and GPU are implemented for the Unified Gas-Kinetic Scheme and their performances are investigated and compared by a two dimensional channel flow case. The parallel CPU algorithm has a one dimensional block…

Computational Physics · Physics 2018-11-02 Jizhou Liu , Fang Q. Hu , Xiaodong Li

Many techniques in program synthesis, superoptimization, and array programming require parallel rollouts of general-purpose programs. GPUs, while capable targets for domain-specific parallelism, are traditionally underutilized by such…

Programming Languages · Computer Science 2026-04-15 Breandan Considine

Graph coloring has been broadly used to discover concurrency in parallel computing. To speedup graph coloring for large-scale datasets, parallel algorithms have been proposed to leverage modern GPUs. Existing GPU implementations either have…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-22 Xuhao Chen , Pingfan Li , Jianbin Fang , Tao Tang , Zhiying Wang , Canqun Yang

In this paper, we explore the limits of graphics processors (GPUs) for general purpose parallel computing by studying problems that require highly irregular data access patterns: parallel graph algorithms for list ranking and connected…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-02-25 Frank Dehne , Kumanan Yogaratnam

Gaussian processes (GPs) are a widely used regression tool, but the cubic complexity of exact solvers limits their scalability. To address this challenge, we extend the GPRat library by incorporating a fully GPU-resident GP prediction…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-24 Henrik Möllmann , Dirk Pflüger , Alexander Strack

Reduction operations are extensively employed in many computational problems. A reduction consists of, given a finite set of numeric elements, combining into a single value all elements in that set, using for this a combiner function. A…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-10-23 Walid Jradi , Hugo do Nascimento , Wellington Martins

This paper presents a Graphics Processing Units (GPUs) acceleration method of an iterative scheme for gas-kinetic model equations. Unlike the previous GPU parallelization of explicit kinetic schemes, this work features a fast converging…

Computational Physics · Physics 2020-01-08 Lianhua Zhu , Peng Wang , Songze Chen , Zhaoli Guo , Yonghao Zhang

GPUs have been widely used to accelerate computations exhibiting simple patterns of parallelism - such as flat or two-level parallelism - and a degree of parallelism that can be statically determined based on the size of the input dataset.…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-18 Hancheng Wu , Da Li , Michela Becchi

Matrix multiplication is a foundational operation in scientific computing and machine learning, yet its computational complexity makes it a significant bottleneck for large-scale applications. The shift to parallel architectures, primarily…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-30 Mufakir Qamar Ansari , Mudabir Qamar Ansari

This paper presents efforts to improve the hierarchical parallelism of a two scale simulation code. Two methods to improve the GPU parallel performance were developed and compared. The first used the NVIDIA Multi-Process Service and the…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-15 Jacob Merson , Mark S. Shephard

First-order methods based on the PDHG algorithm have recently emerged as a viable option for efficiently solving large-scale linear programming problems. One highly desirable property of these methods is that they can make effective use of…

Optimization and Control · Mathematics 2025-10-29 Edward Rothberg

Sequential models, such as Recurrent Neural Networks and Neural Ordinary Differential Equations, have long suffered from slow training due to their inherent sequential nature. For many years this bottleneck has persisted, as many thought…

Machine Learning · Computer Science 2024-01-17 Yi Heng Lim , Qi Zhu , Joshua Selfridge , Muhammad Firmansyah Kasim

Massively multicore processors, such as Graphics Processing Units (GPUs), provide, at a comparable price, a one order of magnitude higher peak performance than traditional CPUs. This drop in the cost of computation, as any…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-18 Samer Al-Kiswany , Abdullah Gharaibeh , Matei Ripeanu

The number of cores on graphical computing units (GPUs) is reaching thousands nowadays, whereas the clock speed of processors stagnates. Unfortunately, constraint programming solvers do not take advantage yet of GPU parallelism. One reason…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-26 Pierre Talbot , Frédéric Pinel , Pascal Bouvry

Large-scale observational health databases are increasingly popular for conducting comparative effectiveness and safety studies of medical products. However, increasing number of patients poses computational challenges when fitting survival…

Computation · Statistics 2023-10-26 Jianxiao Yang , Martijn J. Schuemie , Xiang Ji , Marc A. Suchard

GPU hash tables are increasingly used to accelerate data processing, but their limited functionality restricts adoption in large-scale data processing applications. Current limitations include incomplete concurrency support and missing…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-24 Hunter McCoy , Prashant Pandey
‹ Prev 1 2 3 10 Next ›