English
Related papers

Related papers: Deterministic Sample Sort For GPUs

200 papers

In this paper, we present the design of a sample sort algorithm for manycore GPUs. Despite being one of the most efficient comparison-based sorting algorithms for distributed memory architectures its performance on GPUs was previously…

Data Structures and Algorithms · Computer Science 2009-10-01 Nikolaj Leischner , Vitaly Osipov , Peter Sanders

The Bulk-Synchronous Parallel model of computation has been used for the architecture independent design and analysis of parallel algorithms whose performance is expressed not only in terms of problem size n but also in terms of parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-08-29 Alexandros V. Gerbessiotis , Constantinos J. Siniolakis

This paper describes in detail the bitonic sort algorithm,and implements the bitonic sort algorithm based on cuda architecture.At the same time,we conduct two effective optimization of implementation details according to the characteristics…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-31 Qi Mu , Liqing Cui , Yufei Song

We propose new sequential sorting operations by adapting techniques and methods used for designing parallel sorting algorithms. Although the norm is to parallelize a sequential algorithm to improve performance, we adapt a contrarian…

Data Structures and Algorithms · Computer Science 2016-09-01 Alexandros V Gerbessiotis

Multisplit is a broadly useful parallel primitive that permutes its input data into contiguous buckets or bins, where the function that categorizes an element into a bucket is provided by the programmer. Due to the lack of an efficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-08 Saman Ashkiani , Andrew Davidson , Ulrich Meyer , John D. Owens

Sorting is a fundamental operation in computer science and is a bottleneck in many important fields. Sorting is critical to database applications, online search and indexing,biomedical computing, and many other applications. The explosive…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-11 Dmitri I. Arkhipov , Di Wu , Keqin Li , Amelia C. Regan

Integer sorting on multicores and GPUs can be realized by a variety of approaches that include variants of distribution-based methods such as radix-sort, comparison-oriented algorithms such as deterministic regular sampling and random…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-08-31 Alexandros V. Gerbessiotis

We investigate distributed memory parallel sorting algorithms that scale to the largest available machines and are robust with respect to input size and distribution of the input elements. The main outcome is that four sorting algorithms…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-17 Michael Axtmann , Peter Sanders

Sorting is a primitive operation that is a building block for countless algorithms. As such, it is important to design sorting algorithms that approach peak performance on a range of hardware architectures. Graphics Processing Units (GPUs)…

Data Structures and Algorithms · Computer Science 2017-03-31 Henri Casanova , John Iacono , Ben Karsin , Nodari Sitchinava , Volker Weichert

We discuss how string sorting algorithms can be parallelized on modern multi-core shared memory machines. As a synthesis of the best sequential string sorting algorithms and successful parallel sorting algorithms for atomic objects, we…

Data Structures and Algorithms · Computer Science 2013-05-07 Timo Bingmann , Peter Sanders

Sorting is at the core of many database operations, such as index creation, sort-merge joins, and user-requested output sorting. As GPUs are emerging as a promising platform to accelerate various operations, sorting on GPUs becomes a viable…

Databases · Computer Science 2017-05-22 Elias Stehle , Hans-Arno Jacobsen

A new algorithm, Guidesort, for sorting in the uniprocessor variant of the parallel disk model (PDM) of Vitter and Shriver is presented. The algorithm is deterministic and executes a number of (parallel) I/O operations that comes within a…

Data Structures and Algorithms · Computer Science 2019-02-18 Torben Hagerup

There have been many proposals for sorting integers on multicores/GPUs that include radix-sort and its variants or other approaches that exploit specialized hardware features of a particular multicore architecture. Comparison-based…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-01 Alexandros V. Gerbessiotis

Sorting is one of the most fundamental problems in the field of computer science. With the rapid development of manycore processors, it shows great importance to design efficient parallel sort algorithm on manycore architecture. This paper…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-02-18 Tianyi Yu , Wei Li

Sorting is one of the most basic algorithms, and developing highly parallel sorting programs is becoming increasingly important in high-performance computing because the number of CPU cores per node in modern supercomputers tends to…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-08 Tomoyuki Tokuue , Tomoaki Ishiyama

Machine learning models, and deep neural networks in particular, are increasingly deployed in risk-sensitive domains such as healthcare, environmental forecasting, and finance, where reliable quantification of predictive uncertainty is…

Machine Learning · Computer Science 2026-04-07 Asena Karolin Özdemir , Lars H. Heyen , Arvid Weyrauch , Achim Streit , Markus Götz , Charlotte Debus

Previous parallel sorting algorithms do not scale to the largest available machines, since they either have prohibitive communication volume or prohibitive critical path length. We describe algorithms that are a viable compromise and…

Data Structures and Algorithms · Computer Science 2015-02-26 Michael Axtmann , Timo Bingmann , Peter Sanders , Christian Schulz

We propose a GPU-accelerated distributed optimization algorithm for controlling multi-phase optimal power flow in active distribution systems with dynamically changing topologies. To handle varying network configurations and enable…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-15 Minseok Ryu , Geunyeong Byeon , Kibaek Kim

Linear-time algorithms that are traditionally used to shuffle data on CPUs, such as the method of Fisher-Yates, are not well suited to implementation on GPUs due to inherent sequential dependencies, and existing parallel shuffling…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-02-04 Rory Mitchell , Daniel Stokes , Eibe Frank , Geoffrey Holmes

We present a deterministic parallel multilevel algorithm for balanced hypergraph partitioning that matches the state of the art for non-deterministic algorithms. Deterministic parallel algorithms produce the same result in each invocation,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-13 Robert Krause , Lars Gottesbüren , Nikolai Maas
‹ Prev 1 2 3 10 Next ›