Related papers: A Bulk-Parallel Priority Queue in External Memory …

Equivalence between Priority Queues and Sorting in External Memory

A priority queue is a fundamental data structure that maintains a dynamic ordered set of keys and supports the followig basic operations: insertion of a key, deletion of a key, and finding the smallest key. The complexity of the priority…

Data Structures and Algorithms · Computer Science 2012-07-19 Zhewei Wei , Ke Yi

Scalable Distributed-Memory External Sorting

We engineer algorithms for sorting huge data sets on massively parallel machines. The algorithms are based on the multiway merging paradigm. We first outline an algorithm whose I/O requirement is close to a lower bound. Thus, in contrast to…

Data Structures and Algorithms · Computer Science 2009-10-15 Mirko Rahn , Peter Sanders , Johannes Singler

Parallelizing Query Optimization on Shared-Nothing Architectures

Data processing systems offer an ever increasing degree of parallelism on the levels of cores, CPUs, and processing nodes. Query optimization must exploit high degrees of parallelism in order not to gradually become the bottleneck of query…

Databases · Computer Science 2015-11-06 Immanuel Trummer , Christoph Koch

An Empirical Study of Cache-Oblivious Priority Queues and their Application to the Shortest Path Problem

In recent years the Cache-Oblivious model of external memory computation has provided an attractive theoretical basis for the analysis of algorithms on massive datasets. Much progress has been made in discovering algorithms that are…

Data Structures and Algorithms · Computer Science 2008-02-08 Benjamin Sach , Raphaël Clifford

Parallel External Sorting of ASCII Records Using Learned Models

External sorting is at the core of many operations in large-scale database systems, such as ordering and aggregation queries for large result sets, building indexes, sort-merge joins, duplicate removal, sharding, and record clustering.…

Databases · Computer Science 2023-05-11 Ani Kristo , Tim Kraska

External Memory based Distributed Generation of Massive Scale Social Networks on Small Clusters

Small distributed systems are limited by their main memory to generate massively large graphs. Trivial extension to current graph generators to utilize external memory leads to large amount of random I/O hence do not scale with size. In…

Databases · Computer Science 2012-10-02 Sandeep Gupta

A Faster External Memory Priority Queue with DecreaseKeys

A priority queue is a fundamental data structure that maintains a dynamic set of (key, priority)-pairs and supports Insert, Delete, ExtractMin and DecreaseKey operations. In the external memory model, the current best priority queue…

Data Structures and Algorithms · Computer Science 2018-06-21 Shunhua Jiang , Kasper Green Larsen

CXL-GPU: Pushing GPU Memory Boundaries with the Integration of CXL Technologies

This work introduces a GPU storage expansion solution utilizing CXL, featuring a novel GPU system design with multiple CXL root ports for integrating diverse storage media (DRAMs and/or SSDs). We developed and siliconized a custom CXL…

Hardware Architecture · Computer Science 2025-06-19 Donghyun Gouk , Seungkwan Kang , Seungjun Lee , Jiseon Kim , Kyungkuk Nam , Eojin Ryu , Sangwon Lee , Dongpyung Kim , Junhyeok Jang , Hanyeoreum Bae , Myoungsoo Jung

The Adaptive Priority Queue with Elimination and Combining

Priority queues are fundamental abstract data structures, often used to manage limited resources in parallel programming. Several proposed parallel priority queue implementations are based on skiplists, harnessing the potential for…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-08-06 Irina Calciu , Hammurabi Mendes , Maurice Herlihy

Multi-Resource Parallel Query Scheduling and Optimization

Scheduling query execution plans is a particularly complex problem in shared-nothing parallel systems, where each site consists of a collection of local time-shared (e.g., CPU(s) or disk(s)) and space-shared (e.g., memory) resources and…

Databases · Computer Science 2014-04-01 Minos Garofalakis , Yannis Ioannidis

High-Quality Shared-Memory Graph Partitioning

Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation in processing graphs. Recently, size, variety, and structural complexity of these networks has grown dramatically.…

Data Structures and Algorithms · Computer Science 2018-10-16 Yaroslav Akhremtsev , Peter Sanders , Christian Schulz

A Scalable Shared-Memory Parallel Simplex for Large-Scale Linear Programming

The Simplex tableau has been broadly used and investigated in the industry and academia. With the advent of the big data era, ever larger problems are posed to be solved in ever larger machines whose architecture type did not exist in the…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-29 Demetrios Coutinho , Felipe O. Lins e Silva , Daniel Aloise , Samuel , Xavier-de-Souza

Speculative Path Planning

Parallelization of A* path planning is mostly limited by the number of possible motions, which is far less than the level of parallelism that modern processors support. In this paper, we go beyond the limitations of traditional parallelism…

Robotics · Computer Science 2021-02-16 Mohammad Bakhshalipour , Mohamad Qadri , Dominic Guri

Architectural and System Implications of CXL-enabled Tiered Memory

Memory disaggregation is an emerging technology that decouples memory from traditional memory buses, enabling independent scaling of compute and memory. Compute Express Link (CXL), an open-standard interconnect technology, facilitates…

Hardware Architecture · Computer Science 2025-03-27 Yujie Yang , Lingfeng Xiang , Peiran Du , Zhen Lin , Weishu Deng , Ren Wang , Andrey Kudryavtsev , Louis Ko , Hui Lu , Jia Rao

Engineering MultiQueues: Fast Relaxed Concurrent Priority Queues

Priority queues are used in a wide range of applications, including prioritized online scheduling, discrete event simulation, and greedy algorithms. In parallel settings, classical priority queues often become a severe bottleneck, resulting…

Data Structures and Algorithms · Computer Science 2025-04-17 Marvin Williams , Peter Sanders

Scaling Ordered Stream Processing on Shared-Memory Multicores

Many modern applications require real-time processing of large volumes of high-speed data. Such data processing needs can be modeled as a streaming computation. A streaming computation is specified as a dataflow graph that exposes multiple…

Databases · Computer Science 2018-04-02 Guna Prasaad , G. Ramalingam , Kaushik Rajan

Scheduling optimization of parallel linear algebra algorithms using Supervised Learning

Linear algebra algorithms are used widely in a variety of domains, e.g machine learning, numerical physics and video games graphics. For all these applications, loop-level parallelism is required to achieve high performance. However,…

Machine Learning · Computer Science 2020-01-24 G. Laberge , S. Shirzad , P. Diehl , H. Kaiser , S. Prudhomme , A. Lemoine

Concurrent Double-Ended Priority Queues

This work provides the first concurrent implementation specifically designed for a double-ended priority queue (DEPQ). We do this by describing a general way to add an ExtractMax operation to any concurrent priority queue that already…

Data Structures and Algorithms · Computer Science 2025-08-20 Panagiota Fatourou , Eric Ruppert , Ioannis Xiradakis

Accelerating Concurrent Heap on GPUs

Priority queue, often implemented as a heap, is an abstract data type that has been used in many well-known applications like Dijkstra's shortest path algorithm, Prim's minimum spanning tree, Huffman encoding, and the branch-and-bound…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-18 Yanhao Chen , Fei Hua , Chaozhang Huang , Jeremy Bierema , Chi Zhang , Eddy Z. Zhang

Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference

The evolution of Large Language Model (LLM) serving towards complex, distributed architectures--specifically the P/D-separated, large-scale DP+EP paradigm--introduces distinct scheduling challenges. Unlike traditional deployments where…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-19 Jian Tian , Shuailong Li , Yang Cao , Wenbo Cui , Minghan Zhu , Wenkang Wu , Jianming Zhang , Yanpeng Wang , Zhiwen Xiao , Zhenyu Hou , Dou Shen