Related papers: Memory Bounds for Concurrent Bounded Queues
Priority queues are abstract data structures which store a set of key/value pairs and allow efficient access to the item with the minimal (maximal) key. Such queues are an important element in various areas of computer science such as…
The concurrency literature presents a number of approaches for building non-blocking, FIFO, multiple-producer and multiple-consumer (MPMC) queues. However, only a fraction of them have high performance. In addition, many queue designs, such…
Memory-augmented neural networks consisting of a neural controller and an external memory have shown potentials in long-term sequential learning. Current RAM-like memory models maintain memory accessing every timesteps, thus they do not…
Skiplists are used in a variety of applications for storing data subject to order criteria. In this article we discuss the design, analysis and performance of a concurrent deterministic skiplist on many-core NUMA nodes. We also evaluate the…
Priority queues are fundamental abstract data structures, often used to manage limited resources in parallel programming. Several proposed parallel priority queue implementations are based on skiplists, harnessing the potential for…
Priority queues are fundamental data structures with widespread applications in various domains, including graph algorithms and network simulations. Their performance critically impacts the overall efficiency of these algorithms.…
Priority queues with parallel access are an attractive data structure for applications like prioritized online scheduling, discrete event simulation, or greedy algorithms. However, a classical priority queue constitutes a severe bottleneck…
We investigate effects of ordering in blocked matrix--matrix multiplication. We find that submatrices do not have to be stored contiguously in memory to achieve near optimal performance. Instead it is the choice of execution order of the…
Our goal is to efficiently solve the dynamic memory allocation problem in a concurrent setting where processes run asynchronously. On $p$ processes, we can support allocation and free for fixed-sized blocks with $O(1)$ worst-case time per…
Compute in-memory (CIM) is a promising technique that minimizes data transport, the primary performance bottleneck and energy cost of most data intensive applications. This has found wide-spread adoption in accelerating neural networks for…
Many computer systems for calculating the proper organization of memory are among the most critical issues. Using a tier cache memory (along with branching prediction) is an effective means of increasing modern multi-core processors'…
Bandwidth-starved multicore chips have become ubiquitous. It is well known that the performance of stencil codes can be improved by temporal blocking, lessening the pressure on the memory interface. We introduce a new pipelined approach…
One of the biggest open problems in external memory data structures is the priority queue problem with DecreaseKey operations. If only Insert and ExtractMin operations need to be supported, one can design a comparison-based priority queue…
Priority queues are one of the most fundamental and widely used data structures in computer science. Their primary objective is to efficiently support the insertion of new elements with assigned priorities and the extraction of the highest…
Priority queues with parallel access are an attractive data structure for applications like prioritized online scheduling, discrete event simulation, or branch-and-bound. However, a classical priority queue constitutes a severe bottleneck…
Priority queues are used in a wide range of applications, including prioritized online scheduling, discrete event simulation, and greedy algorithms. In parallel settings, classical priority queues often become a severe bottleneck, resulting…
Concurrent hash tables are one of the most important concurrent data structures with numerous applications. Since hash table accesses can dominate the execution time of the overall application, we need implementations that achieve good…
We consider durable data structures for non-volatile main memory, such as the new Intel Optane memory architecture. Substantial recent work has concentrated on making concurrent data structures durable with low overhead, by adding a minimal…
Concurrent priority queues are widely used in important workloads, such as graph applications and discrete event simulations. However, designing scalable concurrent priority queues for NUMA architectures is challenging. Even though several…
The recent advent of programmable switches makes distributed algorithms readily deployable in real-world datacenter networks. However, there are still gaps between theory and practice that prevent the smooth adaptation of CONGEST algorithms…