Related papers: Fissile Locks
Modern multi-socket architectures exhibit non-uniform memory access (NUMA) behavior, where access by a core to data cached locally on a socket is much faster than access to data cached on a remote socket. Prior work offers several efficient…
We present "Reciprocating Locks", a novel mutual exclusion locking algorithm, targeting cache-coherent shared memory (CC), that enjoys a number of desirable properties. The doorway arrival phase and the release operation both run in…
We present Hapax Locks, a novel locking algorithm that is simple, enjoys constant-time arrival and unlock paths, provides FIFO admission order, and which is also space efficient and generates relatively little coherence traffic under…
The pursuit of power-efficiency is popularizing asymmetric multicore processors (AMP) such as ARM big.LITTLE, Apple M1 and recent Intel Alder Lake with big and little cores. However, we find that existing scalable locks fail to scale on AMP…
Developing concurrent software is challenging, especially if it has to run on modern architectures with Weak Memory Models (WMMs) such as ARMv8, Power, or RISC-V. For the sake of performance, WMMs allow hardware and compilers to…
Saturated locks often degrade the performance of a multithreaded application, leading to a so-called scalability collapse problem. This problem arises when a growing number of threads circulating through a saturated lock causes the overall…
We present Hemlock, a novel mutual exclusion locking algorithm that is extremely compact, requiring just one word per thread plus one word per lock, but which still provides local spinning in most circumstances, high throughput under…
Priority queues are abstract data structures which store a set of key/value pairs and allow efficient access to the item with the minimal (maximal) key. Such queues are an important element in various areas of computer science such as…
We present a new lock-free multiple-producer and multiple-consumer (MPMC) FIFO queue design which is scalable and, unlike existing high-performant queues, very memory efficient. Moreover, the design is ABA safe and does not require any…
The classic ticket lock consists of ticket and grant fields. Arriving threads atomically fetch-and-increment ticket and then wait for grant to become equal to the value returned by the fetch-and-increment primitive, at which point the…
This paper presents a new and practical approach to lock-free locks based on helping, which allows the user to write code using fine-grained locks, but run it in a lock-free manner. Although lock-free locks have been suggested in the past,…
Uplink (UL) dominated sporadic transmission and stringent latency requirement of massive machine type communication (mMTC) forces researchers to abandon complicated grant-acknowledgment based legacy networks. UL grant-free non-orthogonal…
With the fast evolvement of embedded deep-learning computing systems, applications powered by deep learning are moving from the cloud to the edge. When deploying neural networks (NNs) onto the devices under complex environments, there are…
Traditionally, multithreaded data structures have been designed for access by the threads of Operating Systems (OS). However, implementations for access by programmable alternatives known as lightweight threads (also referred to as…
A standard design pattern found in many concurrent data structures, such as hash tables or ordered containers, is alternation of parallelizable sections that incur no data conflicts and critical sections that must run sequentially and are…
We present Trust<T>, a general, type- and memory-safe alternative to locking in concurrent programs. Instead of synchronizing multi-threaded access to an object of type T with a lock, the programmer may place the object in a Trust<T>. The…
Disaggregated memory (DM) separates compute and memory resources, allowing flexible scaling to achieve high resource utilization. To ensure atomic and consistent data access on DM, distributed transaction systems have been adapted, where…
In this article we present Mutable Locks, a synchronization construct with the same execution semantic of traditional locks (such as spin locks or sleep locks), but with a self-tuned optimized trade off between responsiveness---in the…
The lock is a building-block synchronization primitive that enables mutually exclusive access to shared data in shared-memory parallel programs. Mutual exclusion is typically achieved by guarding the code that accesses the shared data with…
A standard design pattern found in many concurrent data structures, such as hash tables or ordered containers, is an alternation of parallelizable sections that incur no data conflicts and critical sections that must run sequentially and…