Related papers: Write-and-f-array: implementation and an applicati…

Efficient Wait-Free Linearizable Implementations of Approximate Bounded Counters Using Read-Write Registers

Relaxing the sequential specification of a shared object is a way to obtain an implementation with better performance compared to implementing the original specification. We apply this approach to the Counter object, under the assumption…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-02-23 Colette Johnen , Adnane Khattabi , Alessia Milani , Jennifer L. Welch

Storage-Efficient Shared Memory Emulation

We study the design of storage-efficient algorithms for emulating atomic shared memory over an asynchronous, distributed message-passing system. Our first algorithm is an atomic single-writer multi-reader algorithm based on a novel…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-27 Marwen Zorgui , Robert Mateescu , Filip Blagojevic , Cyril Guyot , Zhiying Wang

In-Place Initializable Arrays

An initializable array is an array that supports the read and write operations for any element and the initialization of the entire array. This paper proposes a simple in-place algorithm to implement an initializable array of length $N$…

Data Structures and Algorithms · Computer Science 2022-03-31 Takashi Katoh , Keisuke Goto

C-AND: Mixed Writing Scheme for Disturb Reduction in 1T Ferroelectric FET Memory

Ferroelectric field effect transistor (FeFET) memory has shown the potential to meet the requirements of the growing need for fast, dense, low-power, and non-volatile memories. In this paper, we propose a memory architecture named…

Systems and Control · Electrical Eng. & Systems 2022-05-25 Mor M. Dahan , Evelyn T. Breyer , Stefan Slesazeck , Thomas Mikolajick , Shahar Kvatinsky

Enlightening Flash Storage to Stream Writes by Objects

For a write request, today flash storage cannot distinguish the logical object it comes from. In such object-oblivious flash devices, concurrent writes from different objects are simply packed in their arrival order to flash memory blocks;…

Databases · Computer Science 2022-01-13 Jong-Hyeok Park , Gihwan Oh , Sang-Won Lee

Read-Modify-Writable Snapshots from Read/Write operations

In the context of asynchronous concurrent shared-memory systems, a snapshot algorithm allows failure-prone processes to concurrently and atomically write on the entries of a shared array MEM , and also atomically read the whole array.…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-20 Armando Castañeda , Braulio Ramses Hernández Martínez

Durable Algorithms for Writable LL/SC and CAS with Dynamic Joining

We present durable implementations for two well known universal primitives -- CAS (compare-and-swap), and its ABA-free counter-part LLSC (load-linked, store-conditional). All our implementations are: writable, meaning they support a Write()…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-02 Prasad Jayanti , Siddhartha Jayanti , Sucharita Jayanti

Efficient Partial Snapshot Implementations

In this work, we propose the $\lambda$-scanner snapshot, a variation of the snapshot object, which supports any fixed amount of $0 < \lambda \leq n$ different $SCAN$ operations being active at any given time. Whenever $\lambda$ is equal to…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-12 Nikolaos D. Kallimanis , Eleni Kanellou , Charidimos Kiosterakis

Aggregating Funnels for Faster Fetch&Add and Queues

Many concurrent algorithms require processes to perform fetch-and-add operations on a single memory location, which can be a hot spot of contention. We present a novel algorithm called Aggregating Funnels that reduces this contention by…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-04 Younghun Roh , Yuanhao Wei , Eric Ruppert , Panagiota Fatourou , Siddhartha Jayanti , Julian Shun

Constant-Time Snapshots with Applications to Concurrent Data Structures

We present an approach for efficiently taking snapshots of the state of a collection of CAS objects. Taking a snapshot allows later operations to read the value that each CAS object had at the time the snapshot was taken. Taking a snapshot…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-01 Yuanhao Wei , Naama Ben-David , Guy E. Blelloch , Panagiota Fatourou , Eric Ruppert , Yihan Sun

Hash in a Flash: Hash Tables for Solid State Devices

In recent years, information retrieval algorithms have taken center stage for extracting important data in ever larger datasets. Advances in hardware technology have lead to the increasingly wide spread use of flash storage devices. Such…

Databases · Computer Science 2012-11-20 Tyler Clemons , S. M. Faisal , Shirish Tatikonda , Charu Aggarawl , Srinivasan Parthasarathy

Jiffy: A Lock-free Skip List with Batch Updates and Snapshots

In this paper we introduce Jiffy, the first lock-free, linearizable ordered key-value index that offers both (1) batch updates, which are put and remove operations that are executed atomically, and (2) consistent snapshots used by, e.g.,…

Data Structures and Algorithms · Computer Science 2021-02-02 Tadeusz Kobus , Maciej Kokociński , Paweł T. Wojciechowski

A Wait-free Queue with Polylogarithmic Step Complexity

We present a novel linearizable wait-free queue implementation using single-word CAS instructions. Previous lock-free queue implementations from CAS all have amortized step complexity of $\Omega(p)$ per operation in worst-case executions,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-15 Hossein Naderibeni , Eric Ruppert

Persistent Non-Blocking Binary Search Trees Supporting Wait-Free Range Queries

This paper presents the first implementation of a search tree data structure in an asynchronous shared-memory system that provides a wait-free algorithm for executing range queries on the tree, in addition to non-blocking algorithms for…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-05-15 Panagiota Fatourou , Eric Ruppert

A framework of constructing placement delivery arrays for centralized coded caching

In caching system, it is desirable to design a coded caching scheme with the transmission load $R$ and subpacketization $F$ as small as possible, in order to improve efficiency of transmission in the peak traffic times and to decrease…

Information Theory · Computer Science 2021-06-01 Minquan Cheng , Jinyu Wang , Xi Zhong , Qiang Wang

Flat-Combining-Based Persistent Data Structures for Non-Volatile Memory

Flat combining (FC) is a synchronization paradigm in which a single thread, holding a global lock, collects requests by multiple threads for accessing a concurrent data structure and applies their combined requests to it. Although FC is…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-10 Matan Rusanovsky , Hagit Attiya , Ohad Ben-Baruch , Tom Gerby , Danny Hendler , Pedro Ramalhete

An Attention Free Transformer

We introduce Attention Free Transformer (AFT), an efficient variant of Transformers that eliminates the need for dot product self attention. In an AFT layer, the key and value are first combined with a set of learned position biases, the…

Machine Learning · Computer Science 2021-09-23 Shuangfei Zhai , Walter Talbott , Nitish Srivastava , Chen Huang , Hanlin Goh , Ruixiang Zhang , Josh Susskind

Fine-grained Analysis on Fast Implementations of Distributed Multi-writer Atomic Registers

Distributed multi-writer atomic registers are at the heart of a large number of distributed algorithms. While enjoying the benefits of atomicity, researchers further explore fast implementations of atomic reigsters which are optimal in…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-26 Kaile Huang , Yu Huang , Hengfeng Wei

A Wait-free Multi-word Atomic (1,N) Register for Large-scale Data Sharing on Multi-core Machines

We present a multi-word atomic (1,N) register for multi-core machines exploiting Read-Modify-Write (RMW) instructions to coordinate the writer and the readers in a wait-free manner. Our proposal, called Anonymous Readers Counting (ARC),…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-07-25 Mauro Ianni , Alessandro Pellegrini , Francesco Quaglia

Towards Reduced Instruction Sets for Synchronization

Contrary to common belief, a recent work by Ellen, Gelashvili, Shavit, and Zhu has shown that computability does not require multicore architectures to support "strong" synchronization instructions like compare-and-swap, as opposed to…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-05-09 Rati Gelashvili , Idit Keidar , Alexander Spiegelman , Roger Wattenhofer