Related papers: Asynchronous Persistence with ASAP
Emerging workloads, such as graph processing and machine learning are approximate because of the scale of data involved and the stochastic nature of the underlying algorithms. These algorithms are often distributed over multiple machines…
Write-ahead logs (WALs) are a fundamental fault-tolerance technique found in many areas of computer science. WALs must be reliable while maintaining high performance, because all operations will be written to the WAL to ensure their…
Partitioning applications between NDP and host CPU cores causes inter-segment data movement overhead, which is caused by moving data generated from one segment (e.g., instructions, functions) and used in consecutive segments. Prior works…
Embedded devices are increasingly ubiquitous and their importance is hard to overestimate. While they often support safety-critical functions (e.g., in medical devices and sensor-alarm combinations), they are usually implemented under…
Phase-change memory (PCM) devices have multiple banks to serve memory requests in parallel. Unfortunately, if two requests go to the same bank, they have to be served one after another, leading to lower system performance. We observe that a…
The proliferation of high-throughput sequencing machines ensures rapid generation of up to billions of short nucleotide fragments in a short period of time. This massive amount of sequence data can quickly overwhelm today's storage and…
Persistent Memory (PM) technologies enable program recovery to a consistent state in a case of failure. To ensure this crash-consistent behavior, programs need to enforce persist ordering by employing mechanisms, such as logging and…
Persistent memory provides high-performance data persistence at main memory. Memory writes need to be performed in strict order to satisfy storage consistency requirements and enable correct recovery from system crashes. Unfortunately,…
Over the past decade, a number of languages for functional reactive programming (FRP) have been suggested, which use modal types to ensure properties like causality, productivity and lack of space leaks. So far, almost all of these…
We present Locksynth, a tool that automatically derives synchronization needed for destructive updates to concurrent data structures that involve a constant number of shared heap memory write operations. Locksynth serves as the…
We consider a parallel computational model that consists of $P$ processors, each with a fast local ephemeral memory of limited size, and sharing a large persistent memory. The model allows for each processor to fault with bounded…
The growing memory demands of modern applications have driven the adoption of far memory technologies in data centers to provide cost-effective, high-capacity memory solutions. However, far memory presents new performance challenges because…
Persistent memory (pmem) products bring the persistence domain up to the memory level. Intel recently introduced the eADR feature that guarantees to flush data buffered in CPU cache to pmem on a power outage, thereby making the CPU cache a…
To implement a linearizable shared memory in synchronous message-passing systems it is necessary to wait for a time linear to the uncertainty in the latency of the network for both read and write operations. Waiting only for one of them…
The performance properties of byte-addressable persistent memory (PMEM) have the potential to significantly improve system performance over a wide spectrum of applications. But persistent memory brings considerable new challenges to the…
Enforcing state and input constraints during reinforcement learning (RL) in continuous state spaces is an open but crucial problem which remains a roadblock to using RL in safety-critical applications. This paper leverages invariant sets to…
The automated assembly of complex products requires a system that can automatically plan a physically feasible sequence of actions for assembling many parts together. In this paper, we present ASAP, a physics-based planning approach for…
It has been proved that to implement a linearizable shared memory in synchronous message-passing systems it is necessary to wait for a time proportional to the uncertainty in the latency of the network for both read and write operations,…
We present LARK (Linearizability Algorithms for Replicated Keys), a synchronous replication protocol that achieves linearizability while minimizing latency and infrastructure cost, at significantly higher availability than traditional…
Parallel computing is omnipresent in today's scientific computer landscape, starting at multicore processors in desktop computers up to massively parallel clusters. While domain decomposition methods have a long tradition in computational…