Related papers: RDMA vs. RPC for Implementing Distributed Data Str…
In this work, we aim to evaluate different Distributed Lock Management service designs with Remote Direct Memory Access (RDMA). In specific, we implement and evaluate the centralized and the RDMA-enabled lock manager designs for fast…
Deep learning emerges as an important new resource-intensive workload and has been successfully applied in computer vision, speech, natural language processing, and so on. Distributed deep learning is becoming a necessity to cope with…
RDMA is an exciting technology that enables a host to access the memory of a remote host without involving the remote CPU. Prior work shows how to use RDMA to improve the performance of distributed in-memory storage systems. However, RDMA…
High-performance clusters and datacenters pose increasingly demanding requirements on storage systems. If these systems do not operate at scale, applications are doomed to become I/O bound and waste compute cycles. To accelerate the data…
The drive towards exascale computing is opening an enormous opportunity for more realistic and precise simulations of natural phenomena. The process of simulation, however, involves not only the numerical computation of predictions but also…
Coordinating concurrent access to a shared resource using mutual exclusion is a fundamental problem in computation. In this paper, we present a novel approach to mutual exclusion designed specifically for distributed systems leveraging a…
Remote Direct Memory Access (RDMA) is an efficient way to improve the performance of traditional client-server systems. Currently, there are two main design paradigms for RDMA-accelerated systems. The first allows the clients to directly…
Remote Direct Memory Access (RDMA) is a technology that allows direct memory access from the memory of one computer into that of another without involving either one's operating system. This enables high-throughput, low-latency networking,…
Remote memory access (RMA) is an emerging high-performance programming model that uses RDMA hardware directly. Yet, accessing remote memories cannot invoke activities at the target which complicates implementation and limits performance of…
Memory disaggregation is being considered as a strong alternative to traditional architecture to deal with the memory under-utilization in data centers. Disaggregated memory can adapt to dynamically changing memory requirements for the data…
Overheads in Operating System kernel network stacks and sockets have been hindering OSes from managing networking operations efficiently for years. Moreover, when building Remote Procedure Calls over TCP, certain TCP features do not match…
In this paper, we conduct systematic measurement studies to show that the high memory bandwidth consumption of modern distributed applications can lead to a significant drop of network throughput and a large increase of tail latency in…
We can use a hybrid memory system consisting of DRAM and Intel Optane DC Persistent Memory (We call it DCPM in this paper) as DCPM is now commercially available since April 2019. Even if the latency for DCPM is several times higher than…
Remote Procedure Call (RPC) is a widely used abstraction for cloud computing. The programmer specifies type information for each remote procedure, and a compiler generates stub code linked into each application to marshal and unmarshal…
Grid Computing is a type of parallel and distributed systems that is designed to provide reliable access to data and computational resources in wide area networks. These resources are distributed in different geographical locations, however…
With Dynamic Resource Management (DRM) the resources assigned to a job can be changed dynamically during its execution. From the system's perspective, DRM opens a new level of flexibility in resource allocation and job scheduling and…
Modern interconnects offer remote direct memory access (RDMA) features. Yet, most applications rely on explicit message passing for communications albeit their unwanted overheads. The MPI-3.0 standard defines a programming interface for…
Paper presents and evaluates various mechanisms for remote access to memory in distributed systems based on two distinct HPC clusters. We are comparing solutions based on the shared storage and MPI (over Infiniband and Slingshot) to the…
This work-in-progress report presents both the design and partial evaluation of distributed execution indexing, a technique for microservice applications that precisely identifies dynamic instances of inter-service remote procedure calls…
Conventional wisdom holds that an efficient interface between an OS running on a CPU and a high-bandwidth I/O device should use Direct Memory Access (DMA) to offload data transfer, descriptor rings for buffering and queuing, and interrupts…