English
Related papers

Related papers: High-Performance Distributed RMA Locks

200 papers

Designers of modern reader-writer locks confront a difficult trade-off related to reader scalability. Locks that have a compact memory representation for active readers will typically suffer under high intensity read-dominated workloads…

Operating Systems · Computer Science 2019-07-11 David Dice , Alex Kogan

In this paper, we consider a hierarchical distributed multi-task learning (MTL) system where distributed users wish to jointly learn different models orchestrated by a central server with the help of a layer of multiple relays. Since the…

Information Theory · Computer Science 2022-12-19 Haoyang Hu , Songze Li , Minquan Cheng , Youlong Wu

Despite the increasing adoption of Field-Programmable Gate Arrays (FPGAs) in compute clouds, there remains a significant gap in programming tools and abstractions which can leverage network-connected, cloud-scale, multi-die FPGAs to…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-02-05 Neha Prakriya , Yuze Chi , Suhail Basalama , Linghao Song , Jason Cong

Triangle count and local clustering coefficient are two core metrics for graph analysis. They find broad application in analyses such as community detection and link recommendation. Current state-of-the-art solutions suffer from…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-02 András Strausz , Flavio Vella , Salvatore Di Girolamo , Maciej Besta , Torsten Hoefler

The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-03-08 Huan Zhou , Kamran Idrees , José Gracia

Distributed locking mechanisms are fundamental to ensuring data consistency and integrity in distributed systems. This paper presents a comprehensive analysis of distributed locking algorithms, focusing on their performance characteristics…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-07 Andre Rodriguez , William Osborn

Designing a rate limiter that is simultaneously accurate, available, and scalable presents a fundamental challenge in distributed systems, primarily due to the trade-offs between algorithmic precision, availability, consistency, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-13 Bo Guan

Collective communication operations such as MPI_Alltoallv are central to many HPC applications, particularly those with irregular message sizes. We design, implement, and evaluate persistent MPI RMA variants of Alltoallv based on fence and…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-08 Evelyn Namugwanya

Memory disaggregation architecture physically separates CPU and memory into independent components, which are connected via high-speed RDMA networks, greatly improving resource utilization of databases. However, such an architecture poses…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-21 Qing Wang , Youyou Lu , Jiwu Shu

Surrogate models can play a pivotal role in enhancing performance in contemporary High-Performance Computing applications. Cache-based surrogates use already calculated simulation results to interpolate or extrapolate further simulation…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-22 Max Lübke , Marco De Lucia , Steffen Christgau , Stefan Petri , Bettina Schnor

In this work, we aim to evaluate different Distributed Lock Management service designs with Remote Direct Memory Access (RDMA). In specific, we implement and evaluate the centralized and the RDMA-enabled lock manager designs for fast…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-07-21 Yeounoh Chung , Erfan Zamanian

A distributed multi-writer multi-reader (MWMR) atomic register is an important primitive that enables a wide range of distributed algorithms. Hence, improving its performance can have large-scale consequences. Since the seminal work of ABD…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-04-20 Lewis Tseng , Neo Zhou , Cole Dumas , Tigran Bantikyan , Roberto Palmieri

In this paper, a novel distributed scheduling algorithm is proposed, which aims to efficiently schedule both the uplink and downlink backhaul traffic in the relay-assisted mmWave backhaul network with a tree topology. The handshaking of…

Networking and Internet Architecture · Computer Science 2022-02-17 Qiang Hu , Yuchen Liu , Yan Yan , Miao Liu , Jun Zheng , Douglas M. Blough

Common implementations of core memory allocation components, like the Linux buddy system, handle concurrent allocation/release requests by synchronizing threads via spin-locks. This approach is clearly not prone to scale with large thread…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-05-22 Romolo Marotta , Mauro Ianni , Alessandro Pellegrini , Andrea Scarselli , Francesco Quaglia

Distributed resource allocation (DRA) is fundamental to modern networked systems, spanning applications from economic dispatch in smart grids to CPU scheduling in data centers. Conventional DRA approaches require reliable communication, yet…

Systems and Control · Electrical Eng. & Systems 2025-10-22 Mohammadreza Doostmohammadian , Sergio Pequito

Remote memory access (RMA) is an emerging high-performance programming model that uses RDMA hardware directly. Yet, accessing remote memories cannot invoke activities at the target which complicates implementation and limits performance of…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-20 Maciej Besta , Torsten Hoefler

Fully provisioned Message Passing Interface (MPI) parallelism achieves near-optimal wall-clock time for Computational Fluid Dynamics (CFD) solvers. This work addresses a complementary question for shared, cloud-managed clusters: can…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-25 Tianfang Xie

The memory capacity in edge devices is often limited due to constraints on cost, size, and power. Consequently, memory competition leads to inevitable page swapping in memory-constrained mixed-criticality edge devices, causing slow storage…

Operating Systems · Computer Science 2025-11-26 Meng-Chia Lee , Wen Sheng Lim , Yuan-Hao Chang , Tei-Wei Kuo

Efficient multi-robot task allocation (MRTA) is fundamental to various time-sensitive applications such as disaster response, warehouse operations, and construction. This paper tackles a particular class of these problems that we call…

Multiagent Systems · Computer Science 2023-08-21 Steve Paul , Wenyuan Li , Brian Smyth , Yuzhou Chen , Yulia Gel , Souma Chowdhury

Efficient task scheduling in large-scale distributed systems presents significant challenges due to dynamic workloads, heterogeneous resources, and competing quality-of-service requirements. Traditional centralized approaches face…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-27 Daniel Benniah John
‹ Prev 1 2 3 10 Next ›