Related papers: Distributed Execution Indexing

Enhancing Trace Visualizations for Microservices Performance Analysis

Performance analysis of microservices can be a challenging task, as a typical request to these systems involves multiple Remote Procedure Calls (RPC) spanning across independent services and machines. Practitioners primarily rely on…

Software Engineering · Computer Science 2023-02-27 Jessica Leone , Luca Traini

Detecting Latency Degradation Patterns in Service-based Systems

Performance in heterogeneous service-based systems shows non-determistic trends. Even for the same request type, latency may vary from one request to another. These variations can occur due to several reasons on different levels of the…

Software Engineering · Computer Science 2020-04-14 Vittorio Cortellessa , Luca Traini

Distributed Symbolic Execution using Test-Depth Partitioning

Symbolic execution is a classic technique for systematic bug finding, which has seen many applications in recent years but remains hard to scale. Recent work introduced ranged symbolic execution to distribute the symbolic execution task…

Software Engineering · Computer Science 2021-06-07 Shikhar Singh , Sarfraz Khurshid

Integrating Abstractions to Enhance the Execution of Distributed Applications

One of the factors that limits the scale, performance, and sophistication of distributed applications is the difficulty of concurrently executing them on multiple distributed computing resources. In part, this is due to a poor understanding…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-02-22 Matteo Turilli , Feng Liu , Zhao Zhang , Andre Merzky , Michael Wilde , Jon Weissman , Daniel S. Katz , Shantenu Jha

RDMA vs. RPC for Implementing Distributed Data Structures

Distributed data structures are key to implementing scalable applications for scientific simulations and data analysis. In this paper we look at two implementation styles for distributed data structures: remote direct memory access (RDMA)…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-16 Benjamin Brock , Yuxin Chen , Jiakun Yan , John D. Owens , Aydın Buluç , Katherine Yelick

Distributed Inference and Query Processing for RFID Tracking and Monitoring

In this paper, we present the design of a scalable, distributed stream processing system for RFID tracking and monitoring. Since RFID data lacks containment and location information that is key to query processing, we propose to combine…

Databases · Computer Science 2011-03-24 Zhao Cao , Charles Sutton , Yanlei Diao , Prashant Shenoy

Learning Execution Contexts from System Call Distributions for Intrusion Detection in Embedded Systems

Existing techniques used for intrusion detection do not fully utilize the intrinsic properties of embedded systems. In this paper, we propose a lightweight method for detecting anomalous executions using a distribution of system call…

Cryptography and Security · Computer Science 2015-08-04 Man-Ki Yoon , Sibin Mohan , Jaesik Choi , Mihai Christodorescu , Lui Sha

CrossTrace: Efficient Cross-Thread and Cross-Service Span Correlation in Distributed Tracing for Microservices

Distributed tracing has become an essential technique for debugging and troubleshooting modern microservice-based applications, enabling software engineers to detect performance bottlenecks, identify failures, and gain insights into system…

Networking and Internet Architecture · Computer Science 2025-08-18 Linh-An Phan , MingXue Wang , Guangyu Wu , Wang Dawei , Chen Liqun , Li Jin

Spatter: A Tool for Evaluating Gather / Scatter Performance

This paper describes a new benchmark tool, Spatter, for assessing memory system architectures in the context of a specific category of indexed accesses known as gather and scatter. These types of operations are increasingly used to express…

Performance · Computer Science 2020-07-09 Patrick Lavin , Jeffrey Young , Jason Riedy , Richard Vuduc , Aaron Vose , Dan Ernst

Distributed Speculative Execution for Resilient Cloud Applications

Fault-tolerance is critically important in highly-distributed modern cloud applications. Solutions such as Temporal, Azure Durable Functions, and Beldi hide fault-tolerance complexity from developers by persisting execution state and…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-19 Tianyu Li , Badrish Chandramouli , Philip A. Bernstein , Samuel Madden

Remote Procedure Call as a Managed System Service

Remote Procedure Call (RPC) is a widely used abstraction for cloud computing. The programmer specifies type information for each remote procedure, and a compiler generates stub code linked into each application to marshal and unmarshal…

Networking and Internet Architecture · Computer Science 2023-04-18 Jingrong Chen , Yongji Wu , Shihan Lin , Yechen Xu , Xinhao Kong , Thomas Anderson , Matthew Lentz , Xiaowei Yang , Danyang Zhuo

NetRPC: Enabling In-Network Computation in Remote Procedure Calls

People have shown that in-network computation (INC) significantly boosts performance in many application scenarios include distributed training, MapReduce, agreement, and network monitoring. However, existing INC programming is unfriendly…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-19 Bohan Zhao , Wenfei Wu , Wei Xu

Energy-aware Distributed Microservice Request Placement at the Edge

Microservice is a way of splitting the logic of an application into small blocks that can be run on different computing units and used by other applications. It has been successful for cloud applications and is now increasingly used for…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-27 Klervie Toczé , Simin Nadjm-Tehrani

MicroRes: Versatile Resilience Profiling in Microservices via Degradation Dissemination Indexing

Microservice resilience, the ability of microservices to recover from failures and continue providing reliable and responsive services, is crucial for cloud vendors. However, the current practice relies on manually configured rules specific…

Software Engineering · Computer Science 2024-03-22 Tianyi Yang , Cheryl Lee , Jiacheng Shen , Yuxin Su , Yongqiang Yang , Michael R. Lyu

Reliability and Fault-Tolerance by Choreographic Design

Distributed programs are hard to get right because they are required to be open, scalable, long-running, and tolerant to faults. In particular, the recent approaches to distributed software based on (micro-)services where different services…

Programming Languages · Computer Science 2017-08-25 Ian Cassar , Adrian Francalanza , Claudio Antares Mezzina , Emilio Tuosto

Characterization and Comparison of Application Resilience for Serial and Parallel Executions

Soft error of exascale application is a challenge problem in modern HPC. In order to quantify an application's resilience and vulnerability, the application-level fault injection method is widely adopted by HPC users. However, it is not…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-08-06 Kai Wu , Qiang Guan , Nathan DeBardeleben , Dong Li

A Remote Procedure Call Approach for Extreme-scale Services

When working at exascale, the various constraints imposed by the extreme scale of the system bring new challenges for application users and software/middleware developers. In that context, and to provide best performance, resiliency and…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-10-09 Jerome Soumagne , Philip H. Carns , Dries Kimpe , Quincey Koziol , Robert B. Ross

Detection of Configuration Vulnerabilities in Distributed (Web) Environments

Many tools and libraries are readily available to build and operate distributed Web applications. While the setup of operational environments is comparatively easy, practice shows that their continuous secure operation is more difficult to…

Cryptography and Security · Computer Science 2012-07-13 Matteo Maria Casalino , Michele Mangili , Henrik Plate , Serena Elisa Ponta

How to Evaluate Distributed Coordination Systems? -- A Survey and Analysis

Coordination services and protocols are critical components of distributed systems and are essential for providing consistency, fault tolerance, and scalability. However, due to the lack of standard benchmarking and evaluation tools for…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-28 Bekir Turkkan , Elvis Rodrigues , Tevfik Kosar , Aleksey Charapko , Ailidani Ailijiang , Murat Demirbas

HiCR, an Abstract Model for Distributed Heterogeneous Programming

We present HiCR, a model to represent the semantics of distributed heterogeneous applications and runtime systems. The model describes a minimal set of abstract operations to enable hardware topology discovery, kernel execution, memory…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-03 Sergio Miguel Martin , Luca Terracciano , Kiril Dichev , Noah Baumann , Jiashu Lin , Albert-Jan Yzelman