Related papers: Debugging OpenStack Problems Using a State Graph A…

Operational Runtime Behavior Mining for Open-Source Supply Chain Security

Open-source software (OSS) is a critical component of modern software systems, yet supply chain security remains challenging in practice due to unavailable or obfuscated source code. Consequently, security teams often rely on runtime…

Cryptography and Security · Computer Science 2026-05-27 Zhuoran Tan , Ke Xiao , Jeremy Singer , Christos Anagnostopoulos

A Survey of Graph-based Deep Learning for Anomaly Detection in Distributed Systems

Anomaly detection is a crucial task in complex distributed systems. A thorough understanding of the requirements and challenges of anomaly detection is pivotal to the security of such systems, especially for real-world deployment. While…

Machine Learning · Computer Science 2023-06-05 Armin Danesh Pazho , Ghazal Alinezhad Noghre , Arnab A Purkayastha , Jagannadh Vempati , Otto Martin , Hamed Tabkhi

Open Graph Benchmark: Datasets for Machine Learning on Graphs

We present the Open Graph Benchmark (OGB), a diverse set of challenging and realistic benchmark datasets to facilitate scalable, robust, and reproducible graph machine learning (ML) research. OGB datasets are large-scale, encompass multiple…

Machine Learning · Computer Science 2021-02-26 Weihua Hu , Matthias Fey , Marinka Zitnik , Yuxiao Dong , Hongyu Ren , Bowen Liu , Michele Catasta , Jure Leskovec

Autoregressive GNN-ODE GRU Model for Network Dynamics

Revealing the continuous dynamics on the networks is essential for understanding, predicting, and even controlling complex systems, but it is hard to learn and model the continuous network dynamics because of complex and unknown governing…

Machine Learning · Computer Science 2022-11-22 Bo Liang , Lin Wang , Xiaofan Wang

Static Graph Challenge: Subgraph Isomorphism

The rise of graph analytic systems has created a need for ways to measure and compare the capabilities of these systems. Graph analytics present unique scalability difficulties. The machine learning, high performance computing, and visual…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-03-07 Siddharth Samsi , Vijay Gadepally , Michael Hurley , Michael Jones , Edward Kao , Sanjeev Mohindra , Paul Monticciolo , Albert Reuther , Steven Smith , William Song , Diane Staheli , Jeremy Kepner

Towards Runtime Verification via Event Stream Processing in Cloud Computing Infrastructures

Software bugs in cloud management systems often cause erratic behavior, hindering detection, and recovery of failures. As a consequence, the failures are not timely detected and notified, and can silently propagate through the system. To…

Software Engineering · Computer Science 2022-03-09 Domenico Cotroneo , Luigi De Simone , Pietro Liguori , Roberto Natella , Angela Scibelli

DeepGG: a Deep Graph Generator

Learning distributions of graphs can be used for automatic drug discovery, molecular design, complex network analysis, and much more. We present an improved framework for learning generative models of graphs based on the idea of deep state…

Machine Learning · Computer Science 2021-12-07 Julian Stier , Michael Granitzer

Small Graph Is All You Need: DeepStateGNN for Scalable Traffic Forecasting

We propose a novel Graph Neural Network (GNN) model, named DeepStateGNN, for analyzing traffic data, demonstrating its efficacy in two critical tasks: forecasting and reconstruction. Unlike typical GNN methods that treat each traffic sensor…

Machine Learning · Computer Science 2025-02-21 Yannick Wölker , Arash Hajisafi , Cyrus Shahabi , Matthias Renz

A Graphical Interactive Debugger for Distributed Systems

Designing and debugging distributed systems is notoriously difficult. The correctness of a distributed system is largely determined by its handling of failure scenarios. The sequence of events leading to a bug can be long and complex, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-15 Doug Woos , Zachary Tatlock , Michael D. Ernst , Thomas E. Anderson

Towards Learning Self-Organized Criticality of Rydberg Atoms using Graph Neural Networks

Self-Organized Criticality (SOC) is a ubiquitous dynamical phenomenon believed to be responsible for the emergence of universal scale-invariant behavior in many, seemingly unrelated systems, such as forest fires, virus spreading or atomic…

Atomic Physics · Physics 2022-07-20 Simon Ohler , Daniel Brady , Winfried Lötzsch , Michael Fleischhauer , Johannes S. Otterbach

Large Graph Analysis in the GMine System

Current applications have produced graphs on the order of hundreds of thousands of nodes and millions of edges. To take advantage of such graphs, one must be able to find patterns, outliers and communities. These tasks are better performed…

Social and Information Networks · Computer Science 2015-05-29 Jose F. Rodrigues , Hanghang Tong , Jia-Yu Pan , Agma J. M. Traina , Caetano Traina , Christos Faloutsos

Fault Injection Analytics: A Novel Approach to Discover Failure Modes in Cloud-Computing Systems

Cloud computing systems fail in complex and unexpected ways due to unexpected combinations of events and interactions between hardware and software components. Fault injection is an effective means to bring out these failures in a…

Software Engineering · Computer Science 2020-10-02 Domenico Cotroneo , Luigi De Simone , Pietro Liguori , Roberto Natella

Run-time Failure Detection via Non-intrusive Event Analysis in a Large-Scale Cloud Computing Platform

Cloud computing systems fail in complex and unforeseen ways due to unexpected combinations of events and interactions among hardware and software components. These failures are especially problematic when they are silent, i.e., not…

Software Engineering · Computer Science 2023-01-19 Domenico Cotroneo , Luigi De Simone , Pietro Liguori , Roberto Natella

Discovering Job Preemptions in the Open Science Grid

The Open Science Grid(OSG) is a world-wide computing system which facilitates distributed computing for scientific research. It can distribute a computationally intensive job to geo-distributed clusters and process job's tasks in parallel.…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-07-19 Zhe Zhang , Brian Bockelman , Derek Weitzel , David Swanson

A Brief Study of Open Source Graph Databases

With the proliferation of large irregular sparse relational datasets, new storage and analysis platforms have arisen to fill gaps in performance and capability left by conventional approaches built on traditional database technologies and…

Databases · Computer Science 2013-09-12 Rob McColl , David Ediger , Jason Poovey , Dan Campbell , David Bader

On Software Ageing Indicators in OpenStack

Distributed systems in general and cloud systems in particular, are susceptible to failures that can lead to substantial economic and data losses, security breaches, and even potential threats to human safety. Software ageing is an example…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-26 Yevhen Yazvinskyi , Jasmin Bogatinovski , Jorge Cardoso , Odej Kao

On-Stack Replacement \`a la Carte

On-stack replacement (OSR) dynamically transfers execution between different code versions. This mechanism is used in mainstream runtime systems to support adaptive and speculative optimizations by running code tailored to provide the best…

Programming Languages · Computer Science 2017-08-09 Daniele Cono D'Elia , Camil Demetrescu

Families of Distributed Memory Parallel Graph Algorithms from Self-Stabilizing Kernels-An SSSP Case Study

Self-stabilizing algorithms are an important because of their robustness and guaranteed convergence. Starting from any arbitrary state, a self-stabilizing algorithm is guaranteed to converge to a legitimate state.Those algorithms are not…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-06-20 Thejaka Kanewala , Marcin Zalewski , Martina Barnas , Andrew Lumsdaine

Enhancing Failure Propagation Analysis in Cloud Computing Systems

In order to plan for failure recovery, the designers of cloud systems need to understand how their system can potentially fail. Unfortunately, analyzing the failure behavior of such systems can be very difficult and time-consuming, due to…

Software Engineering · Computer Science 2022-03-09 Domenico Cotroneo , Luigi De Simone , Pietro Liguori , Roberto Natella , Nematollah Bidokhti

Learning to Identify Graphs from Node Trajectories in Multi-Robot Networks

The graph identification problem consists of discovering the interactions among nodes in a network given their state/feature trajectories. This problem is challenging because the behavior of a node is coupled to all the other nodes by the…

Systems and Control · Electrical Eng. & Systems 2023-10-24 Eduardo Sebastian , Thai Duong , Nikolay Atanasov , Eduardo Montijano , Carlos Sagues