Related papers: Datalog-based Scalable Semantic Diffing of Concurr…

Scalable Alignment of Process Models and Event Logs: An Approach Based on Automata and S-Components

Given a model of the expected behavior of a business process and an event log recording its observed behavior, the problem of business process conformance checking is that of identifying and describing the differences between the model and…

Software Engineering · Computer Science 2020-03-06 Daniel Reißner , Abel Armas-Cervantes , Raffaele Conforti , Marlon Dumas , Dirk Fahland , Marcello La Rosa

Equivalence of Dataflow Graphs via Rewrite Rules Using a Graph-to-Sequence Neural Model

In this work we target the problem of provably computing the equivalence between two programs represented as dataflow graphs. To this end, we formalize the problem of equivalence between two programs as finding a set of semantics-preserving…

Machine Learning · Computer Science 2021-06-07 Steve Kommrusch , Théo Barollet , Louis-Noël Pouchet

Statistical Program Slicing: a Hybrid Slicing Technique for Analyzing Deployed Software

Dynamic program slicing can significantly reduce the code developers need to inspect by narrowing it down to only a subset of relevant program statements. However, despite an extensive body of research showing its usefulness, dynamic…

Software Engineering · Computer Science 2022-01-04 Bogdan Alexandru Stoica , Swarup K. Sahoo , James R. Larus , Vikram S. Adve

Illuminating Patterns of Divergence: DataDios SmartDiff for Large-Scale Data Difference Analysis

Data engineering workflows require reliable differencing across files, databases, and query outputs, yet existing tools falter under schema drift, heterogeneous types, and limited explainability. SmartDiff is a unified system that combines…

Databases · Computer Science 2025-09-03 Aryan Poduri , Yashwant Tailor

Thread-Modular Static Analysis for Relaxed Memory Models

We propose a memory-model-aware static program analysis method for accurately analyzing the behavior of concurrent software running on processors with weak consistency models such as x86-TSO, SPARC-PSO, and SPARC-RMO. At the center of our…

Programming Languages · Computer Science 2017-09-29 Markus Kusano , Chao Wang

Lifting C Semantics for Dataflow Optimization

C is the lingua franca of programming and almost any device can be programmed using C. However, programming mod-ern heterogeneous architectures such as multi-core CPUs and GPUs requires explicitly expressing parallelism as well as…

Programming Languages · Computer Science 2022-05-25 Alexandru Calotoiu , Tal Ben-Nun , Grzegorz Kwasniewski , Johannes de Fine Licht , Timo Schneider , Philipp Schaad , Torsten Hoefler

So You Want to Analyze Scheme Programs With Datalog?

Static analysis approximates the results of a program by examining only its syntax. For example, control-flow analysis (CFA) determines which syntactic lambdas (for functional languages) or (for object-oriented) methods may be invoked at…

Programming Languages · Computer Science 2021-07-28 Davis Ross Silverman , Yihao Sun , Kristopher Micinski , Thomas Gilray

Automated Synthesis of Distributed Controllers

Synthesis is a particularly challenging problem for concurrent programs. At the same time it is a very promising approach, since concurrent programs are difficult to get right, or to analyze with traditional verification techniques. This…

Formal Languages and Automata Theory · Computer Science 2015-06-09 Anca Muscholl

Discovering Software Parallelization Points Using Deep Neural Networks

This study proposes a deep learning-based approach for discovering loops in programming code according to their potential for parallelization. Two genetic algorithm-based code generators were developed to produce two distinct types of code:…

Machine Learning · Computer Science 2025-10-03 Izavan dos S. Correia , Henrique C. T. Santos , Tiago A. E. Ferreira

ConPredictor: Concurrency Defect Prediction in Real-World Applications

Concurrent programs are difficult to test due to their inherent non-determinism. To address this problem, testing often requires the exploration of thread schedules of a program; this can be time-consuming when applied to real-world…

Software Engineering · Computer Science 2018-04-11 Tingting Yu , Wei Wen , Xue Han , Jane Hayes

Support for Debugging Automatically Parallelized Programs

We describe a system that simplifies the process of debugging programs produced by computer-aided parallelization tools. The system uses relative debugging techniques to compare serial and parallel executions in order to show where the…

Software Engineering · Computer Science 2007-05-23 Robert Hood , Gabriele Jost

Evaluating Datalog over Semirings: A Grounding-based Approach

Datalog is a powerful yet elegant language that allows expressing recursive computation. Although Datalog evaluation has been extensively studied in the literature, so far, only loose upper bounds are known on how fast a Datalog program can…

Databases · Computer Science 2024-03-20 Hangdong Zhao , Shaleen Deep , Paraschos Koutris , Sudeepa Roy , Val Tannen

Automatic Identification of Parallelizable Loops Using Transformer-Based Source Code Representations

Automatic parallelization remains a challenging problem in software engineering, particularly in identifying code regions where loops can be safely executed in parallel on modern multi-core architectures. Traditional static analysis…

Software Engineering · Computer Science 2026-04-01 Izavan dos S. Correia , Henrique C. T. Santos , Tiago A. E. Ferreira

Formalizing and Checking Thread Refinement for Data-Race-Free Execution Models (Extended Version)

When optimizing a thread in a concurrent program (either done manually or by the compiler), it must be guaranteed that the resulting thread is a refinement of the original thread. Most theories of valid optimizations are formulated in terms…

Programming Languages · Computer Science 2015-10-27 Daniel Poetzl , Daniel Kroening

Fault Localization in Multi-Threaded C Programs using Bounded Model Checking (extended version)

Software debugging is a very time-consuming process, which is even worse for multi-threaded programs, due to the non-deterministic behavior of thread-scheduling algorithms. However, the debugging time may be greatly reduced, if automatic…

Logic in Computer Science · Computer Science 2015-09-09 Erickson H. da S. Alves , Lucas C. Cordeiro , Eddie B. de Lima Filho

A Graph-Based Semantics Workbench for Concurrent Asynchronous Programs

A number of novel programming languages and libraries have been proposed that offer simpler-to-use models of concurrency than threads. It is challenging, however, to devise execution models that successfully realise their abstractions…

Software Engineering · Computer Science 2016-03-24 Claudio Corrodi , Alexander Heußner , Christopher M. Poskitt

A Practical Dynamic Programming Approach to Datalog Provenance Computation

We establish a translation between a formalism for dynamic programming over hypergraphs and the computation of semiring-based provenance for Datalog programs. The benefit of this translation is a new method for computing provenance for a…

Databases · Computer Science 2021-12-03 Yann Ramusat , Silviu Maniu , Pierre Senellart

Model Checking with Program Slicing Based on Variable Dependence Graphs

In embedded control systems, the potential risks of software defects have been increasing because of software complexity which leads to, for example, timing related problems. These defects are rarely found by tests or simulations. To detect…

Logic in Computer Science · Computer Science 2013-01-03 Masahiro Matsubara , Kohei Sakurai , Fumio Narisawa , Masushi Enshoiwa , Yoshio Yamane , Hisamitsu Yamanaka

Synthesizing Datalog Programs Using Numerical Relaxation

The problem of learning logical rules from examples arises in diverse fields, including program synthesis, logic programming, and machine learning. Existing approaches either involve solving computationally difficult combinatorial problems,…

Artificial Intelligence · Computer Science 2019-06-26 Xujie Si , Mukund Raghothaman , Kihong Heo , Mayur Naik

"What is Different Between These Datasets?" A Framework for Explaining Data Distribution Shifts

The performance of machine learning models relies heavily on the quality of input data, yet real-world applications often face significant data-related challenges. A common issue arises when curating training data or deploying models: two…

Machine Learning · Computer Science 2025-09-24 Varun Babbar , Zhicheng Guo , Cynthia Rudin