Related papers: Datalog-based Scalable Semantic Diffing of Concurr…
Given a model of the expected behavior of a business process and an event log recording its observed behavior, the problem of business process conformance checking is that of identifying and describing the differences between the model and…
In this work we target the problem of provably computing the equivalence between two programs represented as dataflow graphs. To this end, we formalize the problem of equivalence between two programs as finding a set of semantics-preserving…
Dynamic program slicing can significantly reduce the code developers need to inspect by narrowing it down to only a subset of relevant program statements. However, despite an extensive body of research showing its usefulness, dynamic…
Data engineering workflows require reliable differencing across files, databases, and query outputs, yet existing tools falter under schema drift, heterogeneous types, and limited explainability. SmartDiff is a unified system that combines…
We propose a memory-model-aware static program analysis method for accurately analyzing the behavior of concurrent software running on processors with weak consistency models such as x86-TSO, SPARC-PSO, and SPARC-RMO. At the center of our…
C is the lingua franca of programming and almost any device can be programmed using C. However, programming mod-ern heterogeneous architectures such as multi-core CPUs and GPUs requires explicitly expressing parallelism as well as…
Static analysis approximates the results of a program by examining only its syntax. For example, control-flow analysis (CFA) determines which syntactic lambdas (for functional languages) or (for object-oriented) methods may be invoked at…
Synthesis is a particularly challenging problem for concurrent programs. At the same time it is a very promising approach, since concurrent programs are difficult to get right, or to analyze with traditional verification techniques. This…
This study proposes a deep learning-based approach for discovering loops in programming code according to their potential for parallelization. Two genetic algorithm-based code generators were developed to produce two distinct types of code:…
Concurrent programs are difficult to test due to their inherent non-determinism. To address this problem, testing often requires the exploration of thread schedules of a program; this can be time-consuming when applied to real-world…
We describe a system that simplifies the process of debugging programs produced by computer-aided parallelization tools. The system uses relative debugging techniques to compare serial and parallel executions in order to show where the…
Datalog is a powerful yet elegant language that allows expressing recursive computation. Although Datalog evaluation has been extensively studied in the literature, so far, only loose upper bounds are known on how fast a Datalog program can…
Automatic parallelization remains a challenging problem in software engineering, particularly in identifying code regions where loops can be safely executed in parallel on modern multi-core architectures. Traditional static analysis…
When optimizing a thread in a concurrent program (either done manually or by the compiler), it must be guaranteed that the resulting thread is a refinement of the original thread. Most theories of valid optimizations are formulated in terms…
Software debugging is a very time-consuming process, which is even worse for multi-threaded programs, due to the non-deterministic behavior of thread-scheduling algorithms. However, the debugging time may be greatly reduced, if automatic…
A number of novel programming languages and libraries have been proposed that offer simpler-to-use models of concurrency than threads. It is challenging, however, to devise execution models that successfully realise their abstractions…
We establish a translation between a formalism for dynamic programming over hypergraphs and the computation of semiring-based provenance for Datalog programs. The benefit of this translation is a new method for computing provenance for a…
In embedded control systems, the potential risks of software defects have been increasing because of software complexity which leads to, for example, timing related problems. These defects are rarely found by tests or simulations. To detect…
The problem of learning logical rules from examples arises in diverse fields, including program synthesis, logic programming, and machine learning. Existing approaches either involve solving computationally difficult combinatorial problems,…
The performance of machine learning models relies heavily on the quality of input data, yet real-world applications often face significant data-related challenges. A common issue arises when curating training data or deploying models: two…