Related papers: Scalable Comparison of JavaScript V8 Bytecode Trac…
Graph comparison is a fundamental operation in data mining and information retrieval. Due to the combinatorial nature of graphs, it is hard to balance the expressiveness of the similarity measure and its scalability. Spectral analysis…
Stream processing applications extract value from raw data through Directed Acyclic Graphs of data analysis tasks. Shared-nothing (SN) parallelism is the de-facto standard to scale stream processing applications. Given an application, SN…
Long traces and large event logs that originate from sensors and prediction models are becoming more common in our data-rich world. In such circumstances, conformance checking, a key task in process mining, can become computationally…
Source code similarity are increasingly used in application development to identify clones, isolate bugs, and find copy-rights violations. Similar code fragments can be very problematic due to the fact that errors in the original code must…
Asynchrony has become an inherent element of JavaScript, as an effort to improve the scalability and performance of modern web applications. To this end, JavaScript provides programmers with a wide range of constructs and features for…
Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements. In this paper, we revisit these approaches by considering, additionally, the memory…
When an evolving program is modified to address issues related to thread synchronization, there is a need to confirm the change is correct, i.e., it does not introduce unexpected behavior. However, manually comparing two programs to…
String similarity models are vital for record linkage, entity resolution, and search. In this work, we present STANCE --a learned model for computing the similarity of two strings. Our approach encodes the characters of each string, aligns…
One of the main challenges of debugging is to understand why the program fails for certain inputs but succeeds for others. This becomes especially difficult if the fault is caused by an interaction of multiple inputs. To debug such…
In this paper, we study seven well-known trace analysis techniques both from the hardware and software domain and discuss their performance on communication-centric system-on-chip (SoC) traces. SoC traces are usually huge in size and…
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety. However, the continual learning aspect of these aligned LLMs has been largely overlooked. Existing…
Efficient code retrieval is critical for developer productivity, yet existing benchmarks largely focus on Python and rarely stress-test robustness beyond superficial lexical cues. To address the gap, we introduce an automated pipeline for…
Browser fingerprinting enables persistent cross-site user tracking via subtle techniques that often evade conventional defenses or cause website breakage when script-level blocking countermeasures are applied. Addressing these challenges…
Alignments provide sophisticated diagnostics that pinpoint deviations in a trace with respect to a process model and their severity. However, approaches based on trace alignments use crisp process models as reference and recent…
Graphs, consisting of vertices and edges, are vital for representing complex relationships in fields like social networks, finance, and blockchain. Visualizing these graphs helps analysts identify structural patterns, with readability…
Adequate consideration is crucial to ensure that services in a distributed application context are running satisfactorily with the resources available. Due to the asynchronous nature of tasks and the need to work with multiple layers that…
Similarity-based vector search facilitates many important applications such as search and recommendation but is limited by the memory capacity and bandwidth of a single machine due to large datasets and intensive data read. In this paper,…
The existence of trace links between artifacts of the software development life cycle can improve the efficiency of many activities during software development, maintenance and operations. Unfortunately, the creation and maintenance of…
Recent advances in web technologies make it more difficult than ever to detect and block web tracking systems. In this work, we propose ASTrack, a novel approach to web tracking detection and removal. ASTrack uses an abstraction of the code…
Generative retrieval has emerged as a powerful paradigm for LLM-based recommendation. However, industrial recommender systems often benefit from restricting the output space to a constrained subset of items based on business logic (e.g.…