Related papers: PM4Py-GPU: a High-Performance General-Purpose Libr…
Process mining, i.e., a sub-field of data science focusing on the analysis of event data generated during the execution of (business) processes, has seen a tremendous change over the past two decades. Starting off in the early 2000's, with…
Process-mining techniques have emerged as powerful tools for analyzing event data to gain insights into business processes. In this paper, we present a comprehensive analysis of road traffic fine management processes using the pm4py library…
pm4py is a process mining library for Python implementing several process mining (PM) artifacts and algorithms. It also offers methods to integrate PM with large language models (LLMs). This paper examines how the current paradigms of PM on…
Object-centric process mining provides a more holistic view of processes where we analyze processes with multiple case notions. However, most object-centric process mining techniques consider the whole event log rather than the comparison…
Process mining enables business owners to discover and analyze their actual processes using event data that are widely available in information systems. Event data contain detailed information which is incredibly valuable for providing…
The most commonly used open-source process mining software tools today are ProM and PM4Py, written in Java and Python, respectively. Such high-level, often interpreted, programming languages trade off performance with memory safety and…
Python is rapidly becoming the lingua franca of machine learning and scientific computing. With the broad use of frameworks such as Numpy, SciPy, and TensorFlow, scientific computing and machine learning are seeing a productivity boost on…
To execute scientific computing programs such as deep learning at high speed, GPU acceleration is a powerful option. With the recent advancements in web technologies, interfaces like WebGL and WebGPU, which utilize GPUs on the client side…
Process mining techniques such as process discovery and conformance checking provide insights into actual processes by analyzing event data that are widely available in information systems. These data are very valuable, but often contain…
Processing large-scale graph datasets is computationally intensive and time-consuming. Processor-centric CPU and GPU architectures, commonly used for graph applications, often face bottlenecks caused by extensive data movement between the…
PaPy, which stands for parallel pipelines in Python, is a highly flexible framework that enables the construction of robust, scalable workflows for either generating or processing voluminous datasets. A workflow is created from user-written…
Creating high-quality, large-scale datasets for large language models (LLMs) often relies on resource-intensive, GPU-accelerated models for quality filtering, making the process time-consuming and costly. This dependence on GPUs limits…
Performance analysis is a critical step in the oft-repeated, iterative process of performance tuning of parallel programs. Per-process, per-thread traces (detailed logs of events with timestamps) enable in-depth analysis of parallel program…
Data preparation is a trial-and-error process that typically involves countless iterations over the data to define the best pipeline of operators for a given task. With tabular data, practitioners often perform that burdensome activity on…
In this paper, we present a new Python library called mPyPl, which is intended to simplify complex data processing tasks using functional approach. This library defines operations on lazy data streams of named dictionaries represented as…
Process mining provides techniques to improve the performance and compliance of operational processes. Although sometimes the term "workflow mining" is used, the application in the context of Workflow Management (WFM) and Business Process…
Process Mining is a branch of Data Science that aims to extract process-related information from event data contained in information systems, that is steadily increasing in amount. Many algorithms, and a general-purpose open source…
Process mining represents an important field in BPM and data mining research. Recently, it has gained importance also for practitioners: more and more companies are creating business process intelligence solutions. The evaluation of process…
The vast amount of processing power and memory bandwidth provided by modern Graphics Processing Units (GPUs) make them a platform for data-intensive applications. The database community identified GPUs as effective co-processors for data…
In-memory database query processing frequently involves substantial data transfers between the CPU and memory, leading to inefficiencies due to Von Neumann bottleneck. Processing-in-Memory (PIM) architectures offer a viable solution to…