Related papers: Towards More Data-Aware Application Integration (e…
Although message-based (business) application integration is based on orchestrated message flows, current modeling languages exclusively cover (parts of) the control flow, while under-specifying the data flow. Especially for more…
During the last two decades, it has been increasingly acknowledged that the engineering of information systems usually requires a huge effort in integrating master data and business processes. This has led to a plethora of proposals, both…
The growth in variety and volume of OLTP (Online Transaction Processing) applications poses a challenge to OLTP systems to meet performance and cost demands in the existing hardware landscape. These applications are highly interactive…
Object-centric process mining is emerging as a promising paradigm across diverse industries, drawing substantial academic attention. To support its data requirements, existing object-centric data formats primarily facilitate the exchange of…
Many ML applications and products train on medium amounts of input data but get bottlenecked in real-time inference. When implementing ML systems, conventional wisdom favors segregating ML code into services queried by product code via…
Database engines have historically absorbed many of the innovations in data processing, adding features to process graph data, XML, object oriented, and text among many others. In this paper, we make the case that it is time to do the same…
Separate programming models for data transformation (declarative) and computation (procedural) impact programmer ergonomics, code reusability and database efficiency. To eliminate the necessity for two models or paradigms, we propose a…
In this research project, we investigate an alternative to the standard cloud-centralized data architecture. Specifically, we aim to leave part of the application data under the control of the individual data owners in decentralized…
This paper explores a new opportunity to improve the performance of transaction processing at the application side by merging structurely similar statements or transactions. Concretely, we re-write transactions to 1) merge similar…
The article suggests a description of a system of tables with a set of special lists absorbing a semantics of data and reflects a fullness of data. It shows how their parallel processing can be constructed based on the descriptions. The…
Enterprise application integration (EAI) solutions are the centrepiece of current enterprise IT architectures (e.g., cloud and mobile computing, business networks), however, require the formalization of their building blocks, represented by…
The use of large-scale machine learning methods is becoming ubiquitous in many applications ranging from business intelligence to self-driving cars. These methods require a complex computation pipeline consisting of various types of…
Business process management is increasingly practiced using data-driven approaches. Still, classical imperative process models, which are typically formalized using Petri nets, are not straightforwardly applicable to the relational…
Big data applications have fast arriving data that must be quickly ingested. At the same time, they have specific needs to preprocess and transform the data before it could be put to use. The current practice is to do these preparatory…
Today's Cloud applications are dominated by composite applications comprising multiple computing and data components with strong communication correlations among them. Although Cloud providers are deploying large number of computing and…
We are living in the era of Big Data and witnessing the explosion of data. Given that the limitation of CPU and I/O in a single computer, the mainstream approach to scalability is to distribute computations among a large number of…
We introduce an interleaving operational semantics for describing the client-observable behaviour of atomic transactions on distributed key-value stores. Our semantics builds on abstract states comprising centralised, global key-value…
Transformer models have continuously expanded into all machine learning domains convertible to the underlying sequence-to-sequence representation, including tabular data. However, while ubiquitous, this representation restricts their…
Many organizations rely on data from government and third-party sources, and those sources rarely follow the same data formatting. This introduces challenges in integrating data from multiple sources or aligning external sources with…
Tabular data comprising rows (samples) with the same set of columns (attributes, is one of the most widely used data-type among various industries, including financial services, health care, research, retail, and logistics, to name a few.…