Related papers: Implementing Performance Competitive Logical Recov…
A new type of logs, the command log, is being employed to replace the traditional data log (e.g., ARIES log) in the in-memory databases. Instead of recording how the tuples are updated, a command log only tracks the transactions being…
Data redundancy techniques have been tested in several different applications to provide fault tolerance and performance gains. The use of these techniques is mostly seen at the hardware, device driver, or file system level. In practice,…
In the evolving IT landscape, stability and reliability of systems are essential, yet their growing complexity challenges DevOps teams in implementation and maintenance. Log analysis, a core element of AIOps, provides critical insights into…
Datalog reasoning based on the semina\"ive evaluation strategy evaluates rules using traditional join plans, which often leads to redundancy and inefficiency in practice, especially when the rules are complex. Hypertree decompositions help…
The traditional architecture for a DBMS engine has the recovery, concurrency control and access method code tightly bound together in a storage engine for records. We propose a different approach, where the storage engine is factored into…
DGCC protocol has been shown to achieve good performance on multi-core in-memory system. However, distributed transactions complicate the dependency resolution, and therefore, an effective transaction partitioning strategy is essential to…
Retrieval-Augmented Generation (RAG) effectively enhances Large Language Models (LLMs) by incorporating retrieved external knowledge into the generation process. Reasoning models improve LLM performance in multi-hop QA tasks, which require…
Traditional retrieval methods rely on transforming user queries into vector representations and retrieving documents based on cosine similarity within an embedding space. While efficient and scalable, this approach often fails to handle…
We present a series of novel techniques and algorithms for transaction commit, logging, recovery, and propagation control. In combination, they provide a recovery component that maintains the persistent state of the database (both log and…
In this paper, we present the design and architecture of REI, a novel system for indexing log data for regular expression queries. Our main contribution is an $n$-gram-based indexing strategy and an efficient storage mechanism that results…
Artificial Intelligence (AI) is beginning to transform the research process by automating the discovery of new solutions. This shift depends on the availability of reliable verifiers, which AI-driven approaches require to validate candidate…
The stakeholders of a system are legitimately interested in whether and how its architecture reflects their respective concerns at each point of its development and maintenance processes. Having such knowledge available at all times would…
Developers increasingly use function-as-a-service (FaaS) platforms for data-centric applications that perform low-latency and transactional operations on data, such as for microservices or web serving. Unfortunately, existing FaaS platforms…
Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. The shift from reads to writes in web applications has also been accelerating in recent years. Write-ahead-logging is a common approach…
Incrementalization speeds up computations by avoiding unnecessary recomputations and by efficiently reusing previous results. While domain-specific techniques achieve impressive speedups, e.g., in the context of database queries, they are…
Runtime resource management for many-core systems is increasingly complex. The complexity can be due to diverse workload characteristics with conflicting demands, or limited shared resources such as memory bandwidth and power. Resource…
Multi-tenancy for latency-critical applications leads to re-source interference and unpredictable performance. Core reconfiguration opens up more opportunities for colocation,as it allows the hardware to adjust to the dynamic performance…
Based on the two observations that diverse applications perform better on different multicore architectures, and that different phases of an application may have vastly different resource requirements, Pal et al. proposed a novel…
Recent advancements in QA with LLM, like GPT-4, have shown limitations in handling complex multi-hop queries. We propose AT-RAG, a novel multistep RAG incorporating topic modeling for efficient document retrieval and reasoning. Using…
Supply chain optimization models frequently become infeasible because of modeling errors. Diagnosis and repair require scarce OR expertise: analysts must interpret solver diagnostics, trace root causes across echelons, and fix formulations…