数据库 — Scifaro

Semi-interval Comparison Constraints in Query Containment and Their Impact on Certain Answer Computation

We consider conjunctive queries with arithmetic comparisons (CQAC) and investigate the computational complexity of the problem: Given two CQAC queries, $Q$ and $Q'$, is $Q'$ contained in $Q$? We know that, for CQAC queries, the problem of…

数据库 · 计算机科学 2026-05-26 Foto N. Afrati , Matthew Damigos

Subspace Aggregation Query and Index Generation for Multidimensional Resource Space Model

Organizing large-scale resources in a multidimensional semantic space is an approach to efficiently managing and querying resources from different semantic dimensions. To support advanced applications, this paper proposes a resource space…

数据库 · 计算机科学 2026-05-26 Xiaoping Sun , Hai Zhuge

GPU-Accelerated OLTP: An In-Depth Analysis of Concurrency Control Schemes

Over the past decade, GPUs have demonstrated significant potential in accelerating Online Analytical Processing (OLAP) operations. However, there remains a substantial gap in their application to Online Transaction Processing (OLTP), as…

数据库 · 计算机科学 2026-05-26 Zihan Sun , Yuyu Luo , Yong Zhang , Chao Li , Chunxiao Xing

CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolving Data Marketplaces

Temporal knowledge-graph data marketplaces face three coupled failures in static designs: stale hybrid index shortcuts reduce recall as edges evolve, stationary Shapley pricing misattributes value after distribution shifts, and…

数据库 · 计算机科学 2026-05-25 Joydeep Chandra

A Pragmatic Approach to Learned Indexing in RocksDB: Targeted Optimizations with Minimal System Modification

Learned indexes have emerged as a promising alternative to traditional index structures, offering higher throughput and lower memory usage by approximating the cumulative key distribution function with lightweight models. Despite these…

数据库 · 计算机科学 2026-05-25 Shubham Vashisth , Olivier Michaud , Bettina Kemme , Oana Balmau

BCTuner: LLM-Guided Monte Carlo Tree Search for Efficient Blockchain Knob Tuning

Knob tuning plays a critical role in improving the performance of permissioned blockchains. However, efficient tuning remains challenging due to the architectural complexity of blockchains and the semantic gap between knob-specific logic…

数据库 · 计算机科学 2026-05-25 Yaoyi Deng , Chongyang Tao , Mingxuan Li , Xuelian Lin , Han Sun , Mingchao Wan , Shuai Ma

Measuring Database Unfairness via Dependency Quantification Under Differential Privacy

Differential privacy (DP) has become the de facto standard for protecting sensitive data, providing strong guarantees that published statistics or models reveal limited information about any individual. However, privacy noise and restricted…

数据库 · 计算机科学 2026-05-25 Mariia Vologdin , Yuchao Tao , Amir Gilad

Expressive Power of Deep Homomorphism Networks over Relational Databases

The expressive limitations of message-passing Graph Neural Networks (GNNs) have motivated a wide range of more powerful graph learning architectures. We advocate Deep Homomorphism Networks (DHNs) as a model particularly well-suited for…

数据库 · 计算机科学 2026-05-25 Moritz Schönherr , Balder ten Cate , Maurice Funk , Benny Kimelfeld , Carsten Lutz , Arie Soeteman

MojoFrame: Dataframe Library in Mojo Language

Mojo is an emerging programming language built on MLIR (Multi-Level Intermediate Representation) and supports JIT (Just-in-Time) compilation. It enables transparent hardware-specific optimizations (e.g., for CPUs and GPUs), while allowing…

数据库 · 计算机科学 2026-05-25 Shengya Huang , Zhaoheng Li , Derek Warner , Yongjoo Park

GS-QA: A Benchmark for Geospatial Question Answering

Recent advances in Large Language Models (LLMs) have led to dramatic improvements in question answering (QA). To address the challenge of evaluating QA systems, standardized benchmarks have been introduced. This work focuses on the problem…

数据库 · 计算机科学 2026-05-22 Majid Saeedan , Muhammad Shihab Rashid , Ahmed Eldawy , Vagelis Hristidis

optimade-maker: Automated generation of interoperable materials APIs from static datasets

Atomistic structural data are central to materials science, condensed matter physics, and chemistry, and are increasingly digitised across diverse repositories and databases. Interoperable access to these heterogeneous data sources enables…

数据库 · 计算机科学 2026-05-22 Kristjan Eimre , Matthew L. Evans , Bud Macaulay , Xing Wang , Jusong Yu , Nicola Marzari , Gian-Marco Rignanese , Giovanni Pizzi

OSM+: Billion-Level OpenStreetMap Dataset for City-wide Experiments

Road network data provides rich information about cities, but processing worldwide OpenStreetMap (OSM) data is computationally intensive, and the resulting graphs are often difficult to unify for benchmarking downstream tasks. Existing…

数据库 · 计算机科学 2026-05-22 Guanjie Zheng , Ziyang Su , Yiheng Wang , Yuhang Luo , Hongwei Zhang , Xuanhe Zhou , Linghe Kong , Fan Wu , Wen Ling

ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

Predicates are foundational components in data analysis systems. However, modern workloads increasingly involve unstructured documents, which demands semantic understanding, beyond traditional value-based predicates. Given enormous…

数据库 · 计算机科学 2026-05-22 Hengrui Zhang , Yulong Hui , Yihao Liu , Huanchen Zhang

Fifty Years of Transaction Processing Research (extended)

In this short paper, I recount some early history of transaction research (including some of my own), explain why transaction research continues to this day (even though it seems to be a solved problem), and speculate about its future. This…

数据库 · 计算机科学 2026-05-21 Philip A. Bernstein

PystachIO: Efficient Distributed GPU Query Processing with PyTorch over Fast Networks & Fast Storage

The AI hardware boom has led modern data centers to adopt HPC-style architectures centered on distributed, GPU-centric computation. Large GPU clusters interconnected by fast RDMA networks and backed by high-bandwidth NVMe storage enable…

数据库 · 计算机科学 2026-05-21 Jigao Luo , Nils Boeschen , Muhammad El-Hindi , Carsten Binnig

Access Paths for Efficient Ordering with Large Language Models

In this work, we present the \texttt{LLM ORDER BY} semantic operator as a logical abstraction and conduct a systematic study of its physical implementations. First, we propose several improvements to existing semantic sorting algorithms and…

数据库 · 计算机科学 2026-05-21 Fuheng Zhao , Jiayue Chen , Yiming Pan , Tahseen Rabbani , Sohaib , Divyakant Agrawal , Amr El Abbadi , Paritosh Aggarwal , Anupam Datta , Dimitris Tsirogiannis

Towards Serverless Processing of Spatiotemporal Big Data Queries

Spatiotemporal data are being produced in continuously growing volumes by a variety of data sources and a variety of application fields rely on rapid analysis of such data. Existing systems such as PostGIS or MobilityDB usually build on…

数据库 · 计算机科学 2026-05-21 Diana Baumann , Tim C. Rese , David Bermbach

Optimizing Navigational Graph Queries

We study the optimization of navigational graph queries, i.e., queries which combine recursive and pattern-matching fragments. Current approaches to their evaluation are not effective in practice. Towards addressing this, we present a…

数据库 · 计算机科学 2026-05-21 Thomas Mulder , George Fletcher , Nikolay Yakovets

Leveraging I/O Stalls for Efficient Scheduling in ANNS

Disk-based graph indexes for approximate nearest neighbor search (ANNS) must serve latency-sensitive queries and throughput-demanding updates concurrently. We observe that over 40% of search-thread CPU time is spent stalling on disk I/O;…

数据库 · 计算机科学 2026-05-20 Juncheng Zhang , Yuanming Ren , Yongkun Li , Patrick P. C. Lee

Example-Driven Intent Synthesis for Constrained Data Bundle Retrieval: Focused Text Snippet Extraction and Beyond

Selecting a bundle of items that collectively satisfies constraints is a fundamental task across databases, recommender systems, and text summarization. Unlike traditional retrieval that returns individual or top-k items, bundle retrieval…

数据库 · 计算机科学 2026-05-20 Whanhee Cho , Kuangfei Long , Mahmood Jasim , Matteo Brucato , Alexandra Meliou , Peter J. Haas , Anna Fariha