数据库
With streaming floating-point numbers being increasingly prevalent, effective and efficient compression of such data is critical. Compression schemes must be able to exploit the similarity, or smoothness, of consecutive numbers and must be…
Real-time log analysis is the cornerstone of observability for modern infrastructure. However, existing online parsers are architecturally unsuited for the dynamism of production environments. Built on fundamentally static template models,…
We provide a comprehensive overview of current approaches and systems for combining graphs and time series data. We categorize existing systems into four architectural categories and analyze how these systems meet different requirements and…
A significant impediment to high performance in key-value stores is the high cost of thread switching or stalls. While there are many sources for this, a major one is the contention for resources. And this cost increases with load as…
Knowledge Graphs (KGs) bear great potential for ecology and biodiversity researchers in their ability to support synthesis and integration efforts, meta-analyses, reasoning tasks, and overall machine interoperability of research data.…
Entity Resolution (ER) is a critical data cleaning task for identifying records that refer to the same real-world entity. In the era of Big Data, traditional batch ER is often infeasible due to volume and velocity constraints, necessitating…
Pattern matching of core GQL, the new ISO standard for querying property graphs, cannot check whether edge values are increasing along a path, as established in recent work. We present a constructive translation that overcomes this…
Query rewrite transforms SQL queries into semantically equivalent forms that run more efficiently. Existing approaches mainly rely on predefined rewrite rules, but they handle a limited subset of queries and can cause performance…
Multi-criteria decision making in large databases is very important in real world applications. Recently, an interactive query has been studied extensively in the database literature with the advantage of both the top-k query (with limited…
Modern data analytics pipelines increasingly combine relational queries, graph processing, and tensor computation within a single application, but existing systems remain fragmented across paradigms, execution models, and research…
Cardinality estimation is a cornerstone of cost-based optimizers (CBOs), yet real-world workloads often violate the assumptions behind static statistics, degrading decision stability and increasing plan flip rates. We empirically…
We study the problem of processing continuous k nearest neighbor (CkNN) queries over moving objects on road networks, which is an essential operation in a variety of applications. We are particularly concerned with scenarios where the…
The k Nearest Neighbor (kNN) query over moving objects on road networks is essential for location-based services. Recently, this problem has been studied under road networks with distance as the metric, overlooking fluctuating travel costs.…
The advancement of Text-to-SQL systems is currently hindered by the scarcity of high-quality training data and the limited reasoning capabilities of models in complex scenarios. In this paper, we propose a holistic framework that addresses…
Reachability in hypergraphs is essential for modeling complex groupwise interactions in real-world applications such as co-authorship, social network, and biological analysis, where relationships go beyond pairwise interactions. In this…
With the rise of Large Language Models (LLMs), tourists increasingly use it for route planning by entering keywords for attractions, instead of relying on traditional manual map services. LLMs provide generally reasonable suggestions, but…
The Reverse $k$-Nearest Neighbor (R$k$NN) query over moving objects on road networks seeks to find all moving objects that consider the specified query point as one of their $k$ nearest neighbors. In location based services, many users…
With the proliferation of temporal graph data, there is a growing demand for analyzing information propagation patterns during graph evolution. Existing graph analysis systems, mostly based on static snapshots, struggle to effectively…
Buffer management remains a critical component of database and operating system performance, serving as the primary mechanism for bridging the persistent latency gap between CPU processing speeds and storage access times. This paper…
This work addresses a route planning problem constrained by a bus road network that includes the schedules of all buses. Given a query with a starting bus stop and a set of Points of Interest (POIs) to visit, our goal is to find an optimal…