English
Related papers

Related papers: Implementing Semantic Join Operators Efficiently

200 papers

Large Language Models (LLMs) are being increasingly used within data systems to process large datasets with text fields. A broad class of such tasks involves a semantic join-joining two tables based on a natural language predicate per pair…

Databases · Computer Science 2025-12-08 Sepanta Zeighami , Shreya Shankar , Aditya Parameswaran

Context graphs are essential for modern AI applications including question answering, pattern discovery, and data analysis. Building accurate context graphs from structured databases requires inferring join relationships between entities.…

Databases · Computer Science 2026-03-05 Shivani Tripathi , Ravi Shetye , Shi Qiao , Alekh Jindal

Recent database systems have introduced semantic operators that leverage large language models (LLMs) to filter, join, and project over structured data using natural language predicates. In practice, these operators are combined with…

Conventional operating system scheduling algorithms are largely content-ignorant, making decisions based on factors such as latency or fairness without considering the actual intents or semantics of processes. Consequently, these algorithms…

Machine Learning · Computer Science 2025-06-17 Wenyue Hua , Dujian Ding , Yile Gu , Yujie Ren , Kai Mei , Minghua Ma , William Yang Wang

Large language models (LLMs) are increasingly used for semantic query processing over large corpora. A set of semantic operators derived from relational algebra has been proposed to provide a unified interface for expressing such queries,…

Databases · Computer Science 2026-03-06 Nan Hou , Kangfei Zhao , Jiadong Xie , Jeffrey Xu Yu

Evaluating the relational join is one of the central algorithmic and most well-studied problems in database systems. A staggering number of variants have been considered including Block-Nested loop join, Hash-Join, Grace, Sort-merge for…

Databases · Computer Science 2013-10-17 Hung Q. Ngo , Christopher Re , Atri Rudra

Large Language models (LLMs) have shown promise as generators of symbolic control policies, producing interpretable program-like representations through iterative search. However, these models are not capable of separating the functional…

Machine Learning · Computer Science 2025-10-02 Carlo Bosio , Matteo Guarrera , Alberto Sangiovanni-Vincentelli , Mark W. Mueller

Large language models (LLMs) have revolutionized the landscape of Natural Language Processing systems, but are computationally expensive. To reduce the cost without sacrificing performance, previous studies have explored various approaches…

Computation and Language · Computer Science 2024-10-01 Chia-Hsuan Lee , Hao Cheng , Mari Ostendorf

This paper addresses the limitations of traditional keyword-based search in understanding user intent and introduces a novel hybrid search approach that leverages the strengths of non-semantic search engines, Large Language Models (LLMs),…

Information Retrieval · Computer Science 2024-09-09 Aman Ahluwalia , Bishwajit Sutradhar , Karishma Ghosh , Indrapal Yadav , Arpan Sheetal , Prashant Patil

We present a benchmark targeting a novel class of systems: semantic query processing engines. Those systems rely inherently on generative and reasoning capabilities of state-of-the-art large language models (LLMs). They extend SQL with…

Semantic operators have increasingly become integrated within data systems to enable processing data using Large Language Models (LLMs). Despite significant recent effort in improving these operators, their accuracy is limited due to a…

Databases · Computer Science 2026-04-06 Youran Sun , Sepanta Zeighami , Bhavya Chopra , Shreya Shankar , Aditya G. Parameswaran

Semantic Parsing aims to capture the meaning of a sentence and convert it into a logical, structured form. Previous studies show that semantic parsing enhances the performance of smaller models (e.g., BERT) on downstream tasks. However, it…

Computation and Language · Computer Science 2025-05-28 Kaikai An , Shuzheng Si , Helan Hu , Haozhe Zhao , Yuchi Wang , Qingyan Guo , Baobao Chang

The integration of Large Language Models (LLMs) into data analytics has unlocked powerful capabilities for reasoning over bulk structured and unstructured data. However, existing systems typically rely on either DataFrame primitives, which…

Databases · Computer Science 2026-03-13 Kangkang Qi , Dongyang Xie , Wenbo Li , Hao Zhang , Yuanyuan Zhu , Jeffrey Xu Yu , Kangfei Zhao

Entity matching (EM) is a critical step in entity resolution (ER). Recently, entity matching based on large language models (LLMs) has shown great promise. However, current LLM-based entity matching approaches typically follow a binary…

Computation and Language · Computer Science 2024-12-13 Tianshu Wang , Xiaoyang Chen , Hongyu Lin , Xuanang Chen , Xianpei Han , Hao Wang , Zhenyu Zeng , Le Sun

In this work, we present the \texttt{LLM ORDER BY} semantic operator as a logical abstraction and conduct a systematic study of its physical implementations. First, we propose several improvements to existing semantic sorting algorithms and…

Query optimization is essential for efficient SQL query execution in DBMS, and remains attractive over time due to the growth of data volumes and advances in hardware. Existing traditional optimizers struggle with the cumbersome hand-tuning…

Databases · Computer Science 2025-07-08 Suchen Liu , Jun Gao , Yinjun Han , Yang Lin

Collecting data, extracting value, and combining insights from relational and context-rich multi-modal sources in data processing pipelines presents a challenge for traditional relational DBMS. While relational operators allow declarative…

Databases · Computer Science 2025-02-14 Viktor Sanca , Manos Chatzakis , Anastasia Ailamaki

Large Language Models (LLMs) are revolutionizing how users interact with information systems, yet their high inference cost poses serious scalability and sustainability challenges. Caching inference responses, allowing them to be retrieved…

Machine Learning · Computer Science 2026-02-16 Xutong Liu , Baran Atalar , Xiangxiang Dai , Jinhang Zuo , Siwei Wang , John C. S. Lui , Wei Chen , Carlee Joe-Wong

Joinable Column Discovery is a critical challenge in automating enterprise data analysis. While existing approaches focus on syntactic overlap and semantic similarity, there remains limited understanding of which methods perform best for…

‹ Prev 1 2 3 10 Next ›