Related papers: Implementing Semantic Join Operators Efficiently

Featurized-Decomposition Join: Low-Cost Semantic Joins with Guarantees

Large Language Models (LLMs) are being increasingly used within data systems to process large datasets with text fields. A broad class of such tasks involves a semantic join-joining two tables based on a natural language predicate per pair…

Databases · Computer Science 2025-12-08 Sepanta Zeighami , Shreya Shankar , Aditya Parameswaran

Scalable Join Inference for Large Context Graphs

Context graphs are essential for modern AI applications including question answering, pattern discovery, and data analysis. Building accurate context graphs from structured databases requires inferring join relationships between entities.…

Databases · Computer Science 2026-03-05 Shivani Tripathi , Ravi Shetye , Shi Qiao , Alekh Jindal

PLOP: Cost-Based Placement of Semantic Operators in Hybrid Query Plans

Recent database systems have introduced semantic operators that leverage large language models (LLMs) to filter, join, and project over structured data using natural language predicates. In practice, these operators are combined with…

Databases · Computer Science 2026-04-28 Qiuyang Mang , Yufan Xiang , Hangrui Zhou , Runyuan He , Jiaxiang Yu , Hanchen Li , Aditya Parameswaran , Alvin Cheung

Semantic Scheduling for LLM Inference

Conventional operating system scheduling algorithms are largely content-ignorant, making decisions based on factors such as latency or fairness without considering the actual intents or semantics of processes. Consequently, these algorithms…

Machine Learning · Computer Science 2025-06-17 Wenyue Hua , Dujian Ding , Yile Gu , Yujie Ren , Kai Mei , Minghua Ma , William Yang Wang

Semantic Search At LinkedIn

Semantic search with large language models (LLMs) enables retrieval by meaning rather than keyword overlap, but scaling it requires major inference efficiency advances. We present LinkedIn's LLM-based semantic search framework for AI Job…

Information Retrieval · Computer Science 2026-02-10 Fedor Borisyuk , Sriram Vasudevan , Muchen Wu , Guoyao Li , Benjamin Le , Shaobo Zhang , Qianqi Kay Shen , Yuchin Juan , Kayhan Behdin , Liming Dong , Kaixu Yang , Shusen Jing , Ravi Pothamsetty , Rajat Arora , Sophie Yanying Sheng , Vitaly Abdrashitov , Yang Zhao , Lin Su , Xiaoqing Wang , Chujie Zheng , Sarang Metkar , Rupesh Gupta , Igor Lapchuk , David N. Racca , Madhumitha Mohan , Yanbo Li , Haojun Li , Saloni Gandhi , Xueying Lu , Chetan Bhole , Ali Hooshmand , Xin Yang , Raghavan Muthuregunathan , Jiajun Zhang , Mathew Teoh , Adam Coler , Abhinav Gupta , Xiaojing Ma , Sundara Raman Ramachandran , Morteza Ramezani , Yubo Wang , Lijuan Zhang , Richard Li , Jian Sheng , Chanh Nguyen , Yen-Chi Chen , Chuanrui Zhu , Claire Zhang , Jiahao Xu , Deepti Kulkarni , Qing Lan , Arvind Subramaniam , Ata Fatahibaarzi , Steven Shimizu , Yanning Chen , Zhipeng Wang , Ran He , Zhengze Zhou , Qingquan Song , Yun Dai , Caleb Johnson , Ping Liu , Shaghayegh Gharghabi , Gokulraj Mohanasundaram , Juan Bottaro , Santhosh Sachindran , Qi Guo , Yunxiang Ren , Chengming Jiang , Di Mo , Luke Simon , Jianqiang Shen , Jingwei Wu , Wenjing Zhang

Beyond Linear LLM Invocation: An Efficient and Effective Semantic Filter Paradigm

Large language models (LLMs) are increasingly used for semantic query processing over large corpora. A set of semantic operators derived from relational algebra has been proposed to provide a unified interface for expressing such queries,…

Databases · Computer Science 2026-03-06 Nan Hou , Kangfei Zhao , Jiadong Xie , Jeffrey Xu Yu

Skew Strikes Back: New Developments in the Theory of Join Algorithms

Evaluating the relational join is one of the central algorithmic and most well-studied problems in database systems. A staggering number of variants have been considered including Block-Nested loop join, Hash-Join, Grace, Sort-merge for…

Databases · Computer Science 2013-10-17 Hung Q. Ngo , Christopher Re , Atri Rudra

Combining Large Language Models and Gradient-Free Optimization for Automatic Control Policy Synthesis

Large Language models (LLMs) have shown promise as generators of symbolic control policies, producing interpretable program-like representations through iterative search. However, these models are not capable of separating the functional…

Machine Learning · Computer Science 2025-10-02 Carlo Bosio , Matteo Guarrera , Alberto Sangiovanni-Vincentelli , Mark W. Mueller

OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking

Large language models (LLMs) have revolutionized the landscape of Natural Language Processing systems, but are computationally expensive. To reduce the cost without sacrificing performance, previous studies have explored various approaches…

Computation and Language · Computer Science 2024-10-01 Chia-Hsuan Lee , Hao Cheng , Mari Ostendorf

Hybrid Semantic Search: Unveiling User Intent Beyond Keywords

This paper addresses the limitations of traditional keyword-based search in understanding user intent and introduces a novel hybrid search approach that leverages the strengths of non-semantic search engines, Large Language Models (LLMs),…

Information Retrieval · Computer Science 2024-09-09 Aman Ahluwalia , Bishwajit Sutradhar , Karishma Ghosh , Indrapal Yadav , Arpan Sheetal , Prashant Patil

SemBench: A Benchmark for Semantic Query Processing Engines

We present a benchmark targeting a novel class of systems: semantic query processing engines. Those systems rely inherently on generative and reasoning capabilities of state-of-the-art large language models (LLMs). They extend SQL with…

Databases · Computer Science 2026-03-17 Jiale Lao , Andreas Zimmerer , Olga Ovcharenko , Tianji Cong , Matthew Russo , Gerardo Vitagliano , Michael Cochez , Fatma Özcan , Gautam Gupta , Thibaud Hottelier , H. V. Jagadish , Kris Kissel , Sebastian Schelter , Andreas Kipf , Immanuel Trummer

Semantic Data Processing with Holistic Data Understanding

Semantic operators have increasingly become integrated within data systems to enable processing data using Large Language Models (LLMs). Despite significant recent effort in improving these operators, their accuracy is limited due to a…

Databases · Computer Science 2026-04-06 Youran Sun , Sepanta Zeighami , Bhavya Chopra , Shreya Shankar , Aditya G. Parameswaran

Rethinking Semantic Parsing for Large Language Models: Enhancing LLM Performance with Semantic Hints

Semantic Parsing aims to capture the meaning of a sentence and convert it into a logical, structured form. Previous studies show that semantic parsing enhances the performance of smaller models (e.g., BERT) on downstream tasks. However, it…

Computation and Language · Computer Science 2025-05-28 Kaikai An , Shuzheng Si , Helan Hu , Haozhe Zhao , Yuchi Wang , Qingyan Guo , Baobao Chang

Sema: A High-performance System for LLM-based Semantic Query Processing

The integration of Large Language Models (LLMs) into data analytics has unlocked powerful capabilities for reasoning over bulk structured and unstructured data. However, existing systems typically rely on either DataFrame primitives, which…

Databases · Computer Science 2026-03-13 Kangkang Qi , Dongyang Xie , Wenbo Li , Hao Zhang , Yuanyuan Zhu , Jeffrey Xu Yu , Kangfei Zhao

Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching

Entity matching (EM) is a critical step in entity resolution (ER). Recently, entity matching based on large language models (LLMs) has shown great promise. However, current LLM-based entity matching approaches typically follow a binary…

Computation and Language · Computer Science 2024-12-13 Tianshu Wang , Xiaoyang Chen , Hongyu Lin , Xuanang Chen , Xianpei Han , Hao Wang , Zhenyu Zeng , Le Sun

Access Paths for Efficient Ordering with Large Language Models

In this work, we present the \texttt{LLM ORDER BY} semantic operator as a logical abstraction and conduct a systematic study of its physical implementations. First, we propose several improvements to existing semantic sorting algorithms and…

Databases · Computer Science 2026-05-21 Fuheng Zhao , Jiayue Chen , Yiming Pan , Tahseen Rabbani , Sohaib , Divyakant Agrawal , Amr El Abbadi , Paritosh Aggarwal , Anupam Datta , Dimitris Tsirogiannis

LLM4Hint: Leveraging Large Language Models for Hint Recommendation in Offline Query Optimization

Query optimization is essential for efficient SQL query execution in DBMS, and remains attractive over time due to the growth of data volumes and advances in hardware. Existing traditional optimizers struggle with the cumbersome hand-tuning…

Databases · Computer Science 2025-07-08 Suchen Liu , Jun Gao , Yinjun Han , Yang Lin

Optimizing Context-Enhanced Relational Joins

Collecting data, extracting value, and combining insights from relational and context-rich multi-modal sources in data processing pipelines presents a challenge for traditional relational DBMS. While relational operators allow declarative…

Databases · Computer Science 2025-02-14 Viktor Sanca , Manos Chatzakis , Anastasia Ailamaki

Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation

Large Language Models (LLMs) are revolutionizing how users interact with information systems, yet their high inference cost poses serious scalability and sustainability challenges. Caching inference responses, allowing them to be retrieved…

Machine Learning · Computer Science 2026-02-16 Xutong Liu , Baran Atalar , Xiangxiang Dai , Jinhang Zuo , Siwei Wang , John C. S. Lui , Wei Chen , Carlee Joe-Wong

Evaluating Joinable Column Discovery Approaches for Context-Aware Search

Joinable Column Discovery is a critical challenge in automating enterprise data analysis. While existing approaches focus on syntactic overlap and semantic similarity, there remains limited understanding of which methods perform best for…

Databases · Computer Science 2025-10-29 Harsha Kokel , Aamod Khatiwada , Tejaswini Pedapati , Haritha Ananthakrishnan , Oktie Hassanzadeh , Horst Samulowitz , Kavitha Srinivas