English
Related papers

Related papers: Semantic Data Processing with Holistic Data Unders…

200 papers

Unstructured text has long been difficult to automatically analyze at scale. Large language models (LLMs) now offer a way forward by enabling {\em semantic data processing}, where familiar data processing operators (e.g., map, reduce,…

Human-Computer Interaction · Computer Science 2025-04-22 Shreya Shankar , Bhavya Chopra , Mawil Hasan , Stephen Lee , Björn Hartmann , Joseph M. Hellerstein , Aditya G. Parameswaran , Eugene Wu

The rapid increase in textual information means we need more efficient methods to sift through, organize, and understand it all. While retrieval-augmented generation (RAG) models excel in accessing information from large document…

Computation and Language · Computer Science 2025-03-14 Seiji Maekawa , Hayate Iso , Nikita Bhutani

Scaling test-time computation--generating and analyzing multiple or sequential outputs for a single input--has become a promising strategy for improving the reliability and quality of large language models (LLMs), as evidenced by advances…

Computation and Language · Computer Science 2025-06-03 Sungjae Lee , Hoyoung Kim , Jeongyeon Hwang , Eunhyeok Park , Jungseul Ok

Optimizing an experimental system can be extremely challenging when each experiment is expensive, time-consuming, or difficult to perform. Existing optimizers for expensive black-box problems, such as Bayesian optimization, are typically…

Semantic query processing engines often support semantic joins, enabling users to match rows that satisfy conditions specified in natural language. Such join conditions can be evaluated using large language models (LLMs) that solve novel…

Databases · Computer Science 2025-10-10 Immanuel Trummer

Developers insert logging statements in source code to capture relevant runtime information essential for maintenance and debugging activities. Log level choice is an integral, yet tricky part of the logging activity as it controls log…

Software Engineering · Computer Science 2025-08-13 Youssef Esseddiq Ouatiti , Mohammed Sayagh , Bram Adams , Ahmed E. Hassan

Structured data offers a sophisticated mechanism for the organization of information. Existing methodologies for the text-serialization of structured data in the context of large language models fail to adequately address the heterogeneity…

Computation and Language · Computer Science 2024-02-20 YiQiu Guo , Yuchen Yang , Ya Zhang , Yu Wang , Yanfeng Wang

Large language models (LLMs) with extended context windows enable tasks requiring extensive information integration but are limited by the scarcity of high-quality, diverse datasets for long-context instruction tuning. Existing data…

Computation and Language · Computer Science 2025-02-25 Jiaxi Li , Xingxing Zhang , Xun Wang , Xiaolong Huang , Li Dong , Liang Wang , Si-Qing Chen , Wei Lu , Furu Wei

LLMs enable an exciting new class of data processing applications over large collections of unstructured documents. Several new programming frameworks have enabled developers to build these applications by composing them out of semantic…

The semantic capabilities of large language models (LLMs) have the potential to enable rich analytics and reasoning over vast knowledge corpora. Unfortunately, existing systems either empirically optimize expensive LLM-powered operations…

Databases · Computer Science 2025-03-04 Liana Patel , Siddharth Jha , Melissa Pan , Harshit Gupta , Parth Asawa , Carlos Guestrin , Matei Zaharia

Recent work demonstrated great promise in the idea of orchestrating collaborations between LLMs, human input, and various tools to address the inherent limitations of LLMs. We propose a novel perspective called semantic decoding, which…

Computation and Language · Computer Science 2025-04-30 Maxime Peyrard , Martin Josifoski , Robert West

With advances in large language models (LLMs), researchers are creating new systems that can perform AI-driven analytics over large unstructured datasets. Recent work has explored executing such analytics queries using semantic operators --…

Artificial Intelligence · Computer Science 2025-09-04 Matthew Russo , Tim Kraska

Log parsing is a critical step for automated log analysis in complex systems. Traditional heuristic-based methods offer high efficiency but are limited in accuracy due to overlooking semantic context. In contrast, recent LLM-based parsers…

Computation and Language · Computer Science 2026-03-31 Dongyi Fan , Suqiong Zhang , Lili He , Ming Liu , Yifan Huo

The rise of large language models (LLMs) has significantly impacted various domains, including natural language processing (NLP) and image generation, by making complex computational tasks more accessible. While LLMs demonstrate impressive…

Databases · Computer Science 2024-10-15 Ananya Rahaman , Anny Zheng , Mostafa Milani , Fei Chiang , Rachel Pottinger

String processing, which mainly involves the analysis and manipulation of strings, is a fundamental component of modern computing. Despite the significant advancements of large language models (LLMs) in various natural language processing…

Computation and Language · Computer Science 2025-01-28 Xilong Wang , Hao Fu , Jindong Wang , Neil Zhenqiang Gong

In the field of machine learning, data understanding is the practice of getting initial insights in unknown datasets. Such knowledge-intensive tasks require a lot of documentation, which is necessary for data scientists to grasp the meaning…

Databases · Computer Science 2018-06-14 Markus Schröder , Christian Jilek , Jörn Hees , Andreas Dengel

With the increasing complexity and rapid expansion of the scale of AI systems in cloud platforms, the log data generated during system operation is massive, unstructured, and semantically ambiguous, which brings great challenges to fault…

Artificial Intelligence · Computer Science 2025-06-24 Cheng Ji , Huaiying Luo

Large Language Models (LLMs) are revolutionizing how users interact with information systems, yet their high inference cost poses serious scalability and sustainability challenges. Caching inference responses, allowing them to be retrieved…

Machine Learning · Computer Science 2026-02-16 Xutong Liu , Baran Atalar , Xiangxiang Dai , Jinhang Zuo , Siwei Wang , John C. S. Lui , Wei Chen , Carlee Joe-Wong

The process mining community has recently recognized the potential of large language models (LLMs) for tackling various process mining tasks. Initial studies report the capability of LLMs to support process analysis and even, to some…

Computation and Language · Computer Science 2024-07-03 Adrian Rebmann , Fabian David Schmidt , Goran Glavaš , Han van der Aa

We introduce HAMLET, a holistic and automated framework for evaluating the long-context comprehension of large language models (LLMs). HAMLET structures source texts into a three-level key-fact hierarchy at root-, branch-, and leaf-levels,…

Computation and Language · Computer Science 2025-08-28 Jiaqi Deng , Yuho Lee , Nicole Hee-Yeon Kim , Hyangsuk Min , Taewon Yun , Minjeong Ban , Kim Yul , Hwanjun Song
‹ Prev 1 2 3 10 Next ›