English
Related papers

Related papers: On Efficient Approximate Queries over Machine Lear…

200 papers

Due to the falling costs of data acquisition and storage, researchers and industry analysts often want to find all instances of rare events in large datasets. For instance, scientists can cheaply capture thousands of hours of video, but are…

Databases · Computer Science 2022-01-05 Daniel Kang , Edward Gan , Peter Bailis , Tatsunori Hashimoto , Matei Zaharia

Several data warehouse and database providers have recently introduced extensions to SQL called AI Queries, enabling users to specify functions and conditions in SQL that are evaluated by LLMs, thereby broadening significantly the kinds of…

We consider accelerating machine learning (ML) inference queries on unstructured datasets. Expensive operators such as feature extractors and classifiers are deployed as user-defined functions(UDFs), which are not penetrable with classic…

Databases · Computer Science 2022-01-04 Zhihui Yang , Zuozhi Wang , Yicong Huang , Yao Lu , Chen Li , X. Sean Wang

Researchers and industry analysts are increasingly interested in computing aggregation queries over large, unstructured datasets with selective predicates that are computed using expensive deep neural networks (DNNs). As these DNNs are…

Databases · Computer Science 2021-08-16 Daniel Kang , John Guibas , Peter Bailis , Tatsunori Hashimoto , Yi Sun , Matei Zaharia

This work studies the applicability of expensive external oracles such as large language models in answering top-k queries over predicted scores. Such scores are incurred by user-defined functions to answer personalized queries over…

Databases · Computer Science 2025-02-19 Sohrab Namazi Nia , Subhodeep Ghosh , Senjuti Basu Roy , Sihem Amer-Yahia

Query expansion has been employed for a long time to improve the accuracy of query retrievers. Earlier works relied on pseudo-relevance feedback (PRF) techniques, which augment a query with terms extracted from documents retrieved in a…

Information Retrieval · Computer Science 2024-06-12 Muhammad Shihab Rashid , Jannat Ara Meem , Yue Dong , Vagelis Hristidis

Evaluating query predicates on data samples is the only way to estimate their selectivity in certain scenarios. Finding a guaranteed optimal query plan is not a reasonable optimization goal in those cases as it might require an infinite…

Databases · Computer Science 2015-11-06 Immanuel Trummer , Christoph Koch

Analysts and scientists are interested in querying streams of video, audio, and text to extract quantitative insights. For example, an urban planner may wish to measure congestion by querying the live feed from a traffic camera. Prior work…

Databases · Computer Science 2023-08-21 Matthew Russo , Tatsunori Hashimoto , Daniel Kang , Yi Sun , Matei Zaharia

Query optimizers in RDBMSs search for execution plans expected to be optimal for given queries. They use parameter estimates, often inaccurate, and make assumptions that may not hold in practice. Consequently, they may select plans that are…

Databases · Computer Science 2025-05-27 Amin Kamali , Verena Kantere , Calisto Zuzarte , Vincent Corvinelli

Uncertainty quantification (UQ) is essential for safe deployment of generative AI models such as large language models (LLMs), especially in high stakes applications. Conformal prediction (CP) offers a principled uncertainty quantification…

Machine Learning · Computer Science 2025-06-09 Sima Noorani , Shayan Kiyani , George Pappas , Hamed Hassani

The goal of multi-objective query optimization (MOQO) is to find query plans that realize a good compromise between conflicting objectives such as minimizing execution time and minimizing monetary fees in a Cloud scenario. A previously…

Databases · Computer Science 2014-04-02 Immanuel Trummer , Christoph Koch

Nowadays, query optimization has been highly concerned in big data management, especially in NoSQL databases. Approximate queries boost query performance by loss of accuracy, for example, sampling approaches trade off query completeness for…

Databases · Computer Science 2019-01-03 Jie Song , Yichuan Zhang , Yubin Bao , Ge Yu

As more and more organizations rely on data-driven decision making, large-scale analytics become increasingly important. However, an analyst is often stuck waiting for an exact result. As such, organizations turn to Cloud providers that…

Databases · Computer Science 2020-03-17 Fotis Savva , Christos Anagnostopoulos , Peter Triantafillou

This paper proposes a new approach for approximate evaluation of #P-hard queries with probabilistic databases. In our approach, every query is evaluated entirely in the database engine by evaluating a fixed number of query plans, each…

Databases · Computer Science 2014-12-03 Wolfgang Gatterbauer , Dan Suciu

Operational consistent query answering (CQA) is a recent framework for CQA based on revised definitions of repairs, which are built by applying a sequence of operations (e.g., fact deletions) starting from an inconsistent database until we…

Databases · Computer Science 2025-08-25 Marco Calautti , Ester Livshits , Andreas Pieris , Markus Schneider

Operational consistent query answering (CQA) is a recent framework for CQA based on revised definitions of repairs, which are built by applying a sequence of operations (e.g., fact deletions) starting from an inconsistent database until we…

Databases · Computer Science 2023-12-14 Marco Calautti , Ester Livshits , Andreas Pieris , Markus Schneider

Increasing amounts of available data have led to a heightened need for representing large-scale probabilistic knowledge bases. One approach is to use a probabilistic database, a model with strong assumptions that allow for efficiently…

Artificial Intelligence · Computer Science 2019-04-04 Tal Friedman , Guy Van den Broeck

Schema matching is a central challenge for data integration systems. Inspired by the popularity and the success of crowdsourcing platforms, we explore the use of crowdsourcing to reduce the uncertainty of schema matching. Since…

Databases · Computer Science 2018-09-12 Chen Jason Zhang , Lei Chen , H. V. Jagadish , Mengchen Zhang , Yongxin Tong

Open-domain Question Answering models which directly leverage question-answer (QA) pairs, such as closed-book QA (CBQA) models and QA-pair retrievers, show promise in terms of speed and memory compared to conventional models which retrieve…

Computation and Language · Computer Science 2021-02-16 Patrick Lewis , Yuxiang Wu , Linqing Liu , Pasquale Minervini , Heinrich Küttler , Aleksandra Piktus , Pontus Stenetorp , Sebastian Riedel

Retrieval-augmented generation (RAG) improves the reliability of large language model (LLM) answers by integrating external knowledge. However, RAG increases the end-to-end inference time since looking for relevant documents from large…

‹ Prev 1 2 3 10 Next ›