Related papers: Efficient Approximate Query Answering over Sensor …
Plato provides fast approximate analytics on time series, by precomputing and storing compressed time series. Plato's key novelty is the delivery of tight deterministic error guarantees for time series analytics. Plato evaluates any time…
Despite 25 years of research in academia, approximate query processing (AQP) has had little industrial adoption. One of the major causes of this slow adoption is the reluctance of traditional vendors to make radical changes to their legacy…
We present EntropyDB, an interactive data exploration system that uses a probabilistic approach to generate a small, query-able summary of a dataset. Departing from traditional summarization techniques, we use the Principle of Maximum…
Certain answers are a principled method for coping with the uncertainty that arises in many practical data management tasks. Unfortunately, this method is expensive and may exclude useful (if uncertain) answers. Prior work introduced…
Certain answers are a principled method for coping with uncertainty that arises in many practical data management tasks. Unfortunately, this method is expensive and may exclude useful (if uncertain) answers. Thus, users frequently resort to…
After decades of research in approximate query processing (AQP), its adoption in the industry remains limited. Existing methods struggle to simultaneously provide user-specified error guarantees, eliminate maintenance overheads, and avoid…
In today's databases, previous query answers rarely benefit answering future queries. For the first time, to the best of our knowledge, we change this paradigm in an approximate query processing (AQP) context. We make the following…
Deterministic databases enable scalable replicated systems by executing transactions in a predetermined order. However, existing designs fail to capture transaction dependencies, leading to insufficient scheduling, high abort rates, and…
The question of answering queries over ML predictions has been gaining attention in the database community. This question is challenging because the cost of finding high quality answers corresponds to invoking an oracle such as a human…
Wireless sensor networks offer the potential to span and monitor large geographical areas inexpensively. Sensor network databases like TinyDB are the dominant architectures to extract and manage data in such networks. Since sensors have…
Accurate and efficient entity resolution is an open challenge of particular relevance to intelligence organisations that collect large datasets from disparate sources with differing levels of quality and standard. Starting from a…
We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the…
Over the past a few years, research and development has made significant progresses on big data analytics. A fundamental issue for big data analytics is the efficiency. If the optimal solution is unable to attain or not required or has a…
Consistent query answering is the problem of computing the answers from a database that are consistent with respect to certain integrity constraints that the database as a whole may fail to satisfy. Those answers are characterized as those…
Range aggregate queries (RAQs) are an integral part of many real-world applications, where, often, fast and approximate answers for the queries are desired. Recent work has studied answering RAQs using machine learning models, where a model…
Past research on probabilistic databases has studied the problem of answering queries on a static database. Application scenarios of probabilistic databases however often involve the conditioning of a database using additional information…
Classical algorithms for query optimization presuppose the absence of inconsistencies or uncertainties in the database and exploit only valid semantic knowledge provided, e.g., by integrity constraints. Data inconsistency or uncertainty,…
Approximate computing techniques have been successful in reducing computation and power costs in several domains. However, error sensitive applications in high-performance computing are unable to benefit from existing approximate computing…
Beowulf clusters are very popular and deployed worldwide in support of scientific computing, because of the high computational power and performance. However, they also pose several challenges, and yet they need to provide high…
Recently, deep learning-based language models have significantly enhanced text-to-SQL tasks, with promising applications in retrieving patient records within the medical domain. One notable challenge in such applications is discerning…