Related papers: Combining Aggregation and Sampling (Nearly) Optima…
Ad-hoc queries over frequently updated data in a flat schema are common in real-time data analysis applications and often require very low latency. Online aggregation can achieve so by providing approximate aggregation answers with…
Querying on big data is a challenging task due to the rapid growth of data amount. Approximate query processing (AQP) is a way to meet the requirement of fast response. In this paper, we propose a learning-based AQP method called the LAQP.…
In the current world, OLAP (Online Analytical Processing) is used intensively by modern organizations to perform ad hoc analysis of data, providing insight for better decision making. Thus, the performance for OLAP is crucial; however, it…
The rapid growth of spatial data urges the research community to find efficient processing techniques for interactive queries on large volumes of data. Approximate Query Processing (AQP) is the most prominent technique that can provide…
Data is generated at an unprecedented rate surpassing our ability to analyze them. The database community has pioneered many novel techniques for Approximate Query Processing (AQP) that could give approximate results in a fraction of time…
Despite continuous investments in data technologies, the latency of querying data still poses a significant challenge. Modern analytic solutions require near real-time responsiveness both to make them interactive and to support automated…
Error assessment for Approximate Query Processing (AQP) is a challenging problem. Bootstrap sampling can produce error assessment even when the population data distribution is unknown. However, bootstrap sampling needs to produce a large…
Nowadays, sampling-based Approximate Query Processing (AQP) is widely regarded as a promising way to achieve interactivity in big data analytics. To build such an AQP system, finding the minimal sample size for a query regarding given error…
Interactive visualizations are arguably the most important tool to explore, understand and convey facts about data. In the past years, the database community has been working on different techniques for Approximate Query Processing (AQP)…
The Group-By query is an important kind of query, which is common and widely used in data warehouses, data analytics, and data visualization. Approximate query processing is an effective way to increase the querying efficiency on big data.…
Despite decades of research on approximate query processing (AQP), our understanding of sample-based joins has remained limited and, to some extent, even superficial. The common belief in the community is that joining random samples is…
Many big-data clusters store data in large partitions that support access at a coarse, partition-level granularity. As a result, approximate query processing via row-level sampling is inefficient, often requiring reads of many partitions.…
We study the hardness of Approximate Query Processing (AQP) of various types of queries involving joins over multiple tables of possibly different sizes. In the case where the query result is a single value (e.g., COUNT, SUM, and…
As more and more organizations rely on data-driven decision making, large-scale analytics become increasingly important. However, an analyst is often stuck waiting for an exact result. As such, organizations turn to Cloud providers that…
The current surge of interest in graph-based data models mirrors the usage of increasingly complex reachability queries, as witnessed by recent analytical studies on real-world graph query logs. Despite the maturity of graph DBMS…
Aggregating data is fundamental to data analytics, data exploration, and OLAP. Approximate query processing (AQP) techniques are often used to accelerate computation of aggregates using samples, for which confidence intervals (CIs) are…
Database flexible querying is an alternative to the classic one for users. The use of Formal Concepts Analysis (FCA) makes it possible to make approximate answers that those turned over by a classic DataBase Management System (DBMS). Some…
We study the problem of efficiently estimating counts for queries involving complex filters, such as user-defined functions, or predicates involving self-joins and correlated subqueries. For such queries, traditional sampling techniques may…
TAPAS is a novel adaptive sampling method for the softmax model. It uses a two pass sampling strategy where the examples used to approximate the gradient of the partition function are first sampled according to a squashed population…
Approximate query processing (AQP) is an interesting alternative for exact query processing. It is a tool for dealing with the huge data volumes where response time is more important than perfect accuracy (this is typically the case during…