Related papers: Probabilistic Query Evaluation with Bag Semantics
We study here fundamental issues involved in top-k query evaluation in probabilistic databases. We consider simple probabilistic databases in which probabilities are associated with individual tuples, and general probabilistic databases in…
The reliability of a Boolean Conjunctive Query (CQ) over a tuple-independent probabilistic database is the probability that the CQ is satisfied when the tuples of the database are sampled one by one, independently, with their associated…
Existence of incomplete and imprecise data has moved the database paradigm from deterministic to proba- babilistic information. Probabilistic databases contain tuples that may or may not exist with some probability. As a result, the number…
A probabilistic database with attribute-level uncertainty consists of relations where cells of some attributes may hold probability distributions rather than deterministic content. Such databases arise, implicitly or explicitly, in the…
In their classical 1993 paper [CV93] Chaudhuri and Vardi notice that some fundamental database theory results and techniques fail to survive when we try to see query answers as bags (multisets) of tuples rather than as sets of tuples. But…
Probabilistic databases (PDBs) model uncertainty in data. The current standard is to view PDBs as finite probability spaces over relational database instances. Since many attributes in typical databases have infinite domains, such as…
Queries with aggregation and arithmetic operations, as well as incomplete data, are common in real-world database, but we lack a good understanding of how they should interact. On the one hand, systems based on SQL provide ad-hoc rules for…
In this work, we study the problem of computing a tuple's expected multiplicity over probabilistic databases with bag semantics (where each tuple is associated with a multiplicity) exactly and approximately. We consider bag-TIDBs where we…
We show that for every conjunctive query, the complexity of evaluating it on a probabilistic database is either \PTIME or #\P-complete, and we give an algorithm for deciding whether a given conjunctive query is \PTIME or #\P-complete. The…
Query Containment Problem (QCP) is one of the most fundamental decision problems in database query processing and optimization. Complexity of QCP for conjunctive queries (QCP-CQ) has been fully understood since 1970s. But, as Chaudhuri and…
We address the issue of incorporating a particular yet expressive form of integrity constraints (namely, denial constraints) into probabilistic databases. To this aim, we move away from the common way of giving semantics to probabilistic…
In this tutorial, we will survey known results on the complexity of conjunctive query evaluation in different settings, ranging from Boolean queries over counting to more complex models like enumeration and direct access. A particular focus…
Increasing amounts of available data have led to a heightened need for representing large-scale probabilistic knowledge bases. One approach is to use a probabilistic database, a model with strong assumptions that allow for efficiently…
We consider a bag (multiset) monad on the category of standard Borel spaces, and show that it gives a free measurable commutative monoid. Firstly, we show that a recent measurability result for probabilistic database queries (Grohe and…
We study the $generalized~model~counting~problem$, defined as follows: given a database, and a set of deterministic tuples, count the number of subsets of the database that include all deterministic tuples and satisfy the query. This…
We study the complexity of various fundamental counting problems that arise in the context of incomplete databases, i.e., relational databases that can contain unknown values in the form of labeled nulls. Specifically, we assume that the…
Query evaluation in tuple-independent probabilistic databases is the problem of computing the probability of an answer to a query given independent probabilities of the individual tuples in a database instance. There are two main approaches…
We study the complexity of various fundamental counting problems that arise in the context of incomplete databases, i.e., relational databases that can contain unknown values in the form of labeled nulls. Specifically, we assume that the…
Query evaluation over probabilistic databases is known to be intractable in many cases, even in data complexity, i.e., when the query is fixed. Although some restrictions of the queries [19] and instances [4] have been proposed to lower the…
Query answering over probabilistic data is an important task but is generally intractable. However, a new approach for this problem has recently been proposed, based on structural decompositions of input databases, following, e.g., tree…