Related papers: Probabilistic Databases with MarkoViews
We study the complexity of evaluating queries on probabilistic databases under bag semantics. We focus on self-join free conjunctive queries, and probabilistic databases where occurrences of different facts are independent, which is the…
We study here fundamental issues involved in top-k query evaluation in probabilistic databases. We consider simple probabilistic databases in which probabilities are associated with individual tuples, and general probabilistic databases in…
Existence of incomplete and imprecise data has moved the database paradigm from deterministic to proba- babilistic information. Probabilistic databases contain tuples that may or may not exist with some probability. As a result, the number…
Probabilistic databases play a crucial role in the management and understanding of uncertain data. However, incorporating probabilities into the semantics of incomplete databases has posed many challenges, forcing systems to sacrifice…
Probabilistic databases (PDBs) model uncertainty in data. The current standard is to view PDBs as finite probability spaces over relational database instances. Since many attributes in typical databases have infinite domains, such as…
Much time in process mining projects is spent on finding and understanding data sources and extracting the event data needed. As a result, only a fraction of time is spent actually applying techniques to discover, control and predict the…
We propose unifying techniques from probabilistic databases and relational embedding models with the goal of performing complex queries on incomplete and uncertain data. We formalize a probabilistic database model with respect to which all…
Many modern data mining applications are concerned with the analysis of datasets in which the observations are described by paired high-dimensional vectorial representations or "views". Some typical examples can be found in web mining and…
Query answering over probabilistic data is an important task but is generally intractable. However, a new approach for this problem has recently been proposed, based on structural decompositions of input databases, following, e.g., tree…
This paper introduces U-relations, a succinct and purely relational representation system for uncertain databases. U-relations support attribute-level uncertainty using vertical partitioning. If we consider positive relational algebra…
Materialized view is used in large data centric applications to expedite query processing. The efficiency of materialized view depends on degree of result found against the queries over the existing materialized views. Materialized views…
In ontology-based data access (OBDA), users are provided with a conceptual view of a (relational) data source that abstracts away details about data storage. This conceptual view is realized through an ontology that is connected to the data…
In many supervised learning tasks, the entities to be labeled are related to each other in complex ways and their labels are not independent. For example, in hypertext classification, the labels of linked pages are highly correlated. A…
In this work, we study the problem of computing a tuple's expected multiplicity over probabilistic databases with bag semantics (where each tuple is associated with a multiplicity) exactly and approximately. We consider bag-TIDBs where we…
Model View Definition (MVD) is the standard methodology to define the data exchange requirements and rule constraints for Building Information Models (BIMs). In this paper, the MVDLite algorithm is proposed for the fast validation of MVD…
In this paper, a visual language, VCP, for queries on complex-value databases is proposed. The main strength of the new language is that it is purely visual: (i) It has no notion of variable, quantification, partiality, join, pattern…
Probabilistic databases (PDBs) are probability spaces over database instances. They provide a framework for handling uncertainty in databases, as occurs due to data integration, noisy data, data from unreliable sources or randomized…
Is it possible to make statistical inference broadly accessible to non-statisticians without sacrificing mathematical rigor or inference quality? This paper describes BayesDB, a probabilistic programming platform that aims to enable users…
Traditional database queries follow a simple model: they define constraints that each tuple in the result must satisfy. This model is computationally efficient, as the database system can evaluate the query conditions on each tuple…
We consider the setting of a Semantic Web database, containing both explicit data encoded in RDF triples, and implicit data, implied by the RDF semantics. Based on a query workload, we address the problem of selecting a set of views to be…