Related papers: Range Queries on Uncertain Data
We study coresets for various types of range counting queries on uncertain data. In our model each uncertain point has a probability density describing its location, sometimes defined as k distinct locations. Our goal is to construct a…
We revisit the range sampling problem: the input is a set of points where each point is associated with a real-valued weight. The goal is to store them in a structure such that given a query range and an integer $k$, we can extract $k$…
In this paper, we consider a coverage problem for uncertain points in a tree. Let T be a tree containing a set P of n (weighted) demand points, and the location of each demand point P_i\in P is uncertain but is known to appear in one of m_i…
Given a set $P$ of coloured points on the real line, we study the problem of answering range $\alpha$-majority (or "heavy hitter") queries on $P$. More specifically, for a query range $Q$, we want to return each colour that is assigned to…
Given a set $S$ of $n$ points in the plane, we consider the problem of answering range selection queries on $S$: that is, given an arbitrary $x$-range $Q$ and an integer $k > 0$, return the $k$-th smallest $y$-coordinate from the set of…
A mode of a multiset $S$ is an element $a \in S$ of maximum multiplicity; that is, $a$ occurs at least as frequently as any other element in $S$. Given a list $A[1:n]$ of $n$ items, we consider the problem of constructing a data structure…
The problem of recovering (count and sum) range queries over multidimensional data only on the basis of aggregate information on such data is addressed. This problem can be formalized as follows. Suppose that a transformation T producing a…
We study the approximate range searching for three variants of the clustering problem with a set $P$ of $n$ points in $d$-dimensional Euclidean space and axis-parallel rectangular range queries: the $k$-median, $k$-means, and $k$-center…
We study the following range searching problem in high-dimensional Euclidean spaces: given a finite set $P\subset \mathbb{R}^d$, where each $p\in P$ is assigned a weight $w_p$, and radius $r>0$, we need to preprocess $P$ into a data…
We present a data-structure for orthogonal range searching for random points in the plane. The new data-structure uses (in expectation) $O\bigl(n \log n ( \log \log n)^2 \bigr)$ space, and answers emptiness queries in constant time. As a…
Let P be a set of n points in R^2. Given a rectangle Q = [\alpha_1, \alpha_2] x [\beta_1, \beta_2], a range skyline query returns the maxima of the points in P \cap Q. An important variant is the so-called top-open queries, where Q is a…
Let $P$ be a set of $n$ points in $\R^d$. We present a linear-size data structure for answering range queries on $P$ with constant-complexity semialgebraic sets as ranges, in time close to $O(n^{1-1/d})$. It essentially matches the…
Given a set of points $P\subset \mathbb{R}^{d}$ and a kernel $k$, the Kernel Density Estimate at a point $x\in\mathbb{R}^{d}$ is defined as $\mathrm{KDE}_{P}(x)=\frac{1}{|P|}\sum_{y\in P} k(x,y)$. We study the problem of designing a data…
This paper introduces a scalable approach for probabilistic top-k similarity ranking on uncertain vector data. Each uncertain object is represented by a set of vector instances that are assumed to be mutually-exclusive. The objective is to…
Location data is inherently uncertain for many reasons including 1) imprecise location measurements, 2) obsolete observations that are often interpolated, and 3) deliberate obfuscation to preserve location privacy. What makes handling…
When the cost of misclassifying a sample is high, it is useful to have an accurate estimate of uncertainty in the prediction for that sample. There are also multiple types of uncertainty which are best estimated in different ways, for…
We revisit the range minimum query problem and present a new O(n)-space data structure that supports queries in O(1) time. Although previous data structures exist whose asymptotic bounds match ours, our goal is to introduce a new solution…
Data integration is a notoriously difficult and heuristic-driven process, especially when ground-truth data are not readily available. This paper presents a measure of uncertainty by providing maximal and minimal ranges of a query outcome…
Given a set $P$ of $n$ points in the plane, we consider the problem of computing the number of points of $P$ in a query unit disk (i.e., all query disks have the same radius). We show that the main techniques for simplex range searching in…
Let R^d -> A be a query problem over R^d for which there exists a data structure S that can compute P(q) in O(log n) time for any query point q in R^d. Let D be a probability measure over R^d representing a distribution of queries. We…