Related papers: Range Medians
We consider the following problem: given an unsorted array of $n$ elements, and a sequence of intervals in the array, compute the median in each of the subarrays defined by the intervals. We describe a simple algorithm which uses O(n) space…
We prove that any oblivious algorithm using space $S$ to find the median of a list of $n$ integers from $\{1,...,2n\}$ requires time $\Omega(n \log\log_S n)$. This bound also applies to the problem of determining whether the median is odd…
We study the non-overlapping indexing problem: Given a text T, preprocess it so that you can answer queries of the form: given a pattern P, report the maximal set of non-overlapping occurrences of P in T. A generalization of this problem is…
Given $n$ intervals on a line $\ell$, we consider the problem of moving these intervals on $\ell$ such that no two intervals overlap and the maximum moving distance of the intervals is minimized. The difficulty for solving the problem lies…
The connected $k$-median problem is a constrained clustering problem that combines distance-based $k$-clustering with connectivity information. The problem allows to input a metric space and an unweighted undirected connectivity graph that…
Given an array A[1: n] of n elements drawn from an ordered set, the sorted range selection problem is to build a data structure that can be used to answer the following type of queries efficiently: Given a pair of indices i, j $ (1\le i\le…
We consider an interval coverage problem. Given $n$ intervals of the same length on a line $L$ and a line segment $B$ on $L$, we want to move the intervals along $L$ such that every point of $B$ is covered by at least one interval and the…
We give a polynomial-time approximation algorithm for the (not necessarily metric) $k$-Median problem. The algorithm is an $\alpha$-size-approximation algorithm for $\alpha < 1 + 2 \ln(n/k)$. That is, it guarantees a solution having size at…
Given a set $P$ of $n$ points in the plane, we consider the problem of computing the number of points of $P$ in a query unit disk (i.e., all query disks have the same radius). We show that the main techniques for simplex range searching in…
We investigate a weighted variant of the interval stabbing problem, where the goal is to design an efficient data structure for a given set $\mathcal{I}$ of weighted intervals such that, for a query point $q$ and an integer $k>0$, we can…
Clustering is a fundamental problem in unsupervised learning, and has been studied widely both as a problem of learning mixture models and as an optimization problem. In this paper, we study clustering with respect the emph{k-median}…
We consider the two-dimensional sorted range reporting problem. Our data structure requires O(n lglg n) words of space and O(lglg n + k lglg n) query time, where k is the number of points in the query range. This data structure improves a…
We investigate the problem of determining a set S of k indistinguishable integers in the range [1,n]. The algorithm is allowed to query an integer $q\in [1,n]$, and receive a response comparing this integer to an integer randomly chosen…
Intervals have been generated in many applications (e.g., temporal databases), and they are often associated with weights, such as prices. This paper addresses the problem of processing top-k weighted stabbing queries on interval data.…
We consider the range mode problem where given a sequence and a query range in it, we want to find items with maximum frequency in the range. We give time- and space- efficient algorithms for this problem. Our algorithms are efficient for…
In the classical interval scheduling type of problems, a set of $n$ jobs, characterized by their start and end time, need to be executed by a set of machines, under various constraints. In this paper we study a new variant in which the jobs…
The classical center based clustering problems such as $k$-means/median/center assume that the optimal clusters satisfy the locality property that the points in the same cluster are close to each other. A number of clustering problems arise…
We consider the problem of explainable $k$-medians and $k$-means introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian~(ICML 2020). In this problem, our goal is to find a threshold decision tree that partitions data into $k$ clusters…
We consider two combinatorial problems. The first we call "search with wildcards": given an unknown n-bit string x, and the ability to check whether any subset of the bits of x is equal to a provided query string, the goal is to output x.…
In this paper we initiate a systematic study of exact algorithms for well-known clustering problems, namely $k$-Median and $k$-Means. In $k$-Median, the input consists of a set $X$ of $n$ points belonging to a metric space, and the task is…