Related papers: Multi-dimensional Approximate Counting
Memory becomes a limiting factor in contemporary applications, such as analyses of the Webgraph and molecular sequences, when many objects need to be counted simultaneously. Robert Morris [Communications of the ACM, 21:840--842, 1978]…
Storing a counter incremented $N$ times would naively consume $O(\log N)$ bits of memory. In 1978 Morris described the very first streaming algorithm: the "Morris Counter". His algorithm's space bound is a random variable, and it has been…
In this paper, we study the problem of computing the diameter of a set of $n$ points in $d$-dimensional Euclidean space for a fixed dimension $d$, and propose a new $(1+\varepsilon)$-approximation algorithm with $O(n+ 1/\varepsilon^{d-1})$…
We consider the popular $k$-means problem in $d$-dimensional Euclidean space. Recently Friggstad, Rezapour, Salavatipour [FOCS'16] and Cohen-Addad, Klein, Mathieu [FOCS'16] showed that the standard local search algorithm yields a…
We study the problem of $2$-dimensional orthogonal range counting with additive error. Given a set $P$ of $n$ points drawn from an $n\times n$ grid and an error parameter $\eps$, the goal is to build a data structure, such that for any…
Metric embeddings are a widely used method in algorithm design, where generally a ``complex'' metric is embedded into a simpler, lower-dimensional one. Historically, the theoretical computer science community has focused on bi-Lipschitz…
For a given metric measure space $(X,d,\mu)$ we consider finite samples of points, calculate the matrix of distances between them and then reconstruct the points in some finite-dimensional space using the multidimensional scaling (MDS)…
We propose a new $(1+O(\varepsilon))$-approximation algorithm with $O(n+ 1/\varepsilon^{\frac{(d-1)}{2}})$ running time for computing the diameter of a set of $n$ points in the $d$-dimensional Euclidean space for a fixed dimension $d$,…
Experimental design is a classical statistics problem and its aim is to estimate an unknown $m$-dimensional vector $\beta$ from linear measurements where a Gaussian noise is introduced in each measurement. For the combinatorial experimental…
We study the sublinear multivariate mean estimation problem in $d$-dimensional Euclidean space. Specifically, we aim to find the mean $\mu$ of a ground point set $A$, which minimizes the sum of squared Euclidean distances of the points in…
We study the fundamental problem of estimating the mean of a $d$-dimensional distribution with covariance $\Sigma \preccurlyeq \sigma^2 I_d$ given $n$ samples. When $d = 1$, \cite{catoni} showed an estimator with error $(1+o(1)) \cdot…
We study the witness-counting problem: given a set of vectors $V$ in the $d$-dimensional vector space over $\mathbb{F}_2$, a target vector $t$, and an integer $k$, count all ways to sum-up exactly $k$ different vectors from $V$ to reach…
Several researchers proposed using non-Euclidean metrics on point sets in Euclidean space for clustering noisy data. Almost always, a distance function is desired that recognizes the closeness of the points in the same cluster, even if the…
We consider the problem of computing the smallest possible distortion for embedding of a given n-point metric space into R^d, where d is fixed (and small). For d=1, it was known that approximating the minimum distortion with a factor better…
Let $(\{1,2,\ldots,n\},d)$ be a metric space. We analyze the expected value and the variance of $\sum_{i=1}^{\lfloor n/2\rfloor}\,d({\boldsymbol{\pi}}(2i-1),{\boldsymbol{\pi}}(2i))$ for a uniformly random permutation ${\boldsymbol{\pi}}$ of…
Given a matrix $D$ describing the pairwise dissimilarities of a data set, a common task is to embed the data points into Euclidean space. The classical multidimensional scaling (cMDS) algorithm is a widespread method to do this. However,…
Many streaming algorithms provide only a high-probability relative approximation. These two relaxations, of allowing approximation and randomization, seem necessary -- for many streaming problems, both relaxations must be employed…
For any finite point set in $D$-dimensional space equipped with the 1-norm, we present random linear embeddings to $k$-dimensional space, with a new metric, having the following properties. For any pair of points from the point set that are…
Multidimensional scaling (MDS) is the act of embedding proximity information about a set of $n$ objects in $d$-dimensional Euclidean space. As originally conceived by the psychometric community, MDS was concerned with embedding a fixed set…
For a probability measure on a real separable Hilbert space, we are interested in "volume-based" approximations of the d-dimensional least squares error of it, i.e., least squares error with respect to a best fit d-dimensional affine…