Related papers: Statistics of Partial Minima
Suppose the data consist of a set $S$ of points $x_j$, $1\leq j \leq J$, distributed in a bounded domain $D\subset R^N$, where $N$ is a large number. An algorithm is given for finding the sets $L_k$ of dimension $k\ll N$, $k=1,2,...K$, in a…
Recent years have witnessed an increasing popularity of algorithm design for distributed data, largely due to the fact that massive datasets are often collected and stored in different locations. In the distributed setting communication…
Consider the hyperplanes at a fixed distance $t$ from the center of the hypercube $[0,1]^d$. Significant attention has been given to determining the hyperplanes $H$ among these such that the $(d-1)$-dimensional volume of $H\cap[0,1]^d$ is…
We consider random energy landscapes constructed from d-dimensional lattices or trees. The distribution of the number of local minima in such landscapes follows a large deviation principle and we derive the associated law exactly for…
We study conserved one-dimensional models of particle diffusion, attachment and detachment from clusters, where the detachment rates decrease with increasing cluster size as gamma(m) ~ m^{-k}, k>0. Heuristic scaling arguments based on…
The k-means algorithm is a well-known method for partitioning n points that lie in the d-dimensional space into k clusters. Its main features are simplicity and speed in practice. Theoretically, however, the best known upper bound on its…
Nondominated sorting arranges a set of points in Euclidean space into layers by repeatedly removing the coordinatewise minimal elements. It was recently shown that nondominated sorting of random points has a Hamilton-Jacobi equation…
Let $\varphi_{n,K}$ denote the largest angle in all the triangles with vertices among the $n$ points selected at random in a compact convex subset $K$ of $\mathbb{R}^d$ with nonempty interior, where $d\ge2$. It is shown that the…
Consider a string of $n$ positions, i.e. a discrete string of length $n$. Units of length $k$ are placed at random on this string in such a way that they do not overlap, and as often as possible, i.e. until all spacings between neighboring…
In this manuscript we introduce and study an extended version of the minimal dispersion of point sets, which has recently attracted considerable attention. Given a set $\mathscr P_n=\{x_1,\dots,x_n\}\subset [0,1]^d$ and…
Consider an i.i.d. sample X^*_1,X^*_2,...,X^*_n from a location-scale family, and assume that the only available observations consist of the partial maxima (or minima)sequence, X^*_{1:1},X^*_{2:2},...,X^*_{n:n}, where…
Nondominated sorting is a discrete process that sorts points in Euclidean space according to the coordinatewise partial order, and is used to rank feasible solutions to multiobjective optimization problems. It was previously shown that…
We study the problems of learning and testing junta distributions on $\{-1,1\}^n$ with respect to the uniform distribution, where a distribution $p$ is a $k$-junta if its probability mass function $p(x)$ depends on a subset of at most $k$…
Let $\mathcal{S}$ be a dataset of $n$ 2-dimensional points. The top-$k$ dominating query aims to report the $k$ points that dominate the most points in $\mathcal{S}$. A point $p$ dominates a point $q$ iff all coordinates of $p$ are smaller…
We study the fundamental problem of estimating an unknown discrete distribution $p$ over $d$ symbols, given $n$ i.i.d. samples from the distribution. We are interested in minimizing the KL divergence between the true distribution and the…
This paper discusses the topic of dimensionality reduction for $k$-means clustering. We prove that any set of $n$ points in $d$ dimensions (rows in a matrix $A \in \RR^{n \times d}$) can be projected into $t = \Omega(k / \eps^2)$…
We study the sizes of delta-additive sets of unit vectors in a d-dimensional normed space: the sum of any two vectors has norm at most delta. One-additive sets originate in finding upper bounds of vertex degrees of Steiner Minimum Trees in…
Distributed algorithms for solving additive or consensus optimization problems commonly rely on first-order or proximal splitting methods. These algorithms generally come with restrictive assumptions and at best enjoy a linear convergence…
We derive exact statistical properties of a class of recursive fragmentation processes. We show that introducing a fragmentation probability 0<p<1 leads to a purely algebraic size distribution in one dimension, P(x) ~ x^{-2p}. In d…
In this paper, we investigate the distribution of the maximum of partial sums of certain cubic exponential sums, commonly known as "Birch sums". Our main theorem gives upper and lower bounds (of nearly the same order of magnitude) for the…