Related papers: On Approximating the Lp Distances for p>2
We provide a simple method and relevant theoretical analysis for efficiently estimating higher-order lp distances. While the analysis mainly focuses on l4, our methodology extends naturally to p = 6,8,10..., (i.e., when p is even).…
Random projections (RP) are a popular tool for reducing dimensionality while preserving local geometry. In many applications the data set to be projected is given to us in advance, yet the current RP techniques do not make use of…
Random projections are random linear maps, sampled from appropriate distributions, that approx- imately preserve certain geometrical invariants so that the approximation improves as the dimension of the space grows. The well-known…
In recent years, large high-dimensional data sets have become commonplace in a wide range of applications in science and commerce. Techniques for dimension reduction are of primary concern in statistical analysis. Projection methods play an…
A set of piecewise linear functions, called polylines, $P_1,\ldots,P_L$ each with at most $n$ vertices can be simplified into a polyline $M$ with $k$ vertices, such that the Fr\'echet distances $\epsilon_1,\ldots,\epsilon_L$ to each of…
Many applications using large datasets require efficient methods for minimizing a proximable convex function subject to satisfying a set of linear constraints within a specified tolerance. For this task, we present a proximal projection…
Several important algorithms for machine learning and data analysis use pairwise distances as input. On Riemannian manifolds these distances may be prohibitively costly to compute, in particular for large datasets. To tackle this problem,…
We consider the problem of computing L1-distances between every pair ofcprobability densities from a given family. We point out that the technique of Cauchy random projections (Indyk'06) in this context turns into stochastic integrals with…
The method of stable random projections is popular for efficiently computing the Lp distances in high dimension (where 0<p<=2), using small space. Because it adopts nonadaptive linear projections, this method is naturally suitable when the…
As a typical dimensionality reduction technique, random projection can be simply implemented with linear projection, while maintaining the pairwise distances of high-dimensional data with high probability. Considering this technique is…
Recent technical advances in collecting spatial data have been increasing the demand for methods to analyze large spatial datasets. The statistical analysis for these types of datasets can provide useful knowledge in various fields.…
The transportation $\mathrm{L}^p$ distance, denoted $\mathrm{TL}^p$, has been proposed as a generalisation of Wasserstein $\mathrm{W}^p$ distances motivated by the property that it can be applied directly to colour or multi-channelled…
When performing classification tasks, raw high dimensional features often contain redundant information, and lead to increased computational complexity and overfitting. In this paper, we assume the data samples lie on a single underlying…
Computing the infinity Wasserstein distance and retrieving projections of a probability measure onto a closed subset of probability measures are critical sub-problems in various applied fields. However, the practical applicability of these…
It is a key to construct a similarity graph in graph-oriented subspace learning and clustering. In a similarity graph, each vertex denotes a data point and the edge weight represents the similarity between two points. There are two popular…
Let $\mathbf{P}=\{ p_1, p_2, \ldots p_n \}$ and $\mathbf{Q} = \{ q_1, q_2 \ldots q_m \}$ be two point sets in an arbitrary metric space. Let $\mathbf{A}$ represent the $m\times n$ pairwise distance matrix with $\mathbf{A}_{i,j} = d(p_i,…
In algorithms for finite metric spaces, it is common to assume that the distance between two points can be computed in constant time, and complexity bounds are expressed only in terms of the number of points of the metric space. We…
This paper, broadly speaking, covers the use of randomness in two main areas: low-rank approximation and kernel methods. Low-rank approximation is very important in numerical linear algebra. Many applications depend on matrix decomposition…
Matrices with low numerical rank are omnipresent in many signal processing and data analysis applications. The pivoted QLP (p-QLP) algorithm constructs a highly accurate approximation to an input low-rank matrix. However, it is…
Optimal transport and its related problems, including optimal partial transport, have proven to be valuable tools in machine learning for computing meaningful distances between probability or positive measures. This success has led to a…