Related papers: Approximating Nearest Neighbor Distances
Distance measuring is a very important task in digital geometry and digital image processing. Due to our natural approach to geometry we think of the set of points that are equally far from a given point as a Euclidean circle. Using the…
We present a new approach to approximate nearest-neighbor queries in fixed dimension under a variety of non-Euclidean distances. We are given a set $S$ of $n$ points in $\mathbb{R}^d$, an approximation parameter $\varepsilon > 0$, and a…
We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to…
The distance metric plays an important role in nearest neighbor (NN) classification. Usually the Euclidean distance metric is assumed or a Mahalanobis distance metric is optimized to improve the NN performance. In this paper, we study the…
The k-nearest neighbors (k-NN) is a basic machine learning (ML) algorithm, and several quantum versions of it, employing different distance metrics, have been presented in the last few years. Although the Euclidean distance is one of the…
We present several quantum algorithms for performing nearest-neighbor learning. At the core of our algorithms are fast and coherent quantum methods for computing distance metrics such as the inner product and Euclidean distance. We prove…
We show new applications of the nearest-neighbor chain algorithm, a technique that originated in agglomerative hierarchical clustering. We apply it to a diverse class of geometric problems: we construct the greedy multi-fragment tour for…
Data-sensitive metrics adapt distances locally based the density of data points with the goal of aligning distances and some notion of similarity. In this paper, we give the first exact algorithm for computing a data-sensitive metric called…
Many statistical and machine learning approaches rely on pairwise distances between data points. The choice of distance metric has a fundamental impact on performance of these procedures, raising questions about how to appropriately…
In algorithms for finite metric spaces, it is common to assume that the distance between two points can be computed in constant time, and complexity bounds are expressed only in terms of the number of points of the metric space. We…
Metric based comparison operations such as finding maximum, nearest and farthest neighbor are fundamental to studying various clustering techniques such as $k$-center clustering and agglomerative hierarchical clustering. These techniques…
We provide an efficient reduction from the problem of querying approximate multiplicatively weighted farthest neighbors in a metric space to the unweighted problem. Combining our techniques with core-sets for approximate unweighted farthest…
Recent literature has shown that symbolic data, such as text and graphs, is often better represented by points on a curved manifold, rather than in Euclidean space. However, geometrical operations on manifolds are generally more complicated…
Nearest neighbor methods are a popular class of nonparametric estimators with several desirable properties, such as adaptivity to different distance scales in different regions of space. Prior work on convergence rates for nearest neighbor…
Proximity graph-based methods have emerged as a leading paradigm for approximate nearest neighbor (ANN) search in the system community. This paper presents fresh insights into the theoretical foundation of these methods. We describe an…
Many methods have been developed for data clustering, such as k-means, expectation maximization and algorithms based on graph theory. In this latter case, graphs are generally constructed by taking into account the Euclidian distance as a…
The problem of Approximate Nearest Neighbor (ANN) search is fundamental in computer science and has benefited from significant progress in the past couple of decades. However, most work has been devoted to pointsets whereas complex shapes…
Fair clustering is a constrained variant of clustering where the goal is to partition a set of colored points, such that the fraction of points of any color in every cluster is more or less equal to the fraction of points of this color in…
Nearest Neighbors Algorithm is a Lazy Learning Algorithm, in which the algorithm tries to approximate the predictions with the help of similar existing vectors in the training dataset. The predictions made by the K-Nearest Neighbors…
Crowdsourced, or human computation based clustering algorithms usually rely on relative distance comparisons, as these are easier to elicit from human workers than absolute distance information. A relative distance comparison is a statement…