Related papers: A Heterogeneous High Dimensional Approximate Neare…
The nearest neighbor problem is defined as follows: Given a set $P$ of $n$ points in some metric space $(X,D)$, build a data structure that, given any point $q$, returns a point in $P$ that is closest to $q$ (its "nearest neighbor" in $P$).…
We study the Approximate Nearest Neighbor problem for metric spaces where the query points are constrained to lie on a subspace of low doubling dimension, while the data is high-dimensional. We show that this problem can be solved…
We introduce a new variant of the nearest neighbor search problem, which allows for some coordinates of the dataset to be arbitrarily corrupted or unknown. Formally, given a dataset of $n$ points $P=\{ x_1,\ldots, x_n\}$ in high-dimensions,…
Consider the problem of finding high dimensional approximate nearest neighbors, where the data is generated by some known probabilistic model. We will investigate a large natural class of algorithms which we call bucketing codes. We will…
Similarity search is a fundamental algorithmic primitive, widely used in many computer science disciplines. Given a set of points $S$ and a radius parameter $r>0$, the $r$-near neighbor ($r$-NN) problem asks for a data structure that, given…
Similarity search is a fundamental algorithmic primitive, widely used in many computer science disciplines. There are several variants of the similarity search problem, and one of the most relevant is the $r$-near neighbor ($r$-NN) problem:…
In this paper we study the problem of finding the approximate nearest neighbor of a query point in the high dimensional space, focusing on the Euclidean space. The earlier approaches use locality-preserving hash functions (that tend to map…
We consider the $(1+\epsilon)$-approximate nearest neighbor search problem: given a set $X$ of $n$ points in a $d$-dimensional space, build a data structure that, given any query point $y$, finds a point $x \in X$ whose distance to $y$ is…
Nearest Neighbors Algorithm is a Lazy Learning Algorithm, in which the algorithm tries to approximate the predictions with the help of similar existing vectors in the training dataset. The predictions made by the K-Nearest Neighbors…
K-nearest neighbor classification algorithm is one of the most basic algorithms in machine learning, which determines the sample's category by the similarity between samples. In this paper, we propose a quantum K-nearest neighbor…
We study the fundamental problem of approximate nearest neighbor search in $d$-dimensional Hamming space $\{0,1\}^d$. We study the complexity of the problem in the famous cell-probe model, a classic model for data structures. We consider…
Existing methods for retrieving k-nearest neighbours suffer from the curse of dimensionality. We argue this is caused in part by inherent deficiencies of space partitioning, which is the underlying strategy used by most existing methods. We…
We consider machine learning in a comparison-based setting where we are given a set of points in a metric space, but we have no access to the actual distances between the points. Instead, we can only ask an oracle whether the distance…
We present a new approach to approximate nearest-neighbor queries in fixed dimension under a variety of non-Euclidean distances. We are given a set $S$ of $n$ points in $\mathbb{R}^d$, an approximation parameter $\varepsilon > 0$, and a…
Suppose $V$ is an $n$-element set where for each $x \in V$, the elements of $V \setminus \{x\}$ are ranked by their similarity to $x$. The $K$-nearest neighbor graph is a directed graph including an arc from each $x$ to the $K$ points of $V…
We address the problem of converting large-scale high-dimensional image data into binary codes so that approximate nearest-neighbor search over them can be efficiently performed. Different from most of the existing unsupervised approaches…
Algorithms often carry out equally many computations for "easy" and "hard" problem instances. In particular, algorithms for finding nearest neighbors typically have the same running time regardless of the particular problem instance. In…
We start with a review of the pervasiveness of the nearest neighbor search problem and techniques used to solve it along with some experimental results. In the second chapter, we show reductions between two different classes of geo- metric…
Dimensionality reduction is crucial both for visualization and preprocessing high dimensional data for machine learning. We introduce a novel method based on a hierarchy built on 1-nearest neighbor graphs in the original space which is used…
We study the Closest Pair Problem in Hamming metric, which asks to find the pair with the smallest Hamming distance in a collection of binary vectors. We give a new randomized algorithm for the problem on uniformly random input…