Related papers: Efficient K-Nearest Neighbor Join Algorithms for H…
k nearest neighbor join (kNN join), designed to find k nearest neighbors from a dataset S for every object in another dataset R, is a primitive operation widely adopted by many data mining applications. As a combination of the k nearest…
K Nearest Neighbor (KNN) joins are used in scientific domains for data analysis, and are building blocks of several well-known algorithms. KNN-joins find the KNN of all points in a dataset. This paper focuses on a hybrid CPU/GPU approach…
k-nearest neighbor graph is a fundamental data structure in many disciplines such as information retrieval, data-mining, pattern recognition, and machine learning, etc. In the literature, considerable research has been focusing on how to…
Big data mining is well known to be an important task for data science, because it can provide useful observations and new knowledge hidden in given large datasets. Proximity-based data analysis is particularly utilized in many real-life…
The k Nearest Neighbors (kNN) method has received much attention in the past decades, where some theoretical bounds on its performance were identified and where practical optimizations were proposed for making it work fairly well in high…
Nearest neighbor search is known as a challenging issue that has been studied for several decades. Recently, this issue becomes more and more imminent in viewing that the big data problem arises from various fields. In this paper, a…
Approximate k-Nearest Neighbour (ANN) methods are often used for mining information and aiding machine learning on large scale high-dimensional datasets. ANN methods typically differ in the index structure used for accelerating searches,…
Bearings are among the most failure-prone components in rotating machinery, and their condition directly impacts overall performance. Therefore, accurately diagnosing bearing faults is essential for ensuring system stability. However,…
Data-driven neighborhood definitions and graph constructions are often used in machine learning and signal processing applications. k-nearest neighbor~(kNN) and $\epsilon$-neighborhood methods are among the most common methods used for…
The $k$-nearest neighbor graph (KNNG) on high-dimensional data is a data structure widely used in many applications such as similarity search, dimension reduction and clustering. Due to its increasing popularity, several methods under the…
The problem of finding K-nearest neighbors in the given dataset for a given query point has been worked upon since several years. In very high dimensional spaces the K-nearest neighbor search (KNNS) suffers in terms of complexity in…
Approximate K nearest neighbor (AKNN) search is a fundamental and challenging problem. We observe that in high-dimensional space, the time consumption of nearly all AKNN algorithms is dominated by that of the distance comparison operations…
Non-negative signals form an important class of sparse signals. Many algorithms have already beenproposed to recover such non-negative representations, where greedy and convex relaxed algorithms are among the most popular methods. One fast…
KNN has the reputation to be the word simplest but efficient supervised learning algorithm used for either classification or regression. KNN prediction efficiency highly depends on the size of its training data but when this training data…
Approximate $k$ nearest neighbor (AKNN) search in high-dimensional space is a foundational problem in vector databases with widespread applications. Among the numerous AKNN indexes, Proximity Graph-based indexes achieve state-of-the-art…
Probabilistic k-nearest neighbour (PKNN) classification has been introduced to improve the performance of original k-nearest neighbour (KNN) classification algorithm by explicitly modelling uncertainty in the classification of each feature…
Calculation of near-neighbor interactions among high dimensional, irregularly distributed data points is a fundamental task to many graph-based or kernel-based machine learning algorithms and applications. Such calculations, involving…
Existing methods for retrieving k-nearest neighbours suffer from the curse of dimensionality. We argue this is caused in part by inherent deficiencies of space partitioning, which is the underlying strategy used by most existing methods. We…
In this paper we describe a new brute force algorithm for building the $k$-Nearest Neighbor Graph ($k$-NNG). The $k$-NNG algorithm has many applications in areas such as machine learning, bio-informatics, and clustering analysis. While…
High-dimensional approximate $K$ nearest neighbor search (AKNN) is a fundamental task for various applications, including information retrieval. Most existing algorithms for AKNN can be decomposed into two main components, i.e., candidate…