Related papers: Robust nearest-neighbor methods for classifying hi…
Learning a robust classifier from a few samples remains a key challenge in machine learning. A major thrust of research has been focused on developing $k$-nearest neighbor ($k$-NN) based algorithms combined with metric learning that…
When data is of an extraordinarily large size or physically stored in different locations, the distributed nearest neighbor (NN) classifier is an attractive tool for classification. We propose a novel distributed adaptive NN classifier for…
Learning classifiers that are robust to adversarial examples has received a great deal of recent attention. A major drawback of the standard robust learning framework is there is an artificial robustness radius $r$ that applies to all…
In nearest-neighbor classification problems, a set of $d$-dimensional training points are given, each with a known classification, and are used to infer unknown classifications of other points by using the same classification as the nearest…
In scenarios involving text classification where the number of classes is large (in multiples of 10000s) and training samples for each class are few and often verbose, nearest neighbor methods are effective but very slow in computing a…
Nearest neighbor is a popular class of classification methods with many desirable properties. For a large data set which cannot be loaded into the memory of a single machine due to computation, communication, privacy, or ownership…
We propose a novel strategy for extracting features in supervised learning that can be used to construct a classifier which is more robust to small perturbations in the input space. Our method builds upon the idea of the information…
Classification of high-dimensional low sample size (HDLSS) data poses a challenge in a variety of real-world situations, such as gene expression studies, cancer research, and medical imaging. This article presents the development and…
Perhaps the most straightforward classifier in the arsenal or machine learning techniques is the Nearest Neighbour Classifier -- classification is achieved by identifying the nearest neighbours to a query example and using those neighbours…
Learning from a few examples is an important practical aspect of training classifiers. Various works have examined this aspect quite well. However, all existing approaches assume that the few examples provided are always correctly labeled.…
The k-Nearest Neighbor (k-NN) classification algorithm is one of the most widely-used lazy classifiers because of its simplicity and ease of implementation. It is considered to be an effective classifier and has many applications. However,…
Classification models are very sensitive to data uncertainty, and finding robust classifiers that are less sensitive to data uncertainty has raised great interest in the machine learning literature. This paper aims to construct robust…
Nearest neighbor methods are a popular class of nonparametric estimators with several desirable properties, such as adaptivity to different distance scales in different regions of space. Prior work on convergence rates for nearest neighbor…
Nearest neighbor (kNN) methods have been gaining popularity in recent years in light of advances in hardware and efficiency of algorithms. There is a plethora of methods to choose from today, each with their own advantages and…
KNN has the reputation to be the word simplest but efficient supervised learning algorithm used for either classification or regression. KNN prediction efficiency highly depends on the size of its training data but when this training data…
Nearest neighbor has always been one of the most appealing non-parametric approaches in machine learning, pattern recognition, computer vision, etc. Previous empirical studies partly shows that nearest neighbor is resistant to noise, yet…
Low-rank tensor models are widely used in statistics. However, most existing methods rely heavily on the assumption that data follows a sub-Gaussian distribution. To address the challenges associated with heavy-tailed distributions…
Distance-based classification is among the most competitive classification methods for time series data. The most critical component of distance-based classification is the selected distance function. Past research has proposed various…
Feature selection is an important problem in high-dimensional data analysis and classification. Conventional feature selection approaches focus on detecting the features based on a redundancy criterion using learning and feature searching…
Neural networks are not learning optimal decision boundaries. We show that decision boundaries are situated in areas of low training data density. They are impacted by few training samples which can easily lead to overfitting. We provide a…