Related papers: On pattern classification with weighted dimensions

Supervised Pattern Recognition Involving Skewed Feature Densities

Pattern recognition constitutes a particularly important task underlying a great deal of scientific and technologica activities. At the same time, pattern recognition involves several challenges, including the choice of features to…

Machine Learning · Computer Science 2024-09-04 Alexandre Benatti , Luciano da F. Costa

Dimensionality Invariant Similarity Measure

This paper presents a new similarity measure to be used for general tasks including supervised learning, which is represented by the K-nearest neighbor classifier (KNN). The proposed similarity measure is invariant to large differences in…

Machine Learning · Computer Science 2014-09-04 Ahmad Basheer Hassanat

DW-KNN: A Transparent Local Classifier Integrating Distance Consistency and Neighbor Reliability

K-Nearest Neighbors (KNN) is one of the most used ML classifiers. However, if we observe closely, standard distance-weighted KNN and relative variants assume all 'k' neighbors are equally reliable. In heterogeneous feature space, this…

Machine Learning · Computer Science 2025-12-11 Kumarjit Pathak , Karthik K , Sachin Madan , Jitin Kapila

A Weighted Mutual k-Nearest Neighbour for Classification Mining

kNN is a very effective Instance based learning method, and it is easy to implement. Due to heterogeneous nature of data, noises from different possible sources are also widespread in nature especially in case of large-scale databases. For…

Machine Learning · Computer Science 2020-05-19 Joydip Dhar , Ashaya Shukla , Mukul Kumar , Prashant Gupta

k-Nearest Neighbour Classification of Datasets with a Family of Distances

The $k$-nearest neighbour ($k$-NN) classifier is one of the oldest and most important supervised learning algorithms for classifying datasets. Traditionally the Euclidean norm is used as the distance for the $k$-NN classifier. In this…

Machine Learning · Statistics 2015-12-02 Stan Hatko

AEkNN: An AutoEncoder kNN-based classifier with built-in dimensionality reduction

High dimensionality, i.e. data having a large number of variables, tends to be a challenge for most machine learning tasks, including classification. A classifier usually builds a model representing how a set of inputs explain the outputs.…

Machine Learning · Computer Science 2018-03-12 Francisco J. Pulgar , Francisco Charte , Antonio J. Rivera , María J. del Jesus

Missing Data Imputation for Classification Problems

Imputation of missing data is a common application in various classification problems where the feature training matrix has missingness. A widely used solution to this imputation problem is based on the lazy learning technique, $k$-nearest…

Machine Learning · Statistics 2020-02-26 Arkopal Choudhury , Michael R. Kosorok

Adaptive Explicit Kernel Minkowski Weighted K-means

The K-means algorithm is among the most commonly used data clustering methods. However, the regular K-means can only be applied in the input space and it is applicable when clusters are linearly separable. The kernel K-means, which extends…

Machine Learning · Computer Science 2020-12-08 Amir Aradnia , Maryam Amir Haeri , Mohammad Mehdi Ebadzadeh

Variance-Adjusted Cosine Distance as Similarity Metric

Cosine similarity is a popular distance measure that measures the similarity between two vectors in the inner product space. It is widely used in many data classification algorithms like K-Nearest Neighbors, Clustering etc. This study…

Machine Learning · Statistics 2025-02-05 Satyajeet Sahoo , Jhareswar Maiti

The K-nearest neighbor (KNN) classifier is one of the simplest and most common classifiers, yet its performance competes with the most complex classifiers in the literature. The core of this classifier depends mainly on measuring the…

Machine Learning · Computer Science 2019-10-01 V. B. Surya Prasath , Haneen Arafat Abu Alfeilat , Ahmad B. A. Hassanat , Omar Lasassmeh , Ahmad S. Tarawneh , Mahmoud Bashir Alhasanat , Hamzeh S. Eyal Salman

Minkowski distances and standardisation for clustering and classification of high dimensional data

There are many distance-based methods for classification and clustering, and for data with a high number of dimensions and a lower number of observations, processing distances is computationally advantageous compared to the raw data matrix.…

Methodology · Statistics 2020-06-25 Christian Hennig

A review on distance based time series classification

Time series classification is an increasing research topic due to the vast amount of time series data that are being created over a wide variety of fields. The particularity of the data makes it a challenging task and different approaches…

Machine Learning · Statistics 2018-06-13 Amaia Abanda , Usue Mori , Jose A. Lozano

Deep Weighted Averaging Classifiers

Recent advances in deep learning have achieved impressive gains in classification accuracy on a variety of types of data, including images and text. Despite these gains, however, concerns have been raised about the calibration, robustness,…

Machine Learning · Computer Science 2018-11-20 Dallas Card , Michael Zhang , Noah A. Smith

Learned k-NN Distance Estimation

Big data mining is well known to be an important task for data science, because it can provide useful observations and new knowledge hidden in given large datasets. Proximity-based data analysis is particularly utilized in many real-life…

Databases · Computer Science 2022-11-29 Daichi Amagata , Yusuke Arai , Sumio Fujita , Takahiro Hara

Dimensionality Reduction on Complex Vector Spaces for Euclidean Distance with Dynamic Weights

The weighted Euclidean norm $\|x\|_w$ of a vector $x\in \mathbb{R}^d$ with weights $w\in \mathbb{R}^d$ is the Euclidean norm where the contribution of each dimension is scaled by a given weight. Approaches to dimensionality reduction that…

Data Structures and Algorithms · Computer Science 2026-03-23 Simone Moretti , Paolo Pellizzoni , Francesco Silvestri

High-dimensional sparse trigonometric approximation in the uniform norm and consequences for sampling recovery

Recent findings by Jahn, T. Ullrich, Voigtlaender [10] relate non-linear sampling numbers for the square norm to quantities involving trigonometric best $m-$term approximation errors in the uniform norm. Here we establish new results for…

Numerical Analysis · Mathematics 2024-07-24 Moritz Moeller , Serhii Stasyuk , Tino Ullrich

High-dimensional Semi-supervised Classification via the Fermat Distance

Semi-supervised classification, where unlabeled data are massive but labeled data are limited, often arises in machine learning applications. We address this challenge under high-dimensional data by leveraging the manifold and cluster…

Machine Learning · Statistics 2026-04-28 Ruoxu Tan , Yiming Zang

On high-dimensional modifications of some graph-based two-sample tests

Testing for the equality of two high-dimensional distributions is a challenging problem, and this becomes even more challenging when the sample size is small. Over the last few decades, several graph-based two-sample tests have been…

Methodology · Statistics 2019-11-22 Soham Sarkar , Rahul Biswas , Anil K. Ghosh

An enhanced statistical feature fusion approach using an improved distance evaluation algorithm and weighted K-nearest neighbor for bearing fault diagnosis

Bearings are among the most failure-prone components in rotating machinery, and their condition directly impacts overall performance. Therefore, accurately diagnosing bearing faults is essential for ensuring system stability. However,…

Signal Processing · Electrical Eng. & Systems 2025-09-26 Amir Eshaghi Chaleshtori , Abdollah Aghaie

Adaptive Nearest Neighbor: A General Framework for Distance Metric Learning

$K$-NN classifier is one of the most famous classification algorithms, whose performance is crucially dependent on the distance metric. When we consider the distance metric as a parameter of $K$-NN, learning an appropriate distance metric…

Machine Learning · Computer Science 2019-11-26 Kun Song