Related papers: Multiclass histogram-based thresholding using kern…
Several disciplines, like the social sciences, epidemiology, sentiment analysis, or market research, are interested in knowing the distribution of the classes in a population rather than the individual labels of the members thereof.…
A novel nonparametric clustering algorithm is proposed using the interpoint distances between the members of the data to reveal the inherent clustering structure existing in the given set of data, where we apply the classical nonparametric…
Biclustering algorithms partition data and covariates simultaneously, providing new insights in several domains, such as analyzing gene expression to discover new biological functions. This paper develops a new model-free biclustering…
Imbalanced response variable distribution is a common occurrence in data science. In fields such as fraud detection, medical diagnostics, system intrusion detection and many others where abnormal behavior is rarely observed the data under…
We propose a hybrid quantum approach to threshold and binarize a grayscale image through unsharp measurements (UM) relying on image histogram. Generally, the histograms are characterized by multiple overlapping normal distributions…
Distance-based clustering and classification are widely used in various fields to group mixed numeric and categorical data. In many algorithms, a predefined distance measurement is used to cluster data points based on their dissimilarity.…
Kernel Density Estimation (KDE) is a cornerstone of nonparametric statistics, yet it remains sensitive to bandwidth choice, boundary bias, and computational inefficiency. This study revisits KDE through a principled convolutional framework,…
Kernel density estimation (KDE) is one of the most widely used nonparametric density estimation methods. The fact that it is a memory-based method, i.e., it uses the entire training data set for prediction, makes it unsuitable for most…
Kernel density estimation is a key component of a wide variety of algorithms in machine learning, Bayesian inference, stochastic dynamics and signal processing. However, the unsupervised density estimation technique requires tuning a…
The traditional kernel density estimator of an unknown density is by construction completely nonparametric, in the sense that it has no preferences and will work reasonably well for all shapes. The present paper develops a class of…
We introduce kernel density machines (KDM), an agnostic kernel-based framework for learning the Radon-Nikodym derivative (density) between probability measures under minimal assumptions. KDM applies to general measurable spaces and avoids…
This study proposes multivariate kernel density estimation by stagewise minimization algorithm based on $U$-divergence and a simple dictionary. The dictionary consists of an appropriate scalar bandwidth matrix and a part of the original…
Most density-based clustering methods largely rely on how well the underlying density is estimated. However, density estimation itself is also a challenging problem, especially the determination of the kernel bandwidth. A large bandwidth…
Conditional density estimation (CDE) goes beyond regression by modeling the full conditional distribution, providing a richer understanding of the data than just the conditional mean in regression. This makes CDE particularly useful in…
In this abstract paper, we introduce a new kernel learning method by a nonparametric density estimator. The estimator consists of a group of k-centroids clusterings. Each clustering randomly selects data points with randomly selected…
We present a nonparametric method for selecting informative features in high-dimensional clustering problems. We start with a screening step that uses a test for multimodality. Then we apply kernel density estimation and mode clustering to…
A new clustering accuracy measure is proposed to determine the unknown number of clusters and to assess the quality of clustering of a data set given in any dimensional space. Our validity index applies the classical nonparametric…
HDBSCAN is a density-based clustering algorithm that constructs a cluster hierarchy tree and then uses a specific stability measure to extract flat clusters from the tree. We show how the application of an additional threshold value can…
Developing deep learning models to analyze histology images has been computationally challenging, as the massive size of the images causes excessive strain on all parts of the computing pipeline. This paper proposes a novel deep…
Modern large-scale kernel-based tests such as maximum mean discrepancy (MMD) and kernelized Stein discrepancy (KSD) optimize kernel hyperparameters on a held-out sample via data splitting to obtain the most powerful test statistics. While…