Two-sample test based on Self-Organizing Maps
Abstract
Machine-learning classifiers can be leveraged as a two-sample statistical test. Suppose each sample is assigned a different label and that a classifier can obtain a better-than-chance result discriminating them. In this case, we can infer that both samples originate from different populations. However, many types of models, such as neural networks, behave as a black-box for the user: they can reject that both samples originate from the same population, but they do not offer insight into how both samples differ. Self-Organizing Maps are a dimensionality reduction initially devised as a data visualization tool that displays emergent properties, being also useful for classification tasks. Since they can be used as classifiers, they can be used also as a two-sample statistical test. But since their original purpose is visualization, they can also offer insights.
Cite
@article{arxiv.2212.08960,
title = {Two-sample test based on Self-Organizing Maps},
author = {Alejandro Álvarez-Ayllón and Manuel Palomo-Duarte and Juan-Manuel Dodero},
journal= {arXiv preprint arXiv:2212.08960},
year = {2022}
}
Comments
27 pages