Related papers: Two-sample test based on Self-Organizing Maps

Statistical tools to assess the reliability of self-organizing maps

Results of neural network learning are always subject to some variability, due to the sensitivity to initial conditions, to convergence to local minima, and, sometimes more dramatically, to sampling variability. This paper presents a set of…

Statistics Theory · Mathematics 2007-06-13 Eric De Bodt , Marie Cottrell , Michel Verleysen

Class maps for visualizing classification results

Classification is a major tool of statistics and machine learning. A classification method first processes a training set of objects with given classes (labels), with the goal of afterward assigning new objects to one of these classes. When…

Machine Learning · Statistics 2024-07-08 Jakob Raymaekers , Peter J. Rousseeuw , Mia Hubert

Revisiting Classifier Two-Sample Tests

The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary…

Machine Learning · Statistics 2018-03-14 David Lopez-Paz , Maxime Oquab

Two-sample Testing Using Deep Learning

We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are…

Machine Learning · Statistics 2020-03-11 Matthias Kirchler , Shahryar Khorasani , Marius Kloft , Christoph Lippert

Classification Logit Two-sample Testing by Neural Networks

The recent success of generative adversarial networks and variational learning suggests training a classifier network may work well in addressing the classical two-sample problem. Network-based tests have the computational advantage that…

Machine Learning · Statistics 2022-06-01 Xiuyuan Cheng , Alexander Cloninger

A Semi-Supervised Self-Organizing Map for Clustering and Classification

There has been an increasing interest in semi-supervised learning in the recent years because of the great number of datasets with a large number of unlabeled data but only a few labeled samples. Semi-supervised learning algorithms can work…

Machine Learning · Computer Science 2020-03-26 Pedro H. M. Braga , Hansenclever F. Bassani

Two-Sample Test Based on Classification Probability

Robust classification algorithms have been developed in recent years with great success. We take advantage of this development and recast the classical two-sample test problem in the framework of classification. Based on the estimates of…

Statistics Theory · Mathematics 2019-09-18 Haiyan Cai , Bryan Goggin , Qingtang Jiang

Self-Organizing Maps as a Storage and Transfer Mechanism in Reinforcement Learning

The idea of reusing information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency reinforcement learning agents. In this work, we…

Machine Learning · Computer Science 2018-07-21 Thommen George Karimpanal , Roland Bouffanais

Box Drawings for Learning with Imbalanced Data

The vast majority of real world classification problems are imbalanced, meaning there are far fewer data from the class of interest (the positive class) than from other classes. We propose two machine learning algorithms to handle highly…

Machine Learning · Statistics 2014-06-10 Siong Thye Goh , Cynthia Rudin

Near Optimal Stratified Sampling

The performance of a machine learning system is usually evaluated by using i.i.d.\ observations with true labels. However, acquiring ground truth labels is expensive, while obtaining unlabeled samples may be cheaper. Stratified sampling can…

Machine Learning · Computer Science 2019-07-29 Tiancheng Yu , Xiyu Zhai , Suvrit Sra

A Bipartite Ranking Approach to the Two-Sample Problem

The two-sample problem, which consists in testing whether independent samples on $\mathbb{R}^d$ are drawn from the same (unknown) distribution, finds applications in many areas. Its study in high-dimension is the subject of much attention,…

Statistics Theory · Mathematics 2023-02-09 Stephan Clémençon , Myrto Limnios , Nicolas Vayatis

A Deterministic Self-Organizing Map Approach and its Application on Satellite Data based Cloud Type Classification

A self-organizing map (SOM) is a type of competitive artificial neural network, which projects the high-dimensional input space of the training samples into a low-dimensional space with the topology relations preserved. This makes SOMs…

Machine Learning · Computer Science 2018-11-02 Wenbin Zhang , Jianwu Wang , Daeho Jin , Lazaros Oreopoulos , Zhibo Zhang

Supervised Classification: Quite a Brief Overview

The original problem of supervised classification considers the task of automatically assigning objects to their respective classes on the basis of numerical measurements derived from these objects. Classifiers are the tools that implement…

Machine Learning · Computer Science 2017-10-26 Marco Loog

Self-organizing maps and generalization: an algorithmic description of Numerosity and Variability Effects

Category, or property generalization is a central function in the human cognition. It plays a crucial role in a variety of domains, such as learning, everyday reasoning, specialized reasoning, and decision making. Judging the content of a…

Artificial Intelligence · Computer Science 2018-02-27 Valentina Gliozzi , Kim Plunkett

Evaluating Black-Box Classifiers via Stable Adaptive Two-Sample Inference

We consider the problem of evaluating black-box multi-class classifiers. In the standard setup, we observe class labels $Y\in \{0,1,\ldots,M-1\}$ generated according to the conditional distribution $ Y|X \sim \text{…

Methodology · Statistics 2026-04-08 Yuchen Chen , Jing Lei

Automatic Classification using Self-Organising Neural Networks in Astrophysical Experiments

Self-Organising Maps (SOMs) are effective tools in classification problems, and in recent years the even more powerful Dynamic Growing Neural Networks, a variant of SOMs, have been developed. Automatic Classification (also called…

Neural and Evolutionary Computing · Computer Science 2007-05-23 P. Boinee , A. De Angelis , E. Milotti

Advanced Tutorial: Label-Efficient Two-Sample Tests

Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical…

Machine Learning · Computer Science 2025-01-08 Weizhi Li , Visar Berisha , Gautam Dasarathy

Improving Self-Organizing Maps with Unsupervised Feature Extraction

The Self-Organizing Map (SOM) is a brain-inspired neural model that is very promising for unsupervised learning, especially in embedded applications. However, it is unable to learn efficient prototypes when dealing with complex datasets. We…

Neural and Evolutionary Computing · Computer Science 2020-09-07 Lyes Khacef , Laurent Rodriguez , Benoit Miramond

A label-efficient two-sample test

Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are…

Machine Learning · Computer Science 2022-07-20 Weizhi Li , Gautam Dasarathy , Karthikeyan Natesan Ramamurthy , Visar Berisha

Self-Organizing Maps for Storage and Transfer of Knowledge in Reinforcement Learning

The idea of reusing or transferring information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency of a reinforcement learning agent. In…

Artificial Intelligence · Computer Science 2022-09-28 Thommen George Karimpanal , Roland Bouffanais