English

Collaborative non-parametric two-sample testing

Machine Learning 2024-02-09 v1 Machine Learning

Abstract

This paper addresses the multiple two-sample test problem in a graph-structured setting, which is a common scenario in fields such as Spatial Statistics and Neuroscience. Each node vv in fixed graph deals with a two-sample testing problem between two node-specific probability density functions (pdfs), pvp_v and qvq_v. The goal is to identify nodes where the null hypothesis pv=qvp_v = q_v should be rejected, under the assumption that connected nodes would yield similar test outcomes. We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure and minimizes the assumptions over pvp_v and qvq_v. Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning. We use synthetic experiments and a real sensor network detecting seismic activity to demonstrate that CTST outperforms state-of-the-art non-parametric statistical tests that apply at each node independently, hence disregard the geometry of the problem.

Keywords

Cite

@article{arxiv.2402.05715,
  title  = {Collaborative non-parametric two-sample testing},
  author = {Alejandro de la Concha and Nicolas Vayatis and Argyris Kalogeratos},
  journal= {arXiv preprint arXiv:2402.05715},
  year   = {2024}
}