English

E-Valuating Classifier Two-Sample Tests

Methodology 2024-05-01 v2 Machine Learning Machine Learning

Abstract

We introduce a powerful deep classifier two-sample test for high-dimensional data based on E-values, called E-value Classifier Two-Sample Test (E-C2ST). Our test combines ideas from existing work on split likelihood ratio tests and predictive independence tests. The resulting E-values are suitable for anytime-valid sequential two-sample tests. This feature allows for more effective use of data in constructing test statistics. Through simulations and real data applications, we empirically demonstrate that E-C2ST achieves enhanced statistical power by partitioning datasets into multiple batches beyond the conventional two-split (training and testing) approach of standard classifier two-sample tests. This strategy increases the power of the test while keeping the type I error well below the desired significance level.

Keywords

Cite

@article{arxiv.2210.13027,
  title  = {E-Valuating Classifier Two-Sample Tests},
  author = {Teodora Pandeva and Tim Bakker and Christian A. Naesseth and Patrick Forré},
  journal= {arXiv preprint arXiv:2210.13027},
  year   = {2024}
}