English

Classification accuracy as a proxy for two sample testing

Machine Learning 2020-02-18 v4 Artificial Intelligence Statistics Theory Machine Learning Statistics Theory

Abstract

When data analysts train a classifier and check if its accuracy is significantly different from chance, they are implicitly performing a two-sample test. We investigate the statistical properties of this flexible approach in the high-dimensional setting. We prove two results that hold for all classifiers in any dimensions: if its true error remains ϵ\epsilon-better than chance for some ϵ>0\epsilon>0 as d,nd,n \to \infty, then (a) the permutation-based test is consistent (has power approaching to one), (b) a computationally efficient test based on a Gaussian approximation of the null distribution is also consistent. To get a finer understanding of the rates of consistency, we study a specialized setting of distinguishing Gaussians with mean-difference δ\delta and common (known or unknown) covariance Σ\Sigma, when d/nc(0,)d/n \to c \in (0,\infty). We study variants of Fisher's linear discriminant analysis (LDA) such as "naive Bayes" in a nontrivial regime when ϵ0\epsilon \to 0 (the Bayes classifier has true accuracy approaching 1/2), and contrast their power with corresponding variants of Hotelling's test. Surprisingly, the expressions for their power match exactly in terms of n,d,δ,Σn,d,\delta,\Sigma, and the LDA approach is only worse by a constant factor, achieving an asymptotic relative efficiency (ARE) of 1/π1/\sqrt{\pi} for balanced samples. We also extend our results to high-dimensional elliptical distributions with finite kurtosis. Other results of independent interest include minimax lower bounds, and the optimality of Hotelling's test when d=o(n)d=o(n). Simulation results validate our theory, and we present practical takeaway messages along with natural open problems.

Keywords

Cite

@article{arxiv.1602.02210,
  title  = {Classification accuracy as a proxy for two sample testing},
  author = {Ilmun Kim and Aaditya Ramdas and Aarti Singh and Larry Wasserman},
  journal= {arXiv preprint arXiv:1602.02210},
  year   = {2020}
}

Comments

71 pages, 4 figures. Accepted for publication at the Annals of Statistics (2020)

R2 v1 2026-06-22T12:44:38.523Z