Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data

Feng Liu; Wenkai Xu; Jie Lu; Danica J. Sutherland

Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data

Machine Learning 2022-01-06 v2 Artificial Intelligence Machine Learning Methodology

Authors: Feng Liu , Wenkai Xu , Jie Lu , Danica J. Sutherland

Abstract

Modern kernel-based two-sample tests have shown great success in distinguishing complex, high-dimensional distributions with appropriate learned kernels. Previous work has demonstrated that this kernel learning procedure succeeds, assuming a considerable number of observed samples from each distribution. In realistic scenarios with very limited numbers of data samples, however, it can be challenging to identify a kernel powerful enough to distinguish complex distributions. We address this issue by introducing the problem of meta two-sample testing (M2ST), which aims to exploit (abundant) auxiliary data on related tasks to find an algorithm that can quickly identify a powerful test on new target tasks. We propose two specific algorithms for this task: a generic scheme which improves over baselines and a more tailored approach which performs even better. We provide both theoretical justification and empirical evidence that our proposed meta-testing schemes out-perform learning kernel-based tests directly from scarce observations, and identify when such schemes will be successful.

Keywords

kernel density estimation kernel methods multi-task learning

Cite

@article{arxiv.2106.07636,
  title  = {Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data},
  author = {Feng Liu and Wenkai Xu and Jie Lu and Danica J. Sutherland},
  journal= {arXiv preprint arXiv:2106.07636},
  year   = {2022}
}

Comments

v2, as published at NeurIPS 2021 - https://proceedings.neurips.cc/paper/2021/hash/2e6d9c6052e99fcdfa61d9b9da273ca2-Abstract.html - contains various improvements, especially in the theoretical section. Code is available from https://github.com/fengliu90/MetaTesting

Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data

Abstract

Keywords

Cite

Comments

Related papers