English

A Kernel Two-sample Test for Dynamical Systems

Machine Learning 2022-09-07 v3 Machine Learning

Abstract

Evaluating whether data streams are drawn from the same distribution is at the heart of various machine learning problems. This is particularly relevant for data generated by dynamical systems since such systems are essential for many real-world processes in biomedical, economic, or engineering systems. While kernel two-sample tests are powerful for comparing independent and identically distributed random variables, no established method exists for comparing dynamical systems. The main problem is the inherently violated independence assumption. We propose a two-sample test for dynamical systems by addressing three core challenges: we (i) introduce a novel notion of mixing that captures autocorrelations in a relevant metric, (ii) propose an efficient way to estimate the speed of mixing relying purely on data, and (iii) integrate these into established kernel two-sample tests. The result is a data-driven method that is straightforward to use in practice and comes with sound theoretical guarantees. In an example application to anomaly detection from human walking data, we show that the test is readily applicable without any human expert knowledge and feature engineering.

Keywords

Cite

@article{arxiv.2004.11098,
  title  = {A Kernel Two-sample Test for Dynamical Systems},
  author = {Friedrich Solowjow and Dominik Baumann and Christian Fiedler and Andreas Jocham and Thomas Seel and Sebastian Trimpe},
  journal= {arXiv preprint arXiv:2004.11098},
  year   = {2022}
}