English

Model-Powered Conditional Independence Test

Machine Learning 2017-09-20 v1 Artificial Intelligence Information Theory Machine Learning math.IT

Abstract

We consider the problem of non-parametric Conditional Independence testing (CI testing) for continuous random variables. Given i.i.d samples from the joint distribution f(x,y,z)f(x,y,z) of continuous random vectors X,YX,Y and Z,Z, we determine whether XYZX \perp Y | Z. We approach this by converting the conditional independence test into a classification problem. This allows us to harness very powerful classifiers like gradient-boosted trees and deep neural networks. These models can handle complex probability distributions and allow us to perform significantly better compared to the prior state of the art, for high-dimensional CI testing. The main technical challenge in the classification problem is the need for samples from the conditional product distribution fCI(x,y,z)=f(xz)f(yz)f(z)f^{CI}(x,y,z) = f(x|z)f(y|z)f(z) -- the joint distribution if and only if XYZ.X \perp Y | Z. -- when given access only to i.i.d. samples from the true joint distribution f(x,y,z)f(x,y,z). To tackle this problem we propose a novel nearest neighbor bootstrap procedure and theoretically show that our generated samples are indeed close to fCIf^{CI} in terms of total variational distance. We then develop theoretical results regarding the generalization bounds for classification for our problem, which translate into error bounds for CI testing. We provide a novel analysis of Rademacher type classification bounds in the presence of non-i.i.d near-independent samples. We empirically validate the performance of our algorithm on simulated and real datasets and show performance gains over previous methods.

Keywords

Cite

@article{arxiv.1709.06138,
  title  = {Model-Powered Conditional Independence Test},
  author = {Rajat Sen and Ananda Theertha Suresh and Karthikeyan Shanmugam and Alexandros G. Dimakis and Sanjay Shakkottai},
  journal= {arXiv preprint arXiv:1709.06138},
  year   = {2017}
}

Comments

19 Pages, 2 figures, Accepted for publication in NIPS 2017