English

Learning Invariant Representations with Missing Data

Machine Learning 2022-06-10 v2 Machine Learning

Abstract

Spurious correlations allow flexible models to predict well during training but poorly on related test distributions. Recent work has shown that models that satisfy particular independencies involving correlation-inducing \textit{nuisance} variables have guarantees on their test performance. Enforcing such independencies requires nuisances to be observed during training. However, nuisances, such as demographics or image background labels, are often missing. Enforcing independence on just the observed data does not imply independence on the entire population. Here we derive \acrshort{mmd} estimators used for invariance objectives under missing nuisances. On simulations and clinical data, optimizing through these estimates achieves test performance similar to using estimators that make use of the full data.

Keywords

Cite

@article{arxiv.2112.00881,
  title  = {Learning Invariant Representations with Missing Data},
  author = {Mark Goldstein and Jörn-Henrik Jacobsen and Olina Chau and Adriel Saporta and Aahlad Puli and Rajesh Ranganath and Andrew C. Miller},
  journal= {arXiv preprint arXiv:2112.00881},
  year   = {2022}
}

Comments

CLeaR (Causal Learning and Reasoning) 2022

R2 v1 2026-06-24T08:00:40.681Z