Robustness to missing data: breakdown point analysis

Daniel Ober-Reynolds

doi:10.1016/j.jeconom.2025.106151

Robustness to missing data: breakdown point analysis

Econometrics 2025-12-29 v2

Authors: Daniel Ober-Reynolds

View on arXiv ↗ PDF ↗ DOI ↗

Abstract

Missing data is pervasive in econometric applications, and rarely is it plausible that the data are missing (completely) at random. This paper proposes a methodology for studying the robustness of results drawn from incomplete datasets. Selection is measured as the divergence from the distribution of complete observations to the distribution of incomplete observations. The breakdown point is defined as the minimal amount of selection needed to overturn a given result. Reporting point estimates and lower confidence intervals of the breakdown point is a simple, concise way to communicate the robustness of a result. An estimator of the breakdown point is proposed and shown root-n consistent and asymptotically normal. This estimator can be applied directly to conclusions drawn from any model identified with the generalized method of moments (GMM) that satisfies mild assumptions. Simulations demonstrate the finite sample performance of the breakdown point estimator on averages, linear regression, and logistic regression. The methodology is illustrated by estimating the breakdown point of conclusions drawn from several randomized controlled trails suffering from missing data due to attrition.

Keywords

econometric inference econometric estimation and inference machine learning in economics

Cite

@article{arxiv.2406.06804,
  title  = {Robustness to missing data: breakdown point analysis},
  author = {Daniel Ober-Reynolds},
  journal= {arXiv preprint arXiv:2406.06804},
  year   = {2025}
}

Comments

66 pages, 3 figures. Presented at the 2023 North American Summer Meeting of the Econometric Society. Accepted manuscript

Robustness to missing data: breakdown point analysis

Abstract

Keywords

Cite

Comments

Related papers