English

Non-Data-Splitting Estimator Selection for Regression in Exponential Families

Methodology 2025-02-12 v2

Abstract

We observe nn independent pairs of random variables (Wi,Yi)(W_{i}, Y_{i}), where the conditional distribution of YiY_{i} given Wi=wiW_{i}=w_{i} follows a one-parameter exponential family with parameter \bsg(wi)R\bsg^{*}(w_{i})\in\R. Our goal is to estimate the regression function \bsg\bsg^{*}. We start with an arbitrary collection of piecewise constant candidate estimators based on our observations and, using the same data, select an estimator from this collection. Our approach is agnostic to the dependencies of the candidate estimators on the data, differing from methods like data splitting, cross-validation, and hold-out. To demonstrate its theoretical performance, we provide a non-asymptotic risk bound for the selected estimator. We then explain how to apply the procedure to changepoint detection in exponential families. The practical performance of the proposed approach is illustrated through a comparative simulation study under different scenarios and real datasets.

Keywords

Cite

@article{arxiv.2212.12954,
  title  = {Non-Data-Splitting Estimator Selection for Regression in Exponential Families},
  author = {Juntong Chen},
  journal= {arXiv preprint arXiv:2212.12954},
  year   = {2025}
}