English

Bayesian variable selection for high dimensional generalized linear models: convergence rates of the fitted densities

Statistics Theory 2009-09-29 v1 Statistics Theory

Abstract

Bayesian variable selection has gained much empirical success recently in a variety of applications when the number KK of explanatory variables (x1,...,xK)(x_1,...,x_K) is possibly much larger than the sample size nn. For generalized linear models, if most of the xjx_j's have very small effects on the response yy, we show that it is possible to use Bayesian variable selection to reduce overfitting caused by the curse of dimensionality KnK\gg n. In this approach a suitable prior can be used to choose a few out of the many xjx_j's to model yy, so that the posterior will propose probability densities pp that are ``often close'' to the true density pp^* in some sense. The closeness can be described by a Hellinger distance between pp and pp^* that scales at a power very close to n1/2n^{-1/2}, which is the ``finite-dimensional rate'' corresponding to a low-dimensional situation. These findings extend some recent work of Jiang [Technical Report 05-02 (2005) Dept. Statistics, Northwestern Univ.] on consistency of Bayesian variable selection for binary classification.

Keywords

Cite

@article{arxiv.0710.3458,
  title  = {Bayesian variable selection for high dimensional generalized linear models: convergence rates of the fitted densities},
  author = {Wenxin Jiang},
  journal= {arXiv preprint arXiv:0710.3458},
  year   = {2009}
}

Comments

Published in at http://dx.doi.org/10.1214/009053607000000019 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

R2 v1 2026-06-21T09:33:29.555Z