A study of pre-validation
Abstract
Given a predictor of outcome derived from a high-dimensional dataset, pre-validation is a useful technique for comparing it to competing predictors on the same dataset. For microarray data, it allows one to compare a newly derived predictor for disease outcome to standard clinical predictors on the same dataset. We study pre-validation analytically to determine if the inferences drawn from it are valid. We show that while pre-validation generally works well, the straightforward "one degree of freedom" analytical test from pre-validation can be biased and we propose a permutation test to remedy this problem. In simulation studies, we show that the permutation test has the nominal level and achieves roughly the same power as the analytical test.
Cite
@article{arxiv.0807.4105,
title = {A study of pre-validation},
author = {Holger Höfling and Robert Tibshirani},
journal= {arXiv preprint arXiv:0807.4105},
year = {2008}
}
Comments
Published in at http://dx.doi.org/10.1214/07-AOAS152 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)