SLOPE - Adaptive variable selection via convex optimization

Małgorzata Bogdan; Ewout van den Berg; Chiara Sabatti; Weijie Su; Emmanuel J. Candès

doi:10.1214/15-AOAS842

SLOPE - Adaptive variable selection via convex optimization

Methodology 2015-11-05 v2

Authors: Małgorzata Bogdan , Ewout van den Berg , Chiara Sabatti , Weijie Su , Emmanuel J. Candès

View on arXiv ↗ PDF ↗ DOI ↗

Abstract

We introduce a new estimator for the vector of coefficients $\beta$ in the linear model $y=X\beta+z$ , where $X$ has dimensions $n\times p$ with $p$ possibly larger than $n$ . SLOPE, short for Sorted L-One Penalized Estimation, is the solution to $\min_{b\in\mathbb{R}^p}\frac{1}{2}\Vert y-Xb\Vert _{\ell_2}^2+\lambda_1\vert b\vert _{(1)}+\lambda_2\vert b\vert_{(2)}+\cdots+\lambda_p\vert b\vert_{(p)},$ where $\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_p\ge0$ and $\vert b\vert_{(1)}\ge\vert b\vert_{(2)}\ge\cdots\ge\vert b\vert_{(p)}$ are the decreasing absolute values of the entries of $b$ . This is a convex program and we demonstrate a solution algorithm whose computational complexity is roughly comparable to that of classical $\ell_1$ procedures such as the Lasso. Here, the regularizer is a sorted $\ell_1$ norm, which penalizes the regression coefficients according to their rank: the higher the rank - that is, stronger the signal - the larger the penalty. This is similar to the Benjamini and Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289-300] procedure (BH) which compares more significant $p$ -values with more stringent thresholds. One notable choice of the sequence $\{\lambda_i\}$ is given by the BH critical values $\lambda_{\mathrm {BH}}(i)=z(1-i\cdot q/2p)$ , where $q\in(0,1)$ and $z(\alpha)$ is the quantile of a standard normal distribution. SLOPE aims to provide finite sample guarantees on the selected model; of special interest is the false discovery rate (FDR), defined as the expected proportion of irrelevant regressors among all selected predictors. Under orthogonal designs, SLOPE with $\lambda_{\mathrm{BH}}$ provably controls FDR at level $q$ . Moreover, it also appears to have appreciable inferential properties under more general designs $X$ while having substantial power, as demonstrated in a series of experiments running on both simulated and real data.

Keywords

lasso sparse regression and regularization least squares regression

Cite

@article{arxiv.1407.3824,
  title  = {SLOPE - Adaptive variable selection via convex optimization},
  author = {Małgorzata Bogdan and Ewout van den Berg and Chiara Sabatti and Weijie Su and Emmanuel J. Candès},
  journal= {arXiv preprint arXiv:1407.3824},
  year   = {2015}
}

Comments

Published at http://dx.doi.org/10.1214/15-AOAS842 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

SLOPE - Adaptive variable selection via convex optimization

Abstract

Keywords

Cite

Comments

Related papers