Related papers: Sparse Classification: a scalable discrete optimiz…
Sparse linear regression is a central problem in high-dimensional statistics. We study the correlated random design setting, where the covariates are drawn from a multivariate Gaussian $N(0,\Sigma)$, and we seek an estimator with small…
We present a novel binary convex reformulation of the sparse regression problem that constitutes a new duality perspective. We devise a new cutting plane method and provide evidence that it can solve to provable optimality the sparse…
We consider a discrete optimization formulation for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features. Recent work has shown that mixed integer programming (MIP) can be used to…
High-dimensional learning problems, where the number of features exceeds the sample size, often require sparse regularization for effective prediction and variable selection. While established for fully supervised data, these techniques…
The Lasso is an attractive technique for regularization and variable selection for high-dimensional data, where the number of predictor variables $p_n$ is potentially much larger than the number of samples $n$. However, it was recently…
We present the framework of slowly varying regression under sparsity, allowing sparse regression models to exhibit slow and sparse variations. The problem of parameter estimation is formulated as a mixed-integer optimization problem. We…
In this paper, we review state-of-the-art methods for feature selection in statistics with an application-oriented eye. Indeed, sparsity is a valuable property and the profusion of research on the topic might have provided little guidance…
In this paper, we propose a novel sparse learning based feature selection method that directly optimizes a large margin linear classification model sparsity with l_(2,p)-norm (0 < p < 1)subject to data-fitting constraints, rather than using…
In this paper we discuss the variable selection method from \ell0-norm constrained regression, which is equivalent to the problem of finding the best subset of a fixed size. Our study focuses on two aspects, consistency and computation. We…
Finding the sparse solution of an underdetermined system of linear equations has many applications, especially, it is used in Compressed Sensing (CS), Sparse Component Analysis (SCA), and sparse decomposition of signals on overcomplete…
In this paper, we consider a well-known sparse optimization problem that aims to find a sparse solution of a possibly noisy underdetermined system of linear equations. Mathematically, it can be modeled in a unified manner by minimizing…
We investigate fast methods that allow to quickly eliminate variables (features) in supervised learning problems involving a convex loss function and a $l_1$-norm penalty, leading to a potentially substantial reduction in the number of…
We present a sparse analogue to stochastic gradient descent that is guaranteed to perform well under similar conditions to the lasso. In the linear regression setup with irrepresentable noise features, our algorithm recovers the support set…
Simultaneous feature selection and non-linear function estimation is challenging in modeling, especially in high-dimensional settings where the number of variables exceeds the available sample size. In this article, we investigate the…
Sparse inverse covariance selection is a fundamental problem for analyzing dependencies in high dimensional data. However, such a problem is difficult to solve since it is NP-hard. Existing solutions are primarily based on convex…
We prove an L2 recovery bound for a family of sparse estimators defined as minimizers of some empirical loss functions -- which include hinge loss and logistic loss. More precisely, we achieve an upper-bound for coefficients estimation…
The sparse regression problem, also known as best subset selection problem, can be cast as follows: Given a set $S$ of $n$ points in $\mathbb{R}^d$, a point $y\in \mathbb{R}^d$, and an integer $2 \leq k \leq d$, find an affine combination…
We consider the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse. Our approach is to solve a maximum likelihood problem with an added l_1-norm…
Sparse modelling or model selection with categorical data is challenging even for a moderate number of variables, because one parameter is roughly needed to encode one category or level. The Group Lasso is a well known efficient algorithm…
For multiple index models, it has recently been shown that the sliced inverse regression (SIR) is consistent for estimating the sufficient dimension reduction (SDR) space if and only if $\rho=\lim\frac{p}{n}=0$, where $p$ is the dimension…