Related papers: The Discrete Dantzig Selector: Estimating Sparse L…
In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving Mixed Integer Optimization (MIO) problems. We…
Dantzig Selector (DS) is widely used in compressed sensing and sparse learning for feature selection and sparse signal recovery. Since the DS formulation is essentially a linear programming optimization, many existing linear programming…
The Dantzig selector has received popularity for many applications such as compressed sensing and sparse modeling, thanks to its computational efficiency as a linear programming problem and its nice sampling properties. Existing results…
We consider a discrete optimization formulation for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features. Recent work has shown that mixed integer programming (MIP) can be used to…
We study the mixed-integer optimization (MIO) approach to feature subset selection in nonlinear kernel support vector machines (SVMs) for binary classification. First proposed for linear regression in the 1970s, this approach has recently…
In many applications one may acquire a composition of several signals that may be corrupted by noise, and it is a challenging problem to reliably separate the components from one another without sacrificing significant details. Adding to…
We present a new algorithmic framework for grouped variable selection that is based on discrete mathematical optimization. While there exist several appealing approaches based on convex relaxations and nonconvex heuristics, we focus on…
We consider a class of linear-programming based estimators in reconstructing a sparse signal from linear measurements. Specific formulations of the reconstruction problem considered here include Dantzig selector, basis pursuit (for the case…
We consider the problem of model selection and estimation in sparse high dimensional linear regression models with strongly correlated variables. First, we study the theoretical properties of the dual Lasso solution, and we show that joint…
The Dantzig selector is a widely used and effective method for variable selection in ultra-high-dimensional data. Feature splitting is an efficient processing technique that involves dividing these ultra-high-dimensional variable datasets…
Recent advancements in Mixed Integer Optimization (MIO) algorithms, paired with hardware enhancements, have led to significant speedups in resolving MIO problems. These strategies have been utilized for optimal subset selection,…
We study a seemingly unexpected and relatively less understood overfitting aspect of a fundamental tool in sparse linear modeling - best subset selection, which minimizes the residual sum of squares subject to a constraint on the number of…
Discrete black-box optimization problems are challenging for model-based optimization (MBO) algorithms, such as Bayesian optimization, due to the size of the search space and the need to satisfy combinatorial constraints. In particular,…
Sparse variable selection improves interpretability and generalization in high-dimensional learning by selecting a small subset of informative features. Recent advances in Mixed Integer Programming (MIP) have enabled solving large-scale…
Transductive methods are useful in prediction problems when the training dataset is composed of a large number of unlabeled observations and a smaller number of labeled observations. In this paper, we propose an approach for developing…
We propose a new algorithm for solving multistage stochastic mixed integer linear programming (MILP) problems with complete continuous recourse. In a similar way to cutting plane methods, we construct nonlinear Lipschitz cuts to build lower…
Multivariate decision trees are powerful machine learning tools for classification and regression that attract many researchers and industry professionals. An optimal binary tree has two types of vertices, (i) branching vertices which have…
We formulate the sparse classification problem of $n$ samples with $p$ features as a binary convex optimization problem and propose a cutting-plane algorithm to solve it exactly. For sparse logistic regression and sparse SVM, our algorithm…
For consistency (even oracle properties) of estimation and model prediction, almost all existing methods of variable/feature selection critically depend on sparsity of models. However, for ``large $p$ and small $n$" models sparsity…
We consider the linear regression model with observation error in the design. In this setting, we allow the number of covariates to be much larger than the sample size. Several new estimation methods have been recently introduced for this…