Related papers: Sparse-Input Neural Network using Group Concave Re…
Neural networks are usually not the tool of choice for nonparametric high-dimensional problems where the number of input features is much larger than the number of observations. Though neural networks can approximate complex multivariate…
Classification with a sparsity constraint on the solution plays a central role in many high dimensional machine learning applications. In some cases, the features can be grouped together so that entire subsets of features can be selected or…
In genomic analysis, biomarker discovery, image recognition, and other systems involving machine learning, input variables can often be organized into different groups by their source or semantic category. Eliminating some groups of…
Much work has been done recently to make neural networks more interpretable, and one obvious approach is to arrange for the network to use only a subset of the available features. In linear models, Lasso (or $\ell_1$-regularized) regression…
Sparse modelling or model selection with categorical data is challenging even for a moderate number of variables, because one parameter is roughly needed to encode one category or level. The Group Lasso is a well known efficient algorithm…
In this paper, we consider the joint task of simultaneously optimizing (i) the weights of a deep neural network, (ii) the number of neurons for each hidden layer, and (iii) the subset of active input features (i.e., feature selection).…
Feature selection is one of the most decisive tools in understanding data and machine learning models. Among other methods, sparsity induced by $L^{1}$ penalty is one of the simplest and best studied approaches to this problem. Although…
Grouping structures arise naturally in many statistical modeling problems. Several methods have been proposed for variable selection that respect grouping structure in variables. Examples include the group LASSO and several concave group…
Feature selection in learning to rank has recently emerged as a crucial issue. Whereas several preprocessing approaches have been proposed, only a few works have been focused on integrating the feature selection into the learning process.…
Feature selection is important step in machine learning since it has shown to improve prediction accuracy while depressing the curse of dimensionality of high dimensional data. The neural networks have experienced tremendous success in…
Sparse feature selection has been demonstrated to be effective in handling high-dimensional data. While promising, most of the existing works use convex methods, which may be suboptimal in terms of the accuracy of feature selection and…
Deepening and widening convolutional neural networks (CNNs) significantly increases the number of trainable weight parameters by adding more convolutional layers and feature maps per layer, respectively. By imposing inter- and intra-group…
Despite widespread adoption in practice, guarantees for the LASSO and Group LASSO are strikingly lacking in settings beyond statistical problems, and these algorithms are usually considered to be a heuristic in the context of sparse convex…
The popular Lasso approach for sparse estimation can be derived via marginalization of a joint density associated with a particular stochastic model. A different marginalization of the same probabilistic model leads to a different…
One of the most important steps toward interpretability and explainability of neural network models is feature selection, which aims to identify the subset of relevant features. Theoretical results in the field have mostly focused on the…
One main obstacle for the wide use of deep learning in medical and engineering sciences is its interpretability. While neural network models are strong tools for making predictions, they often provide little information about which features…
High-dimensional learning problems, where the number of features exceeds the sample size, often require sparse regularization for effective prediction and variable selection. While established for fully supervised data, these techniques…
We present a novel feature selection technique, Sparse Linear Centroid-Encoder (SLCE). The algorithm uses a linear transformation to reconstruct a point as its class centroid and, at the same time, uses the $\ell_1$-norm penalty to filter…
Many data sets consist of variables with an inherent group structure. The problem of group selection has been well studied, but in this paper, we seek to do the opposite: our goal is to select at least one variable from each group in the…
In a deep neural network (DNN), the number of the parameters is usually huge to get high learning performances. For that reason, it costs a lot of memory and substantial computational resources, and also causes overfitting. It is known that…