Related papers: Practical considerations for variable screening in…
The lasso is a popular method to induce shrinkage and sparsity in the solution vector (coefficients) of regression problems, particularly when there are many predictors relative to the number of observations. Solving the lasso in this…
Variable screening methods have been shown to be effective in dimension reduction under the ultra-high dimensional setting. Most existing screening methods are designed to rank the predictors according to their individual contributions to…
In high-dimensional settings, sparse structures are critical for efficiency in term of memory and computation complexity. For a linear system, to find the sparsest solution provided with an over-complete dictionary of features directly is…
This paper is a survey of dictionary screening for the lasso problem. The lasso problem seeks a sparse linear combination of the columns of a dictionary to best match a given target vector. This sparse representation has proven useful in a…
Multi-view stacking is a framework for combining information from different views (i.e. different feature sets) describing the same set of objects. In this framework, a base-learner algorithm is trained on each view separately, and their…
One main obstacle for the wide use of deep learning in medical and engineering sciences is its interpretability. While neural network models are strong tools for making predictions, they often provide little information about which features…
In several applications, input samples are more naturally represented in terms of similarities between each other, rather than in terms of feature vectors. In these settings, machine-learning algorithms can become very computationally…
The development and use of dimension reduction methods is prevalent in modern statistical literature. This paper reviews a class of dimension reduction techniques which aim to simultaneously select relevant predictors and find clusters…
The joint optimization of the reconstruction and classification error is a hard non convex problem, especially when a non linear mapping is utilized. In order to overcome this obstacle, a novel optimization strategy is proposed, in which a…
Recently dictionary screening has been proposed as an effective way to improve the computational efficiency of solving the lasso problem, which is one of the most commonly used method for learning sparse representations. To address today's…
Extracting relevant features from data sets where the number of observations ($n$) is much smaller then the number of predictors ($p$) is a major challenge in modern statistics. Sorted L-One Penalized Estimation (SLOPE), a generalization of…
In variable or graph selection problems, finding a right-sized model or controlling the number of false positives is notoriously difficult. Recently, a meta-algorithm called Stability Selection was proposed that can provide reliable…
Learning in adversarial settings is becoming an important task for application domains where attackers may inject malicious data into the training set to subvert normal operation of data-driven technologies. Feature selection has been…
Sparse learning techniques have been routinely used for feature selection as the resulting model usually has a small number of non-zero entries. Safe screening, which eliminates the features that are guaranteed to have zero coefficients for…
The Lasso has been widely used as a method for variable selection, valued for its simplicity and empirical performance. However, Lasso's selection stability deteriorates in the presence of correlated predictors. Several approaches have been…
Recent computational strategies based on screening tests have been proposed to accelerate algorithms addressing penalized sparse regression problems such as the Lasso. Such approaches build upon the idea that it is worth dedicating some…
We propose a new approach to safe variable preselection in high-dimensional penalized regression, such as the lasso. Preselection - to start with a manageable set of covariates - has often been implemented without clear appreciation of its…
Risk prediction models are widely used to guide real-world decision-making in areas such as healthcare and economics, and they also play a key role in estimating nuisance parameters in semiparametric inference. The super learner is a…
In high dimensional settings, sparse structures are crucial for efficiency, both in term of memory, computation and performance. It is customary to consider $\ell_1$ penalty to enforce sparsity in such scenarios. Sparsity enforcing methods,…
In high-dimensions, many variable selection methods, such as the lasso, are often limited by excessive variability and rank deficiency of the sample covariance matrix. Covariance sparsity is a natural phenomenon in high-dimensional…