Related papers: SWAG: A Wrapper Method for Sparse Learning
Feature selection is a critical component in predictive analytics that significantly affects the prediction accuracy and interpretability of models. Intrinsic methods for feature selection are built directly into model learning, providing a…
In dictionary learning, also known as sparse coding, the algorithm is given samples of the form $y = Ax$ where $x\in \mathbb{R}^m$ is an unknown random sparse vector and $A$ is an unknown dictionary matrix in $\mathbb{R}^{n\times m}$…
In this thesis we discuss machine learning methods performing automated variable selection for learning sparse predictive models. There are multiple reasons for promoting sparsity in the predictive models. By relying on a limited set of…
Supervised machine learning often requires large training sets to train accurate models, yet obtaining large amounts of labeled data is not always feasible. Hence, it becomes crucial to explore active learning methods for reducing the size…
Feature selection is an intractable problem, therefore practical algorithms often trade off the solution accuracy against the computation time. In this paper, we propose a novel multi-stage feature selection framework utilizing multiple…
The graph classification problem has been widely studied; however, achieving an interpretable model with high predictive performance remains a challenging issue. This paper proposes an interpretable classification algorithm for attributed…
Process Outcome Prediction entails predicting a discrete property of an unfinished process instance from its partial trace. High-capacity outcome predictors discovered with ensemble and deep learning methods have been shown to achieve top…
In high-dimensional settings, sparse structures are critical for efficiency in term of memory and computation complexity. For a linear system, to find the sparsest solution provided with an over-complete dictionary of features directly is…
Classification with a sparsity constraint on the solution plays a central role in many high dimensional machine learning applications. In some cases, the features can be grouped together so that entire subsets of features can be selected or…
We propose a new approach for metric learning by framing it as learning a sparse combination of locally discriminative metrics that are inexpensive to generate from the training data. This flexible framework allows us to naturally derive…
In this paper, a sparsity-aware adaptive algorithm for distributed learning in diffusion networks is developed. The algorithm follows the set-theoretic estimation rationale. At each time instance and at each node of the network, a closed…
Sparse additive models have attracted much attention in high-dimensional data analysis due to their flexible representation and strong interpretability. However, most existing models are limited to single-level learning under the…
In several applications, input samples are more naturally represented in terms of similarities between each other, rather than in terms of feature vectors. In these settings, machine-learning algorithms can become very computationally…
Balancing predictive power and interpretability has long been a challenging research area, particularly in powerful yet complex models like neural networks, where nonlinearity obstructs direct interpretation. This paper introduces a novel…
Sparse regression and classification estimators that respect group structures have application to an assortment of statistical and machine learning problems, from multitask learning to sparse additive modeling to hierarchical selection.…
Accuracy predictor is trained to predict the validation accuracy of an network from its architecture encoding. It can effectively assist in designing networks and improving Neural Architecture Search(NAS) efficiency. However, a…
Many modern big data applications feature large scale in both numbers of responses and predictors. Better statistical efficiency and scientific insights can be enabled by understanding the large-scale response-predictor association network…
Background and objective: Stacking is an ensemble machine learning method that averages predictions from multiple other algorithms, such as generalized linear models and regression trees. An implementation of stacking, called super…
We study the problem of minimizing the average of a very large number of smooth functions, which is of key importance in training supervised learning models. One of the most celebrated methods in this context is the SAGA algorithm. Despite…
We propose a novel sparse preference learning/ranking algorithm. Our algorithm approximates the true utility function by a weighted sum of basis functions using the squared loss on pairs of data points, and is a generalization of the kernel…