Related papers: Supersparse Linear Integer Models for Predictive S…
Scoring systems are classification models that only require users to add, subtract and multiply a few meaningful numbers to make a prediction. These models are often used because they are practical and interpretable. In this paper, we…
Scoring systems are linear classification models that only require users to add, subtract and multiply a few small numbers in order to make a prediction. These models are in widespread use by the medical community, but are difficult to…
Machine Learning models are increasingly used for decision making, in particular in high-stakes applications such as credit scoring, medicine or recidivism prediction. However, there are growing concerns about these models with respect to…
Scoring systems are linear classification models that only require users to add or subtract a few small numbers in order to make a prediction. They are used for example by clinicians to assess the risk of medical conditions. This work…
In this work, we present a novel, machine-learning approach for constructing Multiclass Interpretable Scoring Systems (MISS) - a fully data-driven methodology for generating single, sparse, and user-friendly scoring systems for multiclass…
Linear Mixed Models (LMMs) are important tools in statistical genetics. When used for feature selection, they allow to find a sparse set of genetic traits that best predict a continuous phenotype of interest, while simultaneously correcting…
In machine learning and data mining, linear models have been widely used to model the response as parametric linear functions of the predictors. To relax such stringent assumptions made by parametric linear models, additive models consider…
In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model…
We investigate a long-debated question, which is how to create predictive models of recidivism that are sufficiently accurate, transparent, and interpretable to use for decision-making. This question is complicated as these models are used…
We present ImplicitSLIM, a novel unsupervised learning approach for sparse high-dimensional data, with applications to collaborative filtering. Sparse linear methods (SLIM) and their variations show outstanding performance, but they are…
Sparse linear discriminant analysis via penalized optimal scoring is a successful tool for classification in high-dimensional settings. While the variable selection consistency of sparse optimal scoring has been established, the…
We consider a discrete optimization formulation for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features. Recent work has shown that mixed integer programming (MIP) can be used to…
Linear system identification and sparse dictionary learning can both be seen as structured matrix factorization problems. However, these two problems have historically been studied in isolation by the systems theory and machine learning…
In genomics, differential abundance and expression analyses are complicated by the compositional nature of sequence count data, which reflect only relative-not absolute-abundances or expression levels. Many existing methods attempt to…
In recent years many sparse linear discriminant analysis methods have been proposed for high-dimensional classification and variable selection. However, most of these proposals focus on binary classification and they are not directly…
While the SLIM approach obtained high ranking-accuracy in many experiments in the literature, it is also known for its high computational cost of learning its parameters from data. For this reason, we focus in this paper on variants of…
Sparse methods are the standard approach to obtain interpretable models with high prediction accuracy. Alternatively, algorithmic ensemble methods can achieve higher prediction accuracy at the cost of loss of interpretability. However, the…
We develop tools for selective inference in the setting of group sparsity, including the construction of confidence intervals and p-values for testing selected groups of variables. Our main technical result gives the precise distribution of…
In this thesis we discuss machine learning methods performing automated variable selection for learning sparse predictive models. There are multiple reasons for promoting sparsity in the predictive models. By relying on a limited set of…
Sparse coding approximates the data sample as a sparse linear combination of some basic codewords and uses the sparse codes as new presentations. In this paper, we investigate learning discriminative sparse codes by sparse coding in a…