Related papers: Explaining Classification Models Built on High-Dim…

To Explain or to Predict?

Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction, and description. In many disciplines there is near-exclusive use of statistical modeling for causal explanation and the…

Methodology · Statistics 2011-01-06 Galit Shmueli

Feature Importance Depends on Properties of the Data: Towards Choosing the Correct Explanations for Your Data and Decision Trees based Models

In order to ensure the reliability of the explanations of machine learning models, it is crucial to establish their advantages and limits and in which case each of these methods outperform. However, the current understanding of when and how…

Machine Learning · Computer Science 2025-02-12 Célia Wafa Ayad , Thomas Bonnier , Benjamin Bosch , Sonali Parbhoo , Jesse Read

Obtaining Explainable Classification Models using Distributionally Robust Optimization

Model explainability is crucial for human users to be able to interpret how a proposed classifier assigns labels to data based on its feature values. We study generalized linear models constructed using sets of feature value rules, which…

Machine Learning · Statistics 2023-11-06 Sanjeeb Dash , Soumyadip Ghosh , Joao Goncalves , Mark S. Squillante

Conditional Sparse Linear Regression

Machine learning and statistics typically focus on building models that capture the vast majority of the data, possibly ignoring a small subset of data as "noise" or "outliers." By contrast, here we consider the problem of jointly…

Machine Learning · Computer Science 2016-08-19 Brendan Juba

On Counterfactual Explanations under Predictive Multiplicity

Counterfactual explanations are usually obtained by identifying the smallest change made to an input to change a prediction made by a fixed model (hereafter called sparse methods). Recent work, however, has revitalized an old insight: there…

Machine Learning · Computer Science 2020-06-24 Martin Pawelczyk , Klaus Broelemann , Gjergji Kasneci

Factor models and variable selection in high-dimensional regression analysis

The paper considers linear regression problems where the number of predictor variables is possibly larger than the sample size. The basic motivation of the study is to combine the points of view of model selection and functional regression…

Statistics Theory · Mathematics 2012-02-24 Alois Kneip , Pascal Sarda

Sparse Learning for Variable Selection with Structures and Nonlinearities

In this thesis we discuss machine learning methods performing automated variable selection for learning sparse predictive models. There are multiple reasons for promoting sparsity in the predictive models. By relying on a limited set of…

Machine Learning · Computer Science 2019-03-27 Magda Gregorova

Bayesian inference in high-dimensional models

Models with dimension more than the available sample size are now commonly used in various applications. A sensible inference is possible using a lower-dimensional structure. In regression problems with a large number of predictors, the…

Statistics Theory · Mathematics 2025-11-25 Sayantan Banerjee , Ismaël Castillo , Subhashis Ghosal

On the (In)Significance of Feature Selection in High-Dimensional Datasets

Feature selection (FS) is assumed to improve predictive performance and identify meaningful features in high-dimensional datasets. Surprisingly, small random subsets of features (0.02-1%) match or outperform the predictive performance of…

Machine Learning · Computer Science 2025-09-22 Bhavesh Neekhra , Debayan Gupta , Partha Pratim Chakrabarti

Probabilistic Sufficient Explanations

Understanding the behavior of learned classifiers is an important task, and various black-box explanations, logical reasoning approaches, and model-specific methods have been proposed. In this paper, we introduce probabilistic sufficient…

Machine Learning · Computer Science 2021-05-24 Eric Wang , Pasha Khosravi , Guy Van den Broeck

Interpreting random forest classification models using a feature contribution method

Model interpretation is one of the key aspects of the model evaluation process. The explanation of the relationship between model variables and outputs is relatively easy for statistical models, such as linear regressions, thanks to the…

Machine Learning · Computer Science 2013-12-05 Anna Palczewska , Jan Palczewski , Richard Marchese Robinson , Daniel Neagu

Revisiting Sparse Convolutional Model for Visual Recognition

Despite strong empirical performance for image classification, deep neural networks are often regarded as ``black boxes'' and they are difficult to interpret. On the other hand, sparse convolutional models, which assume that a signal can be…

Computer Vision and Pattern Recognition · Computer Science 2022-10-25 Xili Dai , Mingyang Li , Pengyuan Zhai , Shengbang Tong , Xingjian Gao , Shao-Lun Huang , Zhihui Zhu , Chong You , Yi Ma

Sharing pattern submodels for prediction with missing values

Missing values are unavoidable in many applications of machine learning and present challenges both during training and at test time. When variables are missing in recurring patterns, fitting separate pattern submodels have been proposed as…

Machine Learning · Computer Science 2023-11-27 Lena Stempfle , Ashkan Panahi , Fredrik D. Johansson

On the challenges of learning with inference networks on sparse, high-dimensional data

We study parameter estimation in Nonlinear Factor Analysis (NFA) where the generative model is parameterized by a deep neural network. Recent work has focused on learning such models using inference (or recognition) networks; we identify a…

Machine Learning · Statistics 2017-10-18 Rahul G. Krishnan , Dawen Liang , Matthew Hoffman

The Hidden Assumptions Behind Counterfactual Explanations and Principal Reasons

Counterfactual explanations are gaining prominence within technical, legal, and business circles as a way to explain the decisions of a machine learning model. These explanations share a trait with the long-established "principal reason"…

Computers and Society · Computer Science 2019-12-12 Solon Barocas , Andrew D. Selbst , Manish Raghavan

Sufficient and Necessary Explanations (and What Lies in Between)

As complex machine learning models continue to find applications in high-stakes decision-making scenarios, it is crucial that we can explain and understand their predictions. Post-hoc explanation methods provide useful insights by…

Machine Learning · Statistics 2024-10-16 Beepul Bharti , Paul Yi , Jeremias Sulam

Cross-Leverage Scores for Selecting Subsets of Explanatory Variables

In a standard regression problem, we have a set of explanatory variables whose effect on some response vector is modeled. For wide binary data, such as genetic marker data, we often have two limitations. First, we have more parameters than…

Methodology · Statistics 2021-09-20 Katharina Parry , Leo N. Geppert , Alexander Munteanu , Katja Ickstadt

Bridging factor and sparse models

Factor and sparse models are two widely used methods to impose a low-dimensional structure in high-dimensions. However, they are seemingly mutually exclusive. We propose a lifting method that combines the merits of these two models in a…

Econometrics · Economics 2022-09-07 Jianqing Fan , Ricardo Masini , Marcelo C. Medeiros

Projective Inference in High-dimensional Problems: Prediction and Feature Selection

This paper discusses predictive inference and feature selection for generalized linear models with scarce but high-dimensional data. We argue that in many cases one can benefit from a decision theoretically justified two-stage approach:…

Machine Learning · Statistics 2020-11-09 Juho Piironen , Markus Paasiniemi , Aki Vehtari

IGANN Sparse: Bridging Sparsity and Interpretability with Non-linear Insight

Feature selection is a critical component in predictive analytics that significantly affects the prediction accuracy and interpretability of models. Intrinsic methods for feature selection are built directly into model learning, providing a…

Machine Learning · Computer Science 2024-03-19 Theodor Stoecker , Nico Hambauer , Patrick Zschech , Mathias Kraus