Related papers: Variable selection through CART

Variable Selection in Causal Inference Using Penalization

In the causal adjustment setting, variable selection techniques based on either the outcome or treatment allocation model can result in the omission of confounders or the inclusion of spurious variables in the propensity score. We propose a…

Statistics Theory · Mathematics 2014-06-06 Ashkan Ertefaie , Masoud Asgharian , David A. Stephens

Variable selection for partially linear single-index varying-coefficient model

This paper focuses on variable selection for a partially linear single-index varying-coefficient model. A regularized variable selection procedure by combining basis function approximations with SCAD penalty is proposed. It can…

Statistics Theory · Mathematics 2024-12-19 Lijuan Han , Liugen Xue , Junshan Xie

Optimal randomized classification trees

Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and…

Machine Learning · Statistics 2021-10-25 Rafael Blanquero , Emilio Carrizosa , Cristina Molero-Río , Dolores Romero Morales

Variable Selection in Causal Inference using a Simultaneous Penalization Method

In the causal adjustment setting, variable selection techniques based on one of either the outcome or treatment allocation model can result in the omission of confounders, which leads to bias, or the inclusion of spurious variables, which…

Methodology · Statistics 2015-11-30 Ashkan Ertefaie , Masoud Asgharian , David Stephens

Addressing both variable selection and misclassified responses with parametric and semiparametric methods

While variable selection has received extensive attention in the literature, its exploration in the presence of response measurement error remains underexplored. In this paper, we investigate this important problem within the context of…

Methodology · Statistics 2026-03-17 Hui Guo , Grace Y. Yi , Boyu Wang

Propensity score estimation using classification and regression trees in the presence of missing covariate data

Data mining and machine learning techniques such as classification and regression trees (CART) represent a promising alternative to conventional logistic regression for propensity score estimation. Whereas incomplete data preclude the…

Machine Learning · Statistics 2018-07-26 Bas B. L. Penning de Vries , Maarten van Smeden , Rolf H. H. Groenwold

Clustering and variable selection for categorical multivariate data

This article investigates unsupervised classification techniques for categorical multivariate data. The study employs multivariate multinomial mixture modeling, which is a type of model particularly applicable to multilocus genotypic data.…

Statistics Theory · Mathematics 2014-03-11 Dominique Bontemps , Wilson Toussile

A method for variable selection in a multivariate functional linear regression model

We propose a new variable selection procedure for a functional linear model with multiple scalar responses and multiple functional predictors. This method is based on basis expansions of the involved functional predictors and coefficients…

Statistics Theory · Mathematics 2023-11-03 Alban Mina Mbina , Guy Martial Nkiet

Variable Selection Using Bayesian Additive Regression Trees

Variable selection is an important statistical problem. This problem becomes more challenging when the candidate predictors are of mixed type (e.g. continuous and binary) and impact the response variable in nonlinear and/or non-additive…

Methodology · Statistics 2021-12-30 Chuji Luo , Michael J. Daniels

Adaptive Covariance Estimation with model selection

We provide in this paper a fully adaptive penalized procedure to select a covariance among a collection of models observing i.i.d replications of the process at fixed observation points. For this we generalize previous results of Bigot and…

Statistics Theory · Mathematics 2012-03-05 Rolando Biscay , Hélène Lescornel , Jean-Michel Loubes

Variable selection for model-based clustering using the integrated complete-data likelihood

Variable selection in cluster analysis is important yet challenging. It can be achieved by regularization methods, which realize a trade-off between the clustering accuracy and the number of selected variables by using a lasso-type penalty.…

Methodology · Statistics 2016-12-23 Marbac Matthieu , Sedki Mohammed

Tree-Values: selective inference for regression trees

We consider conducting inference on the output of the Classification and Regression Tree (CART) [Breiman et al., 1984] algorithm. A naive approach to inference that does not account for the fact that the tree was estimated from the data…

Methodology · Statistics 2022-10-19 Anna C. Neufeld , Lucy L. Gao , Daniela M. Witten

Variable selection in measurement error models

Measurement error data or errors-in-variable data have been collected in many studies. Natural criterion functions are often unavailable for general functional measurement error models due to the lack of information on the distribution of…

Statistics Theory · Mathematics 2010-02-24 Yanyuan Ma , Runze Li

Variable Selection for Linear Regression Imputation in Surveys

Survey sampling is concerned with the estimation of finite population parameters. In practice, survey data suffer from item nonresponse, which is commonly handled through imputation, i.e., replacing missing values with predicted values. As…

Methodology · Statistics 2026-03-06 Ziming An , Mehdi Dagdoug , David Haziza

A Pathwise Algorithm for Covariance Selection

Covariance selection seeks to estimate a covariance matrix by maximum likelihood while restricting the number of nonzero inverse covariance matrix coefficients. A single penalty parameter usually controls the tradeoff between log likelihood…

Optimization and Control · Mathematics 2010-10-12 Vijay Krishnamurthy , Alexandre d'Aspremont

Selection of variables and decision boundaries for functional data via bi-level selection

Sparsity-inducing penalties are useful tools for variable selection and they are also effective for regression settings where the data are functions. We consider the problem of selecting not only variables but also decision boundaries in…

Methodology · Statistics 2020-06-01 Hidetoshi Matsui

Variable Selection with Exponential Weights and $l_0$-Penalization

In the context of a linear model with a sparse coefficient vector, exponential weights methods have been shown to be achieve oracle inequalities for prediction. We show that such methods also succeed at variable selection and estimation…

Statistics Theory · Mathematics 2012-09-18 Ery Arias-Castro , Karim Lounici

Sparse learning with CART

Decision trees with binary splits are popularly constructed using Classification and Regression Trees (CART) methodology. For regression models, this approach recursively divides the data into two near-homogenous daughter nodes according to…

Machine Learning · Statistics 2020-11-20 Jason M. Klusowski

Controlling FSR in Selective Classification

Uncertainty quantification and false selection error rate (FSR) control are crucial in many high-consequence scenarios, so we need models with good interpretability. This article introduces the optimality function for the binary…

Statistics Theory · Mathematics 2023-11-08 Guanlan Zhao , Zhonggen Su

Variable selection in the joint frailty model of recurrent and terminal events using Broken Adaptive Ridge regression

We introduce a novel method to simultaneously perform variable selection and estimation in the joint frailty model of recurrent and terminal events using the Broken Adaptive Ridge Regression penalty. The BAR penalty can be summarized as an…

Methodology · Statistics 2024-09-04 Christian Chan , Fatemeh Mahmoudi , Chel Hee Lee , Quan Long , Xuewen Lu