Related papers: ControlBurn: Feature Selection by Sparse Forests

Feature Selection Methods for Cost-Constrained Classification in Random Forests

Cost-sensitive feature selection describes a feature selection problem, where features raise individual costs for inclusion in a model. These costs allow to incorporate disfavored aspects of features, e.g. failure rates of as measuring…

Machine Learning · Statistics 2020-08-18 Rudolf Jagdhuber , Michel Lang , Jörg Rahnenführer

High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso

The goal of supervised feature selection is to find a subset of input features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient feature selection…

Machine Learning · Statistics 2019-01-07 Makoto Yamada , Wittawat Jitkrittum , Leonid Sigal , Eric P. Xing , Masashi Sugiyama

Fused Lasso for Feature Selection using Structural Information

Feature selection has been proven a powerful preprocessing step for high-dimensional data analysis. However, most state-of-the-art methods tend to overlook the structural correlation information between pairwise samples, which may…

Machine Learning · Computer Science 2019-07-02 Lu Bai , Lixin Cui , Yue Wang , Philip S. Yu , Edwin R. Hancock

ABM: an automatic supervised feature engineering method for loss based models based on group and fused lasso

A vital problem in solving classification or regression problem is to apply feature engineering and variable selection on data before fed into models.One of a most popular feature engineering method is to discretisize continous variable…

Applications · Statistics 2020-09-23 Weijian Luo , Yongxian Long

Forest Kernel Balancing Weights: Outcome-Guided Features for Causal Inference

While balancing covariates between groups is central for observational causal inference, selecting which features to balance remains a challenging problem. Kernel balancing is a promising approach that first estimates a kernel that captures…

Methodology · Statistics 2025-12-15 Andy A. Shen , Eli Ben-Michael , Avi Feller , Luke Keele , Jared Murray

FREEtree: A Tree-based Approach for High Dimensional Longitudinal Data With Correlated Features

This paper proposes FREEtree, a tree-based method for high dimensional longitudinal data with correlated features. Popular machine learning approaches, like Random Forests, commonly used for variable selection do not perform well when there…

Machine Learning · Statistics 2020-06-18 Yuancheng Xu , Athanasse Zafirov , R. Michael Alvarez , Dan Kojis , Min Tan , Christina M. Ramirez

Layer Pruning with Consensus: A Triple-Win Solution

Layer pruning offers a promising alternative to standard structured pruning, effectively reducing computational costs, latency, and memory footprint. While notable layer-pruning approaches aim to detect unimportant layers for removal, they…

Machine Learning · Computer Science 2025-08-25 Leandro Giusti Mugnaini , Carolina Tavares Duarte , Anna H. Reali Costa , Artur Jordao

In many high dimensional classification or regression problems set in a biological context, the complete identification of the set of informative features is often as important as predictive accuracy, since this can provide mechanistic…

Machine Learning · Computer Science 2020-03-02 Yuxin Sun , Benny Chain , Samuel Kaski , John Shawe-Taylor

Quick and Robust Feature Selection: the Strength of Energy-efficient Sparse Training for Autoencoders

Major complications arise from the recent increase in the amount of high-dimensional data, including high computational costs and memory requirements. Feature selection, which identifies the most relevant and informative attributes of a…

Machine Learning · Computer Science 2021-09-14 Zahra Atashgahi , Ghada Sokar , Tim van der Lee , Elena Mocanu , Decebal Constantin Mocanu , Raymond Veldhuis , Mykola Pechenizkiy

End-to-end Feature Selection Approach for Learning Skinny Trees

We propose a new optimization-based approach for feature selection in tree ensembles, an important problem in statistics and machine learning. Popular tree ensemble toolkits e.g., Gradient Boosted Trees and Random Forests support feature…

Machine Learning · Computer Science 2025-04-08 Shibal Ibrahim , Kayhan Behdin , Rahul Mazumder

LCEN: A Nonlinear, Interpretable Feature Selection and Machine Learning Algorithm

Interpretable models can have advantages over black-box models, and interpretability is essential for the application of machine learning in critical settings, such as aviation or medicine. This article introduces the LASSO-Clip-EN (LCEN)…

Machine Learning · Computer Science 2025-12-02 Pedro Seber , Richard D. Braatz

Improving Performance of a Group of Classification Algorithms Using Resampling and Feature Selection

In recent years the importance of finding a meaningful pattern from huge datasets has become more challenging. Data miners try to adopt innovative methods to face this problem by applying feature selection methods. In this paper we propose…

Machine Learning · Computer Science 2014-03-11 Mehdi Naseriparsa , Amir-masoud Bidgoli , Touraj Varaee

Cross-Cluster Weighted Forests

Adapting machine learning algorithms to better handle the presence of clusters or batch effects within training datasets is important across a wide variety of biological applications. This article considers the effect of ensembling Random…

Machine Learning · Statistics 2025-04-01 Maya Ramchandran , Rajarshi Mukherjee , Giovanni Parmigiani

Non-uniform Feature Sampling for Decision Tree Ensembles

We study the effectiveness of non-uniform randomized feature selection in decision tree classification. We experimentally evaluate two feature selection methodologies, based on information extracted from the provided dataset: $(i)$…

Machine Learning · Statistics 2014-03-25 Anastasios Kyrillidis , Anastasios Zouzias

Theoretical and Empirical Advances in Forest Pruning

Regression forests have long delivered state-of-the-art accuracy, often outperforming regression trees and even neural networks, but they suffer from limited interpretability as ensemble methods. In this work, we revisit forest pruning, an…

Machine Learning · Statistics 2025-03-10 Albert Dorador

A concise method for feature selection via normalized frequencies

Feature selection is an important part of building a machine learning model. By eliminating redundant or misleading features from data, the machine learning model can achieve better performance while reducing the demand on com-puting…

Machine Learning · Computer Science 2021-06-11 Song Tan , Xia He

Classification with Sparse Overlapping Groups

Classification with a sparsity constraint on the solution plays a central role in many high dimensional machine learning applications. In some cases, the features can be grouped together so that entire subsets of features can be selected or…

Machine Learning · Computer Science 2014-09-05 Nikhil Rao , Robert Nowak , Christopher Cox , Timothy Rogers

Relevant based structure learning for feature selection

Feature selection is an important task in many problems occurring in pattern recognition, bioinformatics, machine learning and data mining applications. The feature selection approach enables us to reduce the computation burden and the…

Machine Learning · Computer Science 2016-08-30 Hadi Zare , Mojtaba Niazi

Learning Feature Nonlinearities with Non-Convex Regularized Binned Regression

For various applications, the relations between the dependent and independent variables are highly nonlinear. Consequently, for large scale complex problems, neural networks and regression trees are commonly preferred over linear models…

Machine Learning · Computer Science 2017-05-23 Samet Oymak , Mehrdad Mahdavi , Jiasi Chen

Interpretable Selection and Visualization of Features and Interactions Using Bayesian Forests

It is becoming increasingly important for machine learning methods to make predictions that are interpretable as well as accurate. In many practical applications, it is of interest which features and feature interactions are relevant to the…

Machine Learning · Statistics 2016-02-09 Viktoriya Krakovna , Jiong Du , Jun S. Liu