Related papers: Faster feature selection with a Dropping Forward-B…
The explosion of data in recent years has generated an increasing need for new analysis techniques in order to extract knowledge from massive datasets. Machine learning has proved particularly useful to perform this task. Fully automatized…
Feature selection is a critical step in the analysis of high-dimensional data, where the number of features often vastly exceeds the number of samples. Effective feature selection not only improves model performance and interpretability but…
Feature selection, as a data preprocessing strategy, has been proven to be effective and efficient in preparing data (especially high-dimensional data) for various data mining and machine learning problems. The objectives of feature…
Feature selection is frequently used as a pre-processing step to machine learning. It is a process of choosing a subset of original features so that the feature space is optimally reduced according to a certain evaluation criterion. The…
Automated variable selection is widely applied in statistical model development. Algorithms like forward, backward or stepwise selection are available in statistical software packages like R and SAS. Many researchers have criticized the use…
Forward-backward selection is one of the most basic and commonly-used feature selection algorithms available. It is also general and conceptually applicable to many different types of data. In this paper, we propose a heuristic that…
Variable selection, also known as feature selection in machine learning, plays an important role in modeling high dimensional data and is key to data-driven scientific discoveries. We consider here the problem of detecting influential…
I consider unsupervised extensions of the fast stepwise linear regression algorithm \cite{efroymson1960multiple}. These extensions allow one to efficiently identify highly-representative feature variable subsets within a given set of…
In this paper, we present an efficient way of performing stepwise selection in the class of decomposable models. The main contribution of the paper is a simple characterization of the edges that canbe added to a decomposable model while…
We propose trace pursuit for model-free variable selection under the sufficient dimension reduction paradigm. Two distinct algorithms are proposed: stepwise trace pursuit and forward trace pursuit. Stepwise trace pursuit achieves selection…
Along with the flourish of the information age, massive amounts of data are generated day by day. Due to the large-scale and high-dimensional characteristics of these data, it is often difficult to achieve better decision-making in…
Deep learning's success has been attributed to the training of large, overparameterized models on massive amounts of data. As this trend continues, model training has become prohibitively costly, requiring access to powerful computing…
The challenges in feature selection, particularly in balancing model accuracy, interpretability, and computational efficiency, remain a critical issue in advancing machine learning methodologies. To address these complexities, this study…
High-dimensional datasets depict a challenge for learning tasks in data mining and machine learning. Feature selection is an effective technique in dealing with dimensionality reduction. It is often an essential data processing step prior…
Many computer vision and medical imaging problems are faced with learning from large-scale datasets, with millions of observations and features. In this paper we propose a novel efficient learning scheme that tightens a sparsity constraint…
The amount of information in the form of features and variables avail- able to machine learning algorithms is ever increasing. This can lead to classifiers that are prone to overfitting in high dimensions, high di- mensional models do not…
Feature selection is one of the most challenging issues in machine learning, especially while working with high dimensional data. In this paper, we address the problem of feature selection and propose a new approach called Evolving Fast and…
Machine learning models usually assume that a set of feature values used to obtain an output is fixed in advance. However, in many real-world problems, a cost is associated with measuring these features. To address the issue of reducing…
Purpose: Machine learning is broadly used for clinical data analysis. Before training a model, a machine learning algorithm must be selected. Also, the values of one or more model parameters termed hyper-parameters must be set. Selecting…
This paper discusses predictive inference and feature selection for generalized linear models with scarce but high-dimensional data. We argue that in many cases one can benefit from a decision theoretically justified two-stage approach:…