Related papers: Generating Redundant Features with Unsupervised Mu…

Automatically Redundant Features Removal for Unsupervised Feature Selection via Sparse Feature Graph

The redundant features existing in high dimensional datasets always affect the performance of learning and mining algorithms. How to detect and remove them is an important research topic in machine learning and data mining research. In this…

Machine Learning · Computer Science 2017-07-04 Shuchu Han , Hao Huang , Hong Qin

Transforming Datasets to Requested Complexity with Projection-based Many-Objective Genetic Algorithm

The research community continues to seek increasingly more advanced synthetic data generators to reliably evaluate the strengths and limitations of machine learning methods. This work aims to increase the availability of datasets…

Machine Learning · Computer Science 2026-01-30 Joanna Komorniczak

Multi-Objective Genetic Programming Projection Pursuit for Exploratory Data Modeling

For classification problems, feature extraction is a crucial process which aims to find a suitable data representation that increases the performance of the machine learning algorithm. According to the curse of dimensionality theorem, the…

Machine Learning · Computer Science 2010-10-12 Ilknur Icke , Andrew Rosenberg

Random Subspace with Trees for Feature Selection Under Memory Constraints

Dealing with datasets of very high dimension is a major challenge in machine learning. In this paper, we consider the problem of feature selection in applications where the memory is not large enough to contain all features. In this…

Machine Learning · Statistics 2017-09-07 Antonio Sutera , Célia Châtel , Gilles Louppe , Louis Wehenkel , Pierre Geurts

A Novel Memetic Feature Selection Algorithm

Feature selection is a problem of finding efficient features among all features in which the final feature set can improve accuracy and reduce complexity. In feature selection algorithms search strategies are key aspects. Since feature…

Machine Learning · Computer Science 2016-01-27 Mohadeseh Montazeri , Hamid Reza Naji , Mitra Montazeri , Ahmad Faraahi

Neuro-Symbolic Embedding for Short and Effective Feature Selection via Autoregressive Generation

Feature selection aims to identify the optimal feature subset for enhancing downstream models. Effective feature selection can remove redundant features, save computational resources, accelerate the model learning process, and improve the…

Machine Learning · Computer Science 2024-12-19 Nanxu Gong , Wangyang Ying , Dongjie Wang , Yanjie Fu

Knowledge-Based Learning through Feature Generation

Machine learning algorithms have difficulties to generalize over a small set of examples. Humans can perform such a task by exploiting vast amount of background knowledge they possess. One method for enhancing learning algorithms with…

Machine Learning · Computer Science 2020-06-09 Michal Badian , Shaul Markovitch

A Novel Community Detection Based Genetic Algorithm for Feature Selection

The selection of features is an essential data preprocessing stage in data mining. The core principle of feature selection seems to be to pick a subset of possible features by excluding features with almost no predictive information as well…

Machine Learning · Computer Science 2020-08-11 Mehrdad Rostami , Kamal Berahmand , Saman Forouzandeh

Nonparametric Feature Selection by Random Forests and Deep Neural Networks

Random forests are a widely used machine learning algorithm, but their computational efficiency is undermined when applied to large-scale datasets with numerous instances and useless features. Herein, we propose a nonparametric feature…

Machine Learning · Computer Science 2022-01-19 Xiaojun Mao , Liuhua Peng , Zhonglei Wang

Feature Selection as Deep Sequential Generative Learning

Feature selection aims to identify the most pattern-discriminative feature subset. In prior literature, filter (e.g., backward elimination) and embedded (e.g., Lasso) methods have hyperparameters (e.g., top-K, score thresholding) and tie to…

Machine Learning · Computer Science 2024-03-07 Wangyang Ying , Dongjie Wang , Haifeng Chen , Yanjie Fu

Feature Selection for Regression Problems Based on the Morisita Estimator of Intrinsic Dimension

Data acquisition, storage and management have been improved, while the key factors of many phenomena are not well known. Consequently, irrelevant and redundant features artificially increase the size of datasets, which complicates learning…

Machine Learning · Statistics 2017-04-05 Jean Golay , Michael Leuenberger , Mikhail Kanevski

Deep Feature Selection Using a Novel Complementary Feature Mask

Feature selection has drawn much attention over the last decades in machine learning because it can reduce data dimensionality while maintaining the original physical meaning of features, which enables better interpretability than feature…

Machine Learning · Computer Science 2022-09-27 Yiwen Liao , Jochen Rivoir , Raphaël Latty , Bin Yang

Retrieval-Augmented Feature Generation for Domain-Specific Classification

Feature generation can significantly enhance learning outcomes, particularly for tasks with limited data. An effective way to improve feature generation is to expand the current feature space using existing features and enriching the…

Computation and Language · Computer Science 2025-11-11 Xinhao Zhang , Jinghan Zhang , Fengran Mo , Dakshak Keerthi Chandra , Yu-Zhong Chen , Fei Xie , Kunpeng Liu

A Novel Approach for Stable Selection of Informative Redundant Features from High Dimensional fMRI Data

Feature selection is among the most important components because it not only helps enhance the classification accuracy, but also or even more important provides potential biomarker discovery. However, traditional multivariate methods is…

Computer Vision and Pattern Recognition · Computer Science 2016-05-26 Yilun Wang , Zhiqiang Li , Yifeng Wang , Xiaona Wang , Junjie Zheng , Xujuan Duan , Huafu Chen

Consistent Feature Construction with Constrained Genetic Programming for Experimental Physics

A good feature representation is a determinant factor to achieve high performance for many machine learning algorithms in terms of classification. This is especially true for techniques that do not build complex internal representations of…

Neural and Evolutionary Computing · Computer Science 2019-08-22 Noëlie Cherrier , Jean-Philippe Poli , Maxime Defurne , Franck Sabatié

Toward better feature weighting algorithms: a focus on Relief

Feature weighting algorithms try to solve a problem of great importance nowadays in machine learning: The search of a relevance measure for the features of a given domain. This relevance is primarily used for feature selection as feature…

Machine Learning · Computer Science 2015-09-17 Gabriel Prat Masramon , Lluís A. Belanche Muñoz

On the utility of feature selection in building two-tier decision trees

Nowadays, feature selection is frequently used in machine learning when there is a risk of performance degradation due to overfitting or when computational resources are limited. During the feature selection process, the subset of features…

Machine Learning · Computer Science 2023-01-02 Sergey A. Saltykov

Recursive Feature Generation for Knowledge-based Learning

When humans perform inductive learning, they often enhance the process with background knowledge. With the increasing availability of well-formed collaborative knowledge bases, the performance of learning algorithms could be significantly…

Artificial Intelligence · Computer Science 2018-02-02 Lior Friedman , Shaul Markovitch

Generating Diverse Synthetic Datasets for Evaluation of Real-life Recommender Systems

Synthetic datasets are important for evaluating and testing machine learning models. When evaluating real-life recommender systems, high-dimensional categorical (and sparse) datasets are often considered. Unfortunately, there are not many…

Information Retrieval · Computer Science 2024-12-11 Miha Malenšek , Blaž Škrlj , Blaž Mramor , Jure Demšar

Testing for Feature Relevance: The HARVEST Algorithm

Feature selection with high-dimensional data and a very small proportion of relevant features poses a severe challenge to standard statistical methods. We have developed a new approach (HARVEST) that is straightforward to apply, albeit…

Machine Learning · Statistics 2018-03-01 Herbert Weisberg , Victor Pontes , Mathis Thoma