English
Related papers

Related papers: Generating Redundant Features with Unsupervised Mu…

200 papers

The redundant features existing in high dimensional datasets always affect the performance of learning and mining algorithms. How to detect and remove them is an important research topic in machine learning and data mining research. In this…

Machine Learning · Computer Science 2017-07-04 Shuchu Han , Hao Huang , Hong Qin

The research community continues to seek increasingly more advanced synthetic data generators to reliably evaluate the strengths and limitations of machine learning methods. This work aims to increase the availability of datasets…

Machine Learning · Computer Science 2026-01-30 Joanna Komorniczak

For classification problems, feature extraction is a crucial process which aims to find a suitable data representation that increases the performance of the machine learning algorithm. According to the curse of dimensionality theorem, the…

Machine Learning · Computer Science 2010-10-12 Ilknur Icke , Andrew Rosenberg

Dealing with datasets of very high dimension is a major challenge in machine learning. In this paper, we consider the problem of feature selection in applications where the memory is not large enough to contain all features. In this…

Machine Learning · Statistics 2017-09-07 Antonio Sutera , Célia Châtel , Gilles Louppe , Louis Wehenkel , Pierre Geurts

Feature selection is a problem of finding efficient features among all features in which the final feature set can improve accuracy and reduce complexity. In feature selection algorithms search strategies are key aspects. Since feature…

Machine Learning · Computer Science 2016-01-27 Mohadeseh Montazeri , Hamid Reza Naji , Mitra Montazeri , Ahmad Faraahi

Feature selection aims to identify the optimal feature subset for enhancing downstream models. Effective feature selection can remove redundant features, save computational resources, accelerate the model learning process, and improve the…

Machine Learning · Computer Science 2024-12-19 Nanxu Gong , Wangyang Ying , Dongjie Wang , Yanjie Fu

Machine learning algorithms have difficulties to generalize over a small set of examples. Humans can perform such a task by exploiting vast amount of background knowledge they possess. One method for enhancing learning algorithms with…

Machine Learning · Computer Science 2020-06-09 Michal Badian , Shaul Markovitch

The selection of features is an essential data preprocessing stage in data mining. The core principle of feature selection seems to be to pick a subset of possible features by excluding features with almost no predictive information as well…

Machine Learning · Computer Science 2020-08-11 Mehrdad Rostami , Kamal Berahmand , Saman Forouzandeh

Random forests are a widely used machine learning algorithm, but their computational efficiency is undermined when applied to large-scale datasets with numerous instances and useless features. Herein, we propose a nonparametric feature…

Machine Learning · Computer Science 2022-01-19 Xiaojun Mao , Liuhua Peng , Zhonglei Wang

Feature selection aims to identify the most pattern-discriminative feature subset. In prior literature, filter (e.g., backward elimination) and embedded (e.g., Lasso) methods have hyperparameters (e.g., top-K, score thresholding) and tie to…

Machine Learning · Computer Science 2024-03-07 Wangyang Ying , Dongjie Wang , Haifeng Chen , Yanjie Fu

Data acquisition, storage and management have been improved, while the key factors of many phenomena are not well known. Consequently, irrelevant and redundant features artificially increase the size of datasets, which complicates learning…

Machine Learning · Statistics 2017-04-05 Jean Golay , Michael Leuenberger , Mikhail Kanevski

Feature selection has drawn much attention over the last decades in machine learning because it can reduce data dimensionality while maintaining the original physical meaning of features, which enables better interpretability than feature…

Machine Learning · Computer Science 2022-09-27 Yiwen Liao , Jochen Rivoir , Raphaël Latty , Bin Yang

Feature generation can significantly enhance learning outcomes, particularly for tasks with limited data. An effective way to improve feature generation is to expand the current feature space using existing features and enriching the…

Computation and Language · Computer Science 2025-11-11 Xinhao Zhang , Jinghan Zhang , Fengran Mo , Dakshak Keerthi Chandra , Yu-Zhong Chen , Fei Xie , Kunpeng Liu

Feature selection is among the most important components because it not only helps enhance the classification accuracy, but also or even more important provides potential biomarker discovery. However, traditional multivariate methods is…

Computer Vision and Pattern Recognition · Computer Science 2016-05-26 Yilun Wang , Zhiqiang Li , Yifeng Wang , Xiaona Wang , Junjie Zheng , Xujuan Duan , Huafu Chen

A good feature representation is a determinant factor to achieve high performance for many machine learning algorithms in terms of classification. This is especially true for techniques that do not build complex internal representations of…

Neural and Evolutionary Computing · Computer Science 2019-08-22 Noëlie Cherrier , Jean-Philippe Poli , Maxime Defurne , Franck Sabatié

Feature weighting algorithms try to solve a problem of great importance nowadays in machine learning: The search of a relevance measure for the features of a given domain. This relevance is primarily used for feature selection as feature…

Machine Learning · Computer Science 2015-09-17 Gabriel Prat Masramon , Lluís A. Belanche Muñoz

Nowadays, feature selection is frequently used in machine learning when there is a risk of performance degradation due to overfitting or when computational resources are limited. During the feature selection process, the subset of features…

Machine Learning · Computer Science 2023-01-02 Sergey A. Saltykov

When humans perform inductive learning, they often enhance the process with background knowledge. With the increasing availability of well-formed collaborative knowledge bases, the performance of learning algorithms could be significantly…

Artificial Intelligence · Computer Science 2018-02-02 Lior Friedman , Shaul Markovitch

Synthetic datasets are important for evaluating and testing machine learning models. When evaluating real-life recommender systems, high-dimensional categorical (and sparse) datasets are often considered. Unfortunately, there are not many…

Information Retrieval · Computer Science 2024-12-11 Miha Malenšek , Blaž Škrlj , Blaž Mramor , Jure Demšar

Feature selection with high-dimensional data and a very small proportion of relevant features poses a severe challenge to standard statistical methods. We have developed a new approach (HARVEST) that is straightforward to apply, albeit…

Machine Learning · Statistics 2018-03-01 Herbert Weisberg , Victor Pontes , Mathis Thoma
‹ Prev 1 2 3 10 Next ›