Related papers: Resampling methods for parameter-free and robust f…

Advances in Feature Selection with Mutual Information

The selection of features that are relevant for a prediction or classification problem is an important problem in many domains involving high-dimensional data. Selecting features helps fighting the curse of dimensionality, improving the…

Machine Learning · Computer Science 2009-09-04 Michel Verleysen , Fabrice Rossi , Damien François

Improving Performance of a Group of Classification Algorithms Using Resampling and Feature Selection

In recent years the importance of finding a meaningful pattern from huge datasets has become more challenging. Data miners try to adopt innovative methods to face this problem by applying feature selection methods. In this paper we propose…

Machine Learning · Computer Science 2014-03-11 Mehdi Naseriparsa , Amir-masoud Bidgoli , Touraj Varaee

A Cross-Entropy-based Method to Perform Information-based Feature Selection

From a machine learning point of view, identifying a subset of relevant features from a real data set can be useful to improve the results achieved by classification methods and to reduce their time and space complexity. To achieve this…

Machine Learning · Computer Science 2017-05-23 Pietro Cassara , Alessandro Rozza , Mirco Nanni

Theoretical Evaluation of Feature Selection Methods based on Mutual Information

Feature selection methods are usually evaluated by wrapping specific classifiers and datasets in the evaluation process, resulting very often in unfair comparisons between methods. In this work, we develop a theoretical framework that…

Machine Learning · Statistics 2016-10-11 Cláudia Pascoal , M. Rosário Oliveira , António Pacheco , Rui Valadas

k-fold Subsampling based Sequential Backward Feature Elimination

We present a new wrapper feature selection algorithm for human detection. This algorithm is a hybrid feature selection approach combining the benefits of filter and wrapper methods. It allows the selection of an optimal feature vector that…

Computer Vision and Pattern Recognition · Computer Science 2025-03-24 Jeonghwan Park , Kang Li , Huiyu Zhou

Simultaneous Estimation of Number of Clusters and Feature Sparsity in Clustering High-Dimensional Data

Estimating the number of clusters (K) is a critical and often difficult task in cluster analysis. Many methods have been proposed to estimate K, including some top performers using resampling approach. When performing cluster analysis in…

Methodology · Statistics 2019-09-05 Yujia Li , Xiangrui Zeng , Chien-Wei Lin , George Tseng

A Hybrid Feature Selection Method to Improve Performance of a Group of Classification Algorithms

In this paper a hybrid feature selection method is proposed which takes advantages of wrapper subset evaluation with a lower cost and improves the performance of a group of classifiers. The method uses combination of sample domain filtering…

Machine Learning · Computer Science 2014-03-12 Mehdi Naseriparsa , Amir-Masoud Bidgoli , Touraj Varaee

Neural Estimators for Conditional Mutual Information Using Nearest Neighbors Sampling

The estimation of mutual information (MI) or conditional mutual information (CMI) from a set of samples is a long-standing problem. A recent line of work in this area has leveraged the approximation power of artificial neural networks and…

Information Theory · Computer Science 2021-10-27 Sina Molavipour , Germán Bassi , Mikael Skoglund

Model-Augmented Estimation of Conditional Mutual Information for Feature Selection

Markov blanket feature selection, while theoretically optimal, is generally challenging to implement. This is due to the shortcomings of existing approaches to conditional independence (CI) testing, which tend to struggle either with the…

Machine Learning · Computer Science 2020-06-23 Alan Yang , AmirEmad Ghassami , Maxim Raginsky , Negar Kiyavash , Elyse Rosenbaum

Mutual Information Estimation via $f$-Divergence and Data Derangements

Estimating mutual information accurately is pivotal across diverse applications, from machine learning to communications and biology, enabling us to gain insights into the inner mechanisms of complex systems. Yet, dealing with…

Machine Learning · Computer Science 2024-11-12 Nunzio A. Letizia , Nicola Novello , Andrea M. Tonello

Futility Analysis in the Cross-Validation of Machine Learning Models

Many machine learning models have important structural tuning parameters that cannot be directly estimated from the data. The common tactic for setting these parameters is to use resampling methods, such as cross--validation or the…

Machine Learning · Statistics 2014-05-28 Max Kuhn

Theoretical Foundations of Forward Feature Selection Methods based on Mutual Information

Feature selection problems arise in a variety of applications, such as microarray analysis, clinical prediction, text categorization, image classification and face recognition, multi-label learning, and classification of internet traffic.…

Machine Learning · Statistics 2018-02-15 Francisco Macedo , M. Rosário Oliveira , António Pacheco , Rui Valadas

Resampling-Based Multisplit Inference for High-Dimensional Regression

We propose a novel resampling-based method to construct an asymptotically exact test for any subset of hypotheses on coefficients in high-dimensional linear regression. It can be embedded into any multiple testing procedure to make…

Methodology · Statistics 2022-05-26 Anna Vesely , Jelle J. Goeman , Livio Finos

Fast Cross-Validation via Sequential Testing

With the increasing size of today's data sets, finding the right parameter configuration in model selection via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which…

Machine Learning · Computer Science 2016-02-05 Tammo Krueger , Danny Panknin , Mikio Braun

Feature Selection via Mutual Information: New Theoretical Insights

Mutual information has been successfully adopted in filter feature-selection methods to assess both the relevancy of a subset of features in predicting the target variable and the redundancy with respect to other variables. However,…

Machine Learning · Computer Science 2019-07-18 Mario Beraha , Alberto Maria Metelli , Matteo Papini , Andrea Tirinzoni , Marcello Restelli

Network cross-validation by edge sampling

While many statistical models and methods are now available for network analysis, resampling network data remains a challenging problem. Cross-validation is a useful general tool for model selection and parameter tuning, but is not directly…

Methodology · Statistics 2020-05-04 Tianxi Li , Elizaveta Levina , Ji Zhu

Estimating Conditional Mutual Information for Dynamic Feature Selection

Dynamic feature selection, where we sequentially query features to make accurate predictions with a minimal budget, is a promising paradigm to reduce feature acquisition costs and provide transparency into a model's predictions. The problem…

Machine Learning · Computer Science 2024-09-10 Soham Gadgil , Ian Covert , Su-In Lee

Meta-Learning for Resampling Recommendation Systems

One possible approach to tackle the class imbalance in classification tasks is to resample a training dataset, i.e., to drop some of its elements or to synthesize new ones. There exist several widely-used resampling methods. Recent research…

Machine Learning · Computer Science 2018-09-18 Smolyakov Dmitry , Alexander Korotin , Pavel Erofeev , Artem Papanov , Evgeny Burnaev

Estimation of mutual information for real-valued data with error bars and controlled bias

Estimation of mutual information between (multidimensional) real-valued variables is used in analysis of complex systems, biological systems, and recently also quantum systems. This estimation is a hard problem, and universally good…

Quantitative Methods · Quantitative Biology 2019-08-14 Caroline M. Holmes , Ilya Nemenman

MEET: A Monte Carlo Exploration-Exploitation Trade-off for Buffer Sampling

Data selection is essential for any data-based optimization technique, such as Reinforcement Learning. State-of-the-art sampling strategies for the experience replay buffer improve the performance of the Reinforcement Learning agent.…

Machine Learning · Computer Science 2023-11-28 Julius Ott , Lorenzo Servadei , Jose Arjona-Medina , Enrico Rinaldi , Gianfranco Mauro , Daniela Sánchez Lopera , Michael Stephan , Thomas Stadelmayer , Avik Santra , Robert Wille