Related papers: The mRMR variable selection method: a comparative …
This article describes the R package varrank. It has a flexible implementation of heuristic approaches which perform variable ranking based on mutual information. The package is particularly suitable for exploring multivariate datasets…
Dimensionality reduction is one of the key issues in the design of effective machine learning methods for automatic induction. In this work, we introduce recursive maxima hunting (RMH) for variable selection in classification problems with…
How to accurately measure the relevance and redundancy of features is an age-old challenge in the field of feature selection. However, existing filter-based feature selection methods cannot directly measure redundancy for continuous data.…
In machine learning applications for online product offerings and marketing strategies, there are often hundreds or thousands of features available to build such models. Feature selection is one essential method in such applications for…
While building machine learning models, Feature selection (FS) stands out as an essential preprocessing step used to handle the uncertainty and vagueness in the data. Recently, the minimum Redundancy and Maximum Relevance (mRMR) approach…
With the abundance of machine learning methods available and the temptation of using them all in an ensemble method, having a model-agnostic method of feature selection is incredibly alluring. Principal component analysis was developed in…
Variable selection is considered in the setting of supervised binary classification with functional data $\{X(t),\ t\in[0,1]\}$. By "variable selection" we mean any dimension-reduction method which leads to replace the whole trajectory…
The trade-off between relevance and fairness in personalized recommendations has been explored in recent works, with the goal of minimizing learned discrimination towards certain demographics while still producing relevant results. We…
Bilinear matrix inequality (BMI) problems in system and control designs are investigated in this paper. A solution method of reduction of variables (MRVs) is proposed. This method consists of a principle of variable classification, a…
We propose a new variable selection procedure for a functional linear model with multiple scalar responses and multiple functional predictors. This method is based on basis expansions of the involved functional predictors and coefficients…
Feature selection and reducing the dimensionality of data is an essential step in data analysis. In this work, we propose a new criterion for feature selection that is formulated as conditional information between features given the labeled…
Feature selection has attracted significant attention in data mining and machine learning in the past decades. Many existing feature selection methods eliminate redundancy by measuring pairwise inter-correlation of features, whereas the…
Feature selection is used to eliminate redundant features and keep relevant features, it can enhance machine learning algorithm's performance and accelerate computing speed. In various methods, mutual information has attracted increasingly…
A common way to extend the memory of large language models (LLMs) is by retrieval augmented generation (RAG), which inserts text retrieved from a larger memory into an LLM's context window. However, the context window is typically limited…
Mutual Information (MI) based feature selection makes use of MI to evaluate each feature and eventually shortlists a relevant feature subset, in order to address issues associated with high-dimensional datasets. Despite the effectiveness of…
Multi-criteria decision-making often requires finding a small representative set from the database. A recently proposed method is the regret minimization set (RMS) query. RMS returns a size $r$ subset $S$ of dataset $D$ that minimizes the…
Reciprocal recommender systems~(RRS), conducting bilateral recommendations between two involved parties, have gained increasing attention for enhancing matching efficiency. However, the majority of existing methods in the literature still…
We propose a feature screening method that integrates both feature-feature and feature-target relationships. Inactive features are identified via a penalized minimum Redundancy Maximum Relevance (mRMR) procedure, which is the continuous…
This paper considers the problem of variable selection in regression models in the case of functional variables that may be mixed with other type of variables (scalar, multivariate, directional, etc.). Our proposal begins with a simple null…
Efficient benchmarking techniques aim to lower the computational cost of evaluating LLMs by predicting full benchmark scores using only a subset of a benchmark's questions. By reframing this problem as an instance of multiple regression…