Related papers: Feature-Weighted Maximum Representative Subsamplin…

Unsupervised Feature Selection Algorithm Based on Dual Manifold Re-ranking

High-dimensional data is commonly encountered in numerous data analysis tasks. Feature selection techniques aim to identify the most representative features from the original high-dimensional data. Due to the absence of class label…

Machine Learning · Computer Science 2024-10-29 Yunhui Liang , Jianwen Gan , Yan Chen , Peng Zhou , Liang Du

Unsupervised Learning of Debiased Representations with Pseudo-Attributes

Dataset bias is a critical challenge in machine learning since it often leads to a negative impact on a model due to the unintended decision rules captured by spurious correlations. Although existing works often handle this issue based on…

Machine Learning · Computer Science 2022-04-05 Seonguk Seo , Joon-Young Lee , Bohyung Han

Deep Feature Screening: Feature Selection for Ultra High-Dimensional Data via Deep Neural Networks

The applications of traditional statistical feature selection methods to high-dimension, low sample-size data often struggle and encounter challenging problems, such as overfitting, curse of dimensionality, computational infeasibility, and…

Machine Learning · Statistics 2023-12-19 Kexuan Li , Fangfang Wang , Lingli Yang , Ruiqi Liu

Learning Debiased Representation via Disentangled Feature Augmentation

Image classification models tend to make decisions based on peripheral attributes of data items that have strong correlation with a target variable (i.e., dataset bias). These biased models suffer from the poor generalization capability…

Machine Learning · Computer Science 2021-10-26 Jungsoo Lee , Eungyeup Kim , Juyoung Lee , Jihyeon Lee , Jaegul Choo

Feature importance (FI) statistics provide a prominent and valuable method of insight into the decision process of machine learning (ML) models, but their effectiveness has well-known limitations when correlation is present among the…

Machine Learning · Statistics 2025-08-11 Benedikt Fröhlich , Alison Durst , Merle Behr

Large-scale Multi-objective Feature Selection: A Multi-phase Search Space Shrinking Approach

Feature selection is a crucial step in machine learning, especially for high-dimensional datasets, where irrelevant and redundant features can degrade model performance and increase computational costs. This paper proposes a novel…

Neural and Evolutionary Computing · Computer Science 2024-10-30 Azam Asilian Bidgoli , Shahryar Rahnamayan

Unsupervised Hypergraph Feature Selection via a Novel Point-Weighting Framework and Low-Rank Representation

Feature selection methods are widely used in order to solve the 'curse of dimensionality' problem. Many proposed feature selection frameworks, treat all data points equally; neglecting their different representation power and importance. In…

Machine Learning · Computer Science 2018-10-04 Ammar Gilani , Maryam Amirmazlaghani

Common-Sense Bias Modeling for Classification Tasks

Machine learning model bias can arise from dataset composition: correlated sensitive features can distort the downstream classification model's decision boundary and lead to performance differences along these features. Existing de-biasing…

Computer Vision and Pattern Recognition · Computer Science 2025-01-22 Miao Zhang , Zee fryer , Ben Colman , Ali Shahriyari , Gaurav Bharaj

Positive region preserved random sampling: an efficient feature selection method for massive data

Selecting relevant features is an important and necessary step for intelligent machines to maximize their chances of success. However, intelligent machines generally have no enough computing resources when faced with huge volume of data.…

Machine Learning · Computer Science 2025-07-04 Hexiang Bai , Deyu Li , Jiye Liang , Yanhui Zhai

Subsampling Winner Algorithm for Feature Selection in Large Regression Data

Feature selection from a large number of covariates (aka features) in a regression analysis remains a challenge in data science, especially in terms of its potential of scaling to ever-enlarging data and finding a group of scientifically…

Machine Learning · Statistics 2020-02-10 Yiying Fan , Jiayang Sun

Distributionally Robust Feature Selection

We study the problem of selecting limited features to observe such that models trained on them can perform well simultaneously across multiple subpopulations. This problem has applications in settings where collecting each feature is…

Machine Learning · Computer Science 2025-10-27 Maitreyi Swaroop , Tamar Krishnamurti , Bryan Wilder

Fast and Accurate Importance Weighting for Correcting Sample Bias

Bias in datasets can be very detrimental for appropriate statistical estimation. In response to this problem, importance weighting methods have been developed to match any biased distribution to its corresponding target unbiased…

Machine Learning · Computer Science 2022-09-12 Antoine de Mathelin , Francois Deheeger , Mathilde Mougeot , Nicolas Vayatis

Fairness-Aware Unsupervised Feature Selection

Feature selection is a prevalent data preprocessing paradigm for various learning tasks. Due to the expensive cost of acquiring supervision information, unsupervised feature selection sparks great interests recently. However, existing…

Machine Learning · Computer Science 2021-06-07 Xiaoying Xing , Hongfu Liu , Chen Chen , Jundong Li

Feature space reduction method for ultrahigh-dimensional, multiclass data: Random forest-based multiround screening (RFMS)

In recent years, numerous screening methods have been published for ultrahigh-dimensional data that contain hundreds of thousands of features; however, most of these features cannot handle data with thousands of classes. Prediction models…

Machine Learning · Computer Science 2026-02-06 Gergely Hanczár , Marcell Stippinger , Dávid Hanák , Marcell T. Kurbucz , Olivér M. Törteli , Ágnes Chripkó , Zoltán Somogyvári

Benign Shortcut for Debiasing: Fair Visual Recognition via Intervention with Shortcut Features

Machine learning models often learn to make predictions that rely on sensitive social attributes like gender and race, which poses significant fairness risks, especially in societal applications, such as hiring, banking, and criminal…

Machine Learning · Computer Science 2023-08-25 Yi Zhang , Jitao Sang , Junyang Wang , Dongmei Jiang , Yaowei Wang

Training Debiased Subnetworks with Contrastive Weight Pruning

Neural networks are often biased to spuriously correlated features that provide misleading statistical evidence that does not generalize. This raises an interesting question: ``Does an optimal unbiased functional subnetwork exist in a…

Machine Learning · Computer Science 2023-06-27 Geon Yeong Park , Sangmin Lee , Sang Wan Lee , Jong Chul Ye

Debiased Recommendation with User Feature Balancing

Debiased recommendation has recently attracted increasing attention from both industry and academic communities. Traditional models mostly rely on the inverse propensity score (IPS), which can be hard to estimate and may suffer from the…

Information Retrieval · Computer Science 2022-01-19 Mengyue Yang , Guohao Cai , Furui Liu , Zhenhua Dong , Xiuqiang He , Jianye Hao , Jun Wang , Xu Chen

An iterative scheme for feature based positioning using a weighted dissimilarity measure

We propose an iterative scheme for feature-based positioning using a new weighted dissimilarity measure with the goal of reducing the impact of large errors among the measured or modeled features. The weights are computed from the…

Machine Learning · Computer Science 2019-05-31 Caifa Zhou , Andreas Wieser

How to be Fair and Diverse?

Due to the recent cases of algorithmic bias in data-driven decision-making, machine learning methods are being put under the microscope in order to understand the root cause of these biases and how to correct them. Here, we consider a basic…

Machine Learning · Computer Science 2016-10-25 L. Elisa Celis , Amit Deshpande , Tarun Kathuria , Nisheeth K. Vishnoi

Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair

In the image classification task, deep neural networks frequently rely on bias attributes that are spuriously correlated with a target class in the presence of dataset bias, resulting in degraded performance when applied to data without…

Computer Vision and Pattern Recognition · Computer Science 2024-06-18 Jeonghoon Park , Chaeyeon Chung , Juyoung Lee , Jaegul Choo