Related papers: Distributionally Robust Feature Selection

Robust variable selection for model-based learning in presence of adulteration

The problem of identifying the most discriminating features when performing supervised learning has been extensively investigated. In particular, several methods for variable selection in model-based classification have been proposed.…

Applications · Statistics 2020-12-16 Andrea Cappozzo , Francesca Greselin , Thomas Brendan Murphy

Robust Sampling in Deep Learning

Deep learning requires regularization mechanisms to reduce overfitting and improve generalization. We address this problem by a new regularization method based on distributional robust optimization. The key idea is to modify the…

Machine Learning · Computer Science 2020-06-08 Aurora Cobo Aguilera , Antonio Artés-Rodríguez , Fernando Pérez-Cruz , Pablo Martínez Olmos

Bayesian Model Selection on Random Networks

A general Bayesian framework for model selection on random network models regarding their features is considered. The goal is to develop a principle Bayesian model selection approach to compare different fittable, not necessarily nested,…

Methodology · Statistics 2020-04-30 Papamichalis Marios

Feature Selection Facilitates Learning Mixtures of Discrete Product Distributions

Feature selection can facilitate the learning of mixtures of discrete random variables as they arise, e.g. in crowdsourcing tasks. Intuitively, not all workers are equally reliable but, if the less reliable ones could be eliminated, then…

Machine Learning · Statistics 2017-11-28 Vincent Zhao , Steven W. Zucker

Supervised Learning Under Distributed Features

This work studies the problem of learning under both large datasets and large-dimensional feature space scenarios. The feature information is assumed to be spread across agents in a network, where each agent observes some of the features.…

Multiagent Systems · Computer Science 2020-05-26 Bicheng Ying , Kun Yuan , Ali H. Sayed

Sampling-guided exploration of active feature selection policies

Determining the most appropriate features for machine learning predictive models is challenging regarding performance and feature acquisition costs. In particular, global feature choice is limited given that some features will only benefit…

Machine Learning · Computer Science 2026-03-17 Gabriel Bernardino , Anders Jonsson , Patrick Clarysse , Nicolas Duchateau

COMBSS: Best Subset Selection via Continuous Optimization

The problem of best subset selection in linear regression is considered with the aim to find a fixed size subset of features that best fits the response. This is particularly challenging when the total available number of features is very…

Methodology · Statistics 2023-11-28 Sarat Moka , Benoit Liquet , Houying Zhu , Samuel Muller

Efficient Sampling Policy for Selecting a Good Enough Subset

The note studies the problem of selecting a good enough subset out of a finite number of alternatives under a fixed simulation budget. Our work aims to maximize the posterior probability of correctly selecting a good subset. We formulate…

Optimization and Control · Mathematics 2023-05-09 Gongbo Zhang , Bin Chen , Qing-shan Jia , Yijie Peng

Robust subset selection

The best subset selection (or "best subsets") estimator is a classic tool for sparse regression, and developments in mathematical optimization over the past decade have made it more computationally tractable than ever. Notwithstanding its…

Methodology · Statistics 2022-01-11 Ryan Thompson

Hierarchically Robust Representation Learning

With the tremendous success of deep learning in visual tasks, the representations extracted from intermediate layers of learned models, that is, deep features, attract much attention of researchers. Previous empirical analysis shows that…

Computer Vision and Pattern Recognition · Computer Science 2020-03-31 Qi Qian , Juhua Hu , Hao Li

Beyond Discrete Selection: Continuous Embedding Space Optimization for Generative Feature Selection

The goal of Feature Selection - comprising filter, wrapper, and embedded approaches - is to find the optimal feature subset for designated downstream tasks. Nevertheless, current feature selection methods are limited by: 1) the selection…

Machine Learning · Computer Science 2023-09-18 Meng Xiao , Dongjie Wang , Min Wu , Pengfei Wang , Yuanchun Zhou , Yanjie Fu

Robust and Parallel Bayesian Model Selection

Effective and accurate model selection is an important problem in modern data analysis. One of the major challenges is the computational burden required to handle large data sets that cannot be stored or processed on one machine. Another…

Machine Learning · Statistics 2018-06-26 Michael Minyi Zhang , Henry Lam , Lizhen Lin

Robust Forecasting

We use a decision-theoretic framework to study the problem of forecasting discrete outcomes when the forecaster is unable to discriminate among a set of plausible forecast distributions because of partial identification or concerns about…

Econometrics · Economics 2020-12-18 Timothy Christensen , Hyungsik Roger Moon , Frank Schorfheide

Obtaining Explainable Classification Models using Distributionally Robust Optimization

Model explainability is crucial for human users to be able to interpret how a proposed classifier assigns labels to data based on its feature values. We study generalized linear models constructed using sets of feature value rules, which…

Machine Learning · Statistics 2023-11-06 Sanjeeb Dash , Soumyadip Ghosh , Joao Goncalves , Mark S. Squillante

Learning Models with Uniform Performance via Distributionally Robust Optimization

A common goal in statistics and machine learning is to learn models that can perform well against distributional shifts, such as latent heterogeneous subpopulations, unknown covariate shifts, or unmodeled temporal effects. We develop and…

Machine Learning · Statistics 2020-07-21 John Duchi , Hongseok Namkoong

Subset Selection for Multiple Linear Regression via Optimization

Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming…

Machine Learning · Statistics 2020-09-04 Young Woong Park , Diego Klabjan

An Ensemble Approach toward Automated Variable Selection for Network Anomaly Detection

While variable selection is essential to optimize the learning complexity by prioritizing features, automating the selection process is preferred since it requires laborious efforts with intensive analysis otherwise. However, it is not an…

Machine Learning · Computer Science 2019-10-29 Makiya Nakashima , Alex Sim , Youngsoo Kim , Jonghyun Kim , Jinoh Kim

Distributional Robustness and Transfer Learning Through Empirical Bayes

We consider the problem of statistical inference on parameters of a target population when auxiliary observations are available from related populations. We propose a flexible empirical Bayes approach that can be applied on top of any…

Statistics Theory · Mathematics 2023-12-15 Michael Law , Peter Bühlmann , Ya'acov Ritov

Robust Feature Selection by Mutual Information Distributions

Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must…

Artificial Intelligence · Computer Science 2008-06-26 Marco Zaffalon , Marcus Hutter

Robust Feature Selection by Mutual Information Distributions

Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must…

Artificial Intelligence · Computer Science 2014-08-08 Marco Zaffalon , Marcus Hutter