Related papers: Random threshold for linear model selection, revis…

Quantile universal threshold for model selection

Efficient recovery of a low-dimensional structure from high-dimensional data has been pursued in various settings including wavelet denoising, generalized linear models and low-rank matrix estimation. By thresholding some parameters to…

Methodology · Statistics 2017-08-14 Caroline Giacobino , Sylvain Sardy , Jairo Diaz-Rodriguez , Nick Hengartner

Variable selection via thresholding

Variable selection comprises an important step in many modern statistical inference procedures. In the regression setting, when estimators cannot shrink irrelevant signals to zero, covariates without relationships to the response often…

Statistics Theory · Mathematics 2025-03-28 Ka Long Keith Ho , Hien Duy Nguyen

Threshold Selection in Univariate Extreme Value Analysis

Threshold selection plays a key role for various aspects of statistical inference of rare events. Most classical approaches tackling this problem for heavy-tailed distributions crucially depend on tuning parameters or critical values to be…

Methodology · Statistics 2019-03-07 Laura Fee Schneider , Andrea Krajina , Tatyana Krivobokova

Large-scale Nonlinear Variable Selection via Kernel Random Features

We propose a new method for input variable selection in nonlinear regression. The method is embedded into a kernel regression machine that can model general nonlinear functions, not being a priori limited to additive models. This is the…

Machine Learning · Computer Science 2018-09-05 Magda Gregorová , Jason Ramapuram , Alexandros Kalousis , Stéphane Marchand-Maillet

Automated threshold selection and associated inference uncertainty for univariate extremes

Threshold selection is a fundamental problem in any threshold-based extreme value analysis. While models are asymptotically motivated, selecting an appropriate threshold for finite samples is difficult and highly subjective through standard…

Methodology · Statistics 2024-10-30 Conor Murphy , Jonathan A. Tawn , Zak Varty

Quantile universal threshold: model selection at the detection edge for high-dimensional linear regression

To estimate a sparse linear model from data with Gaussian noise, consilience from lasso and compressed sensing literatures is that thresholding estimators like lasso and the Dantzig selector have the ability in some situations to identify…

Machine Learning · Statistics 2017-08-14 Jairo Diaz-Rodriguez , Sylvain Sardy

Online Active Linear Regression via Thresholding

We consider the problem of online active learning to collect data for regression modeling. Specifically, we consider a decision maker with a limited experimentation budget who must efficiently learn an underlying linear population model.…

Machine Learning · Statistics 2016-12-22 Carlos Riquelme , Ramesh Johari , Baosen Zhang

Variable selection in semiparametric regression modeling

In this paper, we are concerned with how to select significant variables in semiparametric modeling. Variable selection for semiparametric regression models consists of two components: model selection for nonparametric components and…

Statistics Theory · Mathematics 2008-12-18 Runze Li , Hua Liang

Importance sampling for weighted binary random matrices with specified margins

A sequential importance sampling algorithm is developed for the distribution that results when a matrix of independent, but not identically distributed, Bernoulli random variables is conditioned on a given sequence of row and column sums.…

Computation · Statistics 2013-01-18 Matthew T. Harrison , Jeffrey W. Miller

Randomised Algorithm for Feature Selection and Classification

We here introduce a novel classification approach adopted from the nonlinear model identification framework, which jointly addresses the feature selection and classifier design tasks. The classifier is constructed as a polynomial expansion…

Machine Learning · Computer Science 2016-07-29 Aida Brankovic , Alessandro Falsone , Maria Prandini , Luigi Piroddi

A robust approach to model-based classification based on trimming and constraints

In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations,…

Applications · Statistics 2019-11-20 Andrea Cappozzo , Francesca Greselin , Thomas Brendan Murphy

Methods of Selective Inference for Linear Mixed Models: a Review and Empirical Comparison

Selective inference aims at providing valid inference after a data-driven selection of models or hypotheses. It is essential to avoid overconfident results and replicability issues. While significant advances have been made in this area for…

Methodology · Statistics 2025-03-14 Matteo D'Alessandro , Magne Thoresen

Variable selection for general index models via sliced inverse regression

Variable selection, also known as feature selection in machine learning, plays an important role in modeling high dimensional data and is key to data-driven scientific discoveries. We consider here the problem of detecting influential…

Methodology · Statistics 2014-09-24 Bo Jiang , Jun S. Liu

Selective inference with a randomized response

Inspired by sample splitting and the reusable holdout introduced in the field of differential privacy, we consider selective inference with a randomized response. We discuss two major advantages of using a randomized response for model…

Statistics Theory · Mathematics 2016-12-01 Xiaoying Tian , Jonathan E. Taylor

Functional Estimation of the Marginal Likelihood

We propose a framework for computing, optimizing and integrating with respect to a smooth marginal likelihood in statistical models that involve high-dimensional parameters/latent variables and continuous low-dimensional hyperparameters.…

Methodology · Statistics 2026-02-10 Omiros Papaspiliopoulos , Timothée Stumpf-Fétizon , Jonathan Weare

Random Partitioning and Distribution-based Thresholding for Iterative Variable Screening in High Dimensions

In big data analysis, a simple task such as linear regression can become very challenging as the variable dimension $p$ grows. As a result, variable screening is inevitable in many scientific studies. In recent years, randomized algorithms…

Methodology · Statistics 2019-02-13 Yu-Hsiang Cheng , Tzee-Ming Huang , Su-Yun Huang

Selective Inference in Graphical Models via Maximum Likelihood

The graphical lasso is a widely used algorithm for fitting undirected Gaussian graphical models. However, for inference on functionals of edge values in the learned graph, standard tools lack formal statistical guarantees, such as control…

Methodology · Statistics 2025-04-01 Sofia Guglielmini , Gerda Claeskens , Snigdha Panigrahi

On the Asymptotics of Importance Weighted Variational Inference

For complex latent variable models, the likelihood function is not available in closed form. In this context, a popular method to perform parameter estimation is Importance Weighted Variational Inference. It essentially maximizes the…

Statistics Theory · Mathematics 2025-01-16 Badr-Eddine Cherief-Abdellatif , Randal Douc , Arnaud Doucet , Hugo Marival

Marginal empirical likelihood and sure independence feature screening

We study a marginal empirical likelihood approach in scenarios when the number of variables grows exponentially with the sample size. The marginal empirical likelihood ratios as functions of the parameters of interest are systematically…

Statistics Theory · Mathematics 2013-11-07 Jinyuan Chang , Cheng Yong Tang , Yichao Wu

Recursive Pathways to Marginal Likelihood Estimation with Prior-Sensitivity Analysis

We investigate the utility to computational Bayesian analyses of a particular family of recursive marginal likelihood estimators characterized by the (equivalent) algorithms known as "biased sampling" or "reverse logistic regression" in the…

Methodology · Statistics 2014-10-16 Ewan Cameron , Anthony Pettitt