Related papers: Selective Sequential Model Selection

Linear regression model selection using p-values when the model dimension grows

We consider a new criterion-based approach to model selection in linear regression. Properties of selection criteria based on p-values of a likelihood ratio statistic are studied for families of linear regression models. We prove that such…

Statistics Theory · Mathematics 2012-05-21 Piotr Pokarowski , Jan Mielniczuk , Paweł Teisseyre

Optimized Conformal Selection: Powerful Selective Inference After Conformity Score Optimization

Model selection/optimization in conformal inference is challenging, since it may break the exchangeability between labeled and unlabeled data. We study this problem in the context of conformal selection, which uses conformal p-values to…

Methodology · Statistics 2024-11-28 Tian Bai , Ying Jin

Model-free selective inference under covariate shift via weighted conformal p-values

This paper introduces novel weighted conformal p-values and methods for model-free selective inference. The problem is as follows: given test units with covariates $X$ and missing responses $Y$, how do we select units for which the…

Methodology · Statistics 2023-09-27 Ying Jin , Emmanuel J. Candès

Selection by Prediction with Conformal p-values

Decision making or scientific discovery pipelines such as job hiring and drug discovery often involve multiple stages: before any resource-intensive step, there is often an initial screening that uses predictions from a machine learning…

Methodology · Statistics 2023-05-30 Ying Jin , Emmanuel J. Candès

Selective inference after feature selection via multiscale bootstrap

It is common to show the confidence intervals or $p$-values of selected features, or predictor variables in regression, but they often involve selection bias. The selective inference approach solves this bias by conditioning on the…

Methodology · Statistics 2022-06-02 Yoshikazu Terada , Hidetoshi Shimodaira

Selective inference is easier with p-values

Selective inference is a subfield of statistics that enables valid inference after selection of a data-dependent question. In this paper, we introduce selectively dominant p-values, a class of p-values that allow practitioners to easily…

Methodology · Statistics 2024-11-22 Anav Sood

Sequential Specification Tests to Choose a Model: A Change-Point Approach

Researchers faced with a sequence of candidate model specifications must often choose the best specification that does not violate a testable identification assumption. One option in this scenario is sequential specification tests:…

Methodology · Statistics 2023-07-25 Adam C. Sales

A new multiple testing method in the dependent case

The most popular multiple testing procedures are stepwise procedures based on $P$-values for individual test statistics. Included among these are the false discovery rate (FDR) controlling procedures of Benjamini--Hochberg [J. Roy. Statist.…

Statistics Theory · Mathematics 2009-06-18 Arthur Cohen , Harold B. Sackrowitz , Minya Xu

Testing Many Zero Restrictions in a High Dimensional Linear Regression Setting

We propose a test of many zero parameter restrictions in a high dimensional linear iid regression model with $k$ $>>$ $n$ regressors. The test statistic is formed by estimating key parameters one at a time based on many low dimension…

Statistics Theory · Mathematics 2023-12-12 Jonathan B. Hill

A Mathematical Programming Approach for Integrated Multiple Linear Regression Subset Selection and Validation

Subset selection for multiple linear regression aims to construct a regression model that minimizes errors by selecting a small number of explanatory variables. Once a model is built, various statistical tests and diagnostics are conducted…

Machine Learning · Statistics 2020-09-04 Seokhyun Chung , Young Woong Park , Taesu Cheong

Model Selection for independent not identically distributed observations based on R\'enyi's pseudodistances

Model selection criteria are rules used to select the best statistical model among a set of candidate models, striking a trade-off between goodness of fit and model complexity. Most popular model selection criteria measure the goodness of…

Statistics Theory · Mathematics 2023-04-13 Angel Felipe , Maria Jaenada , Pedro Miranda , Leandro Pardo

FDR Control via Neural Networks under Covariate-Dependent Symmetric Nulls

In modern multiple hypothesis testing, the availability of covariate information alongside the primary test statistics has motivated the development of more powerful and adaptive inference methods. However, most existing approaches rely on…

Methodology · Statistics 2025-11-20 Taehyoung Kim , Seohwa Hwang , Junyong Park

False Discovery Rate Control via Debiased Lasso

We consider the problem of variable selection in high-dimensional statistical models where the goal is to report a set of variables, out of many predictors $X_1, \dotsc, X_p$, that are relevant to a response of interest. For linear…

Methodology · Statistics 2019-03-20 Adel Javanmard , Hamid Javadi

Factor-Adjusted Regularized Model Selection

This paper studies model selection consistency for high dimensional sparse regression when data exhibits both cross-sectional and serial dependency. Most commonly-used model selection methods fail to consistently recover the true model when…

Methodology · Statistics 2018-09-12 Jianqing Fan , Yuan Ke , Kaizheng Wang

Linear Regression, Covariate Selection and the Failure of Modelling

It is argued that all model based approaches to the selection of covariates in linear regression have failed. This applies to frequentist approaches based on P-values and to Bayesian approaches although for different reasons. In the first…

Methodology · Statistics 2022-02-23 Laurie Davies

Segmenting High-dimensional Matrix-valued Time Series via Sequential Transformations

Modeling matrix-valued time series is an interesting and important research topic. In this paper, we extend the method of Chang et al. (2017) to matrix-valued time series. For any given $p\times q$ matrix-valued time series, we look for…

Methodology · Statistics 2020-02-11 Zhaoxing Gao

Covariate Selection Based on a Model-free Approach to Linear Regression with Exact Probabilities

In this paper we give a completely new approach to the problem of covariate selection in linear regression. A covariate or a set of covariates is included only if it is better in the sense of least squares than the same number of Gaussian…

Methodology · Statistics 2022-02-25 Laurie Davies , Lutz Dümbgen

Efficient Test-based Variable Selection for High-dimensional Linear Models

Variable selection plays a fundamental role in high-dimensional data analysis. Various methods have been developed for variable selection in recent years. Well-known examples are forward stepwise regression (FSR) and least angle regression…

Methodology · Statistics 2018-02-01 Siliang Gong , Kai Zhang , Yufeng Liu

P-values for high-dimensional regression

Assigning significance in high-dimensional regression is challenging. Most computationally efficient selection algorithms cannot guard against inclusion of noise variables. Asymptotically valid p-values are not available. An exception is a…

Methodology · Statistics 2009-06-12 Nicolai Meinshausen , Lukas Meier , Peter Bühlmann

Revamping Conformal Selection With Optimal Power: A Neyman--Pearson Perspective

This paper introduces a novel conformal selection procedure, inspired by the Neyman--Pearson paradigm, to maximize the power of selecting qualified units while maintaining false discovery rate (FDR) control. Existing conformal selection…

Methodology · Statistics 2025-02-25 Jing Qin , Yukun Liu , Moming Li , Chiung-Yu Huang