Related papers: Model-free controlled variable selection via data …
Selecting relevant features associated with a given response variable is an important issue in many scientific fields. Quantifying quality and uncertainty of a selection result via false discovery rate (FDR) control has been of recent…
Simultaneously performing variable selection and inference in high-dimensional regression models is an open challenge in statistics and machine learning. The increasing availability of vast amounts of variables requires the adoption of…
The problem of selecting a handful of truly relevant variables in supervised machine learning algorithms is a challenging problem in terms of untestable assumptions that must hold and unavailability of theoretical assurances that selection…
Simultaneously performing variable selection and inference in high-dimensional models is an open challenge in statistics and machine learning. The increasing availability of vast amounts of variables requires the adoption of specific…
Algorithms that ensure reproducible findings from large-scale, high-dimensional data are pivotal in numerous signal processing applications. In recent years, multivariate false discovery rate (FDR) controlling methods have emerged,…
Controlling the false discovery rate (FDR) is a popular approach to multiple testing, variable selection, and related problems of simultaneous inference. In many contemporary applications, models are not specified by discrete variables,…
Controlling the False Discovery Rate (FDR) in a variable selection procedure is critical for reproducible discoveries, and it has been extensively studied in sparse linear models. However, it remains largely open in scenarios where the…
Balancing false discovery rate (FDR) control with high statistical power remains a central challenge in high-dimensional variable selection. While several FDR-controlling methods have been proposed, many degrade the original data -- by…
The generalized linear models (GLM) have been widely used in practice to model non-Gaussian response variables. When the number of explanatory features is relatively large, scientific researchers are of interest to perform controlled…
We consider the problem of variable selection in high-dimensional statistical models where the goal is to report a set of variables, out of many predictors $X_1, \dotsc, X_p$, that are relevant to a response of interest. For linear…
Variable selection has been widely used in data analysis for the past decades, and it becomes increasingly important in the Big Data era as there are usually hundreds of variables available in a dataset. To enhance interpretability of a…
There has been recent interest in extending the ideas of False Discovery Rates (FDR) to variable selection in regression settings. Traditionally the FDR in these settings has been defined in terms of the coefficients of the full regression…
Controlling the false discovery rate (FDR) in high-dimensional variable selection requires balancing rigorous error control with statistical power. Existing methods with provable guarantees are often overly conservative, creating a…
False discovery rate (FDR) control is a popular approach for maintaining the integrity of statistical analyses, especially in high-dimensional data settings, where multiple comparisons increase the risk of false positives. FDR control has…
Testing for differences in features between clusters in various applications often leads to inflated false positives when practitioners use the same dataset to identify clusters and then test features, an issue commonly known as ``double…
This paper is concerned with false discovery rate (FDR) control in large-scale multiple testing problems. We first propose a new data-driven testing procedure for controlling the FDR in large-scale t-tests for one-sample mean problem. The…
Controlling the false discovery rate (FDR) in variable selection becomes challenging when predictors are correlated, as existing methods often exclude all members of correlated groups and consequently perform poorly for prediction. We…
In many fields of science, we observe a response variable together with a large number of potential explanatory variables, and would like to be able to discover which variables are truly associated with the response. At the same time, we…
Effectively controlling the false discovery rate (FDR) in high-dimensional variable selection is a fundamental statistical problem that has garnered significant research interest. In this paper, we propose a novel, user-friendly, and…
In the context of high-dimensional Gaussian linear regression for ordered variables, we study the variable selection procedure via the minimization of the penalized least-squares criterion. We focus on model selection where the penalty…