Related papers: False Variable Selection Rates in Regression
While data-driven confounder selection requires careful consideration, it is frequently employed in observational studies. Widely recognized criteria for confounder selection include the minimal-set approach, which involves selecting…
Controlling the false discovery rate (FDR) in high-dimensional variable selection requires balancing rigorous error control with statistical power. Existing methods with provable guarantees are often overly conservative, creating a…
False discovery rates (FDR) are an essential component of statistical inference, representing the propensity for an observed result to be mistaken. FDR estimates should accompany observed results to help the user contextualize the relevance…
The False Discovery Rate (FDR) is a new statistical procedure to control the number of mistakes made when performing multiple hypothesis tests, i.e. when comparing many data against a given model hypothesis. The key advantage of FDR is that…
The false discovery rate (FDR) and false nondiscovery rate (FNDR) have received considerable attention in the literature on multiple testing. These performance measures are also appropriate for classification, and in this work we develop…
Controlling the false discovery rate (FDR) is a popular approach to multiple testing, variable selection, and related problems of simultaneous inference. In many contemporary applications, models are not specified by discrete variables,…
Controlling the false discovery rate (FDR) in variable selection becomes challenging when predictors are correlated, as existing methods often exclude all members of correlated groups and consequently perform poorly for prediction. We…
Many approaches for multiple testing begin with the assumption that all tests in a given study should be combined into a global false-discovery-rate analysis. But this may be inappropriate for many of today's large-scale screening problems,…
Variable selection has been widely used in data analysis for the past decades, and it becomes increasingly important in the Big Data era as there are usually hundreds of variables available in a dataset. To enhance interpretability of a…
The problem of selecting a handful of truly relevant variables in supervised machine learning algorithms is a challenging problem in terms of untestable assumptions that must hold and unavailability of theoretical assurances that selection…
Simultaneously performing variable selection and inference in high-dimensional models is an open challenge in statistics and machine learning. The increasing availability of vast amounts of variables requires the adoption of specific…
Multiple hypothesis testing has been widely applied to problems dealing with high-dimensional data, e.g., selecting significant variables and controlling the selection error rate. The most prevailing measure of error rate used in the…
Addressing the simultaneous identification of contributory variables while controlling the false discovery rate (FDR) in high-dimensional data is a crucial statistical challenge. In this paper, we propose a novel model-free variable…
There is recent interest in estimating the false discovery rate (FDR) with published p-values. However, there is little formal research that addresses the manner and extent to which the presumed selection, or publication, bias model impacts…
We consider the problem of variable selection in high-dimensional statistical models where the goal is to report a set of variables, out of many predictors $X_1, \dotsc, X_p$, that are relevant to a response of interest. For linear…
Simultaneously performing variable selection and inference in high-dimensional regression models is an open challenge in statistics and machine learning. The increasing availability of vast amounts of variables requires the adoption of…
In many fields of science, we observe a response variable together with a large number of potential explanatory variables, and would like to be able to discover which variables are truly associated with the response. At the same time, we…
Selecting relevant features associated with a given response variable is an important issue in many scientific fields. Quantifying quality and uncertainty of a selection result via false discovery rate (FDR) control has been of recent…
The false discovery rate (FDR) measures the share of false positives in a set of statistical tests. I develop simple and intuitive bounds on the FDR in cross-sectional predictability publications. The simplest bound requires just a few…
We propose the use of a new false discovery rate (FDR) controlling procedure as a model selection penalized method, and compare its performance to that of other penalized methods over a wide range of realistic settings: nonorthogonal design…