Related papers: Multiple Testing under High-dimensional Dynamic Fa…
In the high dimensional regression analysis when the number of predictors is much larger than the sample size, an important question is to select the important variable which are relevant to the response variable of interest. Variable…
In large scale multiple testing problems, a two-class empirical Bayes approach can be used to control the false discovery rate (Fdr) for the entire array of hypotheses under study. A sample splitting step is incorporated to modify that…
This paper studies the problem of high-dimensional multiple testing and sparse recovery from the perspective of sequential analysis. In this setting, the probability of error is a function of the dimension of the problem. A simple…
Many methods have been developed to estimate the set of relevant variables in a sparse linear model Y= XB+e where the dimension p of B can be much higher than the length n of Y. Here we propose two new methods based on multiple hypotheses…
This paper introduces a novel multi-stage decision-making model that integrates hypothesis testing and dynamic programming algorithms to address complex decision-making scenarios.Initially,we develop a sampling inspection scheme that…
In this paper, we introduce an innovative testing procedure for assessing individual hypotheses in high-dimensional linear regression models with measurement errors. This method remains robust even when either the X-model or Y-model is…
The most popular multiple testing procedures are stepwise procedures based on $P$-values for individual test statistics. Included among these are the false discovery rate (FDR) controlling procedures of Benjamini--Hochberg [J. Roy. Statist.…
We propose a dynamic multiplicative factor model for process data, which arise from complex problem-solving items, an emerging testing mode in large-scale educational assessment. The proposed model can be viewed as an extension of the…
It is frequently of interest to jointly analyze multiple sequences of multiple tests in order to identify simultaneous signals, defined as features tested in multiple studies whose test statistics are non-null in each. In many problems,…
This paper aims to develop an effective model-free inference procedure for high-dimensional data. We first reformulate the hypothesis testing problem via sufficient dimension reduction framework. With the aid of new reformulation, we…
This paper proposes a new procedure to validate the multi-factor pricing theory by testing the presence of alpha in linear factor pricing models with a large number of assets. Because the market's inefficient pricing is likely to occur to a…
In many applied sciences a popular analysis strategy for high-dimensional data is to fit many multivariate generalized linear models in parallel. This paper presents a novel approach to address the resulting multiple testing problem by…
Many important tasks of large-scale recommender systems can be naturally cast as testing multiple linear forms for noisy matrix completion. These problems, however, present unique challenges because of the subtle bias-and-variance tradeoff…
In modern scientific experiments, we frequently encounter data that have large dimensions, and in some experiments, such high dimensional data arrive sequentially rather than full data being available all at a time. We develop multiple…
This paper proposes a moving sum methodology for detecting multiple change points in high-dimensional time series under a factor model, where changes are attributed to those in loadings as well as emergence or disappearance of factors. We…
In many large scale multiple testing applications, the hypotheses often have a known graphical structure, such as gene ontology in gene expression data. Exploiting this graphical structure in multiple testing procedures can improve power as…
We investigate one/two-sample mean tests for high-dimensional compositional data when the number of variables is comparable with the sample size, as commonly encountered in microbiome research. Existing methods mainly focus on max-type test…
Statistical dependence between hypotheses poses a significant challenge to the stability of large scale multiple hypotheses testing. Ignoring it often results in an unacceptably large spread in the false positive proportion even though the…
The $\gamma$-FDP and $k$-FWER multiple testing error metrics, which are tail probabilities of the respective error statistics, have become popular recently as less-stringent alternatives to the FDR and FWER. We propose general and flexible…
Variable selection plays a fundamental role in high-dimensional data analysis. Various methods have been developed for variable selection in recent years. Well-known examples are forward stepwise regression (FSR) and least angle regression…