English
Related papers

Related papers: Distributed Bootstrap for Simultaneous Inference U…

200 papers

In this paper, we propose a bootstrap method applied to massive data processed distributedly in a large number of machines. This new method is computationally efficient in that we bootstrap on the master machine without over-resampling,…

Machine Learning · Statistics 2020-02-21 Yang Yu , Shih-Kang Chao , Guang Cheng

We propose a distributed method for simultaneous inference for datasets with sample size much larger than the number of covariates, i.e., N >> p, in the generalized linear models framework. When such datasets are too big to be analyzed…

Methodology · Statistics 2020-07-23 Lu Tang , Ling Zhou , Peter X. -K. Song

In this paper, we address the problem of conducting statistical inference in settings involving large-scale data that may be high-dimensional and contaminated by outliers. The high volume and dimensionality of the data require distributed…

Machine Learning · Statistics 2022-11-30 Emadaldin Mozafari-Majd , Visa Koivunen

Simultaneous inference for high-dimensional non-Gaussian time series is always considered to be a challenging problem. Such tasks require not only robust estimation of the coefficients in the random process, but also deriving limiting…

Methodology · Statistics 2021-11-03 Linbo Liu , Danna Zhang

This paper proposes a bootstrap-assisted procedure to conduct simultaneous inference for high dimensional sparse linear models based on the recent de-sparsifying Lasso estimator (van de Geer et al. 2014). Our procedure allows the dimension…

Statistics Theory · Mathematics 2016-03-07 Xianyang Zhang , Guang Cheng

We study simultaneous inference for multiple matrix-variate Gaussian graphical models in high-dimensional settings. Such models arise when spatiotemporal data are collected across multiple sample groups or experimental sessions, where each…

Methodology · Statistics 2026-01-21 Zongge Liu , Heejong Bong , Zhao Ren , Matthew A. Smith , Robert E. Kass

This article introduces an iterative distributed computing estimator for the multinomial logistic regression model with large choice sets. Compared to the maximum likelihood estimator, the proposed iterative distributed estimator achieves…

Econometrics · Economics 2024-12-03 Yanqin Fan , Yigit Okar , Xuetao Shi

The bootstrap is a popular data-driven method to quantify statistical uncertainty, but for modern high-dimensional problems, it could suffer from huge computational costs due to the need to repeatedly generate resamples and refit models. We…

Methodology · Statistics 2023-06-21 Henry Lam , Zhenyuan Liu

This paper considers distributed statistical inference for general symmetric statistics %that encompasses the U-statistics and the M-estimators in the context of massive data where the data can be stored at multiple platforms in different…

Statistics Theory · Mathematics 2018-05-30 Song Xi Chen , Liuhua Peng

Inference for functional linear models in the presence of heteroscedastic errors has received insufficient attention given its practical importance; in fact, even a central limit theorem has not been studied in this case. At issue,…

Statistics Theory · Mathematics 2024-05-27 Hyemin Yeon , Xiongtao Dai , Daniel John Nordman

This article reviews recent progress in high-dimensional bootstrap. We first review high-dimensional central limit theorems for distributions of sample mean vectors over the rectangles, bootstrap consistency results in high dimensions, and…

Statistics Theory · Mathematics 2022-05-20 Victor Chernozhukov , Denis Chetverikov , Kengo Kato , Yuta Koike

Bootstrapping is often applied to get confidence limits for semiparametric inference of a target parameter in the presence of nuisance parameters. Bootstrapping with replacement can be computationally expensive and problematic when…

Due to rapid data growth, statistical analysis of massive datasets often has to be carried out in a distributed fashion, either because several datasets stored in separate physical locations are all relevant to a given problem, or simply to…

Computation · Statistics 2016-02-08 Matthias Katzfuss , Dorit Hammerling

We propose a residual and wild bootstrap methodology for individual and simultaneous inference in high-dimensional linear models with possibly non-Gaussian and heteroscedastic errors. We establish asymptotic consistency for simultaneous…

Methodology · Statistics 2016-06-14 Ruben Dezeure , Peter Bühlmann , Cun-Hui Zhang

We propose a methodology for constructing confidence regions with partially identified models of general form. The region is obtained by inverting a test of internal consistency of the econometric structure. We develop a dilation bootstrap…

Econometrics · Economics 2021-02-10 Alfred Galichon , Marc Henry

We propose a double bootstrap procedure for reducing coverage error in the confidence intervals of descriptive statistics for independent and identically distributed functional data. Through a series of Monte Carlo simulations, we compare…

Methodology · Statistics 2021-02-03 Han Lin Shang

As the size of datasets used in statistical learning continues to grow, distributed training of models has attracted increasing attention. These methods partition the data and exploit parallelism to reduce memory and runtime, but suffer…

Machine Learning · Computer Science 2024-07-10 Fred Lu , Ryan R. Curtin , Edward Raff , Francis Ferraro , James Holt

In this paper we address the problem of performing statistical inference for large scale data sets i.e., Big Data. The volume and dimensionality of the data may be so high that it cannot be processed or stored in a single computing node. We…

Methodology · Statistics 2016-04-20 Shahab Basiri , Esa Ollila , Visa Koivunen

In multicenter biomedical research, integrating data from multiple decentralized sites provides more robust and generalizable findings due to its larger sample size and the ability to account for the between-site heterogeneity. However,…

Methodology · Statistics 2025-12-29 Xiaokang Liu , Yuchen Yang , Yifei Sun , Jiang Bian , Yanyuan Ma , Raymond J. Carroll , Yong Chen

The bootstrap is a versatile inference method that has proven powerful in many statistical problems. However, when applied to modern large-scale models, it could face substantial computation demand from repeated data resampling and model…

Methodology · Statistics 2022-02-02 Henry Lam
‹ Prev 1 2 3 10 Next ›