Related papers: Sample Fit Reliability

Finite Sample Valid Inference via Calibrated Bootstrap

While widely used as a general method for uncertainty quantification, the bootstrap method encounters difficulties that raise concerns about its validity in practical applications. This paper introduces a new resampling-based method, termed…

Methodology · Statistics 2024-08-30 Yiran Jiang , Chuanhai Liu , Heping Zhang

Bootstrapping and Sample Splitting For High-Dimensional, Assumption-Free Inference

Several new methods have been proposed for performing valid inference after model selection. An older method is sampling splitting: use part of the data for model selection and part for inference. In this paper we revisit sample splitting…

Statistics Theory · Mathematics 2018-04-04 Alessandro Rinaldo , Larry Wasserman , Max G'Sell , Jing Lei

A Bootstrap Method for Goodness of Fit and Model Selection with a Single Observed Network

Network models are applied in numerous domains where data can be represented as a system of interactions among pairs of actors. While both statistical and mechanistic network models are increasingly capable of capturing various dependencies…

Methodology · Statistics 2018-07-02 Sixing Chen , Jukka-Pekka Onnela

Bootstrap for neural model selection

Bootstrap techniques (also called resampling computation techniques) have introduced new advances in modeling and model evaluation. Using resampling methods to construct a series of new samples which are based on the original data set,…

Statistics Theory · Mathematics 2007-06-13 Riadh Kallel , Marie Cottrell , Vincent Vigneron

Improving prediction accuracy by choosing resampling distribution via cross-validation

In a regression model, prediction is typically performed after model selection. The large variability in the model selection makes the prediction unstable. Thus, it is essential to reduce the variability in model selection and improve…

Computation · Statistics 2024-04-11 Wataru Yoshida , Kei Hirose

Probability and Non-Probability Samples: Improving Regression Modeling by Using Data from Different Sources

Non-probability sampling, for example in the form of online panels, has become a fast and cheap method to collect data. While reliable inference tools are available for classical probability samples, non-probability samples can yield…

Methodology · Statistics 2022-04-05 Gerhard Tutz

The Lazy Bootstrap. A Fast Resampling Method for Evaluating Latent Class Model Fit

The latent class model is a powerful unsupervised clustering algorithm for categorical data. Many statistics exist to test the fit of the latent class model. However, traditional methods to evaluate those fit statistics are not always…

Methodology · Statistics 2018-01-30 Geert H. van Kollenburg , Joris Mulder , Jeroen K. Vermunt

Assessing Estimation Uncertainty under Model Misspecification

Model misspecification is ubiquitous in data analysis because the data-generating process is often complex and mathematically intractable. Therefore, assessing estimation uncertainty and conducting statistical inference under a possibly…

Methodology · Statistics 2023-12-19 Rong Li , Yichen Qin , Yang Li

Mastering an Accurate and Generalizable Simulation-Based Method to Obtain Bias-corrected Point Estimates and Sampling Variance for Any Effect Sizes

Meta-analyses require an effect-size estimate and its corresponding sampling variance from primary studies. In some cases, estimators for the sampling variance of a given effect size statistic may not exist, necessitating the derivation of…

Methodology · Statistics 2025-11-03 Shinichi Nakagawa , Ayumi Mizuno , Coralie Williams , Santiago Ortega , Szymon M. Drobniak , Malgorzata Lagisz , Yefeng Yang , Alistair M. Senior , Daniel W. A. Noble , Erick Lundgren

Calibrated bootstrap for uncertainty quantification in regression models

Obtaining accurate estimates of machine learning model uncertainties on newly predicted data is essential for understanding the accuracy of the model and whether its predictions can be trusted. A common approach to such uncertainty…

Materials Science · Physics 2021-05-28 Glenn Palmer , Siqi Du , Alexander Politowicz , Joshua Paul Emory , Xiyu Yang , Anupraas Gautam , Grishma Gupta , Zhelong Li , Ryan Jacobs , Dane Morgan

Analysis of Bootstrap and Subsampling in High-dimensional Regularized Regression

We investigate popular resampling methods for estimating the uncertainty of statistical models, such as subsampling, bootstrap and the jackknife, and their performance in high-dimensional supervised regression tasks. We provide a tight…

Machine Learning · Statistics 2024-11-04 Lucas Clarté , Adrien Vandenbroucque , Guillaume Dalle , Bruno Loureiro , Florent Krzakala , Lenka Zdeborová

Goodness-of-fit testing based on a weighted bootstrap: A fast large-sample alternative to the parametric bootstrap

The process comparing the empirical cumulative distribution function of the sample with a parametric estimate of the cumulative distribution function is known as the empirical process with estimated parameters and has been extensively…

Methodology · Statistics 2012-10-08 Ivan Kojadinovic , Jun Yan

Applications of the Fractional-Random-Weight Bootstrap

The bootstrap, based on resampling, has, for several decades, been a widely used method for computing confidence intervals for applications where no exact method is available and when sample sizes are not large enough to be able to rely on…

Applications · Statistics 2018-08-27 Chris Gotwalt , Li Xu , Yili Hong , William Q. Meeker

Bootstrapping the Cross-Validation Estimate

Cross-validation is a widely used technique for evaluating the performance of prediction models, ranging from simple binary classification to complex precision medicine strategies. It helps correct for optimism bias in error estimates,…

Methodology · Statistics 2025-09-05 Bryan Cai , Yuanhui Luo , Xinzhou Guo , Fabio Pellegrini , Menglan Pang , Carl de Moor , Changyu Shen , Vivek Charu , Lu Tian

Bootstrap-Based Goodness-of-Fit Test for Parametric Families of Conditional Distributions

A consistent goodness-of-fit test for distributional regression is introduced. The test statistic is based on a process that traces the difference between a nonparametric and a semi-parametric estimate of the marginal distribution function…

Methodology · Statistics 2025-10-10 Gitte Kremling , Gerhard Dikta

Bootstrapping Confidence Levels for Hypotheses about Quadratic (U-Shaped) Regression Models

Bootstrapping can produce confidence levels for hypotheses about quadratic regression models - such as whether the U-shape is inverted, and the location of optima. The method has several advantages over conventional methods: it provides…

Methodology · Statistics 2012-07-09 Michael Wood

Goodness-of-fit Testing in Linear Regression Models

Model checking plays an important role in linear regression as model misspecification seriously affects the validity and efficiency of regression analysis. In practice, model checking is often performed by subjectively evaluating the plot…

Statistics Theory · Mathematics 2019-11-19 Rok Blagus , Jakob Peterlin , Janez Stare

Bootstrapping data arrays of arbitrary order

In this paper we study a bootstrap strategy for estimating the variance of a mean taken over large multifactor crossed random effects data sets. We apply bootstrap reweighting independently to the levels of each factor, giving each…

Methodology · Statistics 2012-09-28 Art B. Owen , Dean Eckles

Scalable Efficient Inference in Complex Surveys through Targeted Resampling of Weights

Survey data often arises from complex sampling designs, such as stratified or multistage sampling, with unequal inclusion probabilities. When sampling is informative, traditional inference methods yield biased estimators and poor coverage.…

Methodology · Statistics 2025-04-17 Snigdha Das , Dipankar Bandyopadhyay , Debdeep Pati

On Determining the Distribution of a Goodness-of-Fit Test Statistic

We consider the problem of goodness-of-fit testing for a model that has at least one unknown parameter that cannot be eliminated by transformation. Examples of such problems can be as simple as testing whether a sample consists of…

Methodology · Statistics 2021-04-28 Sean van der Merwe