English
Related papers

Related papers: Sample size effects in multivariate fitting of cor…

200 papers

Measurement error is a pervasive challenge across many disciplines, yet its impact on sample size determination and the accuracy and precision of estimators regarding the association between an exposure and an outcome remains understudied…

Methodology · Statistics 2025-05-27 Honghyok Kim

Optimization software enables the solution of problems with millions of variables and associated parameters. These parameters are, however, often uncertain and represented with an analytical description of the parameter's distribution or…

Optimization and Control · Mathematics 2025-01-17 John R. Birge

When data do not conform to the hypothesis of a known sampling-variance, the fitting of a constant to the set of measured values is a long debated problem. Given the data, the fitting would require to find which measurand value is most…

Data Analysis, Statistics and Probability · Physics 2011-09-27 Giovanni Mana , Maria Mirabela Predescu

Estimation of causal effects using machine learning methods has become an active research field in econometrics. In this paper, we study the finite sample performance of meta-learners for estimation of heterogeneous treatment effects under…

Econometrics · Economics 2022-02-01 Gabriel Okasa

We provide finite-sample distribution approximations, that are uniform in the parameter, for inference in linear mixed models. Focus is on variances and covariances of random effects in cases where existing theory fails because their…

Statistics Theory · Mathematics 2025-07-29 Karl Oskar Ekvall , Matteo Bottai

Besides the well-known effect of autocorrelations in time series of Monte Carlo simulation data resulting from the underlying Markov process, using the same data pool for computing various estimates entails additional cross correlations.…

Statistical Mechanics · Physics 2014-11-20 Martin Weigel , Wolfhard Janke

We review statistical theories and numerical methods employed to consider the sample size dependence of the failure strength distribution of disordered materials. We first overview the analytical predictions of extreme value statistics and…

Materials Science · Physics 2015-05-13 Mikko J. Alava , Phani K. V. V. Nukala , Stefano Zapperi

In Markov Chain Monte Carlo (MCMC) simulations, the thermal equilibria quantities are estimated by ensemble average over a sample set containing a large number of correlated samples. These samples are selected in accordance with the…

Data Analysis, Statistics and Probability · Physics 2015-01-08 J. Li , P. Vignal , S. Sun , V. M. Calo

This study examines effects of calibration errors on model assumptions and data--analytic tools in direct calibration assays. These effects encompass induced dependencies, inflated variances, and heteroscedasticity among the calibrated…

Statistics Theory · Mathematics 2011-03-30 D. R. Jensen , D. E. Ramirez

In the regression setting, given a set of hyper-parameters, a model-estimation procedure constructs a model from training data. The optimal hyper-parameters that minimize generalization error of the model are usually unknown. In practice…

Machine Learning · Statistics 2019-04-01 Jean Feng , Noah Simon

In statistical exercises where there are several candidate models, the traditional approach is to select one model using some data driven criterion and use that model for estimation, testing and other purposes, ignoring the variability of…

Statistics Theory · Mathematics 2008-12-18 Snigdhansu Chatterjee , Nitai D. Mukhopadhyay

Covariance matrix estimation, a classical statistical topic, poses significant challenges when the sample size is comparable to or smaller than the number of features. In this paper, we frame covariance matrix estimation as a compound…

Methodology · Statistics 2025-03-04 Huqin Xin , Sihai Dave Zhao

The effect of correlations between model parameters and nuisance parameters is discussed, in the context of fitting model parameters to data. Modifications to the usual $\chi^2$ method are required. Fake data studies, as used at present,…

Data Analysis, Statistics and Probability · Physics 2013-09-25 Byron Roe

Having a sufficient quantity of quality data is a critical enabler of training effective machine learning models. Being able to effectively determine the adequacy of a dataset prior to training and evaluating a model's performance would be…

Machine Learning · Computer Science 2026-04-28 Arya Hatamian , Lionel Levine , Haniyeh Ehsani Oskouie , Majid Sarrafzadeh

This paper discusses some problems possibly arising when approximating via Monte-Carlo simulations the distributions of goodness-of-fit test statistics based on the empirical distribution function. We argue that failing to re-estimate…

Data Analysis, Statistics and Probability · Physics 2008-04-01 Marco Capasso , Lucia Alessi , Matteo Barigozzi , Giorgio Fagiolo

We investigate popular resampling methods for estimating the uncertainty of statistical models, such as subsampling, bootstrap and the jackknife, and their performance in high-dimensional supervised regression tasks. We provide a tight…

How should researchers analyze randomized experiments in which the main outcome is latent and measured in multiple ways but each measure contains some degree of error? We first identify a critical study-specific noncomparability problem in…

Econometrics · Economics 2026-01-13 Jiawei Fu , Donald P. Green

A probabilistic model is said to be calibrated if its predicted probabilities match the corresponding empirical frequencies. Calibration is important for uncertainty quantification and decision making in safety-critical applications. While…

Machine Learning · Computer Science 2020-07-01 Anusri Pampari , Stefano Ermon

We give an analytical interpretation of how subsample-based internal covariance estimators lead to biased estimates of the covariance, due to underestimating the super-sample covariance (SSC). This includes the jackknife and bootstrap…

Cosmology and Nongalactic Astrophysics · Physics 2018-04-16 Fabien Lacasa , Martin Kunz

Empirical relationships are derived for the expected sampling error of quantile estimations using Monte Carlo experiments for two frequency distributions frequently encountered in climate sciences. The relationships found are expressed as a…

Methodology · Statistics 2016-10-12 Philippe Roy , René Laprise , Philippe Gachon
‹ Prev 1 2 3 10 Next ›