Related papers: Regression coefficient estimation from remote sens…

Local Prediction-Powered Inference

To infer a function value on a specific point $x$, it is essential to assign higher weights to the points closer to $x$, which is called local polynomial / multivariable regression. In many practical cases, a limited sample size may ruin…

Machine Learning · Statistics 2024-09-30 Yanwu Gu , Dong Xia

Regression-Based Proximal Causal Inference

Negative controls are increasingly used to evaluate the presence of potential unmeasured confounding in observational studies. Beyond the use of negative controls to detect the presence of residual confounding, proximal causal inference…

Methodology · Statistics 2024-06-06 Jiewen Liu , Chan Park , Kendrick Li , Eric J. Tchetgen Tchetgen

Demystifying Prediction Powered Inference

Machine learning predictions are increasingly used to supplement incomplete or costly-to-measure outcomes in fields such as biomedical research, environmental science, and social science. However, treating predictions as ground truth…

Machine Learning · Statistics 2026-01-29 Yilin Song , Dan M. Kluger , Harsh Parikh , Tian Gu

Prediction-Powered Inference with Inverse Probability Weighting

Prediction-powered inference (PPI) is a recent framework for valid statistical inference with partially labeled data, combining model-based predictions on a large unlabeled set with bias correction from a smaller labeled subset. Building on…

Machine Learning · Statistics 2026-03-25 Jyotishka Datta , Nicholas G. Polson

PPI is the Difference Estimator: Recognizing the Survey Sampling Roots of Prediction-Powered Inference

Prediction-powered inference (PPI) is a rapidly growing framework for combining machine learning predictions with a small set of gold-standard labels to conduct valid statistical inference. In this article, I argue that the core estimators…

Methodology · Statistics 2026-03-20 Reagan Mozer

Bayesian Prediction-Powered Inference

Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. Specifically, PPI methods provide tighter confidence intervals by combining small amounts of human-labeled data with…

Machine Learning · Computer Science 2024-05-13 R. Alex Hofer , Joshua Maynez , Bhuwan Dhingra , Adam Fisch , Amir Globerson , William W. Cohen

Regression for the Mean: Auto-Evaluation and Inference with Few Labels through Post-hoc Regression

The availability of machine learning systems that can effectively perform arbitrary tasks has led to synthetic labels from these systems being used in applications of statistical inference, such as data analysis or model evaluation. The…

Machine Learning · Computer Science 2025-07-09 Benjamin Eyre , David Madras

Prediction-Powered Inference Across Many Tasks for AI Evaluation & Social Science Research

Many applications require statistically valid inference across many related tasks, while using only a handful of high-quality labels per hypothesis. In AI evaluation, these tasks may correspond to model behaviors across prompts, subgroups,…

Machine Learning · Statistics 2026-05-29 Nicolas Emmenegger , Ellery Stahler , Chara Podimata

Generalized Prediction-Powered Inference, with Application to Binary Classifier Evaluation

In the partially-observed outcome setting, a recent set of proposals known as "prediction-powered inference" (PPI) involve (i) applying a pre-trained machine learning model to predict the response, and then (ii) using these predictions to…

Methodology · Statistics 2026-02-12 Runjia Zou , Daniela Witten , Brian Williamson

Multiple Regression Analysis of Unmeasured Confounding

Whereas confidence intervals are used to assess uncertainty due to unmeasured individuals, confounding intervals can be used to assess uncertainty due to unmeasured attributes. Previously, we have introduced a methodology for computing…

Methodology · Statistics 2025-08-13 Brian Knaeble , R Mitchell Hughes

Bayesian Principal Component Regression model with spatial effects for forest inventory under small field sample size

Remote sensing observations are extensively used for analysis of environmental variables. These variables often exhibit spatial correlation, which has to be accounted for in the calibration models used in predictions, either by direct…

Applications · Statistics 2017-02-14 Virpi Junttila , Marko Laine

Uncertainty-Aware Regression for Socio-Economic Estimation via Multi-View Remote Sensing

Remote sensing imagery offers rich spectral data across extensive areas for Earth observation. Many attempts have been made to leverage these data with transfer learning to develop scalable alternatives for estimating socio-economic…

Computer Vision and Pattern Recognition · Computer Science 2024-11-22 Fan Yang , Sahoko Ishida , Mengyan Zhang , Daniel Jenson , Swapnil Mishra , Jhonathan Navott , Seth Flaxman

Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation

Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. PPI achieves this by combining small amounts of human-labeled data with larger amounts of data labeled by a reasonably…

Machine Learning · Computer Science 2024-12-05 Adam Fisch , Joshua Maynez , R. Alex Hofer , Bhuwan Dhingra , Amir Globerson , William W. Cohen

Large width penalization for neural network-based prediction interval estimation

Forecasting accuracy in highly uncertain environments is challenging due to the stochastic nature of systems. Deterministic forecasting provides only point estimates and cannot capture potential outcomes. Therefore, probabilistic…

Machine Learning · Computer Science 2024-12-12 Worachit Amnuaypongsa , Jitkomut Songsiri

Comparing Spatial Regression to Random Forests for Large Environmental Data Sets

Environmental data may be "large" due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates with nonlinear relationships, whereas spatial regression,…

Applications · Statistics 2018-12-27 Eric W. Fox , Jay M. Ver Hoef , Anthony R. Olsen

Estimating regression errors without ground truth values

Regression analysis is a standard supervised machine learning method used to model an outcome variable in terms of a set of predictor variables. In most real-world applications we do not know the true value of the outcome variable being…

Machine Learning · Statistics 2019-10-10 Henri Tiittanen , Emilia Oikarinen , Andreas Henelius , Kai Puolamäki

Semi-Supervised Learning via Cross-Prediction-Powered Inference for Wireless Systems

In many wireless application scenarios, acquiring labeled data can be prohibitively costly, requiring complex optimization processes or measurement campaigns. Semi-supervised learning leverages unlabeled samples to augment the available…

Information Theory · Computer Science 2024-10-08 Houssem Sifaou , Osvaldo Simeone

Adversarial Debiasing for Unbiased Parameter Recovery

Advances in machine learning and the increasing availability of high-dimensional data have led to the proliferation of social science research that uses the predictions of machine learning models as proxies for measures of human activity or…

Machine Learning · Computer Science 2025-02-19 Luke C Sanford , Megan Ayers , Matthew Gordon , Eliana Stone

The Importance of Scale for Spatial-Confounding Bias and Precision of Spatial Regression Estimators

Residuals in regression models are often spatially correlated. Prominent examples include studies in environmental epidemiology to understand the chronic health effects of pollutants. I consider the effects of residual spatial structure on…

Methodology · Statistics 2010-11-05 Christopher J. Paciorek

Dual Accuracy-Quality-Driven Neural Network for Prediction Interval Generation

Accurate uncertainty quantification is necessary to enhance the reliability of deep learning models in real-world applications. In the case of regression tasks, prediction intervals (PIs) should be provided along with the deterministic…

Machine Learning · Computer Science 2024-03-26 Giorgio Morales , John W. Sheppard