Related papers: Robust Model-based Inference for Non-Probability S…

Robust and Efficient Bayesian Inference for Non-Probability Samples

The declining response rates in probability surveys along with the widespread availability of unstructured data has led to growing research into non-probability samples. Existing robust approaches are not well-developed for non-Gaussian…

Methodology · Statistics 2022-03-29 Ali Rafei , Michael R. Elliott , Carol A. C. Flannagan

Probability and Non-Probability Samples: Improving Regression Modeling by Using Data from Different Sources

Non-probability sampling, for example in the form of online panels, has become a fast and cheap method to collect data. While reliable inference tools are available for classical probability samples, non-probability samples can yield…

Methodology · Statistics 2022-04-05 Gerhard Tutz

Doubly Robust Inference when Combining Probability and Non-probability Samples with High-dimensional Data

Non-probability samples become increasingly popular in survey statistics but may suffer from selection biases that limit the generalizability of results to the target population. We consider integrating a non-probability sample with a…

Methodology · Statistics 2019-08-26 Shu Yang , Jae Kwang Kim , Rui Song

Doubly Robust Inference with Non-probability Survey Samples

We establish a general framework for statistical inferences with non-probability survey samples when relevant auxiliary information is available from a probability survey sample. We develop a rigorous procedure for estimating the propensity…

Methodology · Statistics 2018-05-17 Yilin Chen , Pengfei Li , Changbao Wu

Bayesian Nonparametric Weighted Sampling Inference

It has historically been a challenge to perform Bayesian inference in a design-based survey context. The present paper develops a Bayesian model for sampling inference in the presence of inverse-probability weights. We use a hierarchical…

Methodology · Statistics 2020-06-24 Yajuan Si , Natesh S. Pillai , Andrew Gelman

Inference from Non-Random Samples Using Bayesian Machine Learning

We consider inference from non-random samples in data-rich settings where high-dimensional auxiliary information is available both in the sample and the target population, with survey inference being a special case. We propose a regularized…

Methodology · Statistics 2021-04-13 Yutao Liu , Andrew Gelman , Qixuan Chen

Combining Non-probability and Probability Survey Samples Through Mass Imputation

This paper presents theoretical results on combining non-probability and probability survey samples through mass imputation, an approach originally proposed by Rivers (2007) as sample matching without rigorous theoretical justification.…

Methodology · Statistics 2020-11-24 Jae Kwang Kim , Seho Park , Yilin Chen , Changbao Wu

Bayesian Estimators in Uncertain Nested Error Regression Models

Nested error regression models are useful tools for analysis of grouped data, especially in the case of small area estimation. This paper suggests a nested error regression model using uncertain random effects in which the random effect in…

Methodology · Statistics 2017-02-28 Shonosuke Sugasawa , Tatsuya Kubokawa

Fully Bayesian Estimation Under Informative Sampling

Bayesian estimation is increasingly popular for performing model based inference to support policymaking. These data are often collected from surveys under informative sampling designs where subject inclusion probabilities are designed to…

Methodology · Statistics 2018-07-13 Luis G. Leon-Novelo , Terrance D. Savitsky

Investigating an Alternative for Estimation from a Nonprobability Sample: Matching plus Calibration

Matching a nonprobability sample to a probability sample is one strategy both for selecting the nonprobability units and for weighting them. This approach has been employed in the past to select subsamples of persons from a large panel of…

Methodology · Statistics 2021-12-03 Zhan Liu , Richard Valliant

Robust estimation of risks from small samples

Data-driven risk analysis involves the inference of probability distributions from measured or simulated data. In the case of a highly reliable system, such as the electricity grid, the amount of relevant data is often exceedingly limited,…

Methodology · Statistics 2017-07-11 Simon H. Tindemans , Goran Strbac

Methods for Combining Probability and Nonprobability Samples Under Unknown Overlaps

Nonprobability (convenience) samples are increasingly sought to stabilize estimations for one or more population variables of interest that are performed using a randomized survey (reference) sample by increasing the effective sample size.…

Methodology · Statistics 2023-06-13 Terrance D. Savitsky , Matthew R. Williams , Julie Gershunskaya , Vladislav Beresovsky , Nels G. Johnson

Amortised and provably-robust simulation-based inference

Complex simulator-based models are now routinely used to perform inference across the sciences and engineering, but existing inference methods are often unable to account for outliers and other extreme values in data which occur due to…

Machine Learning · Statistics 2026-02-18 Ayush Bharti , Charita Dellaporta , Yuga Hikida , François-Xavier Briol

Robust Bayesian Regression with Synthetic Posterior

Although linear regression models are fundamental tools in statistical science, the estimation results can be sensitive to outliers. While several robust methods have been proposed in frequentist frameworks, statistical inference is not…

Methodology · Statistics 2020-07-15 Shintaro Hashimoto , Shonosuke Sugasawa

Robust Bayesian Inference for Simulator-based Models via the MMD Posterior Bootstrap

Simulator-based models are models for which the likelihood is intractable but simulation of synthetic data is possible. They are often used to describe complex real-world phenomena, and as such can often be misspecified in practice.…

Methodology · Statistics 2022-12-20 Charita Dellaporta , Jeremias Knoblauch , Theodoros Damoulas , François-Xavier Briol

Bootstrapping and Sample Splitting For High-Dimensional, Assumption-Free Inference

Several new methods have been proposed for performing valid inference after model selection. An older method is sampling splitting: use part of the data for model selection and part for inference. In this paper we revisit sample splitting…

Statistics Theory · Mathematics 2018-04-04 Alessandro Rinaldo , Larry Wasserman , Max G'Sell , Jing Lei

Robust Sampling for Active Statistical Inference

Active statistical inference is a new method for inference with AI-assisted data collection. Given a budget on the number of labeled data points that can be collected and assuming access to an AI predictive model, the basic idea is to…

Machine Learning · Statistics 2025-11-13 Puheng Li , Tijana Zrnic , Emmanuel Candès

Robust Bayesian Inference for Big Data: Combining Sensor-based Records with Traditional Survey Data

Big Data often presents as massive non-probability samples. Not only is the selection mechanism often unknown, but larger data volume amplifies the relative contribution of selection bias to total error. Existing bias adjustment approaches…

Methodology · Statistics 2022-03-29 Ali Rafei , Carol A. C. Flannagan , Brady T. West , Michael R. Elliott

Improving measurement error and representativeness in nonprobability surveys

In the age of big data, nonprobability surveys are becoming increasingly abundant. Data integration techniques involving both probability and nonprobability surveys are being extensively used for providing improved estimates for finite…

Applications · Statistics 2025-10-17 Aditi Sen , Partha Lahiri

Risk and resampling under model uncertainty

In statistical exercises where there are several candidate models, the traditional approach is to select one model using some data driven criterion and use that model for estimation, testing and other purposes, ignoring the variability of…

Statistics Theory · Mathematics 2008-12-18 Snigdhansu Chatterjee , Nitai D. Mukhopadhyay