Related papers: Estimating prediction error for complex samples

Improved Horvitz-Thompson Estimator in Survey Sampling

The Horvitz-Thompson (HT) estimator is widely used in survey sampling. However, the variance of the HT estimator becomes large when the inclusion probabilities are highly heterogeneous. To overcome this shortcoming, in this paper, a…

Methodology · Statistics 2018-04-13 Xianpeng Zong , Rong Zhu , Guohua Zou

Semiparametric adaptive estimation under informative sampling

In survey sampling, survey data do not necessarily represent the target population, and the samples are often biased. However, information on the survey weights aids in the elimination of selection bias. The Horvitz-Thompson estimator is a…

Methodology · Statistics 2024-04-05 Kosuke Morikawa , Yoshikazu Terada , Jae Kwang Kim

Estimation under cross-classified sampling with application to a childhood survey

The cross-classified sampling design consists in drawing samples from a two-dimension population, independently in each dimension. Such design is commonly used in consumer price index surveys and has been recently applied to draw a sample…

Statistics Theory · Mathematics 2015-11-23 Hélène Juillard , Guillaume Chauvet , Anne Ruiz-Gazen

Deconvolution with application to estimation of sampling probabilities and the Horvitz-Thompson estimator

We elaborate on a deconvolution method, used to estimate the empirical distribution of unknown parameters, as suggested recently by Efron (2013). It is applied to estimating the empirical distribution of the 'sampling probabilities' of m…

Statistics Theory · Mathematics 2013-11-20 Eitan Greenshtein , Theodor Itskov

Data-adaptive exposure thresholds for the Horvitz-Thompson estimator of the Average Treatment Effect in experiments with network interference

Randomized controlled trials often suffer from interference, a violation of the Stable Unit Treatment Values Assumption (SUTVA) in which a unit's treatment assignment affects the outcomes of its neighbors. This interference causes bias in…

Methodology · Statistics 2025-02-06 Vydhourie Thiyageswaran , Tyler McCormick , Jennifer Brennan

Learning from Survey Training Samples: Rate Bounds for Horvitz-Thompson Risk Minimizers

The generalization ability of minimizers of the empirical risk in the context of binary classification has been investigated under a wide variety of complexity assumptions for the collection of classifiers over which optimization is…

Statistics Theory · Mathematics 2019-01-21 Clémençon Stephan , Patrice Bertail , Guillaume Papa

Nonparametric Additive Model-assisted Estimation for Survey Data

An additive model-assisted nonparametric method is investigated to estimate the finite population totals of massive survey data with the aid of auxiliary information. A class of estimators is proposed to improve the precision of the well…

Methodology · Statistics 2019-03-19 Li Wang , Suojin Wang

Trustworthy assessment of heterogeneous treatment effect estimator

Accurate heterogeneous treatment effect (HTE) estimation is essential for personalized recommendations, making it important to evaluate and compare HTE estimators. Traditional assessment methods are inapplicable due to missing…

Methodology · Statistics 2024-12-30 Zijun Gao

Incidence weighting estimation under bipartite incidence graph sampling

Bipartite incidence graph sampling provides a unified representation of many sampling situations for the purpose of estimation, including the existing unconventional sampling methods, such as indirect, network or adaptive cluster sampling,…

Statistics Theory · Mathematics 2020-04-10 Martina Patone , Li-Chun Zhang

Improved Precision in Estimating Average Treatment Effects

The Average Treatment Effect (ATE) is a global measure of the effectiveness of an experimental treatment intervention. Classical methods of its estimation either ignore relevant covariates or do not fully exploit them. Moreover, past work…

Methodology · Statistics 2013-11-05 Emil Pitkin , Richard Berk , Lawrence Brown , Andreas Buja , Ed George , Kai Zhang , Linda Zhao

A Unified Theory of Regression Adjustment for Design-based Inference

Under the Neyman causal model, it is well-known that OLS with treatment-by-covariate interactions cannot harm asymptotic precision of estimated treatment effects in completely randomized experiments. But do such guarantees extend to…

Statistics Theory · Mathematics 2018-03-19 Joel A. Middleton

Multiply Robust Inference of Average Treatment Effects by High-dimensional Empirical Likelihood

In this paper, we develop a multiply robust inference procedure of the average treatment effect (ATE) for data with high-dimensional covariates. We consider the case where it is difficult to correctly specify a single parametric model for…

Methodology · Statistics 2025-09-03 Xintao Xia , Yumou Qiu

Fully Bayesian Estimation Under Informative Sampling

Bayesian estimation is increasingly popular for performing model based inference to support policymaking. These data are often collected from surveys under informative sampling designs where subject inclusion probabilities are designed to…

Methodology · Statistics 2018-07-13 Luis G. Leon-Novelo , Terrance D. Savitsky

A systematic investigation of classical causal inference strategies under mis-specification due to network interference

We systematically investigate issues due to mis-specification that arise in estimating causal effects when (treatment) interference is informed by a network available pre-intervention, i.e., in situations where the outcome of a unit may…

Methodology · Statistics 2018-10-22 Vishesh Karwa , Edoardo M. Airoldi

Reliable Selection of Heterogeneous Treatment Effect Estimators

We study the problem of selecting the best heterogeneous treatment effect (HTE) estimator from a collection of candidates in settings where the treatment effect is fundamentally unobserved. We cast estimator selection as a multiple testing…

Machine Learning · Statistics 2025-11-25 Jiayi Guo , Zijun Gao

Estimator of Prediction Error Based on Approximate Message Passing for Penalized Linear Regression

We propose an estimator of prediction error using an approximate message passing (AMP) algorithm that can be applied to a broad range of sparse penalties. Following Stein's lemma, the estimator of the generalized degrees of freedom, which…

Machine Learning · Statistics 2018-08-01 Ayaka Sakata

Get the Most out of Your Sample: Optimal Unbiased Estimators using Partial Information

Random sampling is an essential tool in the processing and transmission of data. It is used to summarize data too large to store or manipulate and meet resource constraints on bandwidth or battery power. Estimators that are applied to the…

Databases · Computer Science 2015-03-19 Edith Cohen , Haim Kaplan

On the admissibility of Horvitz-Thompson estimator for estimating causal effects under network interference

The Horvitz-Thompson (H-T) estimator is widely used for estimating network causal effects. We study its optimality properties by embedding it in the class of all linear estimators. We show that, under any form of interference, the H-T…

Statistics Theory · Mathematics 2025-11-25 Vishesh Karwa , Edoardo M. Airoldi

On the impossibility of constructing good population mean estimators in a realistic Respondent Driven Sampling model

Current methods for population mean estimation from data collected by Respondent Driven Sampling (RDS) are based on the Horvitz-Thompson estimator together with a set of assumptions on the sampling model under which the inclusion…

Methodology · Statistics 2014-11-10 Adityanand Guntuboyina , Russell Barbour , Robert Heimer

Performance of capture-recapture population size estimators under covariate information

Capture-recapture methods for estimating the total size of elusive populations are widely-used, however, due to the choice of estimator impacting upon the results and conclusions made, the question of performance of each estimator is…

Methodology · Statistics 2023-12-15 Layna Charlie Dennett , Dankmar Böhning