English
Related papers

Related papers: Estimating prediction error for complex samples

200 papers

The Horvitz-Thompson (HT) estimator is widely used in survey sampling. However, the variance of the HT estimator becomes large when the inclusion probabilities are highly heterogeneous. To overcome this shortcoming, in this paper, a…

Methodology · Statistics 2018-04-13 Xianpeng Zong , Rong Zhu , Guohua Zou

In survey sampling, survey data do not necessarily represent the target population, and the samples are often biased. However, information on the survey weights aids in the elimination of selection bias. The Horvitz-Thompson estimator is a…

Methodology · Statistics 2024-04-05 Kosuke Morikawa , Yoshikazu Terada , Jae Kwang Kim

The cross-classified sampling design consists in drawing samples from a two-dimension population, independently in each dimension. Such design is commonly used in consumer price index surveys and has been recently applied to draw a sample…

Statistics Theory · Mathematics 2015-11-23 Hélène Juillard , Guillaume Chauvet , Anne Ruiz-Gazen

We elaborate on a deconvolution method, used to estimate the empirical distribution of unknown parameters, as suggested recently by Efron (2013). It is applied to estimating the empirical distribution of the 'sampling probabilities' of m…

Statistics Theory · Mathematics 2013-11-20 Eitan Greenshtein , Theodor Itskov

Randomized controlled trials often suffer from interference, a violation of the Stable Unit Treatment Values Assumption (SUTVA) in which a unit's treatment assignment affects the outcomes of its neighbors. This interference causes bias in…

Methodology · Statistics 2025-02-06 Vydhourie Thiyageswaran , Tyler McCormick , Jennifer Brennan

The generalization ability of minimizers of the empirical risk in the context of binary classification has been investigated under a wide variety of complexity assumptions for the collection of classifiers over which optimization is…

Statistics Theory · Mathematics 2019-01-21 Clémençon Stephan , Patrice Bertail , Guillaume Papa

An additive model-assisted nonparametric method is investigated to estimate the finite population totals of massive survey data with the aid of auxiliary information. A class of estimators is proposed to improve the precision of the well…

Methodology · Statistics 2019-03-19 Li Wang , Suojin Wang

Accurate heterogeneous treatment effect (HTE) estimation is essential for personalized recommendations, making it important to evaluate and compare HTE estimators. Traditional assessment methods are inapplicable due to missing…

Methodology · Statistics 2024-12-30 Zijun Gao

Bipartite incidence graph sampling provides a unified representation of many sampling situations for the purpose of estimation, including the existing unconventional sampling methods, such as indirect, network or adaptive cluster sampling,…

Statistics Theory · Mathematics 2020-04-10 Martina Patone , Li-Chun Zhang

The Average Treatment Effect (ATE) is a global measure of the effectiveness of an experimental treatment intervention. Classical methods of its estimation either ignore relevant covariates or do not fully exploit them. Moreover, past work…

Methodology · Statistics 2013-11-05 Emil Pitkin , Richard Berk , Lawrence Brown , Andreas Buja , Ed George , Kai Zhang , Linda Zhao

Under the Neyman causal model, it is well-known that OLS with treatment-by-covariate interactions cannot harm asymptotic precision of estimated treatment effects in completely randomized experiments. But do such guarantees extend to…

Statistics Theory · Mathematics 2018-03-19 Joel A. Middleton

In this paper, we develop a multiply robust inference procedure of the average treatment effect (ATE) for data with high-dimensional covariates. We consider the case where it is difficult to correctly specify a single parametric model for…

Methodology · Statistics 2025-09-03 Xintao Xia , Yumou Qiu

Bayesian estimation is increasingly popular for performing model based inference to support policymaking. These data are often collected from surveys under informative sampling designs where subject inclusion probabilities are designed to…

Methodology · Statistics 2018-07-13 Luis G. Leon-Novelo , Terrance D. Savitsky

We systematically investigate issues due to mis-specification that arise in estimating causal effects when (treatment) interference is informed by a network available pre-intervention, i.e., in situations where the outcome of a unit may…

Methodology · Statistics 2018-10-22 Vishesh Karwa , Edoardo M. Airoldi

We study the problem of selecting the best heterogeneous treatment effect (HTE) estimator from a collection of candidates in settings where the treatment effect is fundamentally unobserved. We cast estimator selection as a multiple testing…

Machine Learning · Statistics 2025-11-25 Jiayi Guo , Zijun Gao

We propose an estimator of prediction error using an approximate message passing (AMP) algorithm that can be applied to a broad range of sparse penalties. Following Stein's lemma, the estimator of the generalized degrees of freedom, which…

Machine Learning · Statistics 2018-08-01 Ayaka Sakata

Random sampling is an essential tool in the processing and transmission of data. It is used to summarize data too large to store or manipulate and meet resource constraints on bandwidth or battery power. Estimators that are applied to the…

Databases · Computer Science 2015-03-19 Edith Cohen , Haim Kaplan

The Horvitz-Thompson (H-T) estimator is widely used for estimating network causal effects. We study its optimality properties by embedding it in the class of all linear estimators. We show that, under any form of interference, the H-T…

Statistics Theory · Mathematics 2025-11-25 Vishesh Karwa , Edoardo M. Airoldi

Current methods for population mean estimation from data collected by Respondent Driven Sampling (RDS) are based on the Horvitz-Thompson estimator together with a set of assumptions on the sampling model under which the inclusion…

Methodology · Statistics 2014-11-10 Adityanand Guntuboyina , Russell Barbour , Robert Heimer

Capture-recapture methods for estimating the total size of elusive populations are widely-used, however, due to the choice of estimator impacting upon the results and conclusions made, the question of performance of each estimator is…

Methodology · Statistics 2023-12-15 Layna Charlie Dennett , Dankmar Böhning
‹ Prev 1 2 3 10 Next ›