Related papers: Inference with generalizable classifier prediction…

Black-box Selective Inference via Bootstrapping

Conditional selective inference requires an exact characterization of the selection event, which is often unavailable except for a few examples like the lasso. This work addresses this challenge by introducing a generic approach to estimate…

Methodology · Statistics 2023-08-22 Sifan Liu , Jelena Markovic-Voronov , Jonathan Taylor

Bootstrap inference for the finite population total under complex sampling designs

Bootstrap is a useful tool for making statistical inference, but it may provide erroneous results under complex survey sampling. Most studies about bootstrap-based inference are developed under simple random sampling and stratified random…

Statistics Theory · Mathematics 2019-01-08 Zhonglei Wang , Jae Kwang Kim , Liuhua Peng

Bayesian Inference for Correlated Human Experts and Classifiers

Applications of machine learning often involve making predictions based on both model outputs and the opinions of human experts. In this context, we investigate the problem of querying experts for class label predictions, using as few human…

Machine Learning · Computer Science 2025-06-09 Markelle Kelly , Alex Boyd , Sam Showalter , Mark Steyvers , Padhraic Smyth

Calibrating Model-Based Inferences and Decisions

As the frontiers of applied statistics progress through increasingly complex experiments we must exploit increasingly sophisticated inferential models to analyze the observations we make. In order to avoid misleading or outright erroneous…

Methodology · Statistics 2018-03-23 Michael Betancourt

Causal Inference Isn't Special: Why It's Just Another Prediction Problem

Causal inference is often portrayed as fundamentally distinct from predictive modeling, with its own terminology, goals, and intellectual challenges. But at its core, causal inference is simply a structured instance of prediction under…

Machine Learning · Computer Science 2025-07-10 Carlos Fernández-Loría

Towards a Learning Theory of Cause-Effect Inference

We pose causal inference as the problem of learning to classify probability distributions. In particular, we assume access to a collection $\{(S_i,l_i)\}_{i=1}^n$, where each $S_i$ is a sample drawn from the probability distribution of $X_i…

Machine Learning · Statistics 2015-05-20 David Lopez-Paz , Krikamol Muandet , Bernhard Schölkopf , Ilya Tolstikhin

Predictive inference for discrete-valued time series

For discrete-valued time series, predictive inference cannot be implemented through the construction of prediction intervals to some predetermined coverage level, as this is the case for real-valued time series. To address this problem, we…

Methodology · Statistics 2025-07-23 Maxime Faymonville , Carsten Jentsch , Efstathios Paparoditis

Redefining Populations of Inference for Generalizations from Small Studies

With the growth in experimental studies in education, policymakers and practitioners are interested in understanding not only what works, but for whom an intervention works. This interest in the generalizability of a study's findings has…

Methodology · Statistics 2022-05-02 Wendy Chan , Jimin Oh , Katherine J. Wilson

Learning about individuals from group statistics

We propose a new problem formulation which is similar to, but more informative than, the binary multiple-instance learning problem. In this setting, we are given groups of instances (described by feature vectors) along with estimates of the…

Machine Learning · Computer Science 2012-07-09 Hendrik Kuck , Nando de Freitas

Statistical inference in massive datasets by empirical likelihood

In this paper, we propose a new statistical inference method for massive data sets, which is very simple and efficient by combining divide-and-conquer method and empirical likelihood. Compared with two popular methods (the bag of little…

Methodology · Statistics 2020-04-21 Xuejun Ma , Shaochen Wang , Wang Zhou

Causal predictive inference and target trial emulation

Causal inference from observational data can be viewed as a missing data problem arising from a hypothetical population-scale randomized trial matched to the observational study. This links a target trial protocol with a corresponding…

Methodology · Statistics 2022-07-27 Andrew Yiu , Edwin Fong , Stephen Walker , Chris Holmes

A simple recipe for making accurate parametric inference in finite sample

Constructing tests or confidence regions that control over the error rates in the long-run is probably one of the most important problem in statistics. Yet, the theoretical justification for most methods in statistics is asymptotic. The…

Methodology · Statistics 2019-01-23 Stéphane Guerrier , Mucyo Karemera , Samuel Orso , Maria-Pia Victoria-Feser

Confident in the Crowd: Bayesian Inference to Improve Data Labelling in Crowdsourcing

With the increased interest in machine learning and big data problems, the need for large amounts of labelled data has also grown. However, it is often infeasible to get experts to label all of this data, which leads many practitioners to…

Machine Learning · Computer Science 2021-05-31 Pierce Burke , Richard Klein

Robust Bayesian inference via coarsening

The standard approach to Bayesian inference is based on the assumption that the distribution of the data belongs to the chosen model class. However, even a small violation of this assumption can have a large impact on the outcome of a…

Methodology · Statistics 2015-06-22 Jeffrey W. Miller , David B. Dunson

Effect Inference from Two-Group Data with Sampling Bias

In many applications, different populations are compared using data that are sampled in a biased manner. Under sampling biases, standard methods that estimate the difference between the population means yield unreliable inferences. Here we…

Statistics Theory · Mathematics 2019-11-12 Dave Zachariah , Petre Stoica

Bootstrapping and Sample Splitting For High-Dimensional, Assumption-Free Inference

Several new methods have been proposed for performing valid inference after model selection. An older method is sampling splitting: use part of the data for model selection and part for inference. In this paper we revisit sample splitting…

Statistics Theory · Mathematics 2018-04-04 Alessandro Rinaldo , Larry Wasserman , Max G'Sell , Jing Lei

Causal inference under mis-specification: adjustment based on the propensity score

We study Bayesian approaches to causal inference via propensity score regression. Much of the Bayesian literature on propensity score methods have relied on approaches that cannot be viewed as fully Bayesian in the context of conventional…

Methodology · Statistics 2022-02-01 David A. Stephens , Widemberg S. Nobre , Erica E. M. Moodie , Alexandra M. Schmidt

Selective inference after feature selection via multiscale bootstrap

It is common to show the confidence intervals or $p$-values of selected features, or predictor variables in regression, but they often involve selection bias. The selective inference approach solves this bias by conditioning on the…

Methodology · Statistics 2022-06-02 Yoshikazu Terada , Hidetoshi Shimodaira

Estimating the Accuracies of Multiple Classifiers Without Labeled Data

In various situations one is given only the predictions of multiple classifiers over a large unlabeled test data. This scenario raises the following questions: Without any labeled data and without any a-priori knowledge about the…

Machine Learning · Statistics 2014-10-31 Ariel Jaffe , Boaz Nadler , Yuval Kluger

Causal bootstrapping

To draw scientifically meaningful conclusions and build reliable models of quantitative phenomena, cause and effect must be taken into consideration (either implicitly or explicitly). This is particularly challenging when the measurements…

Machine Learning · Computer Science 2020-12-11 Max A. Little , Reham Badawy