Related papers: Revisiting Active Sequential Prediction-Powered Me…

Prediction-Powered Conditional Inference

We study prediction-powered conditional inference in the setting where labeled data are scarce, unlabeled covariates are abundant, and a black-box machine-learning predictor is available. The goal is to perform statistical inference on…

Machine Learning · Statistics 2026-03-09 Yang Sui , Jin Zhou , Hua Zhou , Xiaowu Dai

Empirical Likelihood Meets Prediction-Powered Inference

We study inference with a small labeled sample, a large unlabeled sample, and high-quality predictions from an external model. We link prediction-powered inference with empirical likelihood by stacking supervised estimating equations based…

Methodology · Statistics 2025-12-19 Guanghui Wang , Mengtao Wen , Changliang Zou

Sequential Information Guided Sensing

We study the value of information in sequential compressed sensing by characterizing the performance of sequential information guided sensing in practical scenarios when information is inaccurate. In particular, we assume the signal…

Information Theory · Computer Science 2015-09-02 Ruiyang Song , Yao Xie , Sebastian Pokutta

The Projected Covariance Measure for assumption-lean variable significance testing

Testing the significance of a variable or group of variables $X$ for predicting a response $Y$, given additional covariates $Z$, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test…

Statistics Theory · Mathematics 2024-05-08 Anton Rask Lundborg , Ilmun Kim , Rajen D. Shah , Richard J. Samworth

Active Statistical Inference

Inspired by the concept of active learning, we propose active inference$\unicode{x2013}$a methodology for statistical inference with machine-learning-assisted data collection. Assuming a budget on the number of labels that can be collected,…

Machine Learning · Statistics 2026-04-09 Tijana Zrnic , Emmanuel J. Candès

Identifying Wrongly Predicted Samples: A Method for Active Learning

State-of-the-art machine learning models require access to significant amount of annotated data in order to achieve the desired level of performance. While unlabelled data can be largely available and even abundant, annotation process can…

Machine Learning · Computer Science 2020-10-15 Rahaf Aljundi , Nikolay Chumerin , Daniel Olmeda Reino

Active Learning for Regression with Aggregated Outputs

Due to the privacy protection or the difficulty of data collection, we cannot observe individual outputs for each instance, but we can observe aggregated outputs that are summed over multiple instances in a set in some real-world…

Machine Learning · Statistics 2022-10-05 Tomoharu Iwata

Optimal kernel regression bounds under energy-bounded noise

Non-conservative uncertainty bounds are key for both assessing an estimation algorithm's accuracy and in view of downstream tasks, such as its deployment in safety-critical contexts. In this paper, we derive a tight, non-asymptotic…

Machine Learning · Computer Science 2026-01-16 Amon Lahr , Johannes Köhler , Anna Scampicchio , Melanie N. Zeilinger

Active and Adaptive Sequential learning

A framework is introduced for actively and adaptively solving a sequence of machine learning problems, which are changing in bounded manner from one time step to the next. An algorithm is developed that actively queries the labels of the…

Machine Learning · Computer Science 2018-05-31 Yuheng Bu , Jiaxun Lu , Venugopal V. Veeravalli

Computationally Efficient Deep Bayesian Unit-Level Modeling of Survey Data under Informative Sampling for Small Area Estimation

The topic of deep learning has seen a surge of interest in recent years both within and outside of the field of Statistics. Deep models leverage both nonlinearity and interaction effects to provide superior predictions in many cases when…

Methodology · Statistics 2020-09-18 Paul A. Parker , Scott H. Holan

ASPEST: Bridging the Gap Between Active Learning and Selective Prediction

Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain. These predictions can then be deferred to humans for further evaluation. As an everlasting challenge for machine learning, in many…

Machine Learning · Computer Science 2024-03-04 Jiefeng Chen , Jinsung Yoon , Sayna Ebrahimi , Sercan Arik , Somesh Jha , Tomas Pfister

Prediction-Powered Linear Regression: A Balance Between Interpretation and Prediction

Unlabeled data are increasingly prevalent in contemporary economic studies, yet their effective use for improving prediction remains challenging because the outcomes are often costly or even infeasible to observe. Machine learning methods…

Methodology · Statistics 2026-05-12 Fuzhi Xu , Xingyu Yan , Xinyu Zhang

On weighted uncertainty sampling in active learning

This note explores probabilistic sampling weighted by uncertainty in active learning. This method has been previously used and authors have tangentially remarked on its efficacy. The scheme has several benefits: (1) it is computationally…

Machine Learning · Computer Science 2019-09-12 Vinay Jethava

Non-Asymptotic Performance of Social Machine Learning Under Limited Data

This paper studies the probability of error associated with the social machine learning framework, which involves an independent training phase followed by a cooperative decision-making phase over a graph. This framework addresses the…

Machine Learning · Computer Science 2024-07-10 Ping Hu , Virginia Bordignon , Mert Kayaalp , Ali H. Sayed

Semi-supervised learning using copula-based regression and model averaging

The available data in semi-supervised learning usually consists of relatively small sized labeled data and much larger sized unlabeled data. How to effectively exploit unlabeled data is the key issue. In this paper, we write the regression…

Methodology · Statistics 2024-11-13 Ziwen Gao , Huihang Liu , Xinyu Zhang

Weighted Sets of Probabilities and Minimax Weighted Expected Regret: New Approaches for Representing Uncertainty and Making Decisions

We consider a setting where an agent's uncertainty is represented by a set of probability measures, rather than a single measure. Measure-by-measure updating of such a set of measures upon acquiring new information is well-known to suffer…

Computer Science and Game Theory · Computer Science 2016-11-04 Joseph Y. Halpern , Samantha Leung

Improving Uncertainty Sampling with Bell Curve Weight Function

Typically, a supervised learning model is trained using passive learning by randomly selecting unlabelled instances to annotate. This approach is effective for learning a model, but can be costly in cases where acquiring labelled instances…

Machine Learning · Computer Science 2024-03-05 Zan-Kai Chong , Hiroyuki Ohsaki , Bok-Min Goi

Weighted Sets of Probabilities and MinimaxWeighted Expected Regret: New Approaches for Representing Uncertainty and Making Decisions

We consider a setting where an agent's uncertainty is represented by a set of probability measures, rather than a single measure. Measure-bymeasure updating of such a set of measures upon acquiring new information is well-known to suffer…

Computer Science and Game Theory · Computer Science 2013-02-26 Joseph Y. Halpern , Samantha Leung

Regression analysis of multiplicative hazards model with time-dependent coefficient for sparse longitudinal covariates

We study the multiplicative hazards model with intermittently observed longitudinal covariates and time-varying coefficients. For such models, the existing ad hoc approach, such as the last value carried forward, is biased. We propose a…

Methodology · Statistics 2025-03-13 Zhuowei Sun , Hongyuan Cao

Continuously updated estimation of conditional hazard functions

Motivated by the need to analyze continuously updated data sets in the context of time-to-event modeling, we propose a novel nonparametric approach to estimate the conditional hazard function given a set of continuous and discrete…

Methodology · Statistics 2025-07-03 Daphné Aurouet , Valentin Patilea