English
Related papers

Related papers: Support Estimation with Sampling Artifacts and Err…

200 papers

The support recovery problem consists of determining a sparse subset of a set of variables that is relevant in generating a set of observations, and arises in a diverse range of settings such as compressive sensing, and subset selection in…

Information Theory · Computer Science 2016-08-31 Jonathan Scarlett , Volkan Cevher

This paper presents a theoretical analysis of sample selection bias correction. The sample bias correction technique commonly used in machine learning consists of reweighting the cost of an error on each training point of a biased sample to…

Machine Learning · Computer Science 2008-12-18 Corinna Cortes , Mehryar Mohri , Michael Riley , Afshin Rostamizadeh

We consider the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its elements. The problem occurs in many…

Machine Learning · Computer Science 2021-06-17 Talya Eden , Piotr Indyk , Shyam Narayanan , Ronitt Rubinfeld , Sandeep Silwal , Tal Wagner

Selection bias arises when the probability that an observation enters a dataset depends on variables related to the quantities of interest, leading to systematic distortions in estimation and uncertainty quantification. For example, in…

We introduce a new method for estimating the support size of an unknown distribution which provably matches the performance bounds of the state-of-the-art techniques in the area and outperforms them in practice. In particular, we present…

Machine Learning · Statistics 2019-10-22 I , Chien , Olgica Milenkovic

We propose a coupled bootstrap (CB) method for the test error of an arbitrary algorithm that estimates the mean in a Poisson sequence, often called the Poisson means problem. The idea behind our method is to generate two carefully-designed…

Methodology · Statistics 2024-08-20 Natalia L. Oliveira , Jing Lei , Ryan J. Tibshirani

Bayesian hierarchical Poisson models are an essential tool for analyzing count data. However, designing efficient algorithms to sample from the posterior distribution of the target parameters remains a challenging task for this class of…

Methodology · Statistics 2025-02-10 Aldo Gardini , Fedele Greco , Carlo Trivisano

Compressed sensing deals with the reconstruction of sparse signals using a small number of linear measurements. One of the main challenges in compressed sensing is to find the support of a sparse signal. In the literature, several bounds on…

Information Theory · Computer Science 2009-11-26 Ali Hormati , Amin Karbasi , Soheil Mohajer , Martin Vetterli

Intensity estimation for Poisson processes is a classical problem and has been extensively studied over the past few decades. Practical observations, however, often contain compositional noise, i.e. a nonlinear shift along the time axis,…

Methodology · Statistics 2019-09-25 Glenna Schluck , Wei Wu , Anuj Srivastava

In machine learning models, the estimation of errors is often complex due to distribution bias, particularly in spatial data such as those found in environmental studies. We introduce an approach based on the ideas of importance sampling to…

Machine Learning · Computer Science 2023-09-15 Boris Prokhorov , Diana Koldasbayeva , Alexey Zaytsev

Support points summarize a large dataset through a smaller set of representative points that can be used for data operations, such as Monte Carlo integration, without requiring access to the full dataset. In this sense, support points offer…

Machine Learning · Statistics 2025-09-01 Peiqi Zhao , Carlos E. Rodríguez , Ramsés H. Mena , Stephen G. Walker

Bayesian analysis is increasingly popular for use in social science and other application areas where the data are observations from an informative sample. An informative sampling design leads to inclusion probabilities that are correlated…

Statistics Theory · Mathematics 2016-06-07 Terrance D. Savitsky , Daniell Toth

Given a statistical model, we propose a novel estimation method that yields randomised estimators for the unknown distribution of an observed random variable. We establish non-asymptotic bounds for the performance of these estimators and…

Statistics Theory · Mathematics 2026-05-06 Yannick Baraud

Estimation of a deterministic quantity observed in non-Gaussian additive noise is explored via order statistics approach. More specifically, we study the estimation problem when measurement noises either have positive supports or follow a…

Signal Processing · Electrical Eng. & Systems 2020-07-15 Kamiar Radnosrati , Gustaf Hendeby , Fredrik Gustafsson

Discrete biomarkers derived as cell densities or counts from tissue microarrays and immunostaining are widely used to study immune signatures in relation to survival outcomes in cancer. Although routinely collected, these signatures are not…

Implicit sampling is a weighted sampling method that is used in data assimilation, where one sequentially updates estimates of the state of a stochastic model based on a stream of noisy or incomplete data. Here we describe how to use…

Numerical Analysis · Mathematics 2016-01-20 Matthias Morzfeld , Xuemin Tu , Jon Wilkening , Alexandre J. Chorin

Confirmation bias, the tendency to interpret information in a way that aligns with one's preconceptions, can profoundly impact scientific research, leading to conclusions that reflect the researcher's hypotheses even when the observational…

Machine Learning · Statistics 2025-09-09 Amnon Balanov , Tamir Bendory , Wasim Huleihel

We consider a problem of statistical mean estimation in which the samples are not observed directly, but are instead observed by a relay (``teacher'') that transmits information through a memoryless channel to the decoder (``student''), who…

Information Theory · Computer Science 2025-05-15 Yan Hao Ling , Zhouhao Yang , Jonathan Scarlett

We consider the problem of exact support recovery of sparse signals via noisy measurements. The main focus is the sufficient and necessary conditions on the number of measurements for support recovery to be reliable. By drawing an analogy…

Information Theory · Computer Science 2010-03-04 Yuzhe Jin , Young-Han Kim , Bhaskar D. Rao

Neural networks make accurate predictions but often fail to provide reliable uncertainty estimates, especially under covariate distribution shifts between training and testing. To address this problem, we propose a Bayesian framework for…

Machine Learning · Statistics 2025-12-22 Yuli Slavutsky , David M. Blei
‹ Prev 1 2 3 10 Next ›