Related papers: Probability-Based Estimation
Estimating prevalence, the fraction of a population with a certain medical condition, is fundamental to epidemiology. Traditional methods rely on classification of test samples taken at random from a population. Such approaches to…
This article presents methods for estimating extreme probabilities, beyond the range of the observations. These methods are model-free and applicable to almost any sample size. They are grounded in order statistics theory and have a wide…
Estimating win probability is one of the classic modeling tasks of sports analytics. Many widely used win probability estimators use machine learning to fit the relationship between a binary win/loss outcome variable and certain game-state…
Epidemiologists increasingly use causal inference methods that rely on machine learning, as these approaches can relax unnecessary model specification assumptions. While deriving and studying asymptotic properties of such estimators is a…
Prediction, where observed data is used to quantify uncertainty about a future observation, is a fundamental problem in statistics. Prediction sets with coverage probability guarantees are a common solution, but these do not provide…
We consider the general problem of estimating probabilities which arise as a union of dependent events. We propose a flexible series of estimators for such probabilities, and describe variance reduction schemes applied to the proposed…
Selection bias arises when the probability that an observation enters a dataset depends on variables related to the quantities of interest, leading to systematic distortions in estimation and uncertainty quantification. For example, in…
We give a probabilistic analysis of inductive knowledge and belief and explore its predictions concerning knowledge about the future, about laws of nature, and about the values of inexactly measured quantities. The analysis combines a…
The bias of an estimator is defined as the difference of its expected value from the parameter to be estimated, where the expectation is with respect to the model. Loosely speaking, small bias reflects the desire that if an experiment is…
We consider the problem of estimating the joint distribution of $n$ independent random variables. Our approach is based on a family of candidate probabilities that we shall call a model and which is chosen to either contain the true…
Percentiles and more generally, quantiles are commonly used in various contexts to summarize data. For most distributions, there is exactly one quantile that is unbiased. For distributions like the Gaussian that have the same mean and…
We propose and analyze estimators for statistical functionals of one or more distributions under nonparametric assumptions. Our estimators are based on the theory of influence functions, which appear in the semiparametric statistics…
Estimating the unknown number of classes in a population has numerous important applications. In a Poisson mixture model, the problem is reduced to estimating the odds that a class is undetected in a sample. The discontinuity of the odds…
Probability forecasts of events are routinely used in climate predictions, in forecasting default probabilities on bank loans or in estimating the probability of a patient's positive response to treatment. Scoring rules have long been used…
We develop a new framework of uncertainty variables to model uncertainty. An uncertainty variable is characterized by an uncertainty set, in which its realization is bound to lie, while the conditional uncertainty is characterized by a set…
Probability forecasts are intended to account for the uncertainties inherent in forecasting. It is suggested that from an end-user's point of view probability is not necessarily sufficient to reflect uncertainties that are not simply the…
In this study, we introduce a new approach to statistical decision theory. Without using a loss function, we select good decision rules to choice between two hypotheses. We call them "experts". They are globally unbiased but also…
A random set is a generalisation of a random variable, i.e. a set-valued random variable. The random set theory allows a unification of other uncertainty descriptions such as interval variable, mass belief function in Dempster-Shafer theory…
Statistical inference for extreme values of random events is difficult in practice due to low sample sizes and inaccurate models for the studied rare events. If prior knowledge for extreme values is available, Bayesian statistics can be…
Statistical machine learning theory often tries to give generalization guarantees of machine learning models. Those models naturally underlie some fluctuation, as they are based on a data sample. If we were unlucky, and gathered a sample…