Related papers: On $p$-values
There are two distinct definitions of 'P-value' for evaluating a proposed hypothesis or model for the process generating an observed dataset. The original definition starts with a measure of the divergence of the dataset from what was…
Deciding whether a model provides a good description of data is often based on a goodness-of-fit criterion summarized by a p-value. Although there is considerable confusion concerning the meaning of p-values, leading to their misuse, they…
P-values are a mainstay in statistics but are often misinterpreted. We propose a new interpretation of p-value as a meaningful plausibility, where this is to be interpreted formally within the inferential model framework. We show that, for…
$P$-values have been the focus of considerable criticism based on various considerations. Still, the $P$-value represents one of the most commonly used statistical tools. When assessing the suitability of a single hypothesized distribution,…
The p-values are often implicitly used as a measure of evidence for the hypotheses of the tests. This practice has been analyzed with different approaches. It is generally accepted for the one-sided hypothesis problem, but it is often…
We explain the concept of p-values presupposing only rudimentary probability theory. We also use the occasion to introduce the notion of p-function, so that p-values are values of a p-function. The explanation is restricted to the discrete…
The notion of p-value is a fundamental concept in statistical inference and has been widely used for reporting outcomes of hypothesis tests. However, p-value is often misinterpreted, misused or miscommunicated in practice. Part of the issue…
Mathematics is a limited component of solutions to real-world problems, as it expresses only what is expected to be true if all our assumptions are correct, including implicit assumptions that are omnipresent and often incorrect.…
This article addresses issues of model criticism and model comparison in Bayesian contexts, and focusses on the use of the so-called posterior predictive p-values (ppp values). These involve a general discrepancy or conflict measure and…
Null hypothesis significance tests and p values are widely used despite very strong arguments against their use in many contexts. Confidence intervals are often recommended as an alternative, but these do not achieve the objective of…
We introduce the notion of p*-values (p*-variables), which generalizes p-values (p-variables) in several senses. The new notion has four natural interpretations: operational, probabilistic, Bayesian, and frequentist. A main example of a…
Calibration is a frequently invoked concept when useful label probability estimates are required on top of classification accuracy. A calibrated model is a function whose values correctly reflect underlying label probabilities. Calibration…
Several scientific fields including psychology are undergoing a replication crisis. There are many reasons for this problem, one of which is a misuse of p-values. There are several alternatives to p-values, and in this paper we describe a…
This paper develops an interpretive framework for divergence P-values and S-values within a descriptive frequentist perspective. Statistical analysis is framed as operating within idealized worlds defined by a set of assumptions and a…
Model checking and testing are two areas with a similar goal: to verify that a system satisfies a property. They start with different hypothesis on the systems and develop many techniques with different notions of approximation, when an…
In the context of supervised parametric models, we introduce the concept of e-values. An e-value is a scalar quantity that represents the proximity of the sampling distribution of parameter estimates in a model trained on a subset of…
Confidence intervals are a popular way to visualize and analyze data distributions. Unlike p-values, they can convey information both about statistical significance as well as effect size. However, very little work exists on applying…
In a regression setting with a response vector and given regressor vectors, a typical question is to what extent the response is related to these regressors, specifically, how well it can be approximated by a linear combination of the…
In many fields of research null hypothesis significance tests and p values are the accepted way of assessing the degree of certainty with which research results can be extrapolated beyond the sample studied. However, there are very serious…
Increased availability of data and accessibility of computational tools in recent years have created unprecedented opportunities for scientific research driven by statistical analysis. Inherent limitations of statistics impose constrains on…