Related papers: Generalized Error Exponents For Small Sample Unive…

Hypothesis Testing with the General Source

The asymptotically optimal hypothesis testing problem with the general sources as the null and alternative hypotheses is studied under exponential-type error constraints on the first kind of error probability. Our fundamental philosophy in…

Probability · Mathematics 2007-05-23 Te Sun Han

A Unified Study on Sequentiality in Universal Classification with Empirically Observed Statistics

In the binary hypothesis testing problem, it is well known that sequentiality in taking samples eradicates the trade-off between two error exponents, yet implementing the optimal test requires the knowledge of the underlying distributions,…

Information Theory · Computer Science 2025-01-07 Ching-Fang Li , I-Hsiang Wang

Improving Pearson's chi-squared test: hypothesis testing of distributions -- optimally

Pearson's chi-squared test, from 1900, is the standard statistical tool for "hypothesis testing on distributions": namely, given samples from an unknown distribution $Q$ that may or may not equal a hypothesis distribution $P$, we want to…

Statistics Theory · Mathematics 2023-10-17 Trung Dang , Walter McKelvie , Paul Valiant , Hongao Wang

Some Remarks on Bayesian Multiple Hypothesis Testing

We consider Bayesian multiple hypothesis problem with independent and identically distributed observations. The classical, Sanov's theorem-based, analysis of the error probability allows one to characterize the best achievable error…

Information Theory · Computer Science 2021-11-01 Hüseyin Afşer

Generic Error Bounds for the Generalized Lasso with Sub-Exponential Data

This work performs a non-asymptotic analysis of the generalized Lasso under the assumption of sub-exponential data. Our main results continue recent research on the benchmark case of (sub-)Gaussian sample distributions and thereby explore…

Statistics Theory · Mathematics 2023-01-18 Martin Genzel , Christian Kipp

Universal Outlier Hypothesis Testing via Mean- and Median-Based Tests

Universal outlier hypothesis testing refers to a hypothesis testing problem where one observes a large number of length-$n$ sequences -- the majority of which are distributed according to the typical distribution $\pi$ and a small number…

Information Theory · Computer Science 2026-01-05 Bernhard C. Geiger , Tobias Koch , Josipa Mihaljević , Maximilian Toller

Generalized Resubstitution for Regression Error Estimation

We propose generalized resubstitution error estimators for regression, a broad family of estimators, each corresponding to a choice of empirical probability measures and loss function. The usual sum of squares criterion is a special case…

Machine Learning · Computer Science 2024-10-24 Diego Marcondes , Ulisses Braga-Neto

Improved likelihood inference in generalized linear models

We address the issue of performing testing inference in generalized linear models when the sample size is small. This class of models provides a straightforward way of modeling normal and non-normal data and has been widely used in several…

Methodology · Statistics 2013-08-16 Tiago M. Vargas , Silvia L. P. Ferrari , Artur J. Lemonte

Classification with High-Dimensional Sparse Samples

The task of the binary classification problem is to determine which of two distributions has generated a length-$n$ test sequence. The two distributions are unknown; two training sequences of length $N$, one from each distribution, are…

Information Theory · Computer Science 2016-04-18 Dayu Huang , Sean Meyn

A new omnibus test of fit based on a characterisation of the uniform distribution

In this paper, we revisit the classical goodness-of-fit problems for univariate distributions; we propose a new testing procedure based on a characterisation of the uniform distribution. Asymptotic theory for the simple hypothesis case is…

Methodology · Statistics 2021-08-17 Bruno Ebner , Shawn Liebenberg , Jaco Visagie

Universal Outlier Hypothesis Testing

Outlier hypothesis testing is studied in a universal setting. Multiple sequences of observations are collected, a small subset of which are outliers. A sequence is considered an outlier if the observations in that sequence are distributed…

Information Theory · Computer Science 2014-04-02 Yun Li , Sirin Nitinawarat , Venugopal V. Veeravalli

Maximum Mean Discrepancy with Unequal Sample Sizes via Generalized U-Statistics

Existing two-sample testing techniques, particularly those based on choosing a kernel for the Maximum Mean Discrepancy (MMD), often assume equal sample sizes from the two distributions. Applying these methods in practice can require…

Machine Learning · Statistics 2025-12-17 Aaron Wei , Milad Jalali , Danica J. Sutherland

Hypothesis Testing in High-Dimensional Regression under the Gaussian Random Design Model: Asymptotic Theory

We consider linear regression in the high-dimensional regime where the number of observations $n$ is smaller than the number of parameters $p$. A very successful approach in this setting uses $\ell_1$-penalized least squares (a.k.a. the…

Methodology · Statistics 2014-02-05 Adel Javanmard , Andrea Montanari

Out-of-sample error estimate for robust M-estimators with convex penalty

A generic out-of-sample error estimate is proposed for robust $M$-estimators regularized with a convex penalty in high-dimensional linear regression where $(X,y)$ is observed and $p,n$ are of the same order. If $\psi$ is the derivative of…

Statistics Theory · Mathematics 2023-03-31 Pierre C Bellec

On the choice of the splitting ratio for the split likelihood ratio test

The recently introduced framework of universal inference provides a new approach to constructing hypothesis tests and confidence regions that are valid in finite samples and do not rely on any specific regularity assumptions on the…

Statistics Theory · Mathematics 2023-09-11 David Strieder , Mathias Drton

Finite-sample analysis of M-estimators using self-concordance

The classical asymptotic theory for parametric $M$-estimators guarantees that, in the limit of infinite sample size, the excess risk has a chi-square type distribution, even in the misspecified case. We demonstrate how self-concordance of…

Statistics Theory · Mathematics 2020-12-01 Dmitrii Ostrovskii , Francis Bach

The Minimax Risk in Testing Uniformity over Large Alphabets under Missing-Ball Alternatives

We study the problem of testing the goodness of fit of categorical count data to a Poisson distribution uniform over the categories, against a class of alternatives defined by excluding an $\ell_p$ ball, $p \leq 2$, of radius $\epsilon$…

Statistics Theory · Mathematics 2025-12-16 Alon Kipnis

A Normal Test for Independence via Generalized Mutual Information

Testing hypothesis of independence between two random elements on a joint alphabet is a fundamental exercise in statistics. Pearson's chi-squared test is an effective test for such a situation when the contingency table is relatively small.…

Statistics Theory · Mathematics 2025-03-19 Jialin Zhang , Zhiyi Zhang

On Universality and Training in Binary Hypothesis Testing

The classical binary hypothesis testing problem is revisited. We notice that when one of the hypotheses is composite, there is an inherent difficulty in defining an optimality criterion that is both informative and well-justified. For…

Statistics Theory · Mathematics 2021-03-29 Michael Bell , Yuval Kochman

Revisiting the Random Subset Sum problem

The average properties of the well-known Subset Sum Problem can be studied by the means of its randomised version, where we are given a target value $z$, random variables $X_1, \ldots, X_n$, and an error parameter $\varepsilon > 0$, and we…

Probability · Mathematics 2024-03-05 Arthur da Cunha , Francesco d'Amore , Frédéric Giroire , Hicham Lesfari , Emanuele Natale , Laurent Viennot