统计理论 — Scifaro

ROC curves for LDA classifiers

In the paper, we derive an analytic formula for the ROC curves of the LDA classifiers. We establish elementary properties of these curves (monotonicity and concavity), provide formula for the area under curve (AUC) and compute the Youden…

统计理论 · 数学 2025-08-26 Mateusz Krukowski

On the attainment of the Wasserstein--Cramer--Rao lower bound

Recently, a Wasserstein analogue of the Cramer--Rao inequality has been developed using the Wasserstein information matrix (Otto metric). This inequality provides a lower bound on the Wasserstein variance of an estimator, which quantifies…

统计理论 · 数学 2025-08-26 Hayato Nishimori , Takeru Matsuda

Optimality of Right-Invariant Priors

We discuss optimal prediction for families of probability distributions with a locally compact topological group structure. Right-invariant priors were previously shown to yield a posterior predictive distribution minimizing the worst-case…

统计理论 · 数学 2025-08-26 Jannis Bolik , Thomas Hofmann

Improved estimation of the positive powers ordered restricted standard deviation of two normal populations

The present manuscript is concerned with component-wise estimation of the positive power of ordered restricted standard deviation of two normal populations with certain restrictions on the means. We propose several improved estimators under…

统计理论 · 数学 2025-08-26 Somnath Mondal , Lakshmi Kanta Patra

Identifying and bounding the probability of necessity for causes of effects with ordinal outcomes

Although the existing causal inference literature focuses on the forward-looking perspective by estimating effects of causes, the backward-looking perspective can provide insights into causes of effects. In backward-looking causal…

统计理论 · 数学 2025-08-26 Chao Zhang , Zhi Geng , Wei Li , Peng Ding

On the probability of linear separability through intrinsic volumes

A dataset with two labels is linearly separable if it can be split into its two classes with a hyperplane. This inflicts a curse on some statistical tools (such as logistic regression) but forms a blessing for others (e.g. support vector…

统计理论 · 数学 2025-08-26 Felix Kuchelmeister

Hidden Markov Models and the Bayes Filter in Categorical Probability

We use Markov categories to generalize the basic theory of Markov chains and hidden Markov models to an abstract setting. This comprises characterizations of hidden Markov models in terms of conditional independences and algorithms for…

统计理论 · 数学 2025-08-26 Tobias Fritz , Andreas Klingler , Drew McNeely , Areeb Shah-Mohammed , Yuwen Wang

Nonparametric Two-Sample Testing by Betting

We study the problem of designing consistent sequential two-sample tests in a nonparametric setting. Guided by the principle of testing by betting, we reframe this task into that of selecting a sequence of payoff functions that maximize the…

统计理论 · 数学 2025-08-26 Shubhanshu Shekhar , Aaditya Ramdas

Data Gluttony: Epistemic Risks, Dependent Testing and Data Reuse in Large Datasets

Large-scale registries have collected vast amounts of data which has enabled investigators to efficiently conduct studies of observational data. Common practice is for investigators to use all data meeting the inclusion criteria of their…

统计理论 · 数学 2025-08-25 Reid Dale , Jordan Rodu , Maria E. Currie , Mike Baiocchi

General M-estimators of location on Riemannian manifolds: existence and uniqueness

We study general M-estimators of location on Riemannian manifolds, extending classical notions such as the Frechet mean by replacing the squared loss with a broad class of loss functions. Under minimal regularity conditions on the loss…

统计理论 · 数学 2025-08-25 Jongmin Lee , Sungkyu Jung

Smooth and rough paths in mean derivative estimation for functional data

In this paper, in a multivariate setting we derive near optimal rates of convergence in the minimax sense for estimating partial derivatives of the mean function for functional data observed under a fixed synchronous design over H\"older…

统计理论 · 数学 2025-08-25 Max Berger , Hajo Holzmann

Asymptotic Theory for Linear Functionals of Kernel Ridge Regression

An asymptotic theory is established for linear functionals of the predictive function given by kernel ridge regression, when the reproducing kernel Hilbert space is equivalent to a Sobolev space. The theory covers a wide variety of linear…

统计理论 · 数学 2025-08-25 Rui Tuo , Lu Zou

Stabilized Cross-Validation of Smoothness in Density Deconvolution

We consider density estimation under measurement error with the Smoothness-Penalized Deconvolution (SPeD) estimator. The estimator has a tuning parameter regulating the smoothness of the estimate, and proper choice of this parameter is…

统计理论 · 数学 2025-08-25 David Kent

Multiply Robust Conformal Risk Control with Coarsened Data

Conformal Prediction (CP) has recently received a tremendous amount of interest, leading to a wide range of new theoretical and methodological results for predictive inference with formal theoretical guarantees. However, the vast majority…

统计理论 · 数学 2025-08-22 Manit Paul , Arun Kumar Kuchibhotla , Eric J. Tchetgen Tchetgen

Ties, Tails and Spectra: On Rank-Based Dependency Measures in High Dimensions

This work is concerned with the limiting spectral distribution of rank-based dependency measures in high dimensions. We provide distribution-free results for multivariate empirical versions of Kendall's $\tau$ and Spearman's $\rho$ in a…

统计理论 · 数学 2025-08-22 Nina Dörnemann , Michael Fleermann , Johannes Heiny

Dynamic clustering for heterophilic stochastic block models with time-varying node memberships

We consider a time-ordered sequence of networks stemming from stochastic block models where nodes gradually change their memberships over time, and no network at any single time point contains sufficient signal strength to recover its…

统计理论 · 数学 2025-08-22 Kevin Z Lin , Jing Lei

Comparing Scale Parameter Estimators for Gaussian Process Interpolation with the Brownian Motion Prior: Leave-One-Out Cross Validation and Maximum Likelihood

Gaussian process (GP) regression is a Bayesian nonparametric method for regression and interpolation, offering a principled way of quantifying the uncertainties of predicted function values. For the quantified uncertainties to be…

统计理论 · 数学 2025-08-22 Masha Naslidnyk , Motonobu Kanagawa , Toni Karvonen , Maren Mahsereci

Ordering results for random maxima and minima from two dependent Kumaraswamy generalized distributed samples

Let $\{X_{1},\ldots,X_{N_1}\}$ and $\{Y_{1},\ldots,Y_{N_2}\}$ be two sequences of interdependent heterogeneous samples, where for $i=1,\ldots,N_{1},$ $X_{i}\sim \text{Kw-G}(x, \alpha_{i}, \gamma_{i};G)$ and for $i=1,\ldots,N_{2},$…

统计理论 · 数学 2025-08-21 Sangita Das , Narayanaswamy Balakrishnan

Better bootstrap t confidence intervals for the mean

This article explores combinations of weighted bootstraps, like the Bayesian bootstrap, with the bootstrap $t$ method for setting approximate confidence intervals for the mean of a random variable in small samples. For this problem the…

统计理论 · 数学 2025-08-21 Art B. Owen

Non-asymptotic bounds for forward processes in denoising diffusions: Ornstein-Uhlenbeck is hard to beat

Denoising diffusion probabilistic models (DDPMs) represent a recent advance in generative modelling that has delivered state-of-the-art results across many domains of applications. Despite their success, a rigorous theoretical understanding…

统计理论 · 数学 2025-08-21 Miha Brešar , Aleksandar Mijatović