统计理论 — Scifaro

A Bayesian sequential test for the drift of a fractional Brownian motion

We consider a fractional Brownian motion with unknown linear drift such that the drift coefficient has a prior normal distribution and construct a sequential test for the hypothesis that the drift is positive versus the alternative that it…

统计理论 · 数学 2026-01-14 Alexey Muravlev , Mikhail Zhitlukhin

Nearest-neighbour Markov point processes on graphs with Euclidean edges

We define nearest-neighbour point processes on graphs with Euclidean edges and linear networks. They can be seen as the analogues of renewal processes on the real line. We show that the Delaunay neighbourhood relation on a tree satisfies…

统计理论 · 数学 2026-01-14 M. N. M. van Lieshout

Gold standard process Markovian poisoning: a semiparametric approach

We consider in this paper a stochastic process that mixes in time, according to a nonobserved stationary Markov selection process, two separate sources of randomness: i) a stationary process which distribution is accessible (gold standard);…

统计理论 · 数学 2026-01-13 Claire Lacour , Pierre Vandekerkhove

Wasserstein Concentration of Empirical Measures for Dependent Data via the Method of Moments

We establish a general concentration result for the 1-Wasserstein distance between the empirical measure of a sequence of random variables and its expectation. Unlike standard results that rely on independence (e.g., Sanov's theorem) or…

统计理论 · 数学 2026-01-13 Arash A. Amini , Luciano Vinas

A Note on NBUE and NWBUE Classes of Life Distributions

Non-monotonic ageing notions are looked upon as an extension of the corresponding monotonic ageing notions in this work. In particular, the New Better than Used in Expectation (NBUE) and the corresponding non-monotonic analogue New Worse…

统计理论 · 数学 2026-01-13 M. Z. Anis

Diffusion Models with Heavy-Tailed Targets: Score Estimation and Sampling Guarantees

Score-based diffusion models have become a powerful framework for generative modeling, with score estimation as a central statistical bottleneck. Existing guarantees for score estimation largely focus on light-tailed targets or rely on…

统计理论 · 数学 2026-01-13 Yifeng Yu , Lu Yu

Estimation of the intercept parameter in integrated Galton-Watson processes

We study estimation of the intercept parameter in an integrated Galton-Watson process, a basic building-block for many count-valued time series models. In this unit root setting, the ordinary least squares estimator is inconsistent, whereas…

统计理论 · 数学 2026-01-13 Yang Lu

The Feldman-H\'ajek Dichotomy for Countable Gaussian Mixtures and their Asymptotic Separability in High Dimensions

This paper establishes the theoretical foundations for the asymptotic separability of Gaussian Mixture Models (GMMs) in high dimensions by extending the classical Feldman-H\'ajek theorem. We first prove that a countable mixture of Gaussian…

统计理论 · 数学 2026-01-13 Umberto Michelucci

The resource theory of causal influence and knowledge of causal influence

Understanding and quantifying causal relationships between variables is essential for reasoning about the physical world. In this work, we develop a resource-theoretic framework to do so. Here, we focus on the simplest nontrivial setting --…

统计理论 · 数学 2026-01-13 Marina Maciel Ansanelli , Beata Zjawin , David Schmid , Yìlè Yīng , John H. Selby , Ciarán M. Gilligan-Lee , Ana Belén Sainz , Robert W. Spekkens

Robust Confidence Intervals for a Binomial Proportion: Local Optimality and Adaptivity

This paper revisits the classical problem of interval estimation of a binomial proportion under Huber contamination. Our main result derives the rate of optimal interval length when the contamination proportion is unknown under a local…

统计理论 · 数学 2026-01-13 Minjun Cho , Yuetian Luo , Chao Gao

Time-complexity of sampling from a multimodal distribution using sequential Monte Carlo

We study a sequential Monte Carlo algorithm to sample from the Gibbs measure with a non-convex energy function at a low temperature. We use the practical and popular geometric annealing schedule, and use a Langevin diffusion at each…

统计理论 · 数学 2026-01-13 Ruiyu Han , Gautam Iyer , Dejan Slepčev

Riesz representers for the rest of us

The application of semiparametric efficient estimators, particularly those that leverage machine learning, is rapidly expanding within epidemiology and causal inference. This literature is increasingly invoking the Riesz representation…

统计理论 · 数学 2026-01-13 Nicholas T. Williams , Oliver J. Hines , Kara E. Rudolph

Asymptotically well-calibrated Bayesian $p$-value using the Kolmogorov-Smirnov statistic

The posterior predictive $p$-value (ppp) is widely used in Bayesian model evaluation. However, due to double use of the data, the ppp may not be a valid $p$-value even in large samples: The asymptotic null distribution of the ppp can be…

统计理论 · 数学 2026-01-13 Yueming Shen , Surya Tokdar

Calibration Bands for Mean Estimates within the Exponential Dispersion Family

A statistical model is said to be calibrated if the resulting mean estimates perfectly match the true means of the underlying responses. Aiming for calibration is often not achievable in practice as one has to deal with finite samples of…

统计理论 · 数学 2026-01-13 Łukasz Delong , Selim Gatti , Mario V. Wüthrich

Fixed-strength spherical designs

A spherical $t$-design is a finite subset $X$ of the unit sphere such that every polynomial of degree at most $t$ has the same average over $X$ as it does over the entire sphere. Determining the minimum possible size of spherical designs,…

统计理论 · 数学 2026-01-13 Travis Dillon

Improved performance guarantees for Tukey's median

Is there a natural way to order data in dimension greater than one? The approach based on the notion of data depth, often associated with John Tukey, is among the most popular. Tukey's depth has found applications in robust statistics,…

统计理论 · 数学 2026-01-13 Stanislav Minsker , Yinan Shen

Directional testing for one-way MANOVA in divergent dimensions

Testing the equality of mean vectors across $g$ different groups plays an important role in many scientific fields. In regular frameworks, likelihood-based statistics under the normality assumption offer a general solution to this task.…

统计理论 · 数学 2026-01-13 Caizhu Huang , Claudia Di Caterina , Nicola Sartori

On the Effect of Misspecifying the Embedding Dimension in Low-rank Network Models

As network data has become ubiquitous in the sciences, there has been growing interest in network models whose structure is driven by latent node-level variables in a (typically low-dimensional) latent geometric space. These "latent…

统计理论 · 数学 2026-01-12 Roddy Taing , Keith Levin

Detecting Planted Structure in Circular Data

Hypothesis testing problems for circular data are formulated, where observations take values on the unit circle and may contain a hidden, phase-coherent structure. Under the null, the data are independent uniform on the unit circle; under…

统计理论 · 数学 2026-01-12 Taha Ameen , Bruce Hajek

What Functions Does XGBoost Learn?

This paper establishes a rigorous theoretical foundation for the function class implicitly learned by XGBoost, bridging the gap between its empirical success and our theoretical understanding. We introduce an infinite-dimensional function…

统计理论 · 数学 2026-01-12 Dohyeong Ki , Adityanand Guntuboyina