Statistics — Scifaro

Canonical Regularisation of Wide Feature-Learning Neural Networks

Wide neural networks in the feature-learning regime drive modern deep learning, and yet they remain far less studied than their kernel-regime counterparts. We consider a critical yet under-explored difference between these two regimes: the…

Machine Learning · Statistics 2026-05-19 George Whittle , Pranav Vaidhyanathan , Juliusz Ziomek , Natalia Ares , Maike A. Osborne

1-truncated C-vine copula mixed models for network meta-analysis of multiple diagnostic tests

As meta-analysis of multiple diagnostic tests impacts clinical decision making and patient health, there is growing interest in statistical models that synthesize evidence from studies comparing multiple diagnostic tests. To compare the…

Methodology · Statistics 2026-05-19 Aristidis K. Nikoloulopoulos

Optimal Sampling for Kernel Quadrature on Unbounded Domains

Kernel quadrature is widely used to approximate integrals of smooth functions, with worst-case error typically decaying at the minimax rate $n^{-\alpha/d}$ for smoothness $\alpha$ in dimension $d$. Existing rate-optimal methods often depend…

Computation · Statistics 2026-05-19 Edoardo Bandoni , Christian Robert , Julien Stoehr

Wasserstein bounds for denoising diffusion probabilistic models via the F\"ollmer process

This paper studies sampling error bounds for denoising diffusion probabilistic models (DDPMs) in the 2-Wasserstein distance. Our contributions are threefold. (i) Under general Lipschitz-type conditions on the score function and for a broad…

Machine Learning · Statistics 2026-05-19 Yuta Koike

A note on connections between the F\"ollmer process and the denoising diffusion probabilistic model

The F\"ollmer process is a Brownian motion conditioned to have a pre-specified distribution at time 1. This process can be interpreted as an "augmented" time-compressed version of the reverse stochastic differential equation (SDE) for the…

Machine Learning · Statistics 2026-05-19 Yuta Koike

A robust nonparametric test for spatial isotropy in lattice data

This paper proposes a robust test for assessing isotropy based on the variogram of spatial data on a two-dimensional regular grid. The test is based on the non-robust subsampling test for isotropy of Guan et al. (2004), which uses the idea…

Methodology · Statistics 2026-05-19 Jana Gierse , Roland Fried

A data-driven Fourier-mixture neural-network method for density estimation

We propose a data-driven Fourier-trained neural-network method for estimating fixed-horizon probability densities from empirical characteristic-function (CF) information. The estimator is a positive Gaussian--Laplace mixture with…

Machine Learning · Statistics 2026-05-19 Duy-Minh Dang , Volter Entoma

Conditional Predictive Inference for General Structured Data with Group Symmetries

We study distribution-free predictive inference for data with group symmetries, aiming to establish near-conditional coverage guarantees beyond exchangeability for structured data. While many predictive inference methods achieve a target…

Methodology · Statistics 2026-05-19 Yichen Shen , Mengxin Yu

Multivariate reconciliation for hierarchical time series

Some time series can be hierarchically organized into levels based on certain characteristics, such as geography or other attributes of interest. These series are referred to as hierarchical time series. Typically, forecasts are generated…

Methodology · Statistics 2026-05-19 Ana Caroline Pinheiro , Rodrigo de Souza Bulhões , Rob J. Hyndman , Paulo Canas Rodrigues

Double/Debiased Machine Learning for Continuous Treatment Effects in Panel Data with Endogeneity

We propose a double/debiased machine learning framework to estimate average derivative effects in nonparametric panel models with two-way fixed effects. It extends instrumental variable methods to panel settings, handles continuous…

Methodology · Statistics 2026-05-19 Peikai Wu , Kuan Sun , Zhiguo Xiao

Wavelet Based Time Series Models with Time-Varying Thresholds

This paper develops a threshold model with a time-varying threshold, represented using a wavelet series expansion. The model adequately captures irregular and abrupt variations, as well as smooth changes in the threshold parameter, allowing…

Methodology · Statistics 2026-05-19 Rhea Davis , N. Balakrishna

Simple Approximation and Derivative Free Inference-Time Scaling for Diffusion Models via Sequential Monte Carlo on Path Measures

iffusion-based generative models increasingly rely on inference-time guidance, adding a drift term or reweighting mixture of experts, to improve sample quality on task-specific objectives. However, most existing techniques require repeated…

Machine Learning · Statistics 2026-05-19 Chenyang Wang , Weizhong Wang , Yinuo Ren , Jose Blanchet , Yiping Lu

Quantifying Officiating Impact in the NBA: A Referee Impact Metric Analysis Using ESPN Win-Probability Data

Over the past century, basketball analytics has moved from simple box-score rates toward complex context-aware measures that evaluate events by their expected effect on game outcomes. Officiating analysis has not made the same transition:…

Applications · Statistics 2026-05-19 Nirek Duma , Leo Benaharon

Multi-Class Neurological Disorder Prediction with Tensor Network Feature Engineering

Accurate diagnosis of neurological disorders is contingent upon advanced imaging modalities such as Magnetic Resonance Imaging (MRI), which commonly utilize sparse imaging techniques to reconstruct images from limited data, thus reducing…

Applications · Statistics 2026-05-19 Keshav Balakrishna , Aaryan Chityala , Vivan Kanna , Ishan Pathak , Harshit Ravula , Aaron Lee , Alessandro Hammond , Moemal Al-Wishah , Leo Anthony Celi

Stationary birth-death processes generating inflation-deflation distributions: Avoiding the issue of dominance

A mixture of two or more count distributions has become deeply embedded in the analysis of excess counts, often relative to the stationary (equilibrium) distributions of birth-death processes such as the geometric, Poisson, Poisson-Lindley…

Methodology · Statistics 2026-05-19 Wanrudee Skulpakdee , Mongkol Hunkrajok

Comparing Two Categorical Gini Correlations with Applications to Classification Problems

This article proposes an inferential framework for comparing predictor importance in classification problems with categorical response variables. The approach is based on the categorical Gini correlation (CGC) proposed by Dang et al.…

Methodology · Statistics 2026-05-19 Sameera Hewage , Yongli Sang

StatQAT: Statistical Quantizer Optimization for Deep Networks

Quantization is essential for reducing the computational cost and memory usage of deep neural networks, enabling efficient inference on low-precision hardware. Despite the growing adoption of uniform and floating-point quantization schemes,…

Machine Learning · Statistics 2026-05-19 Mehmet Aktukmak , Daniel Huang , Ke Ding

How does feature learning reshape the function space?

Feature learning is widely regarded as the key mechanism distinguishing neural networks from fixed-kernel methods, yet its impact on the induced function space remains poorly understood. In this work, we precisely characterize how the…

Machine Learning · Statistics 2026-05-19 João Lobo , Bruno Loureiro , Long Tran-Than , Fanghui Liu

Online Conformal Prediction for Non-Exchangeable Panel Data

Panel data, in which multiple units are repeatedly observed over time, arise throughout science and engineering. Quantifying predictive uncertainty in such settings is challenging because conformal prediction, while distribution-free and…

Machine Learning · Statistics 2026-05-19 Daohong Tu , Kay Giesecke

Do Stationarity Transformations Actually Improve Time Series Forecasts? A Controlled Experimental Evaluation

Stationarity transformations are standard preprocessing in time series forecasting, yet their actual impact on accuracy across different non-stationarity types and model families has received little controlled evaluation. We construct…

Methodology · Statistics 2026-05-19 Bhanu Suraj Malla , Yuqing Hu