Statistics
Wide neural networks in the feature-learning regime drive modern deep learning, and yet they remain far less studied than their kernel-regime counterparts. We consider a critical yet under-explored difference between these two regimes: the…
As meta-analysis of multiple diagnostic tests impacts clinical decision making and patient health, there is growing interest in statistical models that synthesize evidence from studies comparing multiple diagnostic tests. To compare the…
Kernel quadrature is widely used to approximate integrals of smooth functions, with worst-case error typically decaying at the minimax rate $n^{-\alpha/d}$ for smoothness $\alpha$ in dimension $d$. Existing rate-optimal methods often depend…
This paper studies sampling error bounds for denoising diffusion probabilistic models (DDPMs) in the 2-Wasserstein distance. Our contributions are threefold. (i) Under general Lipschitz-type conditions on the score function and for a broad…
The F\"ollmer process is a Brownian motion conditioned to have a pre-specified distribution at time 1. This process can be interpreted as an "augmented" time-compressed version of the reverse stochastic differential equation (SDE) for the…
This paper proposes a robust test for assessing isotropy based on the variogram of spatial data on a two-dimensional regular grid. The test is based on the non-robust subsampling test for isotropy of Guan et al. (2004), which uses the idea…
We propose a data-driven Fourier-trained neural-network method for estimating fixed-horizon probability densities from empirical characteristic-function (CF) information. The estimator is a positive Gaussian--Laplace mixture with…
We study distribution-free predictive inference for data with group symmetries, aiming to establish near-conditional coverage guarantees beyond exchangeability for structured data. While many predictive inference methods achieve a target…
Some time series can be hierarchically organized into levels based on certain characteristics, such as geography or other attributes of interest. These series are referred to as hierarchical time series. Typically, forecasts are generated…
We propose a double/debiased machine learning framework to estimate average derivative effects in nonparametric panel models with two-way fixed effects. It extends instrumental variable methods to panel settings, handles continuous…
This paper develops a threshold model with a time-varying threshold, represented using a wavelet series expansion. The model adequately captures irregular and abrupt variations, as well as smooth changes in the threshold parameter, allowing…
iffusion-based generative models increasingly rely on inference-time guidance, adding a drift term or reweighting mixture of experts, to improve sample quality on task-specific objectives. However, most existing techniques require repeated…
Over the past century, basketball analytics has moved from simple box-score rates toward complex context-aware measures that evaluate events by their expected effect on game outcomes. Officiating analysis has not made the same transition:…
Accurate diagnosis of neurological disorders is contingent upon advanced imaging modalities such as Magnetic Resonance Imaging (MRI), which commonly utilize sparse imaging techniques to reconstruct images from limited data, thus reducing…
A mixture of two or more count distributions has become deeply embedded in the analysis of excess counts, often relative to the stationary (equilibrium) distributions of birth-death processes such as the geometric, Poisson, Poisson-Lindley…
This article proposes an inferential framework for comparing predictor importance in classification problems with categorical response variables. The approach is based on the categorical Gini correlation (CGC) proposed by Dang et al.…
Quantization is essential for reducing the computational cost and memory usage of deep neural networks, enabling efficient inference on low-precision hardware. Despite the growing adoption of uniform and floating-point quantization schemes,…
Feature learning is widely regarded as the key mechanism distinguishing neural networks from fixed-kernel methods, yet its impact on the induced function space remains poorly understood. In this work, we precisely characterize how the…
Panel data, in which multiple units are repeatedly observed over time, arise throughout science and engineering. Quantifying predictive uncertainty in such settings is challenging because conformal prediction, while distribution-free and…
Stationarity transformations are standard preprocessing in time series forecasting, yet their actual impact on accuracy across different non-stationarity types and model families has received little controlled evaluation. We construct…