统计理论
This study proposes a unified stochastic framework for approximating and computing the gradient of every smooth function evaluated at non-independent variables, using $\ell_p$-spherical distributions on $\R^d$ with $d, p\geq 1$. The…
We study the problem of estimating the score function using both implicit score matching and denoising score matching. Assuming that the data distribution exhibiting a low-dimensional structure, we prove that implicit score matching is able…
Sampling based on score diffusions has led to striking empirical results, and has attracted considerable attention from various research communities. It depends on availability of (approximate) Stein score functions for various levels of…
We introduce a new goodness-of-fit test for count data on $\mathbb{N}$ for the Zeta distribution with unknown parameter. The test is built on a Stein-type characterization that uses, as Stein operator, the infinitesimal generator of a…
This paper establishes the Local Asymptotic Normality (LAN) property for the mixed fractional Brownian motion under high-frequency observations with Hurst index $H \in (0, 3/4)$. The simultaneous estimation of the volatility and the Hurst…
Consider the task of generating samples from a tilted distribution of a random vector whose underlying distribution is unknown, but samples from it are available. This finds applications in fields such as finance and climate science, and in…
We consider statistical inference for a class of continuous semimartingale regression models based on high-frequency observations subject to contamination by finite-activity jumps and spike noise. By employing density-power weighting and…
Multimarginal optimal transport (MOT) has emerged as a useful framework for many applied problems. However, compared to the well-studied classical two-marginal optimal transport theory, analysis of MOT is far more challenging and remains…
Variational inference is a fast and scalable alternative to Markov chain Monte Carlo and has been widely applied to posterior inference tasks in statistics and machine learning. A traditional approach for implementing mean-field variational…
We propose an extreme dimension reduction method extending the Extreme-PLS approach to the case where the covariate lies in a possibly infinite-dimensional Hilbert space. The ideas are partly borrowed from both Partial Least-Squares and…
Debiasing is a fundamental concept in high-dimensional statistics. While degrees-of-freedom adjustment is the state-of-the-art technique in high-dimensional linear regression, it is limited to i.i.d. samples and sub-Gaussian covariates.…
This paper develops a general approach for deep learning for a setting that includes nonparametric regression and classification. We perform a framework from data that fulfills a generalized Bernstein-type inequality, including independent,…
Conformal prediction (CP) is widely presented as distribution-free predictive inference with finite-sample marginal coverage under exchangeability. We argue that CP is best understood as a rank-calibrated descendant of the…
For learned models to be trustworthy, it is essential to verify their robustness to perturbations in the training data. Classical approaches involve uncertainty quantification via confidence intervals and bootstrap methods. In contrast,…
High-dimensional Bayesian procedures often exhibit behavior that is effectively low dimensional, even when the ambient parameter space is large or infinite-dimensional. This phenomenon underlies the success of shrinkage priors,…
The Lindley distribution was first introduced by Lindley in 1958 for Bayesian computations. Over the past years, various generalizations of this distribution have been proposed by different authors. The generalized Lindley distributions…
In this article, we study sequential change-point methods for discretely observed generalized Ornstein-Uhlenbeck processes with periodic drift. Two detection methods are proposed, and their respective performance is studied through…
We study the problem of active nonparametric sequential two-sample testing over multiple heterogeneous data sources. In each time slot, a decision-maker adaptively selects one of $K$ data sources and receives a paired sample generated from…
The paper examines the construction and analysis of a new class of mixed exponential statistical structures that combine the properties of stochastic models and linear positive operators. The relevance of the topic is driven by the growing…
Prompt learning has become a key method for adapting large language models to specific tasks with limited data. However, traditional gradient-based optimization methods for tuning prompts are computationally intensive, posing challenges for…