统计理论
Elliptically symmetric distributions are a classic example of a semiparametric model where the location vector and the scatter matrix (or a parameterization of them) are the two finite-dimensional parameters of interest, while the density…
We consider diffusion of independent molecules in an insulated Euclidean domain with unknown diffusivity parameter. At a random time and position, the molecules may bind and stop diffusing in dependence of a given `binding potential'. The…
The stratified linear permutation statistic arises in various statistics problems, including stratified and post-stratified survey sampling, stratified and post-stratified experiments, conditional permutation tests, etc. Although we can…
We prove bounds on the variance of a function $f$ under the empirical measure of the samples obtained by the Sequential Monte Carlo (SMC) algorithm, with time complexity depending on local rather than global Markov chain mixing dynamics.…
We study distributionally robust quantile regression using type-$p$ Wasserstein ambiguity sets. We derive a closed-form expression for the worst-case quantile regression loss under general $p$-Wasserstein uncertainty. We further give a…
Recently, Efron and Tibshirani (Annals of Statistics, 1996) proposed a semiparametric density estimator, which works by multiplying an initial kernel type estimate with a parametric exponential type correction factor, chosen so as to match…
Time-to-event forecasts are essential when decisions depend on event timing. This article develops a framework for evaluating such forecasts when the event has not yet occurred or is not predicted within the forecast horizon. We introduce a…
We consider the problem of constructing confidence intervals (CIs) for the population mean of $N$ values $\{x_1, \ldots, x_N\} \subset \Sigma^N$ based on a random sample of size $n$, denoted by $X^n \equiv (X_1, \ldots, X_n)$, drawn…
This paper deals with a copies-based continuously differentiable and strictly decreasing estimator of the drift function for stochastic differential equations defining recurrent diffusion processes. The first part of our paper deals with…
We introduce the notion of a bivariate random discrete copula on an equidistant mesh and explore its stochastic properties. A random discrete copula is a discrete random field, hence, its value at a given point on the mesh is a random…
We develop a nonparametric two-sample test for distributions supported on the cone of symmetric positive definite matrices. The procedure relies on the Wishart kernel density estimator (KDE) introduced by Belzile et al. (2025), whose…
Towards understanding the fundamental limits of estimation from data of varied quality, we study the problem of estimating a mean parameter from heteroskedastic Gaussian observations where the variances are unknown and may vary arbitrarily…
In random matrix theory, the spectral distribution of the covariance matrix has been well studied under the large dimensional asymptotic regime when the dimensionality and the sample size tend to infinity at the same rate. However, most…
In this work we extend the results developed in 2022 for a sequential change detection algorithm making use of Page's CUSUM statistic, the empirical distribution as an estimate of the pre-change distribution, and a universal code as a tool…
We study the problem of estimating the spectral density of a centered stationary Gaussian time series under local differential privacy constraints. Specifically, we propose new interactive privacy mechanisms for three tasks: recovering a…
In the context of binary classification of trajectories generated by time-homogeneous stochastic differential equations, we consider a mixture of two diffusion processes characterized by a stochastic differential equation (SDE) whose drift…
We consider the observations of an unknown $s$-sparse vector ${\boldsymbol \theta}$ corrupted by Gaussian noise with zero mean and unknown covariance matrix ${\boldsymbol \Sigma}$. We propose minimax optimal methods of estimating the…
The recent seminal work of Chernozhukov, Chetverikov and Kato has shown that bootstrap approximation for the maximum of a sum of independent random vectors is justified even when the dimension is much larger than the sample size. In this…
Statistical data by their very nature are indeterminate in the sense that if one repeats the process of collecting the data the new data set will be different from the original. But two data sets generated in the same way should ``tell the…
We consider the problem of estimating the missing mass, partition function or evidence and its probability distribution in the case that for each sample point in the discrete sample space its (unnormalized) probability mass is revealed.…