统计理论
We address two important statistical problems: that of estimating mixtures of multivariate normal distributions and mixtures of $t$-distributions based on univariate projections, and that of quantifying a discrepancy between mixture…
We study the statistical properties of the entropic optimal (self) transport problem for smooth probability measures. We provide an accurate description of the limit distribution for entropic (self-)potentials and plans as the…
This article extends weak convergence bounds of Markov transition kernels to convergence bounds on the variance of the Markov kernel applied to Lipschitz functions. In the reversible case, weak convergence rates of the transition kernels…
Based on discrete observations, we develop a test to infer if the volatility function $\sigma(\cdot)$ within the nonparametric Gaussian white noise model $dY_t = \sigma(t)dW_t$ is constant. The testing procedure is shown to be…
Based on discrete observations $X_0,X_{\Delta},\dots, X_{n\Delta}$ for $\Delta=n^{-\gamma}$ with $\gamma\in [0,1)$ of the null-recurrent dynamic $dX_t = \sigma(X_t)dW_t$ with a Brownian motion $W$ and $\sigma(x)=\alpha\mathbb{1}\{x<\rho\} +…
This paper characterizes the best possible rate of growth of wealth in a Kelly betting game when repeatedly betting against a general i.i.d. null hypothesis $\mathscr{P}$, but the data are drawn i.i.d from an arbitrary alternative $Q$. We…
We study nonasymptotic minimax estimation of the linear functional $L(\theta)=\eta^\top \theta$ for a high-dimensional $s$-sparse mean vector with an arbitrary loading vector $\eta$. For symmetric noise with exponentially decaying tails, we…
A parametric theory of statistical inference is developed for the moderate deviation probability zone. The new approach to the proofs is based on the Taylor series expansion of the logarithm of the likelihood ratio based on the Hellinger…
Uncertainty is ubiquitous in real-world data, and the assumptions underlying classical linear regression models are often violated in practice. Inspired by the theory of sublinear expectation, we consider a linear regression model where the…
Measures of association in contingency tables, such as odds ratios and their generalizations, are often studied under different sampling schemes that either fix or leave random the margins of the table. While classical results show that…
We study multivariate tail-dependence compatibility for complete and partial signed tail families, treating lower-tail, upper-tail, and mixed configurations in one geometric witness representation indexed by active coordinate sets and sign…
In this paper, we establish sharp upper and lower bounds on the convergence rate of the empirical measures of point processes under the Wasserstein distance. To this end, we first introduce a new metric on the space of counting measures…
Polyak-Ruppert averaging yields an asymptotically normal estimator with sandwich covariance $H^{-1}SH^{-1}$, the foundation of online inference. When the gradient step is preconditioned by a data-driven matrix $P_t$, we ask how fast $P_t$…
We investigate the Conway--Maxwell multivariate Bernoulli distributions, a family of multivariate Bernoulli distributions derived from the Conway--Maxwell-binomial distribution. We show that it is possible to set the parametrization such…
Deterministic-scan and random-scan component-wise Markov chain Monte Carlo algorithms, such as Gibbs samplers and conditional Metropolis-Hastings, are popular approaches for sampling from multivariate distributions. A long-standing open…
In randomized experiments, regression adjustment can improve the precision of average treatment effect (ATE) estimation using covariates without requiring a correctly specified outcome model. Although well studied in low-dimensional…
The analysis step of the ensemble Kalman filter, called the ensemble Kalman update (EnKU), is widely used for approximating posterior distributions in inverse problems and data assimilation. The EnKU approximates the posterior distribution…
Let $S$ be a finite set, and $X_1,\ldots,X_n$ an i.i.d. uniform sample from $S$. To estimate the size $|S|$, without further structure, one can wait for repeats and use the birthday problem. This requires a sample size of the order…
We establish bounds on the conductance for the systematic-scan and random-scan Gibbs samplers when the target distribution satisfies a Poincar\'e or log-Sobolev inequality and possesses sufficiently regular conditional distributions. These…
We study the effects of missingness on the estimation of population parameters. Moving beyond restrictive missing completely at random (MCAR) assumptions, we first formulate a missing data analogue of Huber's arbitrary…