统计理论
Integer-valued time series are widely present in many fields, such as finance, economics, disease transmission, and traffic flow. With data dimensions surging, the traditional multivariate generalized integer autoregressive (MGINAR) model…
Finite mixture models have long been used across a variety of fields in engineering and sciences. Recently there has been a great deal of interest in quantifying the convergence behavior of the \emph{mixing measure}, a fundamental object…
In statistics, generalized linear models (GLMs) are widely used for modeling data and can expressively capture potential nonlinear dependence of the model's outcomes on its covariates. Within the broad family of GLMs, those with binary…
This paper proposes a Multimarginal Optimal Transport ($MOT$) approach for simultaneously comparing $k\geq 2$ measures supported on finite subsets of $\mathbb{R}^d$, $d \geq 1$. We derive asymptotic distributions of the optimal value of the…
Robust Bayesian analysis has been mainly devoted to detecting and measuring robustness w.r.t. the prior distribution. Many contributions in the literature aim to define suitable classes of priors which allow the computation of variations of…
We take another look at using Stein's method to establish uniform Berry-Esseen bounds for Studentized nonlinear statistics, highlighting variable censoring and an exponential randomized concentration inequality for a sum of censored…
We study optimal sample allocation between treatment and control groups under Bayesian linear models. We derive an analytic expression for the Bayes risk, which depends jointly on sample size and covariate mean balance across groups. Under…
We study minimax testing in a statistical inverse problem when the associated operator is unknown. In particular, we consider observations from an inverse Gaussian regression model where the associated operator is unknown but contained in a…
We motivate a new nonparametric test for the one-sided two-sample problem, which is based on a transform T of the Vincze-statistic (R,D). The exact and asymptotic distribution of T is derived. The fundamental idea can also be applied to the…
Expectiles are statistical parameters which also provide a class of sublinear risk measures in finance. They are solutions of continuous optimization problems. The corresponding first order condition provides two different fixed point…
Inferring network structures remains an interesting question for its importance on the understanding and controlling collective dynamics of complex systems. The existing shrinking methods such as Lasso-type estimation can not suitably…
In the context of likelihood ratio testing with parameters on the boundary, we revisit two situations for which there are some discrepancies in the literature: the case of two parameters of interest on the boundary, with all other…
We address the problem of finding worst-case nonparametric bounds for T-statistic by considering the extremal problem of maximising the mid-quantile (a special case of 'smoothed quantile' as discussed in \cite{St77} and \cite{W11}) $\tilde…
We study rates of convergence for estimation of the Gromov-Wasserstein (GW) distance. For two marginals supported on compact subsets of $\R^{d_x}$ and $\R^{d_y}$, respectively, with $\min \{ d_x,d_y \} > 4$, prior work established the rate…
Understanding the effects of the choice of the tree on the joint distribution of a tree-structured Markov random field (MRF) is crucial for fully exploiting the intelligibility of such probabilistic graphical models. Tools must be developed…
We address the issue of computing the non-linear shrinkage formulas for the weighted sample covariance in high dimension. We use theoretical properties of the asymptotic sample spectrum in order to derive the \textit{WeSpeR} algorithm and…
We obtain minimax-optimal convergence rates in the supremum norm, including information-theoretic lower bounds, for estimating the covariance kernel of a stochastic process which is repeatedly observed at discrete, synchronous design…
The hyperbolic secant distribution has several generalizations with applications in finance. In this study, we explore the dual geometric structure of one such generalization, namely the beta-logistic distribution. Recent findings also…
High-dimensional data has become ubiquitous across the sciences but presents computational and statistical challenges. A common approach to addressing these challenges is through sparsity. In this paper, we introduce a new concept of…
The false discovery rate (FDR) and the false non-discovery rate (FNR), defined as the expected false discovery proportion (FDP) and the false non-discovery proportion (FNP), are the most popular benchmarks for multiple testing. Despite the…