统计理论
This paper tackles a fundamental inference problem: given $n$ observations from a distribution $P$ over $\mathbb{R}^d$ with unknown mean $\boldsymbol{\mu}$, we must form a confidence set for the index (or indices) corresponding to the…
Inference on the parametric part of a semiparametric model is no trivial task. If one approximates the infinite dimensional part of the semiparametric model by a parametric function, one obtains a parametric model that is in some sense…
In this short note, we consider posterior simulation for a linear regression model when the error distribution is given by a scale mixture of multivariate normals. We first show that the sampler of Backlund and Hobert (2020) for the case of…
Two regeneration-based bootstrap methods, namely, the \textit{Regeneration based-bootstrap} \cite{AthreyaFuh1992, Somnat-1993} and the \textit{Regenerative Block bootstrap} \cite{Bertail2006} are shown to be valid for the problem of…
It is the purpose of this paper to investigate the issue of estimating the regularity index $\beta>0$ of a discrete heavy-tailed r.v. $S$, \textit{i.e.} a r.v. $S$ valued in $\mathbb{N}^*$ such that $\mathbb{P}(S>n)=L(n)\cdot n^{-\beta}$…
Recent works have shown an interest in investigating the frequentist asymptotic properties of Bayesian procedures for high-dimensional linear models under sparsity constraints. However, there exists a gap in the literature regarding…
The field of machine have seen rising applications of equivariance criterion. However, there is no systematic way to justify its usage, including why it works, whether there is an optimal solution and if so, what form it carries. In this…
In this paper we establish functional Erd\H{o}s-Renyi laws for L\'evy processes, i.e. limit theorems for sets of functions on [0,1] associated to their increments. First, we determine precise conditions under which, in a general framework,…
In this paper, we consider directly estimating the eigenvalues of precision matrix, without inverting the corresponding estimator for the eigenvalues of covariance matrix. We focus on a general asymptotic regime, i.e., the large dimensional…
In this paper, we establish the convergence rate in central limit theorem (CLT) for linearly extended negative quadrant dependent (LENQD) random variables (rv's). Under some weak conditions, the rate of normal approximation is shown as…
The distribution regression problem encompasses many important statistics and machine learning tasks, and arises in a large range of applications. Among various existing approaches to tackle this problem, kernel methods have become a method…
We study the calculation of exact p-values for a large class of non-sharp null hypotheses about treatment effects in a setting with data from experiments involving members of a single connected network. The class includes null hypotheses…
Following the student t-statistic, normalization has been a widely used method in statistic and other disciplines including economics, ecology and machine learning. We focus on statistics taking the form of a ratio over (some power of) the…
We study parameter estimation for interacting particle systems (IPSs) consisting of $N$ weakly interacting multivariate hypoelliptic SDEs. We propose a locally Gaussian approximation of the transition dynamics, carefully designed to address…
We consider a high-dimensional sparse normal means model where the goal is to estimate the mean vector assuming the proportion of non-zero means is unknown. We model the mean vector by a one-group global-local shrinkage prior belonging to a…
Statistical tools which satisfy rigorous privacy guarantees are necessary for modern data analysis. It is well-known that robustness against contamination is linked to differential privacy. Despite this fact, using multivariate medians for…
We introduce a general framework for testing statistical hypotheses for probability measures supported on finite spaces, which is based on optimal transport (OT). These tests are inspired by the analysis of variance (ANOVA) and its…
We present a novel framework for variable selection in Fr\'echet regression with responses in general metric spaces, a setting increasingly relevant for analyzing non-Euclidean data such as probability distributions and covariance matrices.…
Estimating the mean of a random vector from i.i.d. data has received considerable attention, and the optimal accuracy one may achieve with a given confidence is fairly well understood by now. When the data take values in more general metric…
Statistical analyses of multipopulation studies often use the data to select a particular population as the target of inference. For example, a confidence interval may be constructed for a population only in the event that its sample mean…