统计理论
The nonparametric estimation of integrated diffusion processes has been extensively studied, with most existing research focusing on pointwise convergence. This paper is the first to establish uniform convergence rates for the…
The major contributions of this paper lie in two aspects. Firstly, we focus on deriving Bernstein-type inequalities for both geometric and algebraic irregularly-spaced NED random fields, which contain time series as special case.…
In this paper, we study the minimizers of U-processes and their domains of attraction. U-processes arise in various statistical contexts, particularly in M-estimation, where estimators are defined as minimizers of certain objective…
The log-logistic distribution is a versatile parametric family widely used across various applied fields, including survival analysis, reliability engineering, and econometrics. When estimating parameters of the log-logistic distribution,…
Turing's estimator allows one to estimate the probabilities of outcomes that either do not appear or only rarely appear in a given random sample. We perform a simulation study to understand the finite sample performance of several related…
Previously [Journal of Causal Inference, 10, 90-105 (2022)], we computed the variance of two estimators of causal effects for a v-structure of binary variables. Here we show that a linear combination of these estimators has lower variance…
In supervised learning, including regression and classification, conformal methods provide prediction sets for the outcome/label with finite sample coverage for any machine learning predictor. We consider here the case where such prediction…
This paper addresses the problem of identifying and estimating the causal effect of a treatment in the presence of unmeasured confounding and various types of right-censoring. Examples of these censoring mechanisms are administrative…
The focus of this work is the convergence of non-stationary and deep Gaussian process regression. More precisely, we follow a Bayesian approach to regression or interpolation, where the prior placed on the unknown function $f$ is a…
This article introduces the class of continuous time locally stationary wavelet processes. Continuous time models enable us to properly provide scale-based time series models for irregularly-spaced observations for the first time, while…
We investigate the statistical behavior of gradient descent iterates with dropout in the linear regression model. In particular, non-asymptotic bounds for the convergence of expectations and covariance matrices of the iterates are derived.…
Testing hypothesis of independence between two random elements on a joint alphabet is a fundamental exercise in statistics. Pearson's chi-squared test is an effective test for such a situation when the contingency table is relatively small.…
The evaluation of G-Wishart normalising constants is a core component for Bayesian analyses for Gaussian graphical models, but remains a computationally intensive task in general. Based on empirical evidence, Roverato [Scandinavian Journal…
This paper proposes new ANOVA-based approximations of functions and emulators of high-dimensional models using either available derivatives or local stochastic evaluations of such models. Our approach makes use of sensitivity indices to…
Empirical Bayes estimators are based on minimizing the average risk with the hyper-parameters in the weighting function being estimated from observed data. The performance of an empirical Bayes estimator is typically evaluated by its mean…
Recently, Approximate Message Passing (AMP) has been integrated with stochastic localization (diffusion model) by providing a computationally efficient estimator of the posterior mean. Existing (rigorous) analysis typically proves the…
To address the challenges of reliable statistical inference in high-dimensional models, we introduce the Synthetic-data Regularized Estimator (SRE). Unlike traditional regularization methods, the SRE regularizes the complex target model via…
This paper is concerned with estimating the column subspace of a low-rank matrix $\boldsymbol{X}^\star \in \mathbb{R}^{n_1\times n_2}$ from contaminated data. How to obtain optimal statistical accuracy while accommodating the widest range…
Many important dynamic systems, time series models or even algorithms exhibit non-strong mixing properties. In this paper, we introduce the general concept of $\mathcal{C}_{p,\mathcal{F}}$-mixing to cover such cases, where assumptions on…
This paper studies how to construct confidence regions for principal component analysis (PCA) in high dimension, a problem that has been vastly under-explored. While computing measures of uncertainty for nonlinear/nonconvex estimators is in…