统计理论
This paper considers a semiparametric approach within the general Bayesian linear model where the innovations consist of a stationary, mean zero Gaussian time series. While a parametric prior is specified for the linear model coefficients,…
The article concerns hybrid combinations of empirical and parametric likelihood functions. Combining the two allows classical parametric likelihood to be crucially modified via the nonparametric counterpart, making possible model…
We incorporate the conditional value-at-risk (CVaR) quantity into a generalized class of Pickands estimators. By introducing CVaR, the newly developed estimators not only retain the desirable properties of consistency, location, and scale…
We consider a novel multivariate nonparametric two-sample testing problem where, under the alternative, distributions $P$ and $Q$ are separated in an integral probability metric over functions of bounded total variation (TV IPM). We propose…
We provide a simple proof of the Johnson-Lindenstrauss lemma for sub-Gaussian variables. We extend the analysis to identify how sparse projections can be, and what the cost of sparsity is on the target dimension.The Johnson-Lindenstrauss…
The Fisher-Rao distance is the geodesic distance between probability distributions in a statistical manifold equipped with the Fisher metric, which is a natural choice of Riemannian metric on such manifolds. It has recently been applied to…
We address the problem of testing conditional mean and conditional variance for non-stationary data. We build e-values and p-values for four types of non-parametric composite hypotheses with specified mean and variance as well as other…
Logistic regression is key method for modeling the probability of a binary outcome based on a collection of covariates. However, the classical formulation of logistic regression relies on the independent sampling assumption, which is often…
This review paper, written for the second edition of the Handbook of Markov Chain Monte Carlo, provides an introduction to the study of convergence analysis for Markov chain Monte Carlo (MCMC), aimed at researchers new to the field. We…
We consider the problem of estimating the asymptotic variance of a function defined on a Markov chain, an important step for statistical inference of the stationary mean. We design a novel recursive estimator that requires $O(1)$…
Estimation of the mean and covariance parameters for functional data is a critical task, with local linear smoothing being a popular choice. In recent years, many scientific domains are producing multivariate functional data for which $p$,…
This paper addresses the problem of approximating an unknown probability distribution with density $f$ -- which can only be evaluated up to an unknown scaling factor -- with the help of a sequential algorithm that produces at each iteration…
When modeling a vector of risk variables, extreme scenarios are often of special interest. The peaks-over-thresholds method hinges on the notion that, asymptotically, the excesses over a vector of high thresholds follow a multivariate…
We propose two families of asymptotically local minimax lower bounds on parameter estimation performance. The first family of bounds applies to any convex, symmetric loss function that depends solely on the difference between the estimate…
The remedian uses a $k\times b$ matrix to approximate the median of $n\leq b^{k}$ streaming input values by recursively replacing buffers of $b$ values with their medians, thereby ignoring its $200(\lceil b/2\rceil / b)^{k}%$ most extreme…
In this thesis, we consider an $N$-dimensional Ornstein-Uhlenbeck (OU) process satisfying the linear stochastic differential equation $d\mathbf x(t) = - \mathbf B\mathbf x(t) dt + \boldsymbol \Sigma d \mathbf w(t).$ Here, $\mathbf B$ is a…
In this paper, for the problem of heteroskedastic general linear hypothesis testing (GLHT) in high-dimensional settings, we propose a random integration method based on the reference L2-norm to deal with such problems. The asymptotic…
Correspondence analysis, multiple correspondence analysis and their discriminant counterparts (i.e., discriminant simple correspondence analysis and discriminant multiple correspondence analysis) are methods of choice for analyzing…
We study the problem of approximately transforming a sample from a source statistical model to a sample from a target statistical model without knowing the parameters of the source model, and construct several computationally efficient such…
Most linear experimental design problems assume homogeneous variance although heteroskedastic noise is present in many realistic settings. Let a learner have access to a finite set of measurement vectors $\mathcal{X}\subset \mathbb{R}^d$…