统计理论
Variable selection is a problem of statistics that aims to find the subset of the $N$-dimensional possible explanatory variables that are truly related to the generation process of the response variable. In high-dimensional setups, where…
The purpose of this article is to extend the notion of statistical depth to the case of sample paths of a Markov chain. Initially introduced to define a center-outward ordering of points in the support of a multivariate distribution, depth…
Standardness is a popular assumption in the literature on set estimation. It also appears in statistical approaches to topological data analysis, where it is common to assume that the data were sampled from a probability measure that…
Nonparametric mean function regression with repeated measurements serves as a cornerstone for many statistical branches, such as longitudinal/panel/functional data analysis. In this work, we investigate this problem using fully connected…
A novel method for sequential outlier detection in non-stationary time series is proposed. The method tests the null hypothesis of ``no outlier'' at each time point, addressing the multiple testing problem by bounding the error probability…
Motivated by learning dynamical structures from static snapshot data, this paper presents a distribution-on-scalar regression approach for estimating the density evolution of a stochastic process from its noisy temporal point clouds. We…
Consider the nonparametric logistic regression problem. In the logistic regression, we usually consider the maximum likelihood estimator, and the excess risk is the expectation of the Kullback-Leibler (KL) divergence between the true and…
The spiked Fisher matrix is a significant topic for two-sample problems in multivariate statistical inference. This paper is dedicated to testing the number of spikes in a high-dimensional generalized spiked Fisher matrix that relaxes the…
Let $\Gamma$ be an LCA group and $(\mu_n)$ be a sequence of bounded regular Borel measures on $\Gamma$ tending to a measure $\mu_0$. Let $G$ be the dual group of $\Gamma$, $S$ be a non-empty subset of $G \setminus \{ 0 \}$, and $[{\mathcal…
We address the problem of producing a lower bound for the mean of a discrete probability distribution, with known support over a finite set of real numbers, from an iid sample of that distribution. Up to a constant, this is equivalent to…
We propose R\'enyi inaccuracy measure based on multivariate copula and multivariate survival copula, respectively dubbed as multivariate cumulative copula R\'enyi inaccuracy measure and multivariate survival copula R\'enyi inaccuracy…
In this paper, we consider parameter estimation and quasi-likelihood ratio tests for multidimensional jump-diffusion processes defined by stochastic differential equations. In general, simultaneous estimation faces challenges such as an…
We study monotonicity testing of high-dimensional distributions on $\{-1,1\}^n$ in the model of subcube conditioning, suggested and studied by Canonne, Ron, and Servedio~\cite{CRS15} and Bhattacharyya and Chakraborty~\cite{BC18}. Previous…
Essentially all anytime-valid methods hinge on Ville's inequality to gain validity across time without incurring a union bound. Ville's inequality is a proper generalisation of Markov's inequality. It states that a non-negative…
The local false discovery rate (lfdr) of Efron et al. (2001) enjoys major conceptual and decision-theoretic advantages over the false discovery rate (FDR) as an error criterion in multiple testing, but is only well-defined in Bayesian…
We consider the problem of estimating the number of communities in a weighted balanced Stochastic Block Model. We construct hypothesis tests based on semidefinite programming and with a statistic coming from a GOE matrix to distinguish…
We consider the problem of optimal linear estimation of the functional $$A_N \vec{\xi} =\sum_{j = 0}^{N} (\vec{a}(j))^{\top} \vec{\xi}(j)$$ that depends on the unknown values $\vec{\xi}(j),j=0,1,\dots,N,$ of a vector-valued harmonizable…
We study the problem of estimating the barycenter of a distribution given i.i.d. data in a geodesic space. Assuming an upper curvature bound in Alexandrov's sense and a support condition ensuring the strong geodesic convexity of the…
In this paper we consider the construction of simultaneous confidence bands for the spectral density of a stationary time series using a Gaussian approximation for classical lag-window spectral density estimators evaluated at the set of all…
We propose R\'enyi information generating function and discuss its properties. A connection between the R\'enyi information generating function and the diversity index is proposed for discrete type random variables. The relation between the…