统计理论
We revisit the following problem, proposed by Kolmogorov: given prescribed marginal distributions $F$ and $G$ for random variables $X,Y$ respectively, characterize the set of compatible distribution functions for the sum $Z=X+Y$. Bounds on…
We study a hypothesis testing problem in the context of high-dimensional changepoint detection. Given a matrix $X \in \R^{p \times n}$ with independent Gaussian entries, the goal is to determine whether or not a sparse, non-null fraction of…
We establish a definition of ordinal patterns for multivariate data sets based on the concept of Tukey's halfspace depth. Given the definition of these \emph{depth patterns}, we are interested in the probabilities of observing specific…
At present, in the theory of stochastic process modeling a problem of assessment of reliability and accuracy of stochastic process model in $C(T)$ space wasn't studied for the case of implicit decomposition of process in the form of a…
Permutation tests are a popular choice for distinguishing distributions and testing independence, due to their exact, finite-sample control of false positives and their minimax optimality when paired with U-statistics. However, standard…
The block maxima method is a standard approach for analyzing the extremal behavior of a potentially multivariate time series. It has recently been found that the classical approach based on disjoint block maxima may be universally improved…
We study two adaptive importance sampling schemes for estimating the probability of a rare event in the high-dimensional regime $d \to \infty$ with $d$ the dimension. The first scheme is the prominent cross-entropy (CE) method, and the…
The quantification of treatment effects plays an important role in a wide range of applications, including policy making and bio-pharmaceutical research. In this article, we study the quantile treatment effect (QTE) while addressing two…
This paper introduces a quasi-likelihood ratio testing procedure for diffusion processes observed under nonsynchronous sampling schemes. High-frequency data, particularly in financial econometrics, are often recorded at irregular time…
We consider a quotient of a complete Riemannian manifold modulo an isometrically and properly acting Lie group and lifts of the quotient to the manifolds in optimal position to a reference point on the manifold. With respect to the pushed…
In this paper, models that approximate stochastic processes from the space $Sub_\varphi(\Omega)$ with given reliability and accuracy in $L_p(T)$ are considered for some specific functions $\varphi(t)$. For processes that are decomposited in…
The Generalized Mallows Model (GMM) is a well known family of models for ranking data. A GMM is a distribution over $\mathbb{S}_n$, the set of permutations of n objects, characterized by a location parameter $\sigma \in \mathbb{S}_n$, known…
We extend the celebrated Glivenko-Cantelli theorem, sometimes called the fundamental theorem of statistics, from its standard setting of total variation distance to all $f$-divergences. A key obstacle in this endeavor is to define…
Variance based global sensitivity analysis measures the relevance of inputs to a single output using Sobol' indices. This paper extends the definition in a natural way to multiple outputs, directly measuring the relevance of inputs to the…
This paper studies the estimation of large precision matrices and Cholesky factors obtained by observing a Gaussian process at many locations. Under general assumptions on the precision and the observations, we show that the sample…
We consider estimating the shared mean of a sequence of heavy-tailed random variables taking values in a Banach space. In particular, we revisit and extend a simple truncation-based mean estimator first proposed by Catoni and Giulini. While…
The study of survival data often requires taking proper care of the censoring mechanism that prohibits complete observation of the data. Under right censoring, only the first occurring event is observed: either the event of interest, or a…
This paper is concerned with a Bayesian approach to testing hypotheses in statistical inverse problems. Based on the posterior distribution $\Pi \left(\cdot |Y = y\right)$, we want to infer whether a feature $\langle\varphi,…
We introduce a general methodology for post hoc inference in a large-scale multiple testing framework. The approach is called "user-agnostic" in the sense that the statistical guarantee on the number of correct rejections holds for any set…
We consider estimating the proportion of random variables for two types of composite null hypotheses: (i) the means or medians of the random variables belonging to a non-empty, bounded interval; (ii) the means or medians of the random…