统计理论
A Cartan-geometric, jet bundle formulation of curvature-aware variance bounds in parametric statistical estimation is developed. Building on our earlier extrinsic Hilbert space approach to the Cram\'er-Rao and Bhattacharyya-type…
Over the last few months, AI models including large language models have improved greatly. There are now several documented examples where they have helped professional mathematical scientists prove new results, sometimes even helping…
It is well-known that, in Gaussian two-group separation, the optimally discriminating projection direction can be estimated without any knowledge on the group labels. In this work, we \revision{gather} several such unsupervised estimators…
Random Forests have become a widely used tool in machine learning since their introduction in 2001, known for their strong performance in classification and regression tasks. One key feature of Random Forests is the Random Forest…
For a set of $p$-variate data points $\boldsymbol y_1,\ldots,\boldsymbol y_n$, there are several versions of multivariate median and related multivariate sign test proposed and studied in the literature. In this paper we consider the…
Score estimation has recently emerged as a key modern statistical challenge, due to its pivotal role in generative modelling via diffusion models. Moreover, it is an essential ingredient in a new approach to linear regression via convex…
To understand how hidden information can be extracted from statistical networks, planted models in random graphs have been the focus of intensive study in recent years. In this work, we consider the detection of a planted matching, i.e., an…
Privacy-preserving data analysis has become a central challenge in modern statistics. At the same time, a long-standing goal in statistics is the development of adaptive procedures -- methods that achieve near-optimal performance across…
Codifference is a commonly used measure of dependence for stable vectors and processes for which covariance is infinite. However, we argue that it can also be used for other heavy-tail distributions and it provides useful information for…
We study a class of robust mean estimators $\widehat{\mu}$ obtained by adaptively shrinking the weights of sample points far from a base estimator $\widehat{\kappa}$. Given a data-dependent scaling factor $\widehat{\alpha}$ and a weighting…
We observe an unknown function of $d$ variables $f(\boldsymbol{t})$, $\boldsymbol{t} \in[0,1]^d$, in the Gaussian white noise model of intensity $\varepsilon>0$. We assume that the function $f$ is regular and that it is a sum of $k$-variate…
(This is the third version of a working paper.) We develop a family of self-normalized concentration inequalities for marginal mean under martingale-difference structure and $\phi/\tilde{\phi}$-mixing conditions, where the latter includes…
Loss functions determine what it means for an estimator to be optimal, yet the ways in which different losses impose structurally incompatible optimality requirements are not captured by existing decision-theoretic frameworks. This paper…
We investigate the estimation of an optimal transport map between probability measures on an infinite-dimensional space and reveal its minimax optimal rate. Optimal transport theory defines distances within a space of probability measures,…
This paper proposes a novel method to estimate parameters in a logistic regression model. After obtaining the estimators, their asymptotic properties are rigorously investigated.
Second-order characteristics including covariance and spectral density functions are fundamentally important for both statistical applications and theoretical analysis in functional time series. In the high-dimensional setting where the…
We develop a unified geometric framework for nonparametric estimation based on the notion of Twin Kernel Spaces, defined as orbits of a reproducing kernel under a group action. This structure induces a family of transported RKHS geometries…
The designation ``Bernstein-von Mises theorem'' is apparently due to Lucien Le Cam. Roughly, the assertion of this theorem states that the posterior distribution of a parameter, conditioned on a large sample, is approximately normal,…
We provide an epsilon-delta interpretation of Chatterjee's rank correlation by tracing its origin to a notion of local dependence between random variables. Starting from a primitive epsilon-delta construction, we show that rank-based…
Almost seventy years old Marshall-Olkin copulas, then wider Marshall copulas, and finally even wider shock model (SM) copulas constitute a substantial part of nowadays copula theory due to numerous applications. Recently, Christian Genest…