统计理论
We prove that finite multivariate Erlang mixture densities with a common rate parameter are dense in the class of probability densities on $\mathbb{R}_{+}^{d}$ that belong to $L^{p}$, for every dimension $d\in\mathbb{N}$ and every $1\le…
We investigate Bayesian nonparametric density estimation via orthogonal polynomial expansions in weighted Sobolev spaces. A core challenge is establishing minimax optimal posterior convergence rates, especially for densities on unbounded…
Estimation of the mean and covariance functions is a fundamental problem in functional data analysis, particularly for discretely observed functional data. In this work, we study a regularization-based framework for estimating the mean and…
The Highly Adaptive Lasso (HAL) delivers unprecedented guarantees in nonparametric minimum loss estimation under minimal smoothness assumptions, such as dimension-free minimax optimal rates. However, the practical use of HAL has been…
We study the Fr\'echet $k-$means of a metric measure space when both the measure and the distance are unknown and have to be estimated. We prove a general result that states that the $k-$means are continuous with respect to the measured…
The Bayesian and Akaike information criteria aim at finding a good balance between under- and over-fitting. They are extensively used every day by practitioners. Yet we contend they suffer from at least two afflictions: their penalty…
Expand-and-sparsify representations are a class of theoretical models that capture sparse representation phenomena observed in the sensory systems of many animals. At a high level, these representations map an input $x \in \mathbb{R}^d$ to…
Prediction is a central task of statistics and machine learning, yet many inferential settings provide only partial information, typically in the form of moment constraints or estimating equations. We develop a finite, fully Bayesian…
Cross-sectional observations from a dynamical system can be modeled via steady-state distributions of Markov processes. The major challenge is then to determine whether the process parameters can be identified and estimated from the…
Transfer learning aims to improve performance on a target task by leveraging information from related source tasks. We propose a nonparametric regression transfer learning framework that explicitly models heterogeneity in the source-target…
Recursive decision trees are widely used to estimate heterogeneous causal treatment effects in experimental and observational studies. These methods are typically implemented using CART-type recursive partitioning and are often viewed as…
We consider a classical First-order Vector AutoRegressive (VAR(1)) model, where we interpret the autoregressive interaction matrix as influence relationships among the components of the VAR(1) process that can be encoded by a weighted…
Multiway data analysis aims to uncover patterns in data structured as multi-indexed arrays, with multiway covariance playing a crucial role in many applications. However, the high dimensionality of multiway covariance presents significant…
This paper consider the LAN property for the mixed O-U process under high-frequency observation when H>3/4. As considered in mixed fractional Brownian motion, we will also use the projection step to get the non-diagonal rate matrix.
We consider computationally-efficient estimation of population parameters when observations are subject to missing data. In particular, we consider estimation under the realizable contamination model of missing data in which an $\epsilon$…
We introduce a kernel-based two-sample test for comparing probability distributions up to group actions. Our construction yields invariant kernels for locally compact $\sigma$-compact groups and extends classical Haar-based approaches…
Equivalence testing compares the hypothesis that an effect $\mu$ is large against the alternative that it is negligible. Here, `large' is classically expressed as being larger than some `equivalence margin' $\Delta$. A longstanding problem…
Two recent works, Avella-Medina and Gonz\'alez-Sanz (2026) and Passeggeri and Paindaveine (2026), studied the robustness of the optimal transport map through its breakdown point, i.e., the smallest fraction of contamination that can make…
In this paper, we introduce flexible observation-driven $\mathbb{Z}$-valued time series models constructed from mixtures of negative and non-negative components. Compared to models based on the standard Skellam distribution or on a…
Reliable inference from complex survey samples can be derailed by outliers and high-leverage observations induced by unequal inclusion probabilities and calibration. We develop a minimum Hellinger distance estimator (MHDE) for parametric…