统计理论
This paper introduces a version of decoupling and randomization to establish concentration inequalities for double-indexed permutation statistics. The results yield, among other applications, a new combinatorial Hanson-Wright inequality and…
We prove a new sample complexity result for divergence regularized optimal transport. Our bound holds for probability measures on~$\mathbb{R}^d$ with exponential tail decay and for radial cost functions that satisfy a local Lipschitz…
We consider the problem of estimating assortment probabilities, which is common in operations management applications, including product bundling, advertising, etc. Existing approaches typically model each assortment as a category and apply…
It is well known that there is no direct one-to-one relation between $p$-values and likelihood ratios or Bayes factors, since their relation crucially involves the sample size $n$. We investigate their (asymptotic) relation in a…
We introduce a rigorous and sensitive significance test for hyperuniformity that yields reliable results even from a single sample. Our approach is based on a detailed analysis of the empirical Fourier transform of a stationary point…
This paper presents uniform-in-time finite-sample bounds for regularized linear regression with vector-valued outputs and conditionally zero-mean subgaussian noise. By revisiting classical self-normalized martingale arguments, we obtain…
A basic issue in both teaching of and practice of statistics is the interplay between modelling assumptions and inference performance. The general message conveyed is that stronger assumptions lead to better statistical performance of the…
We prove that finite multivariate Erlang mixture densities with a common rate parameter are dense in the class of probability densities on $\mathbb{R}_{+}^{d}$ that belong to $L^{p}$, for every dimension $d\in\mathbb{N}$ and every $1\le…
We investigate Bayesian nonparametric density estimation via orthogonal polynomial expansions in weighted Sobolev spaces. A core challenge is establishing minimax optimal posterior convergence rates, especially for densities on unbounded…
Estimation of the mean and covariance functions is a fundamental problem in functional data analysis, particularly for discretely observed functional data. In this work, we study a regularization-based framework for estimating the mean and…
The Highly Adaptive Lasso (HAL) delivers unprecedented guarantees in nonparametric minimum loss estimation under minimal smoothness assumptions, such as dimension-free minimax optimal rates. However, the practical use of HAL has been…
We study the Fr\'echet $k-$means of a metric measure space when both the measure and the distance are unknown and have to be estimated. We prove a general result that states that the $k-$means are continuous with respect to the measured…
The Bayesian and Akaike information criteria aim at finding a good balance between under- and over-fitting. They are extensively used every day by practitioners. Yet we contend they suffer from at least two afflictions: their penalty…
Expand-and-sparsify representations are a class of theoretical models that capture sparse representation phenomena observed in the sensory systems of many animals. At a high level, these representations map an input $x \in \mathbb{R}^d$ to…
Prediction is a central task of statistics and machine learning, yet many inferential settings provide only partial information, typically in the form of moment constraints or estimating equations. We develop a finite, fully Bayesian…
Cross-sectional observations from a dynamical system can be modeled via steady-state distributions of Markov processes. The major challenge is then to determine whether the process parameters can be identified and estimated from the…
Transfer learning aims to improve performance on a target task by leveraging information from related source tasks. We propose a nonparametric regression transfer learning framework that explicitly models heterogeneity in the source-target…
Recursive decision trees are widely used to estimate heterogeneous causal treatment effects in experimental and observational studies. These methods are typically implemented using CART-type recursive partitioning and are often viewed as…
We consider a classical First-order Vector AutoRegressive (VAR(1)) model, where we interpret the autoregressive interaction matrix as influence relationships among the components of the VAR(1) process that can be encoded by a weighted…
Multiway data analysis aims to uncover patterns in data structured as multi-indexed arrays, with multiway covariance playing a crucial role in many applications. However, the high dimensionality of multiway covariance presents significant…