统计理论
We study the optimal sample complexity of variable selection in linear regression under general design covariance, and show that subset selection is optimal while under standard complexity assumptions, efficient algorithms for this problem…
Driven by the interest on how uniformity of marginal distributions propa\-gates to properties of regression functions, in this contribution we tackle the following questions: Given a $(d-1)$-dimensional random vector $\textbf{X}$ and a…
In this work, it is shown that there is no potential function on the Weilbull statistical manifold. However, from the two-parameter Weibull model we can extract a model with a potential function called the logit model. On this logit model,…
In this paper, we introduce a general theoretical framework for nonparametric hazard rate estimation using associated kernels, whose shapes depend on the point of estimation. Within this framework, we establish rigorous asymptotic results,…
We propose a contrast-based estimation method for Gaussian processes with time-inhomogeneous drifts, observed under high-frequency sampling. The process is modeled as the sum of a deterministic drift function and a stationary Gaussian…
We identify a large class of positive-semidefinite kernels for which a certain polynomial rate of convergence of maximum mean discrepancies of Farey sequences is equivalent to the Riemann hypothesis. This class includes all Mat\'ern kernels…
This paper studies optimal hypothesis testing for nonregular econometric models with parameter-dependent support. We consider both one-sided and two-sided hypothesis testing and develop asymptotically uniformly most powerful tests based on…
We introduce a new method for neighbourhood selection in linear structural equation models that improves over classical methods such as best subset selection (BSS) and the Lasso. Our method, called KL-BSS, takes advantage of the existence…
We show that contiguity relations of hypergeometric functions of several variables give a direct sampling algorithm from the conditional distribution of toric models in statistics. The algorithm is based on a Markov chain on a lattice…
Prediction is a key issue in time series analysis. Just as classical mean regression models, classical autoregressive methods, yielding L$^2$ point-predictions, provide rather poor predictive summaries; a much more informative approach is…
Chaos expansions are widely used in global sensitivity analysis (GSA), as they leverage orthogonal bases of L2 spaces to efficiently compute Sobol' indices, particularly in data-scarce settings. When derivatives are available, we argue that…
We study the minimization of the non-convex and non-differentiable objective function $v \mapsto \mathrm{E} ( \| X - v \| \| X + v \| - \| X \|^2 )$ in $\mathbb{R}^p$. In particular, we show that its minimizers recover the first principal…
Optimal transport has emerged as a fundamental methodology with applications spanning multiple research areas in recent years. However, the convergence rate of the empirical estimator to its population counterpart suffers from the curse of…
We establish a fundamental connection between optimal structure learning and optimal conditional independence testing by showing that the minimax optimal rate for structure learning problems is determined by the minimax rate for conditional…
Simulated tempering is a widely used strategy for sampling from multimodal distributions. In this paper, we consider simulated tempering combined with an arbitrary local Markov chain Monte Carlo sampler and present a new decomposition…
The present paper solves the problem of local linear approximation of the Fr\'echet conditional mean in an extrinsic and intrinsic way from time correlated bivariate curve data evaluated in a manifold (see Torres et al, 2025, on global…
A statistical hypothesis test for long range dependence (LRD) is formulated in the spectral domain for functional time series in manifolds. The elements of the spectral density operator family are assumed to be invariant with respect to the…
Criteria for identifying optimal adjustment sets yielding consistent estimation with minimal asymptotic variance of average treatment effects in parametric and nonparametric models have recently been established. In a single treatment time…
The rejection threshold used for e-values and e-processes is by default set to $1/\alpha$ for a guaranteed type-I error control at $\alpha$, based on Markov's and Ville's inequalities. This threshold can be wasteful in practical…
We study nonparametric contextual bandits under batch constraints, where the expected reward for each action is modeled as a smooth function of covariates, and the policy updates are made at the end of each batch of observations. We…