统计理论
In this paper we develop a new tool for the comparison of paired data based on a new criterion of stochastic dominance that takes into account the dependence structure of the random variables under comparison. This new procedure provides a…
It is well known that the asymptotic variance of sample quantiles can be reduced under heterogeneity relative to the i.i.d. setting. However, asymptotically correct confidence intervals for quantiles are not yet available. We propose a…
We propose a robust estimator for the tail index of Pareto-type distributions under random right-censoring, constructed within the minimum density power divergence (MDPD) framework and based on the Nelson--Aalen estimator of the cumulative…
In this paper we investigate the generalization error of gradient descent (GD) applied to an $\ell_2$-regularized OLS objective function in the linear model. Based on our analysis we develop new methodology for computationally tractable and…
We explore estimation and forecast accuracy for sparse linear models, focusing on scenarios where both predictors and errors carry serial correlations. We establish a clear link between predictor serial correlation and the performance of…
In recent years, there has been growing interest in jointly analyzing a foreground dataset, representing an experimental group, and a background dataset, representing a control group. The goal of such contrastive investigations is to…
Robins et al. (2008, 2017) applied the theory of higher order influence functions (HOIFs) to derive an estimator of the mean $\psi$ of an outcome Y in a missing data model with Y missing at random conditional on a vector X of continuous…
We study Bayesian model selection in colored Gaussian graphical models (CGGMs), which combine sparsity of conditional independencies with symmetry constraints encoded by vertex- and edge-colored graphs. A computational bottleneck in…
Under the assumption that data lie on a compact (unknown) manifold without boundary, we derive finite sample bounds for kernel smoothing and its (first and second) derivatives, and we establish asymptotic normality through Berry-Esseen type…
We propose a new asymptotic test for the separability of a covariance matrix. The null distribution is valid in wide matrix elliptical model that includes, in particular, both matrix Gaussian and matrix $t$-distribution. The test is fast to…
Stick-breaking has a long history and is one of the most popular procedures for constructing random discrete distributions in Statistics and Machine Learning. In particular, due to their intuitive construction and computational tractability…
Tail Value-at-Risk (TVaR) is a widely adopted risk measure playing a critically important role in both academic research and industry practice in insurance. In data applications, TVaR is often estimated using the empirical method, owing to…
We consider the problem of joint estimation of the parameters of $m$ linear dynamical systems, given access to single realizations of their respective trajectories, each of length $T$. The linear systems are assumed to reside on the nodes…
Markov chains are fundamental models for stochastic dynamics, with applications in a wide range of areas such as population dynamics, queueing systems, reinforcement learning, and Monte Carlo methods. Estimating the transition matrix and…
Physics-informed neural networks (PINNs) are a promising approach that combines the power of neural networks with the interpretability of physical modeling. PINNs have shown good practical performance in solving partial differential…
Change point tests for abrupt changes in the mean of functional data, i.e., random elements in infinite-dimensional Hilbert spaces, are either based on dimension reduction techniques, e.g., based on principal components, or directly based…
In constrained stochastic optimization, one naturally expects that imposing a stricter feasible set does not increase the statistical risk of an estimator defined by projection onto that set. In this paper, we show that this intuition can…
Hierarchical Bayesian models are increasingly used in large, inhomogeneous complex network dynamical systems by modeling parameters as draws from a hyperparameter-governed distribution. However, theoretical guarantees for these estimates as…
We study a linear observation model with an unknown permutation called \textit{permuted/shuffled linear regression}, where responses and covariates are mismatched and the permutation forms a discrete, factorial-size parameter. The…
We study the local geometry of empirical risks in high dimensions via the spectral theory of their Hessian and information matrices. We focus on settings where the data, $(Y_\ell)_{\ell =1}^n \in \mathbb{R}^d$, are i.i.d. draws of a…