统计理论
Linear inverse problems are ubiquitous in various science and engineering disciplines. Of particular importance in the past few decades, is the incorporation of sparsity based priors, in particular $\ell_1$ priors, into linear inverse…
In this paper, we introduce a unified framework, inspired by classical regularization theory, for designing and analyzing a broad class of linear regression approaches. Our framework encompasses traditional methods like least squares…
We introduce and study a unified Bayesian framework for extended feature allocations which flexibly captures interactions -- such as repulsion or attraction -- among features and their associated weights. We provide a complete Bayesian…
The computational cost for inference and prediction of statistical models based on Gaussian processes with Mat\'ern covariance functions scales cubicly with the number of observations, limiting their applicability to large data sets. The…
Due to the broad applications of elliptical models, there is a long line of research on goodness-of-fit tests for empirically validating them. However, the existing literature on this topic is generally confined to low-dimensional settings,…
Adversarial training has been proposed to protect machine learning models against adversarial attacks. This paper focuses on adversarial training under $\ell_\infty$-perturbation, which has recently attracted much research attention. The…
Many standard estimators, when applied to adaptively collected data, fail to be asymptotically normal, thereby complicating the construction of confidence intervals. We address this challenge in a semi-parametric context: estimating the…
Recently, Chatterjee (2023) recognized the lack of a direct generalization of his rank correlation $\xi$ in Azadkia and Chatterjee (2021) to a multi-dimensional response vector. As a natural solution to this problem, we here propose an…
We study the coverage properties of full conformal regression in the proportional asymptotic regime where the ratio of the dimension and the sample size converges to a constant. In this setting, existing theory tells us only that full…
A one-shot device is a unit that operates only once, after which it is either destroyed or needs to be rebuilt. For this type of device, the operational status can only be assessed at a specific inspection time, determining whether failure…
This paper develops a general asymptotic theory of series estimators for spatial data collected at irregularly spaced locations within a sampling region $R_n \subset \mathbb{R}^d$. We employ a stochastic sampling design that can flexibly…
Fisher's likelihood is widely used for statistical inference for fixed unknowns. This paper aims to extend two important likelihood-based methods, namely the maximum likelihood procedure for point estimation and the confidence procedure for…
This paper investigates conditional specifications for multivariate count variables. Recently, the spatial count data literature has proposed several conditional models such that the conditional expectations are linear in the conditioning…
Random Forests have been extensively used in regression and classification, inspiring the development of various forest-based methods. Among these, Mondrian Forests, derived from the Mondrian process, mark a significant advancement.…
A novel approach is given to overcome the computational challenges of the full-matrix Adaptive Gradient algorithm (Full AdaGrad) in stochastic optimization. By developing a recursive method that estimates the inverse of the square root of…
Model-free time-to-event regression under confounding presents challenges due to biases introduced by causal and censoring sampling mechanisms. This phenomenology poses problems for classical non-parametric estimators like Beran's or the…
It is common in nonparametric estimation problems to impose a certain low-dimensional structure on the unknown parameter to avoid the curse of dimensionality. This paper considers a nonparametric distribution estimation problem with a…
Many scientific problems involve data exhibiting both temporal and cross-sectional dependencies. While linear dependencies have been extensively studied, the theoretical analysis of regression estimators under nonlinear dependencies remains…
In Genome-Wide Association Studies (GWAS), heritability is defined as the fraction of variance of an outcome explained by a large number of genetic predictors in a high-dimensional polygenic linear model. This work studies the asymptotic…
In this work, we introduce the No-Underrun Sampler (NURS), a locally-adaptive, gradient-free Markov chain Monte Carlo method that blends ideas from Hit-and-Run and the No-U-Turn Sampler. NURS dynamically adapts to the local scale of the…