统计理论
We propose a new importance sampling framework for the estimation and analysis of Sobol' indices. We focus on the estimation of the conditional second-moment quantity underlying these indices, which is the most challenging term to estimate.…
We propose a weak-instrument-robust subvector Lagrange multiplier test for instrumental variables regression. We show that it is asymptotically size-correct under a technical condition or as the number of instruments grows to infinity. This…
An established and growing literature on generalized fiducial inference and related fiducial ideas points to the adoption of fiducial inference as a mainstream perspective among modern statisticians. Like Bayesian posteriors, generalized…
A statistic can be a function of multiple samples. There is little existing work on asymptotic theory for such statistics when group membership is random. We propose a flexible framework that can handle both deterministic and random…
Hypothesis testing in singular statistical models is often regarded as inherently problematic due to non-identifiability and degeneracy of the Fisher information. We show that the fundamental obstruction to testing in such models is not…
We consider the problem of recovering a permutation group $G \leq S_n$ from an error-prone sampling process $X$. We model $X$ as an $S_n$-valued random variable, defined as a mixture of the uniform distributions on $G$ and $S_n$ . Our suite…
Copulas are the primary tool for dependence modeling in statistics, and quasi-copulas are their essential companions. The latter appear, say, as infima or suprema of sets of copulas; they form a huge class and have some unpleasant…
In the Admixture Model, the probability that an individual carries a certain allele at a specific marker depends on the allele frequencies in $K$ ancestral populations and the proportion of the individual's genome originating from these…
Asymptotic uniform confidence bands are constructed for a multivariate nonparametric regression model with heteroscedastic noise, employing histogram estimators under flexible partition conditions. The construction is especially applicable…
The martingale posterior framework is a generalization of Bayesian inference where one elicits a sequence of one-step ahead predictive densities instead of the likelihood and prior. Posterior sampling then involves the imputation of unseen…
The average treatment effect (ATE) is commonly used to quantify the main effect of a binary treatment on an outcome. Extensions to continuous treatments are usually based on the dose-response curve or shift interventions, but both require…
Risk-averse decision-making under uncertainty in partially observable domains is a central challenge in artificial intelligence and is essential for developing reliable autonomous agents. The formal framework for such problems is the…
We study the fundamental problem of clustering $n$ points into $K$ groups drawn from a mixture of isotropic Gaussians in $\mathbb{R}^d$. Specifically, we investigate the requisite minimal distance $\Delta$ between mean vectors to partially…
Consider a sequence of estimators $\hat \theta_n$ which converges almost surely to $\theta_0$ as the sample size $n$ tends to infinity. Under weak smoothness conditions, we identify the asymptotic limit of the last time $\hat \theta_n$ is…
In this work, we analyze alternative effective sample size (ESS) metrics for importance sampling algorithms, and discuss a possible extended range of applications. We show the relationship between the ESS expressions used in the literature…
Let $(X_n)_{n\in \mathbb Z}$ be a GARCH process with $E(X_0^4)<\infty$, and let $\mu_n$ denote the distribution of $\frac 1{{\sqrt n}}\sum_{i=1}^n [X_i^2-\mathbb E(X_0^2)]$. We derive a numerical approximation of $\mu_n$ when $x_1,...,x_n$…
This paper considers a non-standard problem of generating samples from a low-temperature Gibbs distribution with \emph{constrained} support, when some of the coordinates of the mode lie on the boundary. These coordinates are referred to as…
We develop polynomial-time algorithms for near-optimal minimax mean estimation under $\ell_2$-squared loss in a Gaussian sequence model under convex constraints. The parameter space is an origin-symmetric, type-2 convex body $K \subset…
We study the problem of estimating a distribution over a finite alphabet from an i.i.d. sample, with accuracy measured in relative entropy (Kullback-Leibler divergence). While optimal bounds on the expected risk are known, high-probability…
The recent article `Satellite conjunction analysis and the false confidence theorem' (Balch, Martin, and Ferson, 2019, Proceedings of the Royal Society, Series A) points to certain difficulties with Bayesian analysis when used for models…