统计理论
A very popular class of models for networks posits that each node is represented by a point in a continuous latent space, and that the probability of an edge between nodes is a decreasing function of the distance between them in this latent…
When modeling network data using a latent position model, it is typical to assume that the nodes' positions are independently and identically distributed. However, this assumption implies the average node degree grows linearly with the…
Archimedean copulas are a popular type of copulas in which a variant of the Archimedean axiom apply. We provide a topological proof of the Archimedean Axiom which is applicable for non-continuous distribution functions.
A highly cited and inspiring article by Bates et al (2024) demonstrates that the prediction errors estimated through cross-validation, Bootstrap or Mallow's $C_P$ can all be independent of the actual prediction errors. This essay…
We study nonparametric estimation in dynamical systems described by ordinary differential equations (ODEs). Specifically, we focus on estimating the unknown function $f \colon \mathbb{R}^d \to \mathbb{R}^d$ that governs the system dynamics…
The beneficial effects of treatments vary across individuals in most studies. Treatment heterogeneity motivates practitioners to search for the optimal policy based on personal characteristics. A long-standing common practice in policy…
Local dependence random graph models are a class of block models for network data which allow for dependence among edges under a local dependence assumption defined around the block structure of the network. Since being introduced by…
By representing documents as mixtures of topics, topic modeling has allowed the successful analysis of datasets across a wide spectrum of applications ranging from ecology to genetics. An important body of recent work has demonstrated the…
Popular regularizers with non-differentiable penalties, such as Lasso, Elastic Net, Generalized Lasso, or SLOPE, reduce the dimension of the parameter space by inducing sparsity or clustering in the estimators' coordinates. In this paper,…
This paper studies M-estimators with gradient-Lipschitz loss function regularized with convex penalty in linear models with Gaussian design matrix and arbitrary noise distribution. A practical example is the robust M-estimator constructed…
We study when low coordinate degree functions (LCDF) -- linear combinations of functions depending on small subsets of entries of a vector -- can test for the presence of categorical structure, including community structure and…
In this paper, we consider parameter estimation for stochastic differential equations driven by Wiener processes and compound Poisson processes. We assume unknown parameters corresponding to coefficients of the drift term, diffusion term,…
The main purpose of this article is to prove that, under certain assumptions in a linear prediction setting, optimal methods based upon model reduction and even an optimal predictor can be provided. The optimality is formulated in terms of…
Empirical Likelihood (EL) is a type of nonparametric likelihood that is useful in many statistical inference problems, including confidence region construction and $k$-sample problems. It enjoys some remarkable theoretical properties,…
Several new geometric quantile-based measures for multivariate dispersion, skewness, kurtosis, and spherical asymmetry are defined. These measures differ from existing measures, which use volumes and are easy to calculate. Some theoretical…
Kendall's tau and conditional Kendall's tau matrices are multivariate (conditional) dependence measures between the components of a random vector. For large dimensions, available estimators are computationally expensive and can be improved…
When passing from the univariate to the multivariate setting, modelling extremes becomes much more intricate. In this introductory exposition, classical multivariate extreme value theory is presented from the point of view of multivariate…
There are many nonparametric objects of interest that are a function of a conditional distribution. One important example is an average treatment effect conditional on a subset of covariates. Many of these objects have a conditional…
Coupling arguments are a central tool for bounding the deviation between two stochastic processes, but traditionally have been limited to Wasserstein metrics. In this paper, we apply the shifted composition rule--an information-theoretic…
The Gibbs sampler, also known as the coordinate hit-and-run algorithm, is a Markov chain that is widely used to draw samples from probability distributions in arbitrary dimensions. At each iteration of the algorithm, a randomly selected…