统计理论
The adaptive quasi-likelihood analysis is developed for a degenerate diffusion process. Asymptotic normality and moment convergence are proved for the quasi-maximum likelihood estimators and quasi-Bayesian estimators, in the adaptive…
We investigate the problem of joint statistical estimation of several parameters for a stochastic differential equation driven by an additive fractional Brownian motion. Based on discrete-time observations of the model, we construct an…
In Bayesian inference, making deductions about a parameter of interest requires one to sample from or compute an integral against a posterior distribution. A popular method to make these computations cheaper in high-dimensional settings is…
Causal spaces have recently been introduced as a measure-theoretic framework to encode the notion of causality. While it has some advantages over established frameworks, such as structural causal models, the theory is so far only developed…
This paper explores the problem of generative modeling, aiming to simulate diverse examples from an unknown distribution based on observed examples. While recent studies have focused on quantifying the statistical precision of popular…
The $\boldsymbol{\beta}$-model for random graphs is commonly used for representing pairwise interactions in a network with degree heterogeneity. Going beyond pairwise interactions, Stasi et al. (2014) introduced the hypergraph…
In Bayesian inference, a widespread technique to compute integrals against a high-dimensional posterior is to use a Gaussian proxy to the posterior known as the Laplace approximation. We address the question of accuracy of the approximation…
We show that the theorems in Hansen (2021a) (the version accepted by Econometrica), except for one, are not new as they coincide with classical theorems like the good old Gauss-Markov or Aitken Theorem, respectively; the exceptional theorem…
We consider a group synchronization problem with multiple frequencies which involves observing pairwise relative measurements of group elements on multiple frequency channels, corrupted by Gaussian noise. We study the computational phase…
Measuring the concentration of random variables is a fundamental concept in probability and statistics. Here, we explore a type of concentration measure for continuous random variables with bounded support and use it to provide a notion of…
The paper focuses on the Vasicek model driven by a tempered fractional Brownian motion. We derive the asymptotic distributions of the least-squares estimators (based on continuous-time observations) for the unknown drift parameters. This…
We develop here a novel transfer learning methodology called Profiled Transfer Learning (PTL). The method is based on the \textit{approximate-linear} assumption between the source and target parameters. Compared with the commonly assumed…
In many applications, such as sport tournaments or recommendation systems, we have at our disposal data consisting of pairwise comparisons between a set of $n$ items (or players). The objective is to use this data to infer the latent…
The construction of most supervised learning datasets revolves around collecting multiple labels for each instance, then aggregating the labels to form a type of "gold-standard". We question the wisdom of this pipeline by developing a…
A conventional wisdom in statistical learning is that large models require strong regularization to prevent overfitting. Here we show that this rule can be violated by linear regression in the underdetermined $n\ll p$ situation under…
Given a matrix $A \in \mathbb{R}^{m\times d}$ with singular values $\sigma_1\geq \cdots \geq \sigma_d$, and a random matrix $G \in \mathbb{R}^{m\times d}$ with iid $N(0,T)$ entries for some $T>0$, we derive new bounds on the Frobenius…
Point estimation is a fundamental statistical task. Given the wide selection of available point estimators, it is unclear, however, what, if any, would be universally-agreed theoretical reasons to generally prefer one such estimator over…
We discuss, and give examples of, methods for randomly implementing some minimax robust designs from the literature. These have the advantage, over their deterministic counterparts, of having bounded maximum loss in large and very rich…
This paper introduces a Bayesian nonparametric approach to frequency recovery from lossy-compressed discrete data, leveraging all information contained in a sketch obtained through random hashing. By modeling the data points as random…
A standing challenge in data privacy is the trade-off between the level of privacy and the efficiency of statistical inference. Here we conduct an in-depth study of this trade-off for parameter estimation in the $\beta$-model (Chatterjee,…