Density estimation from batched broken random samples

Hancheng Bi; Bernhard Schmitzer; Thilo D. Stier

Density estimation from batched broken random samples

Statistics Theory 2026-02-11 v1 Statistics Theory

Authors: Hancheng Bi , Bernhard Schmitzer , Thilo D. Stier

Abstract

The broken random sample problem was first introduced by DeGroot, Feder, and Gole (1971, Ann. Math. Statist.): in each observation (batch), a random sample of $M$ i.i.d. point pairs $((X_i,Y_i))_{i=1}^M$ is drawn from a joint distribution with density $p(x,y)$ , but we can observe only the unordered multisets $(X_i)_{i=1}^M$ and $(Y_i)_{i=1}^M$ separately; that is, the pairing information is lost. For large $M$ , inferring $p$ from a single observation has been shown to be essentially impossible. In this paper, we propose a parametric method based on a pseudo-log-likelihood to estimate $p$ from $N$ i.i.d. broken sample batches, and we prove a fast convergence rate in $N$ for our estimator that is uniform in $M$ , under mild assumptions.

Keywords

density estimation probability theory statistical inference

Cite

@article{arxiv.2602.09833,
  title  = {Density estimation from batched broken random samples},
  author = {Hancheng Bi and Bernhard Schmitzer and Thilo D. Stier},
  journal= {arXiv preprint arXiv:2602.09833},
  year   = {2026}
}

Comments

18 pages, 4 figures

Density estimation from batched broken random samples

Abstract

Keywords

Cite

Comments

Related papers