English

Density estimation from batched broken random samples

Statistics Theory 2026-02-11 v1 Statistics Theory

Abstract

The broken random sample problem was first introduced by DeGroot, Feder, and Gole (1971, Ann. Math. Statist.): in each observation (batch), a random sample of MM i.i.d. point pairs ((Xi,Yi))i=1M ((X_i,Y_i))_{i=1}^M is drawn from a joint distribution with density p(x,y)p(x,y), but we can observe only the unordered multisets (Xi)i=1M(X_i)_{i=1}^M and (Yi)i=1M(Y_i)_{i=1}^M separately; that is, the pairing information is lost. For large MM, inferring pp from a single observation has been shown to be essentially impossible. In this paper, we propose a parametric method based on a pseudo-log-likelihood to estimate pp from NN i.i.d. broken sample batches, and we prove a fast convergence rate in NN for our estimator that is uniform in MM, under mild assumptions.

Keywords

Cite

@article{arxiv.2602.09833,
  title  = {Density estimation from batched broken random samples},
  author = {Hancheng Bi and Bernhard Schmitzer and Thilo D. Stier},
  journal= {arXiv preprint arXiv:2602.09833},
  year   = {2026}
}

Comments

18 pages, 4 figures

R2 v1 2026-07-01T10:29:48.658Z