English

Binomial Mixture Model With U-shape Constraint

Methodology 2021-08-24 v2 Statistics Theory Statistics Theory

Abstract

In this article, we study the binomial mixture model under the regime that the binomial size mm can be relatively large compared to the sample size nn. This project is motivated by the GeneFishing method (Liu et al., 2019), whose output is a combination of the parameter of interest and the subsampling noise. To tackle the noise in the output, we utilize the observation that the density of the output has a U shape and model the output with the binomial mixture model under a U shape constraint. We first analyze the estimation of the underlying distribution F in the binomial mixture model under various conditions for F. Equipped with these theoretical understandings, we propose a simple method Ucut to identify the cutoffs of the U shape and recover the underlying distribution based on the Grenander estimator (Grenander, 1956). It has been shown that when m=Ω(n2/3)m = {\Omega}(n^{2/3}), the identified cutoffs converge at the rate O(n1/3)O(n^{-1/3}). The L1L_1 distance between the recovered distribution and the true one decreases at the same rate. To demonstrate the performance, we apply our method to varieties of simulation studies, a GTEX dataset used in (Liu et al., 2019) and a single cell dataset from Tabula Muris.

Keywords

Cite

@article{arxiv.2107.13756,
  title  = {Binomial Mixture Model With U-shape Constraint},
  author = {Yuting Ye and Peter J. Bickel},
  journal= {arXiv preprint arXiv:2107.13756},
  year   = {2021}
}

Comments

45 pages, 26 figures, 2 tables

R2 v1 2026-06-24T04:37:42.605Z