Binomial Mixture Model With U-shape Constraint
Abstract
In this article, we study the binomial mixture model under the regime that the binomial size can be relatively large compared to the sample size . This project is motivated by the GeneFishing method (Liu et al., 2019), whose output is a combination of the parameter of interest and the subsampling noise. To tackle the noise in the output, we utilize the observation that the density of the output has a U shape and model the output with the binomial mixture model under a U shape constraint. We first analyze the estimation of the underlying distribution F in the binomial mixture model under various conditions for F. Equipped with these theoretical understandings, we propose a simple method Ucut to identify the cutoffs of the U shape and recover the underlying distribution based on the Grenander estimator (Grenander, 1956). It has been shown that when , the identified cutoffs converge at the rate . The distance between the recovered distribution and the true one decreases at the same rate. To demonstrate the performance, we apply our method to varieties of simulation studies, a GTEX dataset used in (Liu et al., 2019) and a single cell dataset from Tabula Muris.
Cite
@article{arxiv.2107.13756,
title = {Binomial Mixture Model With U-shape Constraint},
author = {Yuting Ye and Peter J. Bickel},
journal= {arXiv preprint arXiv:2107.13756},
year = {2021}
}
Comments
45 pages, 26 figures, 2 tables