English

Kernel Density Balancing

Applications 2025-06-17 v1

Abstract

High-throughput chromatin conformation capture (Hi-C) data provide insights into the 3D structure of chromosomes, with normalization being a crucial pre-processing step. A common technique for normalization is matrix balancing, which rescales rows and columns of a Hi-C matrix to equalize their sums. Despite its popularity and convenience, matrix balancing lacks statistical justification. In this paper, we introduce a statistical model to analyze matrix balancing methods and propose a kernel-based estimator that leverages spatial structure. Under mild assumptions, we demonstrate that the kernel-based method is consistent, converges faster, and is more robust to data sparsity compared to existing approaches.

Keywords

Cite

@article{arxiv.2506.12626,
  title  = {Kernel Density Balancing},
  author = {John Park and Ning Hao and Yue Selena Niu and Ming Hu},
  journal= {arXiv preprint arXiv:2506.12626},
  year   = {2025}
}