English

Optimal Network Pairwise Comparison

Methodology 2024-08-14 v1

Abstract

We are interested in the problem of two-sample network hypothesis testing: given two networks with the same set of nodes, we wish to test whether the underlying Bernoulli probability matrices of the two networks are the same or not. We propose Interlacing Balance Measure (IBM) as a new two-sample testing approach. We consider the {\it Degree-Corrected Mixed-Membership (DCMM)} model for undirected networks, where we allow severe degree heterogeneity, mixed-memberships, flexible sparsity levels, and weak signals. In such a broad setting, how to find a test that has a tractable limiting null and optimal testing performances is a challenging problem. We show that IBM is such a test: in a broad DCMM setting with only mild regularity conditions, IBM has N(0,1)N(0,1) as the limiting null and achieves the optimal phase transition. While the above is for undirected networks, IBM is a unified approach and is directly implementable for directed networks. For a broad directed-DCMM (extension of DCMM for directed networks) setting, we show that IBM has N(0,1/2)N(0, 1/2) as the limiting null and continues to achieve the optimal phase transition. We have also applied IBM to the Enron email network and a gene co-expression network, with interesting results.

Keywords

Cite

@article{arxiv.2408.06987,
  title  = {Optimal Network Pairwise Comparison},
  author = {Jiashun Jin and Zheng Tracy Ke and Shengming Luo and Yucong Ma},
  journal= {arXiv preprint arXiv:2408.06987},
  year   = {2024}
}

Comments

92 pages

R2 v1 2026-06-28T18:11:54.475Z