Finding coherent node groups in directed graphs
Abstract
Grouping the nodes of a graph into clusters is a standard technique for studying networks. We study a problem where we are given a directed network and are asked to partition the graph into a sequence of coherent groups. We assume that nodes in the network have features, and we measure the group coherence by comparing these features. Furthermore, we incorporate the cross edges by penalizing the forward cross edges and backward cross edges with different weights. If the weights are set to 0, then the problem is equivalent to clustering. However, if we penalize the backward edges, the order of discovered groups matters, and we can view our problem as a generalization of a classic segmentation problem. We consider a common iterative approach where we solve the groups given the centroids, and then find the centroids given the groups. We show that, unlike in clustering, the first subproblem is NP-hard. However, we show that we can solve the subproblem exactly if the underlying graph is a tree or if the number of groups is 2. For a general case, we propose an approximation algorithm based on linear programming. We propose 3 additional heuristics: (1) optimizing each pair of groups separately while keeping the remaining groups intact, (2) computing a spanning tree and then optimizing using only the edges in that, and (3) a greedy search moving nodes between the groups while optimizing the overall loss. We demonstrate with our experiments that the algorithms are practical and yield interpretable results.
Keywords
Cite
@article{arxiv.2310.02993,
title = {Finding coherent node groups in directed graphs},
author = {Iiro Kumpulainen and Nikolaj Tatti},
journal= {arXiv preprint arXiv:2310.02993},
year = {2025}
}