English

Het-node2vec: second order random walk sampling for heterogeneous multigraphs embedding

Machine Learning 2024-10-30 v3 Social and Information Networks Physics and Society

Abstract

Many real-world problems are naturally modeled as heterogeneous graphs, where nodes and edges represent multiple types of entities and relations. Existing learning models for heterogeneous graph representation usually depend on the computation of specific and user-defined heterogeneous paths, or in the application of large and often not scalable deep neural network architectures. We propose Het-node2vec, an extension of the node2vec algorithm, designed for embedding heterogeneous graphs. Het-node2vec addresses the challenge of capturing the topological and structural characteristics of graphs and the semantic information underlying the different types of nodes and edges of heterogeneous graphs, by introducing a simple stochastic node and edge type switching strategy in second order random walk processes. The proposed approach also introduces an ''attention mechanism'' to focus the random walks on specific node and edge types, thus allowing more accurate embeddings and more focused predictions on specific node and edge types of interest. Empirical results on benchmark datasets show that Hetnode2vec achieves comparable or superior performance with respect to state-of-the-art methods for heterogeneous graphs in node label and edge prediction tasks.

Keywords

Cite

@article{arxiv.2101.01425,
  title  = {Het-node2vec: second order random walk sampling for heterogeneous multigraphs embedding},
  author = {Mauricio Soto-Gomez and Peter Robinson and Carlos Cano and Ali Pashaeibarough and Emanuele Cavalleri and Justin Reese and Marco Mesiti and Giorgio Valentini and Elena Casiraghi},
  journal= {arXiv preprint arXiv:2101.01425},
  year   = {2024}
}

Comments

25 pages (references excluded), 9 figures