Related papers: Distributed Subgraph Enumeration via Backtracking-…
Subgraph matching is a compute-intensive problem that asks to enumerate all the isomorphic embeddings of a query graph within a data graph. This problem is generally solved with backtracking, which recursively evolves every possible partial…
The subgraph enumeration problem asks us to find all subgraphs of a target graph that are isomorphic to a given pattern graph. Determining whether even one such isomorphic subgraph exists is NP-complete---and therefore finding all such…
Subgraph pattern detection aims to uncover complex interaction structures in graphs. However, state-of-the-art graph neural network (GNN)-based solutions assume centralized access to the entire graph. When graphs are instead distributed…
Component-centric distributed graph processing platforms that use a bulk synchronous parallel (BSP) programming model have gained traction. These address the short-comings of Big Data abstractions/platforms like MapReduce/Hadoop for…
Recently, graph mining approaches have become very popular, especially in domains such as bioinformatics, chemoinformatics and social networks. In this scope, one of the most challenging tasks is frequent subgraph discovery. This task has…
The ability to handle large scale graph data is crucial to an increasing number of applications. Much work has been dedicated to supporting basic graph operations such as subgraph matching, reachability, regular expression matching, etc. In…
Exact subgraph matching on large-scale graphs remains a challenging problem due to high computational complexity and distributed system constraints. Existing GNN-based path embedding (GNN-PE) frameworks achieve efficient exact matching on…
This paper presents a methodology for creating streaming, distributed inference algorithms for Bayesian nonparametric (BNP) models. In the proposed framework, processing nodes receive a sequence of data minibatches, compute a variational…
To understand the structure of a network, it can be useful to break it down into its constituent pieces. This is the approach taken in a multitude of successful network analysis methods, such as motif analysis. These methods require one to…
Subgraph matching has garnered increasing attention for its diverse real-world applications. Given the dynamic nature of real-world graphs, addressing evolving scenarios without incurring prohibitive overheads has been a focus of research.…
Subgraph matching is a fundamental problem in graph analysis with a wide range of applications. However, due to its inherent NP-hardness, enumerating subgraph matches efficiently on large real-world graphs remains highly challenging. Most…
As RDF becomes more widely established and the amount of linked data is rapidly increasing, the efficient querying of large amount of data becomes a significant challenge. In this paper, we propose a family of algorithms for querying large…
Mining subgraphs with interesting structural properties from networks (or graphs) is a computationally challenging task. In this paper, we propose two algorithms for enumerating all connected induced subgraphs of a given cardinality from…
Graph embedding has been proven to be efficient and effective in facilitating graph analysis. In this paper, we present a novel spectral framework called NOn-Backtracking Embedding (NOBE), which offers a new perspective that organizes graph…
This paper proposes Kudu, a distributed execution engine with a well-defined abstraction that can be integrated with existing single-machine graph pattern mining (GPM) systems to provide efficiency and scalability at the same time. The key…
We present BatchGNN, a distributed CPU system that showcases techniques that can be used to efficiently train GNNs on terabyte-sized graphs. It reduces communication overhead with macrobatching in which multiple minibatches' subgraph…
As one of the most fundamental tasks in graph theory, subgraph matching is a crucial task in many fields, ranging from information retrieval, computer vision, biology, chemistry and natural language processing. Yet subgraph matching problem…
We study the classic subgraph enumeration problem under distributed settings. Existing solutions either suffer from severe memory crisis or rely on large indexes, which makes them impractical for very large graphs. Most of them follow a…
Graph embedding maps graph nodes to low-dimensional vectors, and is widely adopted in machine learning tasks. The increasing availability of billion-edge graphs underscores the importance of learning efficient and effective embeddings on…
Graph-based computations are crucial in a wide range of applications, where graphs can scale to trillions of edges. To enable efficient training on such large graphs, mini-batch subgraph sampling is commonly used, which allows training…