Related papers: BFS based distributed algorithm for parallel local…
Higher-order graph clustering aims to partition the graph using frequently occurring subgraphs. Motif conductance is one of the most promising higher-order graph clustering models due to its strong interpretability. However, existing motif…
One fundamental problem in temporal graph analysis is to count the occurrences of small connected subgraph patterns (i.e., motifs), which benefits a broad range of real-world applications, such as anomaly detection, structure prediction,…
Triangle counting is a fundamental problem in graph mining, essential for analyzing graph streams with arbitrary edge orders. However, exact counting becomes impractical due to the massive size of real-world graph streams. To address this,…
The persistence diagram, which describes the topological features of a dataset, is a key descriptor in Topological Data Analysis. The "Discrete Morse Sandwich" (DMS) method has been reported to be the most efficient algorithm for computing…
We address the problem of computing the distribution of induced connected subgraphs, aka \emph{graphlets} or \emph{motifs}, in large graphs. The current state-of-the-art algorithms estimate the motif counts via uniform sampling, by…
Counting triangles in a graph and incident to each vertex is a fundamental and frequently considered task of graph analysis. We consider how to efficiently do this for huge graphs using massively parallel distributed-memory machines.…
Counting the frequencies of 3-, 4-, and 5-node undirected motifs (also know as graphlets) is widely used for understanding complex networks such as social and biology networks. However, it is a great challenge to compute these metrics for a…
Computing subgraph frequencies is a fundamental task that lies at the core of several network analysis methodologies, such as network motifs and graphlet-based metrics, which have been widely used to categorize and compare networks from…
We introduce multi-frequency vector diffusion maps (MFVDM), a new framework for organizing and analyzing high dimensional datasets. The new method is a mathematical and algorithmic generalization of vector diffusion maps (VDM) and other…
Network embeddings have become very popular in learning effective feature representations of networks. Motivated by the recent successes of embeddings in natural language processing, researchers have tried to find network embeddings in…
We present a novel distributed algorithm for counting all four-node induced subgraphs in a big graph. These counts, called the $4$-profile, describe a graph's connectivity properties and have found several uses ranging from bioinformatics…
Algorithms for mining very large graphs, such as those representing online social networks, to discover the relative frequency of small subgraphs within them are of high interest to sociologists, computer scientists and marketeers alike.…
The BFS algorithm is a basic graph data processing algorithm and many other graph data processing algorithms have similar architectural features with BFS algorithm and can be built on the basis of BFS algorithm model. We analyze the…
Big graphs (networks) arising in numerous application areas pose significant challenges for graph analysts as these graphs grow to billions of nodes and edges and are prohibitively large to fit in the main memory. Finding the number of…
Data-intensive, graph-based computations are pervasive in several scientific applications, and are known to to be quite challenging to implement on distributed memory systems. In this work, we explore the design space of parallel algorithms…
Over the last two decades, frameworks for distributed-memory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel…
This paper considers the problem of distributed optimization over time-varying graphs. For the case of undirected graphs, we introduce a distributed algorithm, referred to as DIGing, based on a combination of a distributed inexact gradient…
Subgraph counting aims to count occurrences of a template T in a given network G(V, E). It is a powerful graph analysis tool and has found real-world applications in diverse domains. Scaling subgraph counting problems is known to be memory…
Graph analytics for large scale graphs has gained interest in recent years. Many graph algorithms have been designed for vertex-centric distributed graph processing frameworks to operate on large graphs with 100 M vertices and edges, using…
Given a labeled graph, the frequent-subgraph mining (FSM) problem asks to find all the $k$-vertex subgraphs that appear with frequency greater than a given threshold. FSM has numerous applications ranging from biology to network science, as…