Related papers: Limited Random Walk Algorithm for Big Graph Data C…
Large datasets with interactions between objects are common to numerous scientific fields (i.e. social science, internet, biology...). The interactions naturally define a graph and a common way to explore or summarize such dataset is graph…
Graph partition is a fundamental problem of parallel computing for big graph data. Many graph partition algorithms have been proposed to solve the problem in various applications, such as matrix computations and PageRank, etc., but none has…
Clustering trajectory data attracted considerable attention in the last few years. Most of prior work assumed that moving objects can move freely in an euclidean space and did not consider the eventual presence of an underlying road network…
Algorithms for node clustering typically focus on finding homophilous structure in graphs. That is, they find sets of similar nodes with many edges within, rather than across, the clusters. However, graphs often also exhibit heterophilous…
We introduce a modified model of random walk, and then develop two novel clustering algorithms based on it. In the algorithms, each data point in a dataset is considered as a particle which can move at random in space according to the…
Even though clustering trajectory data attracted considerable attention in the last few years, most of prior work assumed that moving objects can move freely in an euclidean space and did not consider the eventual presence of an underlying…
Graph clustering is a fundamental computational problem with a number of applications in algorithm design, machine learning, data mining, and analysis of social networks. Over the past decades, researchers have proposed a number of…
Graph embedding, representing local and global neighborhood information by numerical vectors, is a crucial part of the mathematical modeling of a wide range of real-world systems. Among the embedding algorithms, random walk-based algorithms…
Graph clustering has many important applications in computing, but due to the increasing sizes of graphs, even traditionally fast clustering methods can be computationally expensive for real-world graphs of interest. Scalability problems…
Graph clustering is an unsupervised machine learning method that partitions the nodes in a graph into different groups. Despite achieving significant progress in exploiting both attributed and structured data information, graph clustering…
In the last two decades we are witnessing a huge increase of valuable big data structured in the form of graphs or networks. To apply traditional machine learning and data analytic techniques to such data it is necessary to transform graphs…
Although many successful ensemble clustering approaches have been developed in recent years, there are still two limitations to most of the existing approaches. First, they mostly overlook the issue of uncertain links, which may mislead the…
Researchers have designed many algorithms to measure the distances between graph nodes, such as average hitting times of random walks, cosine distances from DeepWalk, personalized PageRank, etc. Successful although these algorithms are,…
Clustering a graph means identifying internally dense subgraphs which are only sparsely interconnected. Formalizations of this notion lead to measures that quantify the quality of a clustering and to algorithms that actually find…
We study the problem of finding the maximum of a function defined on the nodes of a connected graph. The goal is to identify a node where the function obtains its maximum. We focus on local iterative algorithms, which traverse the nodes of…
Graph-limit theory focuses on the convergence of sequences of graphs when the number of nodes becomes arbitrarily large. This framework defines a continuous version of graphs allowing for the study of dynamical systems on very large graphs,…
We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the…
Random walks can reveal communities or clusters in networks, because they are more likely to stay within a cluster than leave it. Thus, one family of community detection algorithms uses random walks to measure distance between pairs of…
Graphlets are induced subgraph patterns that are crucial to the understanding of the structure and function of a large network. A lot of efforts have been devoted to calculating graphlet statistics where random walk based approaches are…
Random walk centrality is a fundamental metric in graph mining for quantifying node importance and influence, defined as the weighted average of hitting times to a node from all other nodes. Despite its ability to capture rich graph…