Related papers: A Streaming Algorithm for Graph Clustering

Clustering-based Partitioning for Large Web Graphs

Graph partitioning plays a vital role in distributedlarge-scale web graph analytics, such as pagerank and labelpropagation. The quality and scalability of partitioning strategyhave a strong impact on such communication- and…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-01-04 Deyu Kong , Xike Xie , Zhuoxu Zhang

CluStRE: Streaming Graph Clustering with Multi-Stage Refinement

We present CluStRE, a novel streaming graph clustering algorithm that balances computational efficiency with high-quality clustering using multi-stage refinement. Unlike traditional in-memory clustering approaches, CluStRE processes graphs…

Machine Learning · Computer Science 2025-02-12 Adil Chhabra , Shai Dorian Peretz , Christian Schulz

Buffered Streaming Graph Partitioning

Partitioning graphs into blocks of roughly equal size is widely used when processing large graphs. Currently there is a gap in the space of available partitioning algorithms. On the one hand, there are streaming algorithms that have been…

Data Structures and Algorithms · Computer Science 2021-12-23 Marcelo Fonseca Faraj , Christian Schulz

A linear streaming algorithm for community detection in very large networks

In this paper, we introduce a novel community detection algorithm in graphs, called SCoDA (Streaming Community Detection Algorithm), based on an edge streaming setting. This algorithm has an extremely low memory footprint and a…

Social and Information Networks · Computer Science 2017-03-09 Alexandre Hollocou , Julien Maudet , Thomas Bonald , Marc Lelarge

2PS: High-Quality Edge Partitioning with Two-Phase Streaming

Graph partitioning is an important preprocessing step to distributed graph processing. In edge partitioning, the edge set of a given graph is split into $k$ equally-sized partitions, such that the replication of vertices across partitions…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-22 Ruben Mayer , Kamil Orujzade , Hans-Arno Jacobsen

Streaming Graph Algorithms in the Massively Parallel Computation Model

We initiate the study of graph algorithms in the streaming setting on massive distributed and parallel systems inspired by practical data processing systems. The objective is to design algorithms that can efficiently process evolving graphs…

Data Structures and Algorithms · Computer Science 2025-01-20 Artur Czumaj , Gopinath Mishra , Anish Mukherjee

Clustering Categorical Data Streams

The data stream model has been defined for new classes of applications involving massive data being generated at a fast pace. Web click stream analysis and detection of network intrusions are two examples. Cluster analysis on data streams…

Databases · Computer Science 2007-05-23 Zengyou He , Xiaofei Xu , Shengchun Deng , Joshua Zhexue Huang

A fast multilevel algorithm for graph clustering and community detection

One of the most useful measures of cluster quality is the modularity of a partition, which measures the difference between the number of the edges joining vertices from the same cluster and the expected number of such edges in a random…

Data Analysis, Statistics and Probability · Physics 2009-09-29 Hristo Djidjev

Streaming Hypergraph Partitioning Algorithms on Limited Memory Environments

Many well-known, real-world problems involve dynamic data which describe the relationship among the entities. Hypergraphs are powerful combinatorial structures that are frequently used to model such data. For many of today's data-centric…

Data Structures and Algorithms · Computer Science 2021-03-10 Fatih Taşyaran , Berkay Demireller , Kamer Kaya , Bora Uçar

Window-based Streaming Graph Partitioning Algorithm

In the recent years, the scale of graph datasets has increased to such a degree that a single machine is not capable of efficiently processing large graphs. Thereby, efficient graph partitioning is necessary for those large graph…

Data Structures and Algorithms · Computer Science 2019-02-06 Md Anwarul kaium Patwary , Saurabh Garg , Byeong Kang

Streaming Balanced Graph Partitioning for Random Graphs

There has been a recent explosion in the size of stored data, partially due to advances in storage technology, and partially due to the growing popularity of cloud-computing and the vast quantities of data generated. This motivates the need…

Data Structures and Algorithms · Computer Science 2012-12-06 Isabelle Stanton

Learning-Augmented Streaming Algorithms for Correlation Clustering

We study streaming algorithms for Correlation Clustering. Given a graph as an arbitrary-order stream of edges, with each edge labeled as positive or negative, the goal is to partition the vertices into disjoint clusters, such that the…

Data Structures and Algorithms · Computer Science 2025-10-14 Yinhao Dong , Shan Jiang , Shi Li , Pan Peng

Online Clustering by Penalized Weighted GMM

With the dawn of the Big Data era, data sets are growing rapidly. Data is streaming from everywhere - from cameras, mobile phones, cars, and other electronic devices. Clustering streaming data is a very challenging problem. Unlike the…

Machine Learning · Computer Science 2019-02-08 Shlomo Bugdary , Shay Maymon

Partitioning Complex Networks via Size-constrained Clustering

The most commonly used method to tackle the graph partitioning problem in practice is the multilevel approach. During a coarsening phase, a multilevel graph partitioning algorithm reduces the graph size by iteratively contracting nodes and…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-03-26 Henning Meyerhenke , Peter Sanders , Christian Schulz

Enhancing Graph Topology and Clustering Quality: A Modularity-Guided Approach

Current modularity-based community detection algorithms attempt to find cluster memberships that maximize modularity within a fixed graph topology. Diverging from this conventional approach, our work introduces a novel strategy that employs…

Data Analysis, Statistics and Probability · Physics 2024-02-27 Yongyu Wang , Shiqi Hao , Xiaoyang Wang , Xiaotian Zhuang

Fast Memory-efficient Anomaly Detection in Streaming Heterogeneous Graphs

Given a stream of heterogeneous graphs containing different types of nodes and edges, how can we spot anomalous ones in real-time while consuming bounded memory? This problem is motivated by and generalizes from its application in security…

Social and Information Networks · Computer Science 2016-02-23 Emaad A. Manzoor , Sadegh Momeni , Venkat N. Venkatakrishnan , Leman Akoglu

Distributed Graph Clustering using Modularity and Map Equation

We study large-scale, distributed graph clustering. Given an undirected graph, our objective is to partition the nodes into disjoint sets called clusters. A cluster should contain many internal edges while being sparsely connected to other…

Data Structures and Algorithms · Computer Science 2020-04-28 Michael Hamann , Ben Strasser , Dorothea Wagner , Tim Zeitz

Partitioning Trillion Edge Graphs on Edge Devices

Processing large-scale graphs, containing billions of entities, is critical across fields like bioinformatics, high-performance computing, navigation and route planning, among others. Efficient graph partitioning, which divides a graph into…

Data Structures and Algorithms · Computer Science 2024-10-11 Adil Chhabra , Florian Kurpicz , Christian Schulz , Dominik Schweisgut , Daniel Seemaier

Counting Triangles in Real-World Graph Streams: Dealing with Repeated Edges and Time Windows

Real-world graphs often manifest as a massive temporal stream of edges. The need for real-time analysis of such large graph streams has led to progress on low memory, one-pass streaming graph algorithms. These algorithms were designed for…

Data Structures and Algorithms · Computer Science 2014-10-16 Madhav Jha , C. Seshadhri , Ali Pinar

Streaming, Memory Limited Algorithms for Community Detection

In this paper, we consider sparse networks consisting of a finite number of non-overlapping communities, i.e. disjoint clusters, so that there is higher density within clusters than across clusters. Both the intra- and inter-cluster edge…

Social and Information Networks · Computer Science 2014-11-06 Se-Young Yun , Marc Lelarge , Alexandre Proutiere