Related papers: Distributed GraphLab: A Framework for Machine Lear…

GraphLab: A Distributed Framework for Machine Learning in the Cloud

Machine Learning (ML) techniques are indispensable in a wide range of fields. Unfortunately, the exponential increase of dataset sizes are rapidly extending the runtime of sequential algorithms and threatening to slow future progress in ML.…

Machine Learning · Computer Science 2011-07-06 Yucheng Low , Joseph Gonzalez , Aapo Kyrola , Danny Bickson , Carlos Guestrin

GraphLab: A New Framework for Parallel Machine Learning

Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and…

Machine Learning · Computer Science 2010-06-28 Yucheng Low , Joseph Gonzalez , Aapo Kyrola , Danny Bickson , Carlos Guestrin , Joseph M. Hellerstein

GraphLab: A New Framework For Parallel Machine Learning

Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and…

Machine Learning · Computer Science 2014-08-12 Yucheng Low , Joseph E. Gonzalez , Aapo Kyrola , Danny Bickson , Carlos E. Guestrin , Joseph Hellerstein

Graph Sampling with Distributed In-Memory Dataflow Systems

Given a large graph, a graph sample determines a subgraph with similar characteristics for certain metrics of the original graph. The samples are much smaller thereby accelerating and simplifying the analysis and visualization of large…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-11 Kevin Gomez , Matthias Täschner , M. Ali Rostami , Christopher Rost , Erhard Rahm

An Empirical Comparison of Big Graph Frameworks in the Context of Network Analysis

Complex networks are relational data sets commonly represented as graphs. The analysis of their intricate structure is relevant to many areas of science and commerce, and data sets may reach sizes that require distributed storage and…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-01-05 Jannis Koch , Christian L. Staudt , Maximilian Vogel , Henning Meyerhenke

Parallel Computation of Graph Embeddings

Graph embedding aims at learning a vector-based representation of vertices that incorporates the structure of the graph. This representation then enables inference of graph properties. Existing graph embedding techniques, however, do not…

Machine Learning · Computer Science 2019-09-09 Chi Thang Duong , Hongzhi Yin , Thanh Dat Hoang , Truong Giang Le Ba , Matthias Weidlich , Quoc Viet Hung Nguyen , Karl Aberer

High-Level Programming Abstractions for Distributed Graph Processing

Efficient processing of large-scale graphs in distributed environments has been an increasingly popular topic of research in recent years. Inter-connected data that can be modeled as graphs arise in application domains such as machine…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-25 Vasiliki Kalavri , Vladimir Vlassov , Seif Haridi

GraphX: Unifying Data-Parallel and Graph-Parallel Analytics

From social networks to language modeling, the growing scale and importance of graph data has driven the development of numerous new graph-parallel systems (e.g., Pregel, GraphLab). By restricting the computation that can be expressed and…

Databases · Computer Science 2014-02-12 Reynold S. Xin , Daniel Crankshaw , Ankur Dave , Joseph E. Gonzalez , Michael J. Franklin , Ion Stoica

Scalable and Adaptive Parallel Training of Graph Transformer on Large Graphs

Graph foundation models have demonstrated remarkable adaptability across diverse downstream tasks through large-scale pretraining on graphs. However, existing implementations of the backbone model, graph transformers, are typically limited…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-21 Jun-Liang Lin , Kamesh Madduri , Mahmut Taylan Kandemir

Attributed Graph Clustering in Collaborative Settings

Graph clustering is an unsupervised machine learning method that partitions the nodes in a graph into different groups. Despite achieving significant progress in exploiting both attributed and structured data information, graph clustering…

Machine Learning · Computer Science 2025-01-03 Rui Zhang , Xiaoyang Hou , Zhihua Tian , Yan he , Enchao Gong , Jian Liu , Qingbiao Wu , Kui Ren

Parallel aggregation is a ubiquitous operation in data analytics that is expressed as GROUP BY in SQL, reduce in Hadoop, or segment in TensorFlow. Parallel aggregation starts with an optional local pre-aggregation step and then repartitions…

Databases · Computer Science 2018-11-30 Feilong Liu , Ario Salmasi , Spyros Blanas , Anastasios Sidiropoulos

Overcoming Latency-bound Limitations of Distributed Graph Algorithms using the HPX Runtime System

Graph processing at scale presents many challenges, including the irregular structure of graphs, the latency-bound nature of graph algorithms, and the overhead associated with distributed execution. While existing frameworks such as Spark…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-06 Karame Mohammadiporshokooh , Panagiotis Syskakis , Andrew Lumsdaine , Hartmut Kaiser

Parallel Graph Algorithms in Constant Adaptive Rounds: Theory meets Practice

We study fundamental graph problems such as graph connectivity, minimum spanning forest (MSF), and approximate maximum (weight) matching in a distributed setting. In particular, we focus on the Adaptive Massively Parallel Computation (AMPC)…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-25 Soheil Behnezhad , Laxman Dhulipala , Hossein Esfandiari , Jakub Łącki , Vahab Mirrokni , Warren Schudy

Streaming Graph Algorithms in the Massively Parallel Computation Model

We initiate the study of graph algorithms in the streaming setting on massive distributed and parallel systems inspired by practical data processing systems. The objective is to design algorithms that can efficiently process evolving graphs…

Data Structures and Algorithms · Computer Science 2025-01-20 Artur Czumaj , Gopinath Mishra , Anish Mukherjee

GraphScale: A Framework to Enable Machine Learning over Billion-node Graphs

Graph Neural Networks (GNNs) have emerged as powerful tools for supervised machine learning over graph-structured data, while sampling-based node representation learning is widely utilized in unsupervised learning. However, scalability…

Machine Learning · Computer Science 2024-07-23 Vipul Gupta , Xin Chen , Ruoyun Huang , Fanlong Meng , Jianjun Chen , Yujun Yan

A Survey of Distributed Graph Algorithms on Massive Graphs

Distributed processing of large-scale graph data has many practical applications and has been widely studied. In recent years, a lot of distributed graph processing frameworks and algorithms have been proposed. While many efforts have been…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-29 Lingkai Meng , Yu Shao , Long Yuan , Longbin Lai , Peng Cheng , Xue Li , Wenyuan Yu , Wenjie Zhang , Xuemin Lin , Jingren Zhou

Parallel Hierarchical Affinity Propagation with MapReduce

The accelerated evolution and explosion of the Internet and social media is generating voluminous quantities of data (on zettabyte scales). Paramount amongst the desires to manipulate and extract actionable intelligence from vast big data…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-03-31 Dillon Mark Rose , Jean Michel Rouly , Rana Haber , Nenad Mijatovic , Adrian M. Peter

GraphGen+: Advancing Distributed Subgraph Generation and Graph Learning On Industrial Graphs

Graph-based computations are crucial in a wide range of applications, where graphs can scale to trillions of edges. To enable efficient training on such large graphs, mini-batch subgraph sampling is commonly used, which allows training…

Machine Learning · Computer Science 2025-04-04 Yue Jin , Yongchao Liu , Chuntao Hong

Graph Federated Learning Based on the Decentralized Framework

Graph learning has a wide range of applications in many scenarios, which require more need for data privacy. Federated learning is an emerging distributed machine learning approach that leverages data from individual devices or data centers…

Machine Learning · Computer Science 2023-07-20 Peilin Liu , Yanni Tang , Mingyue Zhang , Wu Chen

Distributed Graph Learning with Smooth Data Priors

Graph learning is often a necessary step in processing or representing structured data, when the underlying graph is not given explicitly. Graph learning is generally performed centrally with a full knowledge of the graph signals, namely…

Signal Processing · Electrical Eng. & Systems 2021-12-14 Isabela Cunha Maia Nobre , Mireille El Gheche , Pascal Frossard