Related papers: Efficient Large-Scale Graph Processing on Hybrid C…

Scaling Up Large-Scale Graph Processing for GPU-Accelerated Heterogeneous Systems

Not only with the large host memory for supporting large scale graph processing, GPU-accelerated heterogeneous architecture can also provide a great potential for high-performance computing. However, few existing heterogeneous systems can…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-05 Xianliang Li

Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even…

Data Structures and Algorithms · Computer Science 2019-08-22 Laxman Dhulipala , Guy E. Blelloch , Julian Shun

Accelerating Direction-Optimized Breadth First Search on Hybrid Architectures

Large scale-free graphs are famously difficult to process efficiently: the skewed vertex degree distribution makes it difficult to obtain balanced partitioning. Our research instead aims to turn this into an advantage by partitioning the…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-10-05 Scott Sallinen , Abdullah Gharaibeh , Matei Ripeanu

ALPHA-PIM: Analysis of Linear Algebraic Processing for High-Performance Graph Applications on a Real Processing-In-Memory System

Processing large-scale graph datasets is computationally intensive and time-consuming. Processor-centric CPU and GPU architectures, commonly used for graph applications, often face bottlenecks caused by extensive data movement between the…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-11 Marzieh Barkhordar , Alireza Tabatabaeian , Mohammad Sadrosadati , Christina Giannoula , Juan Gomez Luna , Izzat El Hajj , Onur Mutlu , Alaa R. Alameldeen

Scalable Graph Embedding LearningOn A Single GPU

Graph embedding techniques have attracted growing interest since they convert the graph data into continuous and low-dimensional space. Effective graph analytic provides users a deeper understanding of what is behind the data and thus can…

Machine Learning · Computer Science 2022-01-21 Azita Nouri , Philip E. Davis , Pradeep Subedi , Manish Parashar

Hybrid Edge Partitioner: Partitioning Large Power-Law Graphs under Memory Constraints

Distributed systems that manage and process graph-structured data internally solve a graph partitioning problem to minimize their communication overhead and query run-time. Besides computational complexity -- optimal graph partitioning is…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-24 Ruben Mayer , Hans-Arno Jacobsen

A Graph-based Model for GPU Caching Problems

Modeling data sharing in GPU programs is a challenging task because of the massive parallelism and complex data sharing patterns provided by GPU architectures. Better GPU caching efficiency can be achieved through careful task scheduling…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-04 Lingda Li , Ari B. Hayes , Stephen A. Hackler , Eddy Z. Zhang , Mario Szegedy , Shuaiwen Leon Song

Efficient Processing of Very Large Graphs in a Small Cluster

Inspired by the success of Google's Pregel, many systems have been developed recently for iterative computation over big graphs. These systems provide a user-friendly vertex-centric programming interface, where a programmer only needs to…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-01-22 Da Yan , Yuzhen Huang , James Cheng , Huanhuan Wu

HyTGraph: GPU-Accelerated Graph Processing with Hybrid Transfer Management

Processing large graphs with memory-limited GPU needs to resolve issues of host-GPU data transfer, which is a key performance bottleneck. Existing GPU-accelerated graph processing frameworks reduce the data transfers by managing the active…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-01 Qiange Wang , Xin Ai , Yanfeng Zhang , Jing Chen , Ge Yu

High-Quality Shared-Memory Graph Partitioning

Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation in processing graphs. Recently, size, variety, and structural complexity of these networks has grown dramatically.…

Data Structures and Algorithms · Computer Science 2018-10-16 Yaroslav Akhremtsev , Peter Sanders , Christian Schulz

GraphHP: A Hybrid Platform for Iterative Graph Processing

The Bulk Synchronous Parallel(BSP) computational model has emerged as the dominant distributed framework to build large-scale iterative graph processing systems. While its implementations(e.g., Pregel, Giraph, and Hama) achieve high…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-06-23 Qun Chen , Song Bai , Zhanhuai Li , Zhiying Gou , Bo Suo , Wei Pan

Fast Processing of Large Graph Applications Using Asynchronous Architecture

Graph algorithms and techniques are increasingly being used in scientific and commercial applications to express relations and explore large data sets. Although conventional or commodity computer architectures, like CPU or GPU, can compute…

Hardware Architecture · Computer Science 2017-07-03 Michel A. Kinsy , Rashmi S. Agrawal , Hien D. Nguyen

GraphH: High Performance Big Graph Analytics in Small Clusters

It is common for real-world applications to analyze big graphs using distributed graph processing systems. Popular in-memory systems require an enormous amount of resources to handle big graphs. While several out-of-core approaches have…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-08-08 Peng Sun , Yonggang Wen , Ta Nguyen Binh Duong , Xiaokui Xiao

Efficient On-Chip Communication for Parallel Graph-Analytics on Spatial Architectures

Large-scale graph processing has drawn great attention in recent years. Most of the modern-day datacenter workloads can be represented in the form of Graph Processing such as MapReduce etc. Consequently, a lot of designs for Domain-Specific…

Hardware Architecture · Computer Science 2022-09-07 Khushal Sethi

Deep Multilevel Graph Partitioning

Partitioning a graph into blocks of "roughly equal" weight while cutting only few edges is a fundamental problem in computer science with a wide range of applications. In particular, the problem is a building block in applications that…

Data Structures and Algorithms · Computer Science 2021-05-06 Lars Gottesbüren , Tobias Heuer , Peter Sanders , Christian Schulz , Daniel Seemaier

xDGP: A Dynamic Graph Processing System with Adaptive Partitioning

Many real-world systems, such as social networks, rely on mining efficiently large graphs, with hundreds of millions of vertices and edges. This volume of information requires partitioning the graph across multiple nodes in a distributed…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-09-11 Luis Vaquero , Felix Cuadrado , Dionysios Logothetis , Claudio Martella

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Graph embeddings map graph nodes to continuous vectors and are foundational to community detection, recommendation, and many scientific applications. At billion-scale, however, existing graph embedding systems face a trade-off: they either…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-13 Zhonggen Li , Xiangyu Ke , Yifan Zhu , Yunjun Gao , Feifei Li

HC-SpMM: Accelerating Sparse Matrix-Matrix Multiplication for Graphs with Hybrid GPU Cores

Sparse Matrix-Matrix Multiplication (SpMM) is a fundamental operation in graph computing and analytics. However, the irregularity of real-world graphs poses significant challenges to achieving efficient SpMM operation for graph data on…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-13 Zhonggen Li , Xiangyu Ke , Yifan Zhu , Yunjun Gao , Yaofeng Tu

Exploring the Limits of GPUs With Parallel Graph Algorithms

In this paper, we explore the limits of graphics processors (GPUs) for general purpose parallel computing by studying problems that require highly irregular data access patterns: parallel graph algorithms for list ranking and connected…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-02-25 Frank Dehne , Kumanan Yogaratnam

GRE: A Graph Runtime Engine for Large-Scale Distributed Graph-Parallel Applications

Large-scale distributed graph-parallel computing is challenging. On one hand, due to the irregular computation pattern and lack of locality, it is hard to express parallelism efficiently. On the other hand, due to the scale-free nature,…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-10-22 Jie Yan , Guangming Tan , Ninghui Sun