Related papers: Parallel Processing of Large Graphs

Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even…

Data Structures and Algorithms · Computer Science 2019-08-22 Laxman Dhulipala , Guy E. Blelloch , Julian Shun

A Comparison of Parallel Graph Processing Implementations

The rapidly growing number of large network analysis problems has led to the emergence of many parallel and distributed graph processing systems---one survey in 2014 identified over 80. Since then, the landscape has evolved; some packages…

Performance · Computer Science 2017-05-18 Samuel Pollard , Boyana Norris

BSP vs MapReduce

The MapReduce framework has been generating a lot of interest in a wide range of areas. It has been widely adopted in industry and has been used to solve a number of non-trivial problems in academia. Putting MapReduce on strong theoretical…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-06-19 Matthew Felice Pace

Parallel aggregation is a ubiquitous operation in data analytics that is expressed as GROUP BY in SQL, reduce in Hadoop, or segment in TensorFlow. Parallel aggregation starts with an optional local pre-aggregation step and then repartitions…

Databases · Computer Science 2018-11-30 Feilong Liu , Ario Salmasi , Spyros Blanas , Anastasios Sidiropoulos

Scheduling of Graph Queries: Controlling Intra- and Inter-query Parallelism for a High System Throughput

The vast amounts of data used in social, business or traffic networks, biology and other natural sciences are often managed in graph-based data sets, consisting of a few thousand up to billions and trillions of vertices and edges,…

Databases · Computer Science 2021-10-22 Matthias Hauck , Ismail Oukid , Holger Fröning

Streaming Graph Algorithms in the Massively Parallel Computation Model

We initiate the study of graph algorithms in the streaming setting on massive distributed and parallel systems inspired by practical data processing systems. The objective is to design algorithms that can efficiently process evolving graphs…

Data Structures and Algorithms · Computer Science 2025-01-20 Artur Czumaj , Gopinath Mishra , Anish Mukherjee

Parallel Local Graph Clustering

Graph clustering has many important applications in computing, but due to growing sizes of graphs, even traditionally fast clustering methods such as spectral partitioning can be computationally expensive for real-world graphs of interest.…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-11 Julian Shun , Farbod Roosta-Khorasani , Kimon Fountoulakis , Michael W. Mahoney

Simulating Parallel Algorithms in the MapReduce Framework with Applications to Parallel Computational Geometry

In this paper, we describe efficient MapReduce simulations of parallel algorithms specified in the BSP and PRAM models. We also provide some applications of these simulation results to problems in parallel computational geometry for the…

Data Structures and Algorithms · Computer Science 2015-03-14 Michael T. Goodrich

MapReduce Meets Fine-Grained Complexity: MapReduce Algorithms for APSP, Matrix Multiplication, 3-SUM, and Beyond

Distributed processing frameworks, such as MapReduce, Hadoop, and Spark are popular systems for processing large amounts of data. The design of efficient algorithms in these frameworks is a challenging problem, as the systems both require…

Data Structures and Algorithms · Computer Science 2019-05-07 MohammadTaghi Hajiaghayi , Silvio Lattanzi , Saeed Seddighin , Cliff Stein

The Efficiency of MapReduce in Parallel External Memory

Since its introduction in 2004, the MapReduce framework has become one of the standard approaches in massive distributed and parallel computation. In contrast to its intensive use in practise, theoretical footing is still limited and only…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-12-19 Gero Greiner , Riko Jacob

Time and Memory Efficient Parallel Algorithm for Structural Graph Summaries and two Extensions to Incremental Summarization and $k$-Bisimulation for Long $k$-Chaining

We developed a flexible parallel algorithm for graph summarization based on vertex-centric programming and parameterized message passing. The base algorithm supports infinitely many structural graph summary models defined in a formal…

Data Structures and Algorithms · Computer Science 2022-11-07 Till Blume , Jannik Rau , David Richerby , Ansgar Scherp

Massively Parallel Algorithms for Small Subgraph Counting

Over the last two decades, frameworks for distributed-memory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel…

Data Structures and Algorithms · Computer Science 2022-07-19 Amartya Shankha Biswas , Talya Eden , Quanquan C. Liu , Slobodan Mitrović , Ronitt Rubinfeld

A Meta-graph Approach to Analyze Subgraph-centric Distributed Programming Models

Component-centric distributed graph processing platforms that use a bulk synchronous parallel (BSP) programming model have gained traction. These address the short-comings of Big Data abstractions/platforms like MapReduce/Hadoop for…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-13 Ravikant Dindokar , Neel Choudhury , Yogesh Simmhan

Comparing MapReduce and Pipeline Implementations for Counting Triangles

A common method to define a parallel solution for a computational problem consists in finding a way to use the Divide and Conquer paradigm in order to have processors acting on its own data and scheduled in a parallel fashion. MapReduce is…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-01-13 Edelmira Pasarella , Maria-Esther Vidal , Cristina Zoltan

BPP: Large Graph Storage for Efficient Disk Based Processing

Processing very large graphs like social networks, biological and chemical compounds is a challenging task. Distributed graph processing systems process the billion-scale graphs efficiently but incur overheads of efficient partitioning and…

Data Structures and Algorithms · Computer Science 2014-01-13 Kamran Najeebullah , Kifayat Ullah Khan , Waqas Nawaz , Young-Koo Lee

Parallel Graph Partitioning for Complex Networks

Processing large complex networks like social networks or web graphs has recently attracted considerable interest. In order to do this in parallel, we need to partition them into pieces of about equal size. Unfortunately, previous parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-01-27 Henning Meyerhenke , Peter Sanders , Christian Schulz

Efficient On-Chip Communication for Parallel Graph-Analytics on Spatial Architectures

Large-scale graph processing has drawn great attention in recent years. Most of the modern-day datacenter workloads can be represented in the form of Graph Processing such as MapReduce etc. Consequently, a lot of designs for Domain-Specific…

Hardware Architecture · Computer Science 2022-09-07 Khushal Sethi

The Family of MapReduce and Large Scale Data Processing Systems

In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a…

Databases · Computer Science 2013-02-14 Sherif Sakr , Anna Liu , Ayman G. Fayoumi

Dynamic Parallel and Distributed Graph Cuts

Graph-cuts are widely used in computer vision. In order to speed up the optimization process and improve the scalability for large graphs, Strandmark and Kahl introduced a splitting method to split a graph into multiple subgraphs for…

Data Structures and Algorithms · Computer Science 2016-11-03 Miao Yu , Shuhan Shen , Zhanyi Hu

GraphHP: A Hybrid Platform for Iterative Graph Processing

The Bulk Synchronous Parallel(BSP) computational model has emerged as the dominant distributed framework to build large-scale iterative graph processing systems. While its implementations(e.g., Pregel, Giraph, and Hama) achieve high…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-06-23 Qun Chen , Song Bai , Zhanhuai Li , Zhiying Gou , Bo Suo , Wei Pan