Related papers: Efficient Implementation of a Synchronous Parallel…

Parallel Flow-Based Hypergraph Partitioning

We present a shared-memory parallelization of flow-based refinement, which is considered the most powerful iterative improvement technique for hypergraph partitioning at the moment. Flow-based refinement works on bipartitions, so current…

Data Structures and Algorithms · Computer Science 2022-01-06 Lars Gottesbüren , Tobias Heuer , Peter Sanders

A Parallel Framework for Parametric Maximum Flow Problems in Image Segmentation

This paper presents a framework that supports the implementation of parallel solutions for the widespread parametric maximum flow computational routines used in image segmentation algorithms. The framework is based on supergraphs, a special…

Computer Vision and Pattern Recognition · Computer Science 2015-12-08 Vlad Olaru , Mihai Florea , Cristian Sminchisescu

Engineering A Workload-balanced Push-Relabel Algorithm for Massive Graphs on GPUs

The push-relabel algorithm is an efficient algorithm that solves the maximum flow/ minimum cut problems of its affinity to parallelization. As the size of graphs grows exponentially, researchers have used Graphics Processing Units (GPUs) to…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-02 Chou-Ying Hsieh , Po-Chieh Lin , Sy-Yen Kuo

Scalable Maxflow Processing for Dynamic Graphs

The Maximum Flow (Max-Flow) problem is a cornerstone in graph theory and combinatorial optimization, aiming to determine the largest possible flow from a designated source node to a sink node within a capacitated flow network. It has…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-04 Shruthi Kannappan , Ashwina Kumar , Rupesh Nasre

GPU Implementation and Optimization of a Flexible MAP Decoder for Synchronization Correction

In this paper we present an optimized parallel implementation of a flexible MAP decoder for synchronization error correcting codes, supporting a very wide range of code sizes and channel conditions. On mid-range GPUs we demonstrate decoding…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-26 Johann A. Briffa

Parallel Local Graph Clustering

Graph clustering has many important applications in computing, but due to growing sizes of graphs, even traditionally fast clustering methods such as spectral partitioning can be computationally expensive for real-world graphs of interest.…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-11 Julian Shun , Farbod Roosta-Khorasani , Kimon Fountoulakis , Michael W. Mahoney

Spinning Fast Iterative Data Flows

Parallel dataflow systems are a central part of most analytic pipelines for big data. The iterative nature of many analysis and machine learning algorithms, however, is still a challenge for current systems. While certain types of bulk…

Databases · Computer Science 2012-08-02 Stephan Ewen , Kostas Tzoumas , Moritz Kaufmann , Volker Markl

Accelerating a fluvial incision and landscape evolution model with parallelism

Solving inverse problems and achieving statistical rigour in landscape evolution models requires running many model realizations. Parallel computation is necessary to achieve this in a reasonable time. However, no previous algorithm is…

Computational Engineering, Finance, and Science · Computer Science 2019-01-23 Richard Barnes

Automatic Parallelization: Executing Sequential Programs on a Task-Based Parallel Runtime

There are billions of lines of sequential code inside nowadays' software which do not benefit from the parallelism available in modern multicore architectures. Automatically parallelizing sequential code, to promote an efficient use of the…

Programming Languages · Computer Science 2016-04-13 Alcides Fonseca , Bruno Cabral , João Rafael , Ivo Correia

Engineering a Scalable High Quality Graph Partitioner

We describe an approach to parallel graph partitioning that scales to hundreds of processors and produces a high solution quality. For example, for many instances from Walshaw's benchmark collection we improve the best known partitioning.…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-04-08 Manuel Holtgrewe , Peter Sanders , Christian Schulz

Warm-starting Push-Relabel

Push-Relabel is one of the most celebrated network flow algorithms. Maintaining a pre-flow that saturates a cut, it enjoys better theoretical and empirical running time than other flow algorithms, such as Ford-Fulkerson. In practice,…

Data Structures and Algorithms · Computer Science 2024-05-30 Sami Davies , Sergei Vassilvitskii , Yuyan Wang

Efficient Dynamic MaxFlow Computation on GPUs

Maxflow is a fundamental problem in graph theory and combinatorial optimisation, used to determine the maximum flow from a source node to a sink node in a flow network. It finds applications in diverse domains, including computer networks,…

Data Structures and Algorithms · Computer Science 2025-11-11 Shruthi Kannappan , Ashwina Kumar , Rupesh Nasre

An Adaptive Parallel Algorithm for Computing Connected Components

We present an efficient distributed memory parallel algorithm for computing connected components in undirected graphs based on Shiloach-Vishkin's PRAM approach. We discuss multiple optimization techniques that reduce communication volume as…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-02-15 Chirag Jain , Patrick Flick , Tony Pan , Oded Green , Srinivas Aluru

Efficient Parallel Connected Components Labeling with a Coarse-to-fine Strategy

This paper proposes a new parallel approach to solve connected components on a 2D binary image implemented with CUDA. We employ the following strategies to accelerate neighborhood exploration after dividing an input image into independent…

Computer Vision and Pattern Recognition · Computer Science 2018-01-29 Jun Chen , Keisuke Nonaka , Ryosuke Watanabe , Hiroshi Sankoh , Houari Sabirin , Sei Naito

A Multi-signal Variant for the GPU-based Parallelization of Growing Self-Organizing Networks

Among the many possible approaches for the parallelization of self-organizing networks, and in particular of growing self-organizing networks, perhaps the most common one is producing an optimized, parallel implementation of the standard…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-03-31 Giacomo Parigi , Angelo Stramieri , Danilo Pau , Marco Piastra

PASGAL: Parallel And Scalable Graph Algorithm Library

In this paper, we introduce PASGAL (Parallel And Scalable Graph Algorithm Library), a parallel graph library that scales to a variety of graph types, many processors, and large graph sizes. One special focus of PASGAL is the efficiency on…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-29 Xiaojun Dong , Yan Gu , Yihan Sun , Letong Wang

PAGANI: A Parallel Adaptive GPU Algorithm for Numerical

We present a new adaptive parallel algorithm for the challenging problem of multi-dimensional numerical integration on massively parallel architectures. Adaptive algorithms have demonstrated the best performance, but efficient many-core…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-24 Ioannis Sakiotis , Kamesh Arumugam , Marc Paterno , Desh Ranjan , Balša Terzić , Mohammad Zubair

Relaxed Scheduling for Scalable Belief Propagation

The ability to leverage large-scale hardware parallelism has been one of the key enablers of the accelerated recent progress in machine learning. Consequently, there has been considerable effort invested into developing efficient parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-19 Vitaly Aksenov , Dan Alistarh , Janne H. Korhonen

An Evaluation of Massively Parallel Algorithms for DFA Minimization

We study parallel algorithms for the minimization of Deterministic Finite Automata (DFAs). In particular, we implement four different massively parallel algorithms on Graphics Processing Units (GPUs). Our results confirm the expectations…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-31 Jan Martens , Anton Wijs

Parallel implematation of flow and matching algorithms

In our work we present two parallel algorithms and their lock-free implementations using a popular GPU environment Nvidia CUDA. The first algorithm is the push-relabel method for the flow problem in grid graphs. The second is the cost…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-10-31 Agnieszka Łupińska