Related papers: Parallel Write-Efficient Algorithms and Data Struc…

Improved Parallel Cache-Oblivious Algorithms for Dynamic Programming and Linear Algebra

Emerging non-volatile main memory (NVRAM) technologies provide byte-addressability, low idle power, and improved memory-density, and are likely to be a key component in the future memory hierarchy. However, a critical challenge in achieving…

Data Structures and Algorithms · Computer Science 2019-08-22 Guy E. Blleloch , Yan Gu

Algorithmic Building Blocks for Asymmetric Memories

The future of main memory appears to lie in the direction of new non-volatile memory technologies that provide strong capacity-to-performance ratios, but have write operations that are much more expensive than reads in terms of energy,…

Data Structures and Algorithms · Computer Science 2018-06-28 Yan Gu , Yihan Sun , Guy E. Blelloch

Sorting with Asymmetric Read and Write Costs

Emerging memory technologies have a significant gap between the cost, both in time and in energy, of writing to memory versus reading from memory. In this paper we present models and algorithms that account for this difference, with a focus…

Data Structures and Algorithms · Computer Science 2016-03-15 Guy E. Blelloch , Jeremy T. Fineman , Phillip B. Gibbons , Yan Gu , Julian Shun

A parallel algorithm for Delaunay triangulation of moving points on the plane

Delaunay Triangulation(DT) is one of the important geometric problems that is used in various branches of knowledge such as computer vision, terrain modeling, spatial clustering and networking. Kinetic data structures have become very…

Computational Geometry · Computer Science 2023-08-15 Nazanin Hadiniya , Mohammad Ghodsi

Efficient Algorithms with Asymmetric Read and Write Costs

In several emerging technologies for computer memory (main memory), the cost of reading is significantly cheaper than the cost of writing. Such asymmetry in memory costs poses a fundamentally different model from the RAM for algorithm…

Data Structures and Algorithms · Computer Science 2016-08-30 Guy E. Blelloch , Jeremy T. Fineman , Phillip B. Gibbons , Yan Gu , Julian Shun

Implicit Decomposition for Write-Efficient Connectivity Algorithms

The future of main memory appears to lie in the direction of new technologies that provide strong capacity-to-performance ratios, but have write operations that are much more expensive than reads in terms of latency, bandwidth, and energy.…

Data Structures and Algorithms · Computer Science 2017-10-10 Naama Ben-David , Guy E. Blelloch , Jeremy T. Fineman , Phillip B. Gibbons , Yan Gu , Charles McGuffey , Julian Shun

Efficiency Guarantees for Parallel Incremental Algorithms under Relaxed Schedulers

Several classic problems in graph processing and computational geometry are solved via incremental algorithms, which split computation into a series of small tasks acting on shared state, which gets updated progressively. While the…

Data Structures and Algorithms · Computer Science 2020-03-24 Dan Alistarh , Nikita Koval , Giorgi Nadiradze

Parallelism in Randomized Incremental Algorithms

In this paper we show that many sequential randomized incremental algorithms are in fact parallel. We consider algorithms for several problems including Delaunay triangulation, linear programming, closest pair, smallest enclosing disk,…

Data Structures and Algorithms · Computer Science 2018-10-15 Guy E. Blelloch , Yan Gu , Julian Shun , Yihan Sun

Parallel Construction of Compact Planar Embeddings

The sheer sizes of modern datasets are forcing data-structure designers to consider seriously both parallel construction and compactness. To achieve those goals we need to design a parallel algorithm with good scalability and with low…

Data Structures and Algorithms · Computer Science 2017-05-02 Leo Ferres , José Fuentes-Sepúlveda , Travis Gagie , Meng He , Gonzalo Navarro

Parallel Delta-Stepping Algorithm for Shared Memory Architectures

We present a shared memory implementation of a parallel algorithm, called delta-stepping, for solving the single source shortest path problem for directed and undirected graphs. In order to reduce synchronization costs we make some…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-02-21 M. Kranjčević , D. Palossi , S. Pintarelli

Parareal Neural Networks Emulating a Parallel-in-time Algorithm

As deep neural networks (DNNs) become deeper, the training time increases. In this perspective, multi-GPU parallel computing has become a key tool in accelerating the training of DNNs. In this paper, we introduce a novel methodology to…

Numerical Analysis · Mathematics 2024-07-08 Chang-Ock Lee , Youngkyu Lee , Jongho Park

Distributed-Memory Parallel Algorithms for Fixed-Radius Near Neighbor Graph Construction

Computing fixed-radius near-neighbor graphs is an important first step for many data analysis algorithms. Near-neighbor graphs connect points that are close under some metric, endowing point clouds with a combinatorial structure. As…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-17 Gabriel Raulet , Dmitriy Morozov , Aydin Buluc , Katherine Yelick

Algorithm-hardware co-design of neuromorphic networks with dual memory pathways

Spiking neural networks excel at event-driven sensing. Yet, maintaining task-relevant context over long timescales both algorithmically and in hardware, while respecting both tight energy and memory budgets, remains a core challenge in the…

Neural and Evolutionary Computing · Computer Science 2026-05-05 Pengfei Sun , Zhe Su , Jascha Achterberg , Giacomo Indiveri , Dan F. M. Goodman , Danyal Akarca

Distributed-Memory Parallel Algorithms for Counting and Listing Triangles in Big Graphs

Big graphs (networks) arising in numerous application areas pose significant challenges for graph analysts as these graphs grow to billions of nodes and edges and are prohibitively large to fit in the main memory. Finding the number of…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-06-19 Shaikh Arifuzzaman , Maleq Khan , Madhav Marathe

One machine, one minute, three billion tetrahedra

This paper presents a new scalable parallelization scheme to generate the 3D Delaunay triangulation of a given set of points. Our first contribution is an efficient serial implementation of the incremental Delaunay insertion algorithm. A…

Computational Geometry · Computer Science 2018-11-07 Célestin Marot , Jeanne Pellerin , Jean-François Remacle

On the Design and Analysis of Parallel and Distributed Algorithms

Arrival of multicore systems has enforced a new scenario in computing, the parallel and distributed algorithms are fast replacing the older sequential algorithms, with many challenges of these techniques. The distributed algorithms provide…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-13 Rajendra Purohit , K R Chowdhary , S D Purohit

A 2D Parallel Triangle Counting Algorithm for Distributed-Memory Architectures

Triangle counting is a fundamental graph analytic operation that is used extensively in network science and graph mining. As the size of the graphs that needs to be analyzed continues to grow, there is a requirement in developing scalable…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-24 Ancy Sarah Tom , George Karypis

Engineering a Distributed-Memory Triangle Counting Algorithm

Counting triangles in a graph and incident to each vertex is a fundamental and frequently considered task of graph analysis. We consider how to efficiently do this for huge graphs using massively parallel distributed-memory machines.…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-07-24 Peter Sanders , Tim Niklas Uhl

Parallel Algorithms for Tensor Train Arithmetic

We present efficient and scalable parallel algorithms for performing mathematical operations for low-rank tensors represented in the tensor train (TT) format. We consider algorithms for addition, elementwise multiplication, computing norms…

Numerical Analysis · Mathematics 2021-09-08 Hussam Al Daas , Grey Ballard , Peter Benner

Towards Work-Efficient Parallel Parameterized Algorithms

Parallel parameterized complexity theory studies how fixed-parameter tractable (fpt) problems can be solved in parallel. Previous theoretical work focused on parallel algorithms that are very fast in principle, but did not take into account…

Data Structures and Algorithms · Computer Science 2019-02-21 Max Bannach , Malte Skambath , Till Tantau