English
Related papers

Related papers: Compiler Optimization for Irregular Memory Access …

200 papers

The Partitioned Global Address Space (PGAS) programming model strikes a balance between the locality-aware, but explicit, message-passing model and the easy-to-use, but locality-agnostic, shared memory model. However, the PGAS rich memory…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-09-11 Olivier Serres , Abdullah Kayi , Ahmad Anbar , Tarek El-Ghazawi

The PGAS model is well suited for executing irregular applications on cluster-based systems, due to its efficient support for short, one-sided messages. However, there are currently two major limitations faced by PGAS applications. The…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-06-22 Sri Raj Paul , Akihiro Hayashi , Kun Chen , Vivek Sarkar

This work presents a heterogeneous communication library for clusters of processors and FPGAs. This library, Shoal, supports the Partitioned Global Address Space (PGAS) memory model for applications. PGAS is a shared memory model for…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-27 Varun Sharma , Paul Chow

Partitioned global address space (PGAS) is a parallel programming model for the development of applications on clusters. It provides a global address space partitioned among the cluster nodes, and is supported in programming languages like…

Logic in Computer Science · Computer Science 2013-07-26 Georgel Calin , Egor Derevenetc , Rupak Majumdar , Roland Meyer

A new parallel algorithm utilizing partitioned global address space (PGAS) programming model to achieve high scalability is reported for particle tracking in direct numerical simulations of turbulent flow. The work is motivated by the…

Computational Physics · Physics 2020-05-28 Dhawal Buaria , P. K. Yeung

Using large-scale multicore systems to get the maximum performance and energy efficiency with manageable programmability is a major challenge. The partitioned global address space (PGAS) programming model enhances programmability by…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-01 Jérémie Lagravière , Johannes Langguth , Mohammed Sourouri , Phuong H. Ha , Xing Cai

Coarse-Grained Reconfigurable Arrays (CGRAs) are specialized accelerators commonly employed to boost performance in workloads with iterative structures. Existing research typically focuses on compiler or architecture optimizations aimed at…

Hardware Architecture · Computer Science 2025-08-28 Xiangfeng Liu , Zhe Jiang , Anzhen Zhu , Xiaomeng Han , Mingsong Lyu , Qingxu Deng , Nan Guan

The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-03-08 Huan Zhou , Kamran Idrees , José Gracia

The partitioned global address space has bridged the gap between shared and distributed memory, and with this bridge comes the ability to adapt shared memory concepts, such as non-blocking programming, to distributed systems such as…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-30 Garvit Dewan , Louis Jenkins

Embedded system performances are bounded by power consumption. The trend is to offload greedy computations on hardware accelerators as GPU, Xeon Phi or FPGA. FPGA chips combine both flexibility of programmable chips and energy-efficiency of…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-16 Christophe Alias

We propose a set of benchmarks that specifically targets a major cause of performance degradation in high performance computing platforms: irregular access patterns. These benchmarks are meant to be used to asses the performance of…

Performance · Computer Science 2008-05-27 H. L. A. van der Spek , E. M. Bakker , H. A. G. Wijshoff

The UPC programming language offers parallelism via logically partitioned shared memory, which typically spans physically disjoint memory sub-systems. One convenient feature of UPC is its ability to automatically execute between-thread data…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-01 Jérémie Lagravière , Johannes Langguth , Martina Prugger , Lukas Einkemmer , Phuong H. Ha , Xing Cai

A Partitioned Global Address Space (PGAS) approach treats a distributed system as if the memory were shared on a global level. Given such a global view on memory, the user may program applications very much like shared memory systems. This…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-07-08 Huan Zhou , Yousri Mhedheb , Kamran Idrees , Colin W. Glass , José Gracia , Karl Fürlinger , Jie Tao

Partitioned Global Address Space (PGAS) integrates the concepts of shared memory programming and the control of data distribution and locality provided by message passing into a single parallel programming model. The purpose of allying…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-03-15 Kamran Idrees , Christoph Niethammer , Aniello Esposito , Colin W. Glass

Local search is a successful approach for solving combinatorial optimization and constraint satisfaction problems. With the progressing move toward multi and many-core systems, GPUs and the quest for Exascale systems, parallelism has become…

Programming Languages · Computer Science 2013-05-13 Rui Machado , Salvador Abreu , Daniel Diaz

The Partitioned Global Address Space (PGAS), a memory model in which the global address space is explicitly partitioned across compute nodes in a cluster, strives to bridge the gap between shared-memory and distributed-memory programming.…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-02 Garvit Dewan , Louis Jenkins

We present DASH, a C++ template library that offers distributed data structures and parallel algorithms and implements a compiler-free PGAS (partitioned global address space) approach. DASH offers many productivity and performance features…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-06 Karl Fürlinger , Tobias Fuchs , Roger Kowalewski

Cost-based query optimization remains a critical task in relational databases even after decades of research and industrial development. Query optimizers rely on a large range of statistical synopses -- including attribute-level histograms…

Databases · Computer Science 2021-02-05 Yesdaulet Izenov , Asoke Datta , Florin Rusu , Jun Hyung Shin

Applications with irregular data structures, data-dependent control flows and fine-grained data transfers (e.g., real-world graph computations) perform poorly on cache-based systems. We propose the UpDown accelerator that supports…

Maximizing parallelism level in applications can be achieved by minimizing overheads due to load imbalances and waiting time due to memory latencies. Compiler optimization is one of the most effective solutions to tackle this problem. The…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-03-29 Zahra Khatami , Hartmut Kaiser , J. Ramanujam
‹ Prev 1 2 3 10 Next ›