English
Related papers

Related papers: Specializing Coherence, Consistency, and Push/Pull…

200 papers

We reduce the cost of communication and synchronization in graph processing by analyzing the fastest way to process graphs: pushing the updates to a shared state or pulling the updates to a private state.We investigate the applicability of…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-02 Maciej Besta , Michal Podstawski , Linus Groner , Edgar Solomonik , Torsten Hoefler

Throughput-oriented computing via co-running multiple applications in the same machine has been widely adopted to achieve high hardware utilization and energy saving on modern supercomputers and data centers. However, efficiently co-running…

Performance · Computer Science 2023-03-29 Hao Xu , Shuang Song , Ze Mao

Acceleration of graph applications on GPUs has found large interest due to the ubiquitous use of graph processing in various domains. The inherent \textit{irregularity} in graph applications leads to several challenges for parallelization.…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-02 Ananya Raval , Rupesh Nasre , Vivek Kumar , Vasudevan R , Sathish Vadhiyar , Keshav Pingali

Dynamic graph neural network (DGNN) is becoming increasingly popular because of its widespread use in capturing dynamic features in the real world. A variety of dynamic graph neural networks designed from algorithmic perspectives have…

Hardware Architecture · Computer Science 2023-04-17 Hanqiu Chen , Yahya Alhinai , Yihan Jiang , Eunjee Na , Cong Hao

Not only with the large host memory for supporting large scale graph processing, GPU-accelerated heterogeneous architecture can also provide a great potential for high-performance computing. However, few existing heterogeneous systems can…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-05 Xianliang Li

Load-balancing among the threads of a GPU for graph analytics workloads is difficult because of the irregular nature of graph applications and the high variability in vertex degrees, particularly in power-law graphs. We describe a novel…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-28 Vishwesh Jatala , Loc Hoang , Roshan Dathathri , Gurbinder Gill , V Krishna Nandivada , Keshav Pingali

Connected components and spanning forest are fundamental graph algorithms due to their use in many important applications, such as graph clustering and image segmentation. GPUs are an ideal platform for graph algorithms due to their high…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-28 Changwan Hong , Laxman Dhulipala , Julian Shun

Dynamic parallelism on GPUs allows GPU threads to dynamically launch other GPU threads. It is useful in applications with nested parallelism, particularly where the amount of nested parallelism is irregular and cannot be predicted…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-01-11 Mhd Ghaith Olabi , Juan Gómez Luna , Onur Mutlu , Wen-mei Hwu , Izzat El Hajj

GPUs have been widely used to accelerate computations exhibiting simple patterns of parallelism - such as flat or two-level parallelism - and a degree of parallelism that can be statically determined based on the size of the input dataset.…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-18 Hancheng Wu , Da Li , Michela Becchi

In order to improve system performance efficiently, a number of systems choose to equip multi-core and many-core processors (such as GPUs). Due to their discrete memory these heterogeneous architectures comprise a distributed system within…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-02-27 Hao Wu , Daniel Lohmann , Wolfgang Schröder-Preikschat

To keep pace with the rapid advancements in design complexity within modern computing systems, directed graph representation learning (DGRL) has become crucial, particularly for encoding circuit netlists, computational graphs, and…

Machine Learning · Computer Science 2024-10-10 Haoyu Wang , Yinan Huang , Nan Wu , Pan Li

Modern GPU systems are constantly evolving to meet the needs of computing-intensive applications in scientific and machine learning domains. However, there is typically a gap between the hardware capacity and the achievable application…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-02 Gabin Schieffer , Ruimin Shi , Stefano Markidis , Andreas Herten , Jennifer Faj , Ivy Peng

Given its high integration density, high speed, byte addressability, and low standby power, non-volatile or persistent memory is expected to supplement/replace DRAM as main memory. Through persistency programming models (which define…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-04-30 Zhen Lin , Mohammad Alshboul , Yan Solihin , Huiyang Zhou

Graph structure learning is a core problem in graph-based machine learning, essential for uncovering latent relationships and ensuring model interpretability. However, most existing approaches are ill-suited for large-scale and dynamically…

Machine Learning · Computer Science 2025-05-20 Mohit Kataria , Nikita Malik , Sandeep Kumar , Jayadeva

Aligning generative diffusion models with human preferences via reinforcement learning (RL) is critical yet challenging. Most existing algorithms are often vulnerable to reward hacking, such as quality degradation, over-stylization, or…

Graph generative models are essential across diverse scientific domains by capturing complex distributions over relational data. Among them, graph diffusion models achieve superior performance but face inefficient sampling and limited…

Machine Learning · Computer Science 2025-06-17 Yiming Qin , Manuel Madeira , Dorina Thanou , Pascal Frossard

As deep learning models are increasingly deployed on mobile devices, modern mobile devices incorporate deep learning-specific accelerators to handle the growing computational demands, thus increasing their hardware heterogeneity. However,…

Machine Learning · Computer Science 2025-08-26 Duseok Kang , Yunseong Lee , Junghoon Kim

Over the past few years, there has been an increased interest in including FPGAs in data centers and high-performance computing clusters along with GPUs and other accelerators. As a result, it has become increasingly important to have a…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-14 Mostafa Eghbali Zarch , Reece Neff , Michela Becchi

In this paper, we evaluate and compare the performance of two approaches, namely self-stabilization and rollback, to handling consistency violating faults (\cvf) that occur when a self-stabilizing distributed graph-based program is executed…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-29 Duong Nguyen , Sandeep S. Kulkarni

Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code that supports symbolic, graph-based Deep…

Software Engineering · Computer Science 2022-07-20 Tatiana Castro Vélez , Raffi Khatchadourian , Mehdi Bagherzadeh , Anita Raja
‹ Prev 1 2 3 10 Next ›