English
Related papers

Related papers: GTaP: A GPU-Resident Fork-Join Task-Parallel Runti…

200 papers

GPUs have been widely used to accelerate computations exhibiting simple patterns of parallelism - such as flat or two-level parallelism - and a degree of parallelism that can be statically determined based on the size of the input dataset.…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-18 Hancheng Wu , Da Li , Michela Becchi

Autonomous agents powered by large language models (LLMs) have shown impressive capabilities in tool manipulation for complex task-solving. However, existing paradigms such as ReAct rely on sequential reasoning and execution, failing to…

Artificial Intelligence · Computer Science 2025-10-30 Jiaqi Wu , Qinlao Zhao , Zefeng Chen , Kai Qin , Yifei Zhao , Xueqian Wang , Yuhang Yao

With the rapid advancement of Artificial Intelligence, the Graphics Processing Unit (GPU) has become increasingly essential across a growing number of safety-critical application domains. Applying a GPU is indispensable for parallel…

Operating Systems · Computer Science 2026-02-25 Yuanhai Zhang , Songyang He , Ruizhe Gou , Mingyue Cui , Boyang Li , Shuai Zhao , Kai Huang

Parallel data processing has become indispensable for processing applications involving huge data sets. This brings into focus the Graphics Processing Units (GPUs) which emphasize on many-core computing. With the advent of General Purpose…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-05-22 Poorna Banerjee , Amit Dave

Parallel programming models can encourage performance portability by moving the responsibility for work assignment and data distribution from the programmer to a runtime system. However, analyzing the resulting implicit memory allocations,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-14 Fabian Knorr , Philip Salzmann , Peter Thoman , Thomas Fahringer

In this work, we survey the role of GPUs in real-time systems. Originally designed for parallel graphics workloads, GPUs are now widely used in time-critical applications such as machine learning, autonomous vehicles, and robotics due to…

Many emerging cyber-physical systems, such as autonomous vehicles and robots, rely heavily on artificial intelligence and machine learning algorithms to perform important system operations. Since these highly parallel applications are…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-07 An Zou , Jing Li , Christopher D. Gill , Xuan Zhang

GPU-based HPC clusters are attracting more scientific application developers due to their extensive parallelism and energy efficiency. In order to achieve portability among a variety of multi/many core architectures, a popular choice for an…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-04-10 Ali TehraniJamsaz , Alok Mishra , Akash Dutta , Abid M. Malik , Barbara Chapman , Ali Jannesari

Priority queue, often implemented as a heap, is an abstract data type that has been used in many well-known applications like Dijkstra's shortest path algorithm, Prim's minimum spanning tree, Huffman encoding, and the branch-and-bound…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-18 Yanhao Chen , Fei Hua , Chaozhang Huang , Jeremy Bierema , Chi Zhang , Eddy Z. Zhang

Graphics Processing Unit, or GPUs, have been successfully adopted both for graphic computation in 3D applications, and for general purpose application (GP-GPUs), thank to their tremendous performance-per-watt. Recently, there is a big…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-03 Paolo Burgio

Planning long-horizon robot manipulation requires making discrete decisions about which objects to interact with and continuous decisions about how to interact with them. A robot planner must select grasps, placements, and motions that are…

Modeling data sharing in GPU programs is a challenging task because of the massive parallelism and complex data sharing patterns provided by GPU architectures. Better GPU caching efficiency can be achieved through careful task scheduling…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-04 Lingda Li , Ari B. Hayes , Stephen A. Hackler , Eddy Z. Zhang , Mario Szegedy , Shuaiwen Leon Song

Large industrial systems that combine services and applications, have become targets for cyber criminals and are challenging from the security, monitoring and auditing perspectives. Security log analysis is a key step for uncovering…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-04-10 Xavier Bellekens , Christos Tachtatzis , Robert Atkinson , Craig Renfrew , Tony Kirkham

Graphics Processing Units (GPUs) are high performance co-processors originally intended to improve the use and quality of computer graphics applications. Once, researchers and practitioners noticed the potential of using GPU for general…

Numerical Analysis · Computer Science 2016-07-12 K. Parand , Saeed Zafarvahedian , Sayyed A. Hossayni

The rapid development in computing technology has paved the way for directive-based programming models towards a principal role in maintaining software portability of performance-critical applications. Efforts on such models involve a least…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-04-28 Kazuaki Matsumura , Simon Garcia De Gonzalo , Antonio J. Peña

GPUs are readily available in cloud computing and personal devices, but their use for data processing acceleration has been slowed down by their limited integration with common programming languages such as Python or Java. Moreover, using…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-20 Alberto Parravicini , Arnaud Delamare , Marco Arnaboldi , Marco D. Santambrogio

General Purpose Graphics Processing Unit (GPGPU) computing plays a transformative role in deep learning and machine learning by leveraging the computational advantages of parallel processing. Through the power of Compute Unified Device…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-20 Ming Li , Ziqian Bi , Tianyang Wang , Yizhu Wen , Qian Niu , Xinyuan Song , Zekun Jiang , Junyu Liu , Benji Peng , Sen Zhang , Xuanhe Pan , Jiawei Xu , Jinlang Wang , Keyu Chen , Caitlyn Heqi Yin , Pohsun Feng , Ming Liu

In this paper, we propose TAPA, an end-to-end framework that compiles a C++ task-parallel dataflow program into a high-frequency FPGA accelerator. Compared to existing solutions, TAPA has two major advantages. First, TAPA provides a set of…

Hardware Architecture · Computer Science 2024-10-18 Licheng Guo , Yuze Chi , Jason Lau , Linghao Song , Xingyu Tian , Moazin Khatti , Weikang Qiao , Jie Wang , Ecenur Ustun , Zhenman Fang , Zhiru Zhang , Jason Cong

Stochastic simulations need multiple replications in order to build confidence intervals for their results. Even if we do not need a large amount of replications, it is a good practice to speed-up the whole simulation time using the…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-01-08 Jonathan Passerat-Palmbach , Jonathan Caux , Pridi Siregar , Claude Mazel , David Hill

As GPU availability has increased and programming support has matured, a wider variety of applications are being ported to these platforms. Many parallel applications contain fine-grained synchronization idioms; as such, their correct…

Programming Languages · Computer Science 2021-09-14 Tyler Sorensen , Lucas F. Salvador , Harmit Raval , Hugues Evrard , John Wickerson , Margaret Martonosi , Alastair F. Donaldson
‹ Prev 1 2 3 10 Next ›