English
Related papers

Related papers: CkIO: Parallel File Input for Over-Decomposed Task…

200 papers

Overdecomposition has emerged as a powerful and sometimes essential technique in parallel programming. Many application domains or frameworks, including those based on adaptive mesh refinements, or tree codes use it. Charm++ is a parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-14 Aditya Bhosale , Anant Jain , Shourya Goel , Ritvik Rao , Peddoju Sateesh Kumar , Laxmikant Kale

Task-based programming models are excellent tools to parallelize and seamlessly load balance an application workload. However, the integration of I/O intensive applications and task-based programming models is lacking. Typically, I/O…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-30 Aleix Roca Nonell , Vicenç Beltran Querol , Sergi Mateo Bellido

The ongoing convergence of HPC and cloud computing presents a fundamental challenge: HPC applications, designed for static and homogeneous supercomputers, are ill-suited for the dynamic, heterogeneous, and volatile nature of the cloud.…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-17 Aditya Bhosale , Advait Tahilyani , Laxmikant Kale , Sara Kokkila-Schumacher

Storage systems have not kept the same technology improvement rate as computing systems. As applications produce more and more data, I/O becomes the limiting factor for increasing application performance. I/O congestion caused by concurrent…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-03 Hatem Elshazly , Jorge Ejarque , Francesc Lordan , Rosa M. Badia

The I/O access patterns of many parallel applications consist of accesses to a large number of small, noncontiguous pieces of data. If an application's I/O needs are met by making many small, distinct I/O requests, however, the I/O…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Rajeev Thakur , William Gropp , Ewing Lusk

Parallel I/O refers to the ability of scientific programs to concurrently read/write from/to a single file from multiple processes executing on distributed memory platforms like compute clusters. In the HPC world, I/O becomes a significant…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-15 Muhammad Sohaib Ayub , Muhammad Adnan , Muhammad Yasir Shafi

Driven by artificial intelligence, data science, and high-resolution simulations, I/O workloads and hardware on high-performance computing (HPC) systems have become increasingly complex. This complexity can lead to large I/O overheads and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-03 Hammad Ather , Jean Luca Bez , Chen Wang , Hank Childs , Allen D. Malony , Suren Byna

Production-quality parallel applications are often a mixture of diverse operations, such as computation- and communication-intensive, regular and irregular, tightly coupled and loosely linked operations. In conventional construction of…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-08-07 Ivy Bo Peng , Roberto Gioiosa , Gokcen Kestor , Erwin Laure , Stefano Markidis

Unstructured meshes are characterized by data points irregularly distributed in the Euclidian space. Due to the irregular nature of these data, computing connectivity information between the mesh elements requires much more time and memory…

Data Structures and Algorithms · Computer Science 2025-04-03 Guoxi Liu , Federico Iuricich

We evaluate and compare four contemporary and emerging runtimes for high-performance computing(HPC) applications: Cilk, Charm++, ParalleX and AM++. We compare along three bases: programming model, execution model and the implementation on…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-04-02 Abhishek Kulkarni , Andrew Lumsdaine

Asynchronous Many-Task (AMT) runtime systems take advantage of multi-core architectures with light-weight threads, asynchronous executions, and smart scheduling. In this paper, we present the comparison of the AMT systems Charm++ and HPX…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-04 Nanmiao Wu , Ioannis Gonidelis , Simeng Liu , Zane Fink , Nikunj Gupta , Karame Mohammadiporshokooh , Patrick Diehl , Hartmut Kaiser , Laxmikant V. Kale

In the past couple of decades, the computational abilities of supercomput- ers have increased tremendously. Leadership scale supercomputers now are capable of petaflops. Likewise, the problem size targeted by applications running on such…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-09-06 Robert Louis Cloud

Applications in science and engineering often require huge computational resources for solving problems within a reasonable time frame. Parallel supercomputers provide the computational infrastructure for solving such problems. A…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Rajesh Sudarsan , Calvin J. Ribbens

The dynamic load-balancing framework in Charm++/AMPI, developed at the University of Illinois, is based on using processor virtualization to allow thread migration across processors. This framework has been successfully applied to many…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-10-17 Alvaro Luiz Fazenda , Celso L. Mendes , Laxmikant V. Kale , Jairo Panetta , Eduardo Rocha Rodrigues

For an increasing number of data intensive scientific applications, parallel I/O concepts are a major performance issue. Tackling this issue, we develop an input/output system designed for highly efficient, scalable and conveniently usable…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-08-06 Erich Schikuta , Helmut Wanek , Heinz Stockinger , Kurt Stockinger , Thomas Fürle , Oliver Jorns , Christoph Löffelhardt , Peter Brezany , Minh Dang , Thomas Mück

In the semiconductor industry, integrated circuit (IC) processes play a vital role, as the rising complexity and market expectations necessitate improvements in yield. Identifying IC defects and assigning IC testing tasks to the right…

Artificial Intelligence · Computer Science 2025-06-04 Lo Pang-Yun Ting , Yu-Hao Chiang , Yi-Tung Tsai , Hsu-Chao Lai , Kun-Ta Chuang

Parallel applications can spend a significant amount of time performing I/O on large-scale supercomputers. Fast near-compute storage accelerators called burst buffers can reduce the time a processor spends performing I/O and mitigate I/O…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-15 Yiheng Xu , Pranav Sivaraman , Hariharan Devarajan , Kathryn Mohror , Abhinav Bhatele

Despite the various research initiatives and proposed programming models, efficient solutions for parallel programming in HPC clusters still rely on a complex combination of different programming models (e.g., OpenMP and MPI), languages…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-16 Hervé Yviquel , Marcio Pereira , Emílio Francesquini , Guilherme Valarini , Gustavo Leite , Pedro Rosso , Rodrigo Ceccato , Carla Cusihualpa , Vitoria Dias , Sandro Rigo , Alan Souza , Guido Araujo

HPC systems keep growing in size to meet the ever-increasing demand for performance and computational resources. Apart from increased performance, large scale systems face two challenges that hinder further growth: energy efficiency and…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-06 Ioannis Vardas , Manolis Ploumidis , Manolis Marazakis

Multi-core architectures feature an intricate hierarchy of cache memories, with multiple levels and sizes. To adequately decompose an application according to the traits of a particular memory hierarchy is a cumbersome task that may be…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-11-20 Hervé Paulino , Nuno Delgado
‹ Prev 1 2 3 10 Next ›