English
Related papers

Related papers: Periodic I/O scheduling for super-computers

200 papers

The ever-growing processing power of supercomputers in recent decades enables us to explore increasing complex scientific problems. Effective scheduling these jobs is crucial for individual job performance and system efficiency. The…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-09-21 Yuping Fan

Efficient data access in High-Performance Computing (HPC) systems is essential to the performance of intensive computing tasks. Traditional optimizations of the I/O stack aim to improve peak performance but are often workload specific and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-21 Thomas Collignon , Kouds Halitim , Raphaël Bleuse , Sophie Cerf , Bogdan Robu , Éric Rutten , Lionel Seinturier , Alexandre van Kempen

We consider a natural scheduling problem which arises in many distributed computing frameworks. Jobs with diverse resource requirements (e.g. memory requirements) arrive over time and must be served by a cluster of servers, each with a…

Networking and Internet Architecture · Computer Science 2019-01-21 Konstantinos Psychas , Javad Ghaderi

Burst-Buffering is a promising storage solution that introduces an intermediate highthroughput storage buffer layer to mitigate the I/O bottleneck problem that the current High-Performance Computing (HPC) platforms suffer. The existing…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-17 Benbo Zha , Hong Shen

We consider the problem of designing a packet-level congestion control and scheduling policy for datacenter networks. Current datacenter networks primarily inherit the principles that went into the design of Internet, where congestion…

Networking and Internet Architecture · Computer Science 2017-10-10 Devavrat Shah , Qiaomin Xie

High performance computing (HPC) is undergoing significant changes. The emerging HPC applications comprise both compute- and data-intensive applications. To meet the intense I/O demand from emerging data-intensive applications, burst…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-12-11 Yuping Fan , Zhiling Lan , Paul Rich , William E. Allcock , Michael E. Papka , Brian Austin , David Paul

We consider the problem of scheduling arrivals to a congestion system with a finite number of users having identical deterministic demand sizes. The congestion is of the processor sharing type in the sense that all users in the system at…

Optimization and Control · Mathematics 2017-04-12 Liron Ravner , Yoni Nazarathy

A queue is required when a service provider is not able to handle jobs arriving over the time. In a highly flexible and dynamic environment, some jobs might demand for faster execution at run-time especially when the resources are limited…

Performance · Computer Science 2015-03-24 Yash Gupta , Kamalakar Karlapalem

High Speed computing meets ever increasing real-time computational demands through the leveraging of flexibility and parallelism. The flexibility is achieved when computing platform designed with heterogeneous resources to support…

Operating Systems · Computer Science 2015-01-08 Mahendra Vucha , Arvind Rajawat

The primary motivation for uptake of virtualization has been resource isolation, capacity management and resource customization allowing resource providers to consolidate their resources in virtual machines. Various approaches have been…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-09-27 Omer Khalid , Ivo Maljevic , Richard Anthony , Miltos Petridis , Kevin Parrot , Markus Schulz

This paper investigates co-scheduling algorithms for processing a set of parallel applications. Instead of executing each application one by one, using a maximum degree of parallelism for each of them, we aim at scheduling several…

Data Structures and Algorithms · Computer Science 2013-05-01 Guillaume Aupy , Manu Shantharam , Anne Benoit , Yves Robert , Padma Raghavan

Motivated by modern parallel computing applications, we consider the problem of scheduling parallel-task jobs with heterogeneous resource requirements in a cluster of machines. Each job consists of a set of tasks that can be processed in…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-03 Mehrnoosh Shafiee , Javad Ghaderi

This paper addresses the problem of scheduling tasks with different criticality levels in the presence of I/O requests. In mixed-criticality scheduling, higher criticality tasks are given precedence over those of lower criticality when it…

Operating Systems · Computer Science 2016-03-15 Eric Missimer , Katherine Zhao , Richard West

Although High Performance Computing (HPC) users understand basic resource requirements such as the number of CPUs and memory limits, internal infrastructural utilization data is exclusively leveraged by cluster operators, who use it to…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-19 Abel Souza , Kristiaan Pelckmans , Johan Tordsson

Computational Grid is enormous environments with heterogeneous resources and stable infrastructures among other Internet-based computing systems. However, the managing of resources in such systems has its special problems. Scheduler systems…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-05-07 Asgarali Bouyer , Mohammad Javad hoseyni , Abdul Hanan Abdullah

As multimodal and AI-driven services exchange hundreds of megabytes per request, existing IPC runtimes spend a growing share of CPU cycles on memory copies. Although both hardware and software mechanisms are exploring memory offloading,…

Operating Systems · Computer Science 2026-01-13 Misun Park , Richi Dubey , Yifan Yuan , Nam Sung Kim , Ada Gavrilovska

Coflow provides a key application-layer abstraction for capturing communication patterns, enabling the efficient coordination of parallel data flows to reduce job completion times in distributed systems. Modern data center networks (DCNs)…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-10 Xin Wang , Hong Shen , Hui Tian , Dong Wang

Many HPC applications perform their I/O in bursts that follow a periodic pattern. This allows for making predictions as to when a burst occurs. System providers can take advantage of such knowledge to reduce file-system contention by…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-17 Ahmad Tarraf , Alexis Bandet , Francieli Boito , Guillaume Pallez , Felix Wolf

Traditionally, on-demand, rigid, and malleable applications have been scheduled and executed on separate systems. The ever-growing workload demands and rapidly developing HPC infrastructure trigger the interest of converging these…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-09-14 Yuping Fan , Paul Rich , William Allcock , Michael Papka , Zhiling Lan

This study presents a machine learning-assisted approach to optimize task scheduling in cluster systems, focusing on node-affinity constraints. Traditional schedulers like Kubernetes struggle with real-time adaptability, whereas the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-30 Leszek Sliwko , Jolanta Mizera-Pietraszko
‹ Prev 1 2 3 10 Next ›