Related papers: An Application-Level Dependable Technique for Farm…
Modern high performance computing (HPC) systems exhibit a rapid growth in size, both "horizontally" in the number of nodes, as well as "vertically" in the number of cores per node. As such, they offer additional levels of hardware…
This paper addresses a production scheduling problem derived from an industrial use case, focusing on unrelated parallel machine scheduling with the personnel availability constraint. The proposed model optimizes the production plan over a…
Application-level caching is a form of caching that has been increasingly adopted to satisfy performance and throughput requirements. The key idea is to store the results of a computation, to improve performance by reusing instead of…
We provide a multilevel approach for analysing performances of parallel algorithms. The main outcome of such approach is that the algorithm is described by using a set of operators which are related to each other according to the problem…
Motivated by the need for adaptive, secure and responsive scheduling in a great range of computing applications, including human-centered and time-critical applications, this paper proposes a scheduling framework that seamlessly adds…
Shared resource interference is observed by applications as dynamic performance asymmetry. Prior art has developed approaches to reduce the impact of performance asymmetry mainly at the operating system and architectural levels. In this…
We introduce the nivel2 software for multi-level modelling. Multi-level modelling is a modelling paradigm where a model element may be simultaneously a type for and an instance of other elements under some constraints. This contrasts…
In practice, standard scheduling of parallel computing jobs almost always leaves significant portions of the available hardware unused, even with many jobs still waiting in the queue. The simple reason is that the resource requests of these…
Application autotuning is a promising path investigated in literature to improve computation efficiency. In this context, the end-users define high-level requirements and an autonomic manager is able to identify and seize optimization…
We present a type theory combining both linearity and dependency by stratifying typing rules into a level for logics and a level for programs. The distinction between logics and programs decouples their semantics, allowing the type system…
Decoupled learning is a branch of model parallelism which parallelizes the training of a network by splitting it depth-wise into multiple modules. Techniques from decoupled learning usually lead to stale gradient effect because of their…
Profiling techniques are used extensively at different parts of the computing stack to achieve many goals. One major goal is to make a piece of software execute more efficiently on a specific hardware platform, where efficiency spans…
We consider a three-level parallelisation scheme. The second and third levels define a classical two-level parallelisation scheme and some load balancing algorithm is used to distribute tasks among processes. It is well-known that for many…
Porting applications to new hardware or programming models is a tedious and error prone process. Every help that eases these burdens is saving developer time that can then be invested into the advancement of the application itself instead…
There are billions of lines of sequential code inside nowadays' software which do not benefit from the parallelism available in modern multicore architectures. Automatically parallelizing sequential code, to promote an efficient use of the…
This paper presents a multiagent approach as a paradigm for scheduling parallel jobs in a parallel system. Scheduling parallel jobs is performed as a means to balance the load of a system in order to improve the performance of a parallel…
A parallel computer system is a collection of processing elements that communicate and cooperate to solve large computational problems efficiently. To achieve this, at first the large computational problem is partitioned into several tasks…
Parallel and distributed application design is a major area of interest in the domain of high performance scientific and industrial computing. Over the years, various approaches have been proposed to aid parallel program developers to…
The main goal of parallel processing is to provide users with performance that is much better than that of single processor systems. The execution of jobs is scheduled, which requires certain resources in order to meet certain criteria.…
A new form of caching, namely application-level caching, has been recently employed in web applications to improve their performance and increase scalability. It consists of the insertion of caching logic into the application base code to…