Related papers: A Peer-to-Peer Middleware Framework for Resilient …
The social and economic importance of large bodies of programs and data that are potentially long-lived has attracted much attention in the commercial and research communities. Here we concentrate on a set of methodologies and technologies…
The work presented in this thesis seeks to improve programmer productivity in the following ways: - by reducing the amount of code that has to be written to construct an application; - by increasing the reliability of the code written; and…
To design peer-to-peer (P2P) software systems is a challenging task, because of their highly decentralized nature, which may cause unexpected emergent global behaviors. The last fifteen years have seen many P2P applications to come out and…
Non-Volatile Memory devices may soon be a part of main memory, and programming models that give programmers direct access to persistent memory through loads and stores are sought to maximize the performance benefits of these new devices.…
In this document, we develop a structured approach to the management of HPC resilience based on the concept of resilience-based design patterns. A design pattern is a general repeatable solution to a commonly occurring problem. We identify…
Like other engineering disciplines, software engineering should also have principles to guide the construction of sustainable computer applications. Tangible properties include a) unlimited scalability, b) maximal reproducibility, and c)…
Applications in science and engineering often require huge computational resources for solving problems within a reasonable time frame. Parallel supercomputers provide the computational infrastructure for solving such problems. A…
Reliability is a serious concern for future extreme-scale high-performance computing (HPC) systems. While the HPC community has developed various resilience solutions, the solution space remains fragmented. There are no formal methods and…
In recent years, the research community has raised serious questions about the reproducibility of scientific work. In particular, since many studies include some kind of computing work, reproducibility is also a technological challenge, not…
Flexible and performant Persistency Service is a necessary component of any HEP Software Framework. The building of a modular, non-intrusive and performant persistency component have been shown to be very difficult task. In the past, it was…
The adoption of heterogeneous computing systems based on diverse architectures to achieve exascale computing power has worsened the performance portability problem of scientific applications that were designed to run on these platforms. To…
Parallel programmers face the often irreconcilable goals of programmability and performance. HPC systems use distributed memory for scalability, thereby sacrificing the programmability advantages of shared memory programming models.…
We introduce an object-oriented framework for parallel programming, which is based on the observation that programming objects can be naturally interpreted as processes. A parallel program consists of a collection of persistent processes…
We consider a parallel computational model that consists of $P$ processors, each with a fast local ephemeral memory of limited size, and sharing a large persistent memory. The model allows for each processor to fault with bounded…
Given its high integration density, high speed, byte addressability, and low standby power, non-volatile or persistent memory is expected to supplement/replace DRAM as main memory. Through persistency programming models (which define…
The embedding of fault tolerance provisions into the application layer of a programming language is a non-trivial task that has not found a satisfactory solution yet. Such a solution is very important, and the lack of a simple, coherent and…
Hardware heterogeneity is here to stay for high-performance computing. Large-scale systems are currently equipped with multiple GPU accelerators per compute node and are expected to incorporate more specialized hardware in the future. This…
It is undeniable that most developers today are building distributed applications. However, most of these applications are developed by composing existing systems together through unspecified APIs exposed to the application developer.…
Peer-to-peer systems are the most resilient form of distributed computing, but the design of robust protocols for their coordination is difficult. This makes it hard to specify and reason about global behaviour of such systems. This paper…
A peer-to-peer application architecture is proposed that has the potential to eliminate the back-end servers for hosting services on the Internet. The proposed application architecture has been modeled as a distributed system for delivering…