Related papers: A Complexity Separation Between the Cache-Coherent…
Multi-core architectures feature an intricate hierarchy of cache memories, with multiple levels and sizes. To adequately decompose an application according to the traits of a particular memory hierarchy is a cumbersome task that may be…
Motivated by recent distributed systems technology, Aguilera et al. introduced a hybrid model of distributed computing, called message-and-memory model or m&m model for short [1]. In this model, processes can communicate by message passing…
Modern large-scale scientific applications consist of thousands to millions of individual tasks. These tasks involve not only computation but also communication with one another. Typically, the communication pattern between tasks is sparse…
Shared Memory is a mechanism that allows several processes to communicate with each other by accessing -- writing or reading -- a set of variables that they have in common. A Consistency Model defines how each process observes the state of…
Now days, manufacturers are focusing on increasing the concurrency in multiprocessor system-on-a-chip (MPSoC) architecture instead of increasing clock speed, for embedded systems. Traditionally lock-based synchronization is provided to…
We investigate the minimal number of failures that can partition a system where processes communicate both through shared memory and by message passing. We prove that this number precisely captures the resilience that can be achieved by…
Conventional cache models are not suited for real-time parallel processing because tasks may flush each other's data out of the cache in an unpredictable manner. In this way the system is not compositional so the overall performance is…
The latest trends in high-performance computing systems show an increasing demand on the use of a large scale multicore systems in a efficient way, so that high compute-intensive applications can be executed reasonably well. However, the…
A common paradigm for scientific computing is distributed message-passing systems, and a common approach to these systems is to implement them across clusters of high-performance workstations. As multi-core architectures become increasingly…
Performance modeling of parallel applications on multicore computers remains a challenge in computational co-design due to the complex design of multicore processors including private and shared memory hierarchies. We present a Scalable…
The C/C++ memory model provides an interface and execution model for programmers of concurrent (shared-variable) code. It provides a range of mechanisms that abstract from underlying hardware memory models -- that govern how multicore…
The problem of identifying intersections between two sets of d-dimensional axis-parallel rectangles appears frequently in the context of agent-based simulation studies. For this reason, the High Level Architecture (HLA) specification -- a…
To mitigate the ever worsening "Power wall" and "Memory wall" problems, multi-core architectures with multilevel cache hierarchies have been widely accepted in modern processors. However, the complexity of the architectures makes modeling…
The Adapteva Epiphany many-core architecture comprises a scalable 2D mesh Network-on-Chip (NoC) of low-power RISC cores with minimal uncore functionality. Whereas such a processor offers high computational energy efficiency and parallel…
Distributed shared memory (DSM) allows to implement and deploy applications onto distributed architectures using the convenient shared memory programming model in which a set of tasks are able to allocate and access data despite their…
Mutual exclusion (ME) is a commonly used technique to handle conflicts in concurrent systems. With recent advancements in non-volatile memory technology, there is an increased focus on the problem of recoverable mutual exclusion (RME), a…
Traditional techniques for synchronization are based on \emph{locking} that provides threads with exclusive access to shared data. \emph{Coarse-grained} locking typically forces threads to access large amounts of data sequentially and,…
We formulate a modular approach to the design and analysis of a particular class of mutual exclusion algorithms for shared memory multiprocessor systems. Specifically, we consider algorithms that organize waiting processes into a queue.…
Multi-core processors improve performance, but they can create unpredictability owing to shared resources such as caches interfering. Cache partitioning is used to alleviate the Worst-Case Execution Time (WCET) estimation by isolating the…
In an anonymous shared memory system, all inter-process communications are via shared objects; however, unlike in standard systems, there is no a priori agreement between processes on the names of shared objects [14,15]. Furthermore, the…