Related papers: Cache Optimization for Memory Intensive Workloads …
Major chip manufacturers have all introduced multicore microprocessors. Multi-socket systems built from these processors are used for running various server applications. Depending on the application, remote cache-to-cache transfers can…
Major chip manufacturers have all introduced multicore microprocessors. Multi-socket systems built from these processors are used for running various server applications. However to the best of our knowledge current commercial operating…
Multi-socket multi-core servers are used for solving some of the important problems in computing. Remote DRAM accesses can impact performance of certain applications running on such servers. This paper presents a new near linear operating…
Major chip manufacturers have all introduced multicore microprocessors. Multi-socket systems built from these processors are routinely used for running various server applications. Depending on the application that is run on the system,…
The increasing number of threads inside the cores of a multicore processor, and competitive access to the shared cache memory, become the main reasons for an increased number of competitive cache misses and performance decline. Inevitably,…
Advancements in multi-core have created interest among many research groups in finding out ways to harness the true power of processor cores. Recent research suggests that on-board component such as cache memory plays a crucial role in…
WCET (Worst-Case Execution Time) estimation on multicore architecture is particularly challenging mainly due to the complex accesses over cache shared by multiple cores. Existing analysis identifies possible contentions between parallel…
In this paper, we proposed an effective and efficient multi-core shared-cache design optimization approach based on reuse-distance analysis of the data traces of target applications. Since data traces are independent of system hardware…
Many computer systems for calculating the proper organization of memory are among the most critical issues. Using a tier cache memory (along with branching prediction) is an effective means of increasing modern multi-core processors'…
Multi-threaded applications are capable of exploiting the full potential of many-core systems. However, Network-on-Chip (NoC) based inter-core communication in many-core systems is responsible for 60-75% of the miss latency experienced by…
This dissertation develops hardware that automatically reduces the effective latency of accessing memory in both single-core and multi-core systems. To accomplish this, the dissertation shows that all last level cache misses can be…
With the advent of era of Big Data and Internet of Things, there has been an exponential increase in the availability of large data sets. These data sets require in-depth analysis that provides intelligence for improvements in methods for…
Multi-core processors improve performance, but they can create unpredictability owing to shared resources such as caches interfering. Cache partitioning is used to alleviate the Worst-Case Execution Time (WCET) estimation by isolating the…
Modern multicore processors are employing large last-level caches, for example Intel's E7-8800 processor uses 24MB L3 cache. Further, with each CMOS technology generation, leakage energy has been dramatically increasing and hence, leakage…
In this paper we investigate the feasibility of denial-of-service (DoS) attacks on shared caches in multicore platforms. With carefully engineered attacker tasks, we are able to cause more than 300X execution time increases on a victim task…
Memory disaggregation addresses memory imbalance in a cluster by decoupling CPU and memory allocations of applications while also increasing the effective memory capacity for (memory-intensive) applications beyond the local memory limit…
Multi-core architectures feature an intricate hierarchy of cache memories, with multiple levels and sizes. To adequately decompose an application according to the traits of a particular memory hierarchy is a cumbersome task that may be…
Cache partitioning techniques have been successfully adopted to mitigate interference among concurrently executing real-time tasks on multi-core processors. Considering that the execution time of a cache-sensitive task strongly depends on…
Multicore processors have proved to be the right choice for both desktop and server systems because it can support high performance with an acceptable budget expenditure. In this work, we have compared several works in cache contention and…
In this paper, we propose a practical and effective approach allowing designers to optimize multi-level cache size at the early system design phase. Our key contribution is to generalize the reuse distance analysis method and develop an…