Related papers: Efficient Distributed Data Structures for Future M…
Multi-core architectures feature an intricate hierarchy of cache memories, with multiple levels and sizes. To adequately decompose an application according to the traits of a particular memory hierarchy is a cumbersome task that may be…
Various performance characteristics of distributed file systems have been well studied. However, the performance efficiency of distributed file systems on small-file problems with complex machine learning algorithms scenarios is not well…
The parallel and distributed processing are becoming de facto industry standard, and a large part of the current research is targeted on how to make computing scalable and distributed, dynamically, without allocating the resources on…
In this paper, we proposed an effective and efficient multi-core shared-cache design optimization approach based on reuse-distance analysis of the data traces of target applications. Since data traces are independent of system hardware…
Traditional data centers are designed with a rigid architecture of fit-for-purpose servers that provision resources beyond the average workload in order to deal with occasional peaks of data. Heterogeneous data centers are pushing towards…
The growing scale of data requires efficient memory subsystems with large memory capacity and high memory performance. Disaggregated architecture has become a promising solution for today's cloud and edge computing for its scalability and…
In this tutorial paper, we will firstly review some basic simulation concepts and then introduce the parallel and distributed simulation techniques in view of some new challenges of today and tomorrow. More in particular, in the last years…
With the advent of era of Big Data and Internet of Things, there has been an exponential increase in the availability of large data sets. These data sets require in-depth analysis that provides intelligence for improvements in methods for…
We demonstrate that general-purpose memory allocation involving many threads on many cores can be done with high performance, multicore scalability, and low memory consumption. For this purpose, we have designed and implemented scalloc, a…
Next-generation wireless technologies (for immersive-massive communication, joint communication and sensing) demand highly parallel architectures for massive data processing. A common architectural template scales up by grouping tens to…
The ability to express a program as a hierarchical composition of parts is an essential tool in managing the complexity of software and a key abstraction this provides is to separate the representation of data from the computation. Many…
Memory disaggregation is being considered as a strong alternative to traditional architecture to deal with the memory under-utilization in data centers. Disaggregated memory can adapt to dynamically changing memory requirements for the data…
Disaggregated memory is an upcoming data center technology that will allow nodes (servers) to share data efficiently. Sharing data creates a debate on the level of cache coherence the system should provide. While current proposals aim to…
To mitigate the ever worsening "Power wall" and "Memory wall" problems, multi-core architectures with multilevel cache hierarchies have been widely accepted in modern processors. However, the complexity of the architectures makes modeling…
Performance modeling can help to improve the resource efficiency of clusters and distributed dataflow applications, yet the available modeling data is often limited. Collaborative approaches to performance modeling, characterized by the…
The in-memory cache system is an important component in a cloud for the data access performance. As the tenants may have different performance goals for data access depending on the nature of their tasks, effectively managing the memory…
In modern large-scale distributed systems, analytics jobs submitted by various users often share similar work, for example scanning and processing the same subset of data. Instead of optimizing jobs independently, which may result in…
This paper addresses the problem of management and coordination of energy resources in a typical microgrid, including smart buildings as flexible loads, energy storages, and renewables. The overall goal is to provide a comprehensive and…
Arrival of multicore systems has enforced a new scenario in computing, the parallel and distributed algorithms are fast replacing the older sequential algorithms, with many challenges of these techniques. The distributed algorithms provide…
The next generation of many-core enabled large-scale computing systems relies on thousands of billions of heterogeneous processing cores connected to form a single computing unit. In such large-scale computing environments, resource…