Related papers: Die-Stacked DRAM: Memory, Cache, or MemCache?
Response time requirements for big data processing systems are shrinking. To meet this strict response time requirement, many big data systems store all or most of their data in main memory to reduce the access latency. Main memory…
Putting the DRAM on the same package with a processor enables several times higher memory bandwidth than conventional off-package DRAM. Yet, the latency of in-package DRAM is not appreciably lower than that of off-package DRAM. A promising…
As SRAM-based caches are hitting a scaling wall, manufacturers are integrating DRAM-based caches into system designs to continue increasing cache sizes. While DRAM caches can improve the performance of memory systems, existing DRAM cache…
Web application performance is heavily reliant on the hit rate of memory-based caches. Current DRAM-based web caches statically partition their memory across multiple applications sharing the cache. This causes under utilization of memory…
Poor DRAM technology scaling over the course of many years has caused DRAM-based main memory to increasingly become a larger system bottleneck. A major reason for the bottleneck is that data stored within DRAM must be moved across a…
With the imminent slowing down of DRAM scaling, Phase Change Memory (PCM) is emerging as a lead alternative for main memory technology. While PCM achieves low energy due to various technology-specific advantages, PCM is significantly slower…
In recent years, there is an increasing demand of big memory systems so to perform large scale data analytics. Since DRAM memories are expensive, some researchers are suggesting to use other memory systems such as non-volatile memory (NVM)…
As persistent memory (PM) technologies emerge, hybrid memory architectures combining DRAM with PM bring the potential to provide a tiered, byte-addressable main memory of unprecedented capacity. Nearly a decade after the first proposals for…
Many high end and next generation computing systems to incorporated alternative memory technologies to meet performance goals. Since these technologies present distinct advantages and tradeoffs compared to conventional DDR* SDRAM, such as…
Memory-centric computing aims to enable computation capability in and near all places where data is generated and stored. As such, it can greatly reduce the large negative performance and energy impact of data access and data movement, by…
Over the past two decades, the storage capacity and access bandwidth of main memory have improved tremendously, by 128x and 20x, respectively. These improvements are mainly due to the continuous technology scaling of DRAM (dynamic…
3D die-stacked DRAM has emerged as a key technology for delivering high bandwidth and high density for applications such as high-performance computing, graphics, and machine learning. However, different applications place diverse and…
Die-stacked DRAM caches are increasingly advocated to bridge the performance gap between on-chip Cache and main memory. It is essential to improve DRAM cache hit rate and lower cache hit latency simultaneously. Prior DRAM cache designs fall…
This paper investigates bandwidth-efficient DRAM caching for hybrid DRAM + 3D-XPoint memories. 3D-XPoint is becoming a viable alternative to DRAM as it enables high-capacity and non-volatile main memory systems; however, 3D-XPoint has 4-8x…
Today's systems are overwhelmingly designed to move data to computation. This design choice goes directly against at least three key trends in systems that cause performance, scalability and energy bottlenecks: (1) data access from memory…
The rapid development of multi-core system and increase of data-intensive application in recent years call for larger main memory. Traditional DRAM memory can increase its capacity by reducing the feature size of storage cell. Now further…
It has become increasingly difficult to understand the complex interaction between modern applications and main memory, composed of DRAM chips. Manufacturers are now selling and proposing many different types of DRAM, with each DRAM type…
We propose overcoming the memory capacity limitation of GPUs with high-capacity Storage-Class Memory (SCM) and DRAM cache. By significantly increasing the memory capacity with SCM, the GPU can capture a larger fraction of the memory…
Major chip manufacturers have all introduced multicore microprocessors. Multi-socket systems built from these processors are used for running various server applications. Depending on the application that is run on the system, remote memory…
DRAM-based main memory and its associated components increasingly account for a significant portion of application performance bottlenecks and power budget demands inside the computing ecosystem. To alleviate the problems of storage density…