English
Related papers

Related papers: 3D Cache Hierarchy Optimization

200 papers

Power consumption, off-chip memory bandwidth, chip area and Network on Chip (NoC) capacity are among main chip resources limiting the scalability of Chip Multiprocessors (CMP). A closed form analytical solution for optimizing the CMP cache…

Hardware Architecture · Computer Science 2017-05-23 Leonid Yavits , Amir Morad , Ran Ginosar

With technology scaling, the size of cache systems in chip-multiprocessors (CMPs) has been dramatically increased to efficiently store and manipulate a large amount of data in future applications and decrease the gap between cores and…

Hardware Architecture · Computer Science 2022-01-04 Pooneh Safayenikoo , Arghavan Asad , Mahmood Fathy

In multithreaded applications with high degree of data sharing, the miss rate of private cache is shown to exhibit a compulsory miss component. It manifests because at least some of the shared data originates from other cores and can only…

Hardware Architecture · Computer Science 2016-02-04 Leonid Yavits , Amir Morad , Ran Ginosar

Over the last three decades, innovations in the memory subsystem were primarily targeted at overcoming the data movement bottleneck. In this paper, we focus on a specific market trend in memory technology: 3D-stacked memory and caches. We…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-17 Jens Domke , Emil Vatai , Balazs Gerofi , Yuetsu Kodama , Mohamed Wahib , Artur Podobas , Sparsh Mittal , Miquel Pericàs , Lingqi Zhang , Peng Chen , Aleksandr Drozd , Satoshi Matsuoka

The increasing density of transistors in Integrated Circuits (ICs) has enabled the development of highly integrated Systems-on-Chip (SoCs) and, more recently, Multiprocessor Systems-on-Chip (MPSoCs). To address scalability challenges in…

Hardware Architecture · Computer Science 2025-04-29 Rodrigo Cataldo , Cesar Marcon , Debora Matos

With emerging storage-class memory (SCM) nearing commercialization, there is evidence that it will deliver the much-anticipated high density and access latencies within only a few factors of DRAM. Nevertheless, the latency-sensitive nature…

Caching at mobile devices and leveraging cooperative device-to-device (D2D) communications are two promising approaches to support massive content delivery over wireless networks while mitigating the effects of interference. To show the…

Information Theory · Computer Science 2020-05-13 Ramy Amer , Hesham ElSawy , Jacek Kibiłda , M. Majid Butt , Nicola Marchetti

This paper investigates the multi-GPU performance of a 3D buoyancy driven cavity solver using MPI and OpenACC directives on different platforms. The paper shows that decomposing the total problem in different dimensions affects the strong…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-10 Weicheng Xue , Christopher J. Roy

The task of 3D ICs layout design involves the assembly of millions of components taking into account many different requirements and constraints such as topological, wiring or manufacturability ones. It is a NP-hard problem that requires…

Hardware Architecture · Computer Science 2019-11-28 Katarzyna Grzesiak-Kopeć , Maciej Ogorzałek

For a system-level design of Networks-on-Chip for 3D heterogeneous System-on-Chip (SoC), the locations of components, routers and vertical links are determined from an application model and technology parameters. In conventional methods,…

Hardware Architecture · Computer Science 2019-10-04 Jan Moritz Joseph , Dominik Ermel , Lennart Bamberg , Alberto García-Ortiz , Thilo Pionteck

Major chip manufacturers have all introduced multicore microprocessors. Multi-socket systems built from these processors are used for running various server applications. Depending on the application, remote cache-to-cache transfers can…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-09-23 Suryanarayana Murthy Durbhakula

The in-memory cache system is an important component in a cloud for the data access performance. As the tenants may have different performance goals for data access depending on the nature of their tasks, effectively managing the memory…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-05 Taejoon Kim , Yu Gu , Jinoh Kim

Major chip manufacturers have all introduced multicore microprocessors. Multi-socket systems built from these processors are used for running various server applications. Depending on the application that is run on the system, remote memory…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-08-13 Murthy Durbhakula

Current embedded systems are specifically designed to run multimedia applications. These applications have a big impact on both performance and energy consumption. Both metrics can be optimized selecting the best cache configuration for a…

Neural and Evolutionary Computing · Computer Science 2023-02-23 Josefa Díaz Álvarez , José L. Risco-Martín , J. Manuel Colmenar

Most previous 3D IC research focused on stacking traditional 2D silicon layers, so the interconnect reduction is limited to inter-block delays. In this paper, we propose techniques that enable efficient exploration of the 3D design space…

Hardware Architecture · Computer Science 2025-08-20 Yongxiang Liu , Yuchun Ma , Eren Kurshan , Glenn Reinman , Jason Cong

Many computer systems for calculating the proper organization of memory are among the most critical issues. Using a tier cache memory (along with branching prediction) is an effective means of increasing modern multi-core processors'…

Networking and Internet Architecture · Computer Science 2021-05-21 Mohamed A. Hamada , Abdelrahman Abdallah

Real-time and cyber-physical systems need to interact with and respond to their physical environment in a predictable time. While multicore platforms provide incredible computational power and throughput, they also introduce new sources of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-29 Ayoosh Bansal , Jayati Singh , Yifan Hao , Jen-Yang Wen , Renato Mancuso , Marco Caccamo

Three-dimensional integrated circuits promise power, performance, and footprint gains compared to their 2D counterparts, thanks to drastic reductions in the interconnects' length through their smaller form factor. We can leverage the…

The increasing number of threads inside the cores of a multicore processor, and competitive access to the shared cache memory, become the main reasons for an increased number of competitive cache misses and performance decline. Inevitably,…

Hardware Architecture · Computer Science 2017-01-09 Milcho Prisagjanec , Pece Mitrevski

Cache partitioning techniques have been successfully adopted to mitigate interference among concurrently executing real-time tasks on multi-core processors. Considering that the execution time of a cache-sensitive task strongly depends on…

Hardware Architecture · Computer Science 2023-10-05 Binqi Sun , Debayan Roy , Tomasz Kloda , Andrea Bastoni , Rodolfo Pellizzoni , Marco Caccamo
‹ Prev 1 2 3 10 Next ›