English
Related papers

Related papers: Phase-Priority based Directory Coherence for Multi…

200 papers

Phase-change memory (PCM) devices have multiple banks to serve memory requests in parallel. Unfortunately, if two requests go to the same bank, they have to be served one after another, leading to lower system performance. We observe that a…

Hardware Architecture · Computer Science 2019-08-22 Shihao Song , Anup Das , Onur Mutlu , Nagarajan Kandasamy

The use of multi-chip modules (MCM) and/or multi-socket boards is the most suitable approach to increase the computation density of servers while keep chip yield attained. This paper introduces a new coherence protocol suitable, in terms of…

Hardware Architecture · Computer Science 2024-05-06 Lucia G. Menezo , Valentin Puente , Jose A. Gregorio

The \emph{Partial Cache-Coherence (PCC)} model maintains hardware cache coherence only within subsets of cores, enabling large-scale memory sharing with emerging memory interconnect technologies like Compute Express Link (CXL). However,…

Operating Systems · Computer Science 2025-11-11 Fangnuo Wu , Mingkai Dong , Wenjun Cai , Jingsheng Yan , Haibo Chen

Reducing the average memory access time is crucial for improving the performance of applications running on multi-core architectures. With workload consolidation this becomes increasingly challenging due to shared resource contention.…

Hardware Architecture · Computer Science 2021-02-24 Nadja Ramhöj Holtryd , Madhavan Manivannan , Per Stenström , Miquel Pericàs

In this paper, we propose a concurrency control protocol, called the Prudent-Precedence Concurrency Control (PPCC) protocol, for high data contention database environments. PPCC is prudently more aggressive in permitting more serializable…

Databases · Computer Science 2016-11-18 Weidong Xiong , Feng Yu , Mohammed Hamdi , Wen-Chi Hou

Real-time and cyber-physical systems need to interact with and respond to their physical environment in a predictable time. While multicore platforms provide incredible computational power and throughput, they also introduce new sources of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-29 Ayoosh Bansal , Jayati Singh , Yifan Hao , Jen-Yang Wen , Renato Mancuso , Marco Caccamo

In modern Commercial Off-The-Shelf (COTS) multicore systems, each core can generate many parallel memory requests at a time. The processing of these parallel requests in the DRAM controller greatly affects the memory interference delay…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-07-29 Heechul Yun

Most commercial embedded devices have been deployed with a single processor architecture. The code size and complexity of applications running on embedded devices are rapidly increasing due to the emergence of application business models…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-26 Geunsik Lim , Changwoo Min , YoungIk Eom

Cache partitioning techniques have been successfully adopted to mitigate interference among concurrently executing real-time tasks on multi-core processors. Considering that the execution time of a cache-sensitive task strongly depends on…

Hardware Architecture · Computer Science 2023-10-05 Binqi Sun , Debayan Roy , Tomasz Kloda , Andrea Bastoni , Rodolfo Pellizzoni , Marco Caccamo

Processing-in-memory (PIM) architectures have seen an increase in popularity recently, as the high internal bandwidth available within 3D-stacked memory provides greater incentive to move some computation into the logic layer of the memory.…

General trends in computer architecture are shifting more towards parallelism. Multicore architectures have proven to be a major step in processor evolution. With the advancement in multicore architecture, researchers are focusing on…

Hardware Architecture · Computer Science 2019-10-22 Arsalan Shahid , Muhammad Tayyab , Muhammad Yasir Qadri , Nadia N. Qadri , Jameel Ahmed

Comprehending the performance bottlenecks at the core of the intricate hardware-software interactions exhibited by highly parallel programs on HPC clusters is crucial. This paper sheds light on the issue of automatically asynchronous MPI…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-06 Ayesha Afzal , Georg Hager , Stefano Markidis , Gerhard Wellein

Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA devices provide tighter integrations between software running on CPUs and hardware accelerators. Modern heterogeneous SoC-FPGA platforms support multiple I/O cache…

Hardware Architecture · Computer Science 2019-08-06 Seung Won Min , Sitao Huang , Mohamed El-Hadedy , Jinjun Xiong , Deming Chen , Wen-mei Hwu

Advancements in multi-core have created interest among many research groups in finding out ways to harness the true power of processor cores. Recent research suggests that on-board component such as cache memory plays a crucial role in…

Hardware Architecture · Computer Science 2011-11-15 N. Ramasubramanian , Srinivas V. V. , N. Ammasai Gounden

Modern commercial-off-the-shelf (COTS) multicore processors have advanced memory hierarchies that enhance memory-level parallelism (MLP), which is crucial for high performance. To support high MLP, shared last-level caches (LLCs) are…

Hardware Architecture · Computer Science 2025-07-23 Connor Sullivan , Alex Manley , Mohammad Alian , Heechul Yun

Modern computer designs support composite prefetching, where multiple individual prefetcher components are used to target different memory access patterns. However, multiple prefetchers competing for resources can drastically hurt…

Hardware Architecture · Computer Science 2023-07-18 Erika S. Alcorta , Mahesh Madhav , Scott Tetrick , Neeraja J. Yadwadkar , Andreas Gerstlauer

Conventional cache models are not suited for real-time parallel processing because tasks may flush each other's data out of the cache in an unpredictable manner. In this way the system is not compositional so the overall performance is…

Hardware Architecture · Computer Science 2011-11-09 A. M. Molnos , M. J. M. Heijligers , S. D. Cotofana , J. T. J. Van Eijndhoven

Directory-based protocols have been the de facto solution for maintaining cache coherence in shared-memory parallel systems comprising multi/many cores, where each store instruction is eagerly made globally visible by invalidating the…

Hardware Architecture · Computer Science 2012-10-09 Daofu Liu , Yunji Chen , Qi Guo , Tianshi Chen , Ling Li , Qunfeng Dong , Weiwu Hu

Conventional wisdom holds that an efficient interface between an OS running on a CPU and a high-bandwidth I/O device should use Direct Memory Access (DMA) to offload data transfer, descriptor rings for buffering and queuing, and interrupts…

Hardware Architecture · Computer Science 2025-04-25 Anastasiia Ruzhanskaia , Pengcheng Xu , David Cock , Timothy Roscoe

We consider a parallel computational model that consists of $P$ processors, each with a fast local ephemeral memory of limited size, and sharing a large persistent memory. The model allows for each processor to fault with bounded…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-15 Guy E. Blelloch , Phillip B. Gibbons , Yan Gu , Charles McGuffey , Julian Shun
‹ Prev 1 2 3 10 Next ›