English
Related papers

Related papers: Chiplets and the Codelet Model

200 papers

This paper focuses on the simulation of multi-die System-on-Chip (SoC) architectures using VisualSim, emphasizing chiplet-based system modeling and performance analysis. Chiplet technology presents a promising alternative to traditional…

Hardware Architecture · Computer Science 2025-11-04 Wajid Ali , Ayaz Akram , Deepak Shankar

Fast-evolving artificial intelligence (AI) algorithms such as large language models have been driving the ever-increasing computing demands in today's data centers. Heterogeneous computing with domain-specific architectures (DSAs) brings…

Hardware Architecture · Computer Science 2024-03-06 Zhuoping Yang , Shixin Ji , Xingzhen Chen , Jinming Zhuang , Weifeng Zhang , Dharmesh Jani , Peipei Zhou

For decades, memory capabilities have scaled up much slower than compute capabilities, leaving memory utilization as a major bottleneck. Prefetching and cache hierarchies mitigate this in applications with easily predictable memory accesses…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-02 Dawson Fox , Jose Monsalve Diaz , Xiaoming Li

Due to reduced manufacturing yields, traditional monolithic chips cannot keep up with the compute, memory, and communication demands of data-intensive applications, such as rapidly growing deep neural network (DNN) models. Chiplet-based…

Hardware Architecture · Computer Science 2025-10-31 Lukas Pfromm , Alish Kanani , Harsh Sharma , Janardhan Rao Doppa , Partha Pratim Pande , Umit Y. Ogras

Vision Transformers (ViTs) have established new performance benchmarks in vision tasks such as image recognition and object detection. However, these advancements come with significant demands for memory and computational resources,…

Hardware Architecture · Computer Science 2026-02-10 Cong Wang , Zexin Fu , Jiayi Huang , Shanshi Huang

A chiplet is an integrated circuit that encompasses a well-defined subset of an overall system's functionality. In contrast to traditional monolithic system-on-chips (SoCs), chiplet-based architecture can reduce costs and increase…

Hardware Architecture · Computer Science 2023-12-12 Shixin Chen , Shanyi Li , Zhen Zhuang , Su Zheng , Zheng Liang , Tsung-Yi Ho , Bei Yu , Alberto L. Sangiovanni-Vincentelli

To address increasing compute demand from recent multi-model workloads with heavy models like large language models, we propose to deploy heterogeneous chiplet-based multi-chip module (MCM)-based accelerators. We develop an advanced…

Hardware Architecture · Computer Science 2023-12-18 Mohanad Odema , Hyoukjun Kwon , Mohammad Abdullah Al Faruque

Conventional heterogeneous computing systems built on PCIe interconnects suffer from inefficient fine-grained host-device interactions and complex programming models. In recent years, many proprietary and open cache-coherent interconnect…

Hardware Architecture · Computer Science 2026-01-13 Yanjing Wang , Lizhou Wu , Sunfeng Gao , Yibo Tang , Junhui Luo , Zicong Wang , Yang Ou , Dezun Dong , Nong Xiao , Mingche Lai

The advent of chiplet technology introduces cutting-edge opportunities for constructing highly heterogeneous platforms with specialized accelerators. However, the HPC community currently lacks expertise in hardware development, a gap that…

Hardware Architecture · Computer Science 2024-10-31 Kazutomo Yoshii , Mohamed El-Hadedy

On the advent of the slow death of Moore's law, the silicon industry is moving towards a new era of chiplets. The automotive industry is experiencing a profound transformation towards software-defined vehicles, fueled by the surging demand…

2.5D integration is an important technique to tackle the growing cost of manufacturing chips in advanced technology nodes. This poses the challenge of providing high-performance inter-chiplet interconnects (ICIs). As the number of chiplets…

Hardware Architecture · Computer Science 2023-10-10 Patrick Iff , Maciej Besta , Matheus Cavalcante , Tim Fischer , Luca Benini , Torsten Hoefler

Heterogeneous chiplets have been proposed for accelerating high-performance computing tasks. Integrated inside one package, CPU and GPU chiplets can share a common interconnection network that can be implemented through the interposer.…

Hardware Architecture · Computer Science 2024-06-04 Siamak Biglari , Ruixiao Huang , Hui Zhao , Saraju Mohanty

Many modern workloads such as neural network inference and graph processing are fundamentally memory-bound. For such workloads, data movement between memory and CPU cores imposes a significant overhead in terms of both latency and energy. A…

Hardware Architecture · Computer Science 2023-04-04 Juan Gómez-Luna , Izzat El Hajj , Ivan Fernandez , Christina Giannoula , Geraldo F. Oliveira , Onur Mutlu

Transformers have revolutionized deep learning and generative modeling, enabling advancements in natural language processing tasks. However, the size of transformer models is increasing continuously, driven by enhanced capabilities across…

Hardware Architecture · Computer Science 2025-02-18 Harsh Sharma , Pratyush Dhingra , Janardhan Rao Doppa , Umit Ogras , Partha Pratim Pande

Large language models (LLMs) such as OpenAI's ChatGPT and Google's Gemini have demonstrated unprecedented capabilities of autoregressive AI models across multiple tasks triggering disruptive technology innovations around the world. However,…

Hardware Architecture · Computer Science 2024-05-22 Huwan Peng , Scott Davidson , Richard Shi , Shuaiwen Leon Song , Michael Taylor

The trend in industry is towards heterogeneous multicore processors (HMCs), including chips with CPUs and massively-threaded throughput-oriented processors (MTTOPs) such as GPUs. Although current homogeneous chips tightly couple the cores…

Hardware Architecture · Computer Science 2013-10-30 Blake A. Hechtman , Daniel J. Sorin

Recent works have introduced task-based parallelization schemes to accelerate graph search and sparse data-structure traversal, where some solutions scale up to thousands of processing units (PUs) on a single chip. However parallelizing…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-14 Marcelo Orenes-Vera , Esin Tureci , David Wentzlaff , Margaret Martonosi

Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-22 Karame Mohammadiporshokooh , Steven R. Brandt , Hartmut Kaiser

In large-scale distributed LLM training, communication between devices becomes the key performance bottleneck. Chiplet technology can integrate multiple dies into a package to scale-up node performance with higher bandwidth. Meanwhile,…

Hardware Architecture · Computer Science 2026-04-22 Kangbo Bai , Zhantong Zhu , Yifan Ding , Tianyu Jia

Chiplet architectures are on the rise as they promise to overcome the scaling challenges of monolithic chips. A key component of such architectures is an efficient inter-chiplet interconnect (ICI). The ICI design space is huge as there are…

Hardware Architecture · Computer Science 2025-03-19 Patrick Iff , Benigna Bruggmann , Blaise Morel , Maciej Besta , Luca Benini , Torsten Hoefler
‹ Prev 1 2 3 10 Next ›