Related papers: Chiplets and the Codelet Model

Simulation-Driven Evaluation of Chiplet-Based Architectures Using VisualSim

This paper focuses on the simulation of multi-die System-on-Chip (SoC) architectures using VisualSim, emphasizing chiplet-based system modeling and performance analysis. Chiplet technology presents a promising alternative to traditional…

Hardware Architecture · Computer Science 2025-11-04 Wajid Ali , Ayaz Akram , Deepak Shankar

Challenges and Opportunities to Enable Large-Scale Computing via Heterogeneous Chiplets

Fast-evolving artificial intelligence (AI) algorithms such as large language models have been driving the ever-increasing computing demands in today's data centers. Heterogeneous computing with domain-specific architectures (DSAs) brings…

Hardware Architecture · Computer Science 2024-03-06 Zhuoping Yang , Shixin Ji , Xingzhen Chen , Jinming Zhuang , Weifeng Zhang , Dharmesh Jani , Peipei Zhou

On Memory Codelets: Prefetching, Recoding, Moving and Streaming Data

For decades, memory capabilities have scaled up much slower than compute capabilities, leaving memory utilization as a major bottleneck. Prefetching and cache hierarchies mitigate this in applications with easily predictable memory accesses…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-02 Dawson Fox , Jose Monsalve Diaz , Xiaoming Li

CHIPSIM: A Co-Simulation Framework for Deep Learning on Chiplet-Based Systems

Due to reduced manufacturing yields, traditional monolithic chips cannot keep up with the compute, memory, and communication demands of data-intensive applications, such as rapidly growing deep neural network (DNN) models. Chiplet-based…

Hardware Architecture · Computer Science 2025-10-31 Lukas Pfromm , Alish Kanani , Harsh Sharma , Janardhan Rao Doppa , Partha Pratim Pande , Umit Y. Ogras

Hemlet: A Heterogeneous Compute-in-Memory Chiplet Architecture for Vision Transformers with Group-Level Parallelism

Vision Transformers (ViTs) have established new performance benchmarks in vision tasks such as image recognition and object detection. However, these advancements come with significant demands for memory and computational resources,…

Hardware Architecture · Computer Science 2026-02-10 Cong Wang , Zexin Fu , Jiayi Huang , Shanshi Huang

Floorplet: Performance-aware Floorplan Framework for Chiplet Integration

A chiplet is an integrated circuit that encompasses a well-defined subset of an overall system's functionality. In contrast to traditional monolithic system-on-chips (SoCs), chiplet-based architecture can reduce costs and increase…

Hardware Architecture · Computer Science 2023-12-12 Shixin Chen , Shanyi Li , Zhen Zhuang , Su Zheng , Zheng Liang , Tsung-Yi Ho , Bei Yu , Alberto L. Sangiovanni-Vincentelli

Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets

To address increasing compute demand from recent multi-model workloads with heavy models like large language models, we propose to deploy heterogeneous chiplet-based multi-chip module (MCM)-based accelerators. We develop an advanced…

Hardware Architecture · Computer Science 2023-12-18 Mohanad Odema , Hyoukjun Kwon , Mohammad Abdullah Al Faruque

Cohet: A CXL-Driven Coherent Heterogeneous Computing Framework with Hardware-Calibrated Full-System Simulation

Conventional heterogeneous computing systems built on PCIe interconnects suffer from inefficient fine-grained host-device interactions and complex programming models. In recent years, many proprietary and open cache-coherent interconnect…

Hardware Architecture · Computer Science 2026-01-13 Yanjing Wang , Lizhou Wu , Sunfeng Gao , Yibo Tang , Junhui Luo , Zicong Wang , Yang Ou , Dezun Dong , Nong Xiao , Mingche Lai

Educating for Hardware Specialization in the Chiplet Era: A Path for the HPC Community

The advent of chiplet technology introduces cutting-edge opportunities for constructing highly heterogeneous platforms with specialized accelerators. However, the HPC community currently lacks expertise in hardware development, a gap that…

Hardware Architecture · Computer Science 2024-10-31 Kazutomo Yoshii , Mohamed El-Hadedy

Chiplets on Wheels: Review Paper on Holistic Chiplet Solutions for Autonomous Vehicles

On the advent of the slow death of Moore's law, the silicon industry is moving towards a new era of chiplets. The automotive industry is experiencing a profound transformation towards software-defined vehicles, fueled by the surging demand…

Hardware Architecture · Computer Science 2024-06-04 Swathi Narashiman , Venkat A , Divyaratna Joshi , Deepak Sridhar , Harish Rajesh , Sanjay Sattva , Aniruddha S , Jayanth B , Varun Manjunath , Ragavendiran N

HexaMesh: Scaling to Hundreds of Chiplets with an Optimized Chiplet Arrangement

2.5D integration is an important technique to tackle the growing cost of manufacturing chips in advanced technology nodes. This poses the challenge of providing high-performance inter-chiplet interconnects (ICIs). As the number of chiplets…

Hardware Architecture · Computer Science 2023-10-10 Patrick Iff , Maciej Besta , Matheus Cavalcante , Tim Fischer , Luca Benini , Torsten Hoefler

Designing Reconfigurable Interconnection Network of Heterogeneous Chiplets Using Kalman Filter

Heterogeneous chiplets have been proposed for accelerating high-performance computing tasks. Integrated inside one package, CPU and GPU chiplets can share a common interconnection network that can be implemented through the interposer.…

Hardware Architecture · Computer Science 2024-06-04 Siamak Biglari , Ruixiao Huang , Hui Zhao , Saraju Mohanty

Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-in-Memory Hardware

Many modern workloads such as neural network inference and graph processing are fundamentally memory-bound. For such workloads, data movement between memory and CPU cores imposes a significant overhead in terms of both latency and energy. A…

Hardware Architecture · Computer Science 2023-04-04 Juan Gómez-Luna , Izzat El Hajj , Ivan Fernandez , Christina Giannoula , Geraldo F. Oliveira , Onur Mutlu

A Heterogeneous Chiplet Architecture for Accelerating End-to-End Transformer Models

Transformers have revolutionized deep learning and generative modeling, enabling advancements in natural language processing tasks. However, the size of transformer models is increasing continuously, driven by enhanced capabilities across…

Hardware Architecture · Computer Science 2025-02-18 Harsh Sharma , Pratyush Dhingra , Janardhan Rao Doppa , Umit Ogras , Partha Pratim Pande

Chiplet Cloud: Building AI Supercomputers for Serving Large Generative Language Models

Large language models (LLMs) such as OpenAI's ChatGPT and Google's Gemini have demonstrated unprecedented capabilities of autoregressive AI models across multiple tasks triggering disruptive technology innovations around the world. However,…

Hardware Architecture · Computer Science 2024-05-22 Huwan Peng , Scott Davidson , Richard Shi , Shuaiwen Leon Song , Michael Taylor

Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips

The trend in industry is towards heterogeneous multicore processors (HMCs), including chips with CPUs and massively-threaded throughput-oriented processors (MTTOPs) such as GPUs. Although current homogeneous chips tightly couple the cores…

Hardware Architecture · Computer Science 2013-10-30 Blake A. Hechtman , Daniel J. Sorin

Massive Data-Centric Parallelism in the Chiplet Era

Recent works have introduced task-based parallelization schemes to accelerate graph search and sparse data-structure traversal, where some solutions scale up to thousands of processing units (PUs) on a single chip. However parallelizing…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-14 Marcelo Orenes-Vera , Esin Tureci , David Wentzlaff , Margaret Martonosi

A New Execution Model and Executor for Adaptively Optimizing the Performance of Parallel Algorithms Using HPX Runtime System

Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-22 Karame Mohammadiporshokooh , Steven R. Brandt , Hartmut Kaiser

ChipLight: Cross-Layer Optimization of Chiplet Design with Optical Interconnects for LLM Training

In large-scale distributed LLM training, communication between devices becomes the key performance bottleneck. Chiplet technology can integrate multiple dies into a package to scale-up node performance with higher bandwidth. Meanwhile,…

Hardware Architecture · Computer Science 2026-04-22 Kangbo Bai , Zhantong Zhu , Yifan Ding , Tianyu Jia

RapidChiplet: A Toolchain for Rapid Design Space Exploration of Chiplet Architectures

Chiplet architectures are on the rise as they promise to overcome the scaling challenges of monolithic chips. A key component of such architectures is an efficient inter-chiplet interconnect (ICI). The ICI design space is huge as there are…

Hardware Architecture · Computer Science 2025-03-19 Patrick Iff , Benigna Bruggmann , Blaise Morel , Maciej Besta , Luca Benini , Torsten Hoefler