Related papers: Enabling Efficient Transaction Processing on CXL-B…

Next-Gen Computing Systems with Compute Express Link: a Comprehensive Survey

Interconnection is crucial for computing systems. However, the current interconnection performance between processors and devices, such as memory devices and accelerators, significantly lags behind their computing performance, severely…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-21 Chen Chen , Xinkui Zhao , Guanjie Cheng , Yuesheng Xu , Shuiguang Deng , Jianwei Yin

An Introduction to the Compute Express Link (CXL) Interconnect

The Compute Express Link (CXL) is an open industry-standard interconnect between processors and devices such as accelerators, memory buffers, smart network interfaces, persistent memory, and solid-state drives. CXL offers coherency and…

Hardware Architecture · Computer Science 2024-05-09 Debendra Das Sharma , Robert Blankenship , Daniel S. Berger

A Programming Model for Disaggregated Memory over CXL

CXL (Compute Express Link) is an emerging open industry-standard interconnect between processing and memory devices that is expected to revolutionize the way systems are designed. It enables cache-coherent, shared memory pools in a…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-27 Gal Assa , Moritz Lumme , Lucas Bürgi , Michal Friedman , Ori Lahav

Memory Sharing with CXL: Hardware and Software Design Approaches

Compute Express Link (CXL) is a rapidly emerging coherent interconnect standard that provides opportunities for memory pooling and sharing. Memory sharing is a well-established software feature that improves memory utilization by avoiding…

Emerging Technologies · Computer Science 2024-04-05 Sunita Jain , Nagaradhesh Yeleswarapu , Hasan Al Maruf , Rita Gupta

CXL-Interference: Analysis and Characterization in Modern Computer Systems

Compute Express Link (CXL) is a promising technology that addresses memory and storage challenges. Despite its advantages, CXL faces performance threats from external interference when co-existing with current memory and storage systems.…

Hardware Architecture · Computer Science 2024-11-28 Shunyu Mao , Jiajun Luo , Yixin Li , Jiapeng Zhou , Weidong Zhang , Zheng Liu , Teng Ma , Shuwen Deng

CXL and the Return of Scale-Up Database Engines

The trend toward specialized processing devices such as TPUs, DPUs, GPUs, and FPGAs has exposed the weaknesses of PCIe in interconnecting these devices and their hosts. Several attempts have been proposed to improve, augment, or downright…

Databases · Computer Science 2024-09-04 Alberto Lerner , Gustavo Alonso

Compute Can't Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI Infrastructure

Modern AI workloads such as large language models (LLMs) and retrieval-augmented generation (RAG) impose severe demands on memory, communication bandwidth, and resource flexibility. Traditional GPU-centric architectures struggle to scale…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-15 Myoungsoo Jung

Towards CXL Resilience to CPU Failures

Compute Express Link (CXL) 3.0 and beyond allows the compute nodes of a cluster to share data with hardware cache coherence and at the granularity of a cache line. This enables shared-memory semantics for distributed computing, but…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-10 Antonis Psistakis , Burak Ocalan , Chloe Alverti , Fabien Chaix , Ramnatthan Alagappan , Josep Torrellas

The Case for Persistent CXL switches

Compute Express Link (CXL) switch allows memory extension via PCIe physical layer to address increasing demand for larger memory capacities in data centers. However, CXL attached memory introduces 170ns to 400ns memory latency. This becomes…

Hardware Architecture · Computer Science 2025-03-14 Khan Shaikhul Hadi , Naveed Ul Mustafa , Mark Heinrich , Yan Solihin

HybridTier: an Adaptive and Lightweight CXL-Memory Tiering System

Modern workloads are demanding increasingly larger memory capacity. Compute Express Link (CXL)-based memory tiering has emerged as a promising solution for addressing this problem by utilizing traditional DRAM alongside slow-tier CXL memory…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-04 Kevin Song , Jiacheng Yang , Zixuan Wang , Jishen Zhao , Sihang Liu , Gennady Pekhimenko

A Case for CXL-Centric Server Processors

The memory system is a major performance determinant for server processors. Ever-growing core counts and datasets demand higher bandwidth and capacity as well as lower latency from the memory system. To keep up with growing demands,…

Hardware Architecture · Computer Science 2023-05-10 Albert Cho , Anish Saxena , Moinuddin Qureshi , Alexandros Daglis

Equilibria: Fair Multi-Tenant CXL Memory Tiering At Scale

Memory dominates datacenter system cost and power. Memory expansion via Compute Express Link (CXL) is an effective way to provide additional memory at lower cost and power, but its effective use requires software-level tiering for…

Operating Systems · Computer Science 2026-04-21 Kaiyang Zhao , Neha Gholkar , Hasan Maruf , Abhishek Dhanotia , Johannes Weiner , Gregory Price , Ning Sun , Bhavya Dwivedi , Stuart Clark , Dimitrios Skarlatos

Cohet: A CXL-Driven Coherent Heterogeneous Computing Framework with Hardware-Calibrated Full-System Simulation

Conventional heterogeneous computing systems built on PCIe interconnects suffer from inefficient fine-grained host-device interactions and complex programming models. In recent years, many proprietary and open cache-coherent interconnect…

Hardware Architecture · Computer Science 2026-01-13 Yanjing Wang , Lizhou Wu , Sunfeng Gao , Yibo Tang , Junhui Luo , Zicong Wang , Yang Ou , Dezun Dong , Nong Xiao , Mingche Lai

CCCL: Node-Spanning GPU Collectives with CXL Memory Pooling

Large language models (LLMs) training or inference across multiple nodes introduces significant pressure on GPU memory and interconnect bandwidth. The Compute Express Link (CXL) shared memory pool offers a scalable solution by enabling…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-08 Dong Xu , Han Meng , Xinyu Chen , Dengcheng Zhu , Wei Tang , Fei Liu , Liguang Xie , Wu Xiang , Rui Shi , Yue Li , Henry Hu , Hui Zhang , Jianping Jiang , Dong Li

Understanding and Optimizing Serverless Workloads in CXL-Enabled Tiered Memory

Recent Serverless workloads tend to be largescaled/CPU-memory intensive, such as DL, graph applications, that require dynamic memory-to-compute resources provisioning. Meanwhile, recent solutions seek to design page management strategies…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-26 Yuze Li , Shunyu Yao

CXL Shared Memory Programming: Barely Distributed and Almost Persistent

While Compute Express Link (CXL) enables support for cache-coherent shared memory among multiple nodes, it also introduces new types of failures--processes can fail before data does, or data might fail before a process does. The lack of a…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-18 Yi Xu , Suyash Mahar , Ziheng Liu , Mingyao Shen , Steven Swanson

DAXFS: A Lock-Free Shared Filesystem for CXL Disaggregated Memory

CXL (Compute Express Link) enables multiple hosts to share byte-addressable memory with hardware cache coherence, but no existing filesystem exploits this for lock-free multi-host coordination. We present DaxFS, a Linux filesystem for CXL…

Operating Systems · Computer Science 2026-04-03 Cong Wang , Yiwei Yang , Yusheng Zheng

Low-overhead General-purpose Near-Data Processing in CXL Memory Expanders

Emerging Compute Express Link (CXL) enables cost-efficient memory expansion beyond the local DRAM of processors. While its CXL$.$mem protocol provides minimal latency overhead through an optimized protocol stack, frequent CXL memory…

Hardware Architecture · Computer Science 2024-10-07 Hyungkyu Ham , Jeongmin Hong , Geonwoo Park , Yunseon Shin , Okkyun Woo , Wonhyuk Yang , Jinhoon Bae , Eunhyeok Park , Hyojin Sung , Euicheol Lim , Gwangsun Kim

Telepathic Datacenters: Fast RPCs using Shared CXL Memory

Datacenter applications often rely on remote procedure calls (RPCs) for fast, efficient, and secure communication. However, RPCs are slow, inefficient, and hard to use as they require expensive serialization and compression to communicate…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-22 Suyash Mahar , Ehsan Hajyjasini , Seungjin Lee , Zifeng Zhang , Mingyao Shen , Steven Swanson

Exploring and Evaluating Real-world CXL: Use Cases and System Adoption

Compute eXpress Link (CXL) is emerging as a promising memory interface technology. However, its performance characteristics remain largely unclear due to the limited availability of production hardware. Key questions include: What are the…

Performance · Computer Science 2025-10-14 Xi Wang , Jie Liu , Jianbo Wu , Shuangyan Yang , Jie Ren , Bhanu Shankar , Dong Li