Related papers: CXL Shared Memory Programming: Barely Distributed …

A Programming Model for Disaggregated Memory over CXL

CXL (Compute Express Link) is an emerging open industry-standard interconnect between processing and memory devices that is expected to revolutionize the way systems are designed. It enables cache-coherent, shared memory pools in a…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-27 Gal Assa , Moritz Lumme , Lucas Bürgi , Michal Friedman , Ori Lahav

CXL Memory as Persistent Memory for Disaggregated HPC: A Practical Approach

In the landscape of High-Performance Computing (HPC), the quest for efficient and scalable memory solutions remains paramount. The advent of Compute Express Link (CXL) introduces a promising avenue with its potential to function as a…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-22 Yehonatan Fridman , Suprasad Mutalik Desai , Navneet Singh , Thomas Willhalm , Gal Oren

Memory Sharing with CXL: Hardware and Software Design Approaches

Compute Express Link (CXL) is a rapidly emerging coherent interconnect standard that provides opportunities for memory pooling and sharing. Memory sharing is a well-established software feature that improves memory utilization by avoiding…

Emerging Technologies · Computer Science 2024-04-05 Sunita Jain , Nagaradhesh Yeleswarapu , Hasan Al Maruf , Rita Gupta

Next-Gen Computing Systems with Compute Express Link: a Comprehensive Survey

Interconnection is crucial for computing systems. However, the current interconnection performance between processors and devices, such as memory devices and accelerators, significantly lags behind their computing performance, severely…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-21 Chen Chen , Xinkui Zhao , Guanjie Cheng , Yuesheng Xu , Shuiguang Deng , Jianwei Yin

Towards CXL Resilience to CPU Failures

Compute Express Link (CXL) 3.0 and beyond allows the compute nodes of a cluster to share data with hardware cache coherence and at the granularity of a cache line. This enables shared-memory semantics for distributed computing, but…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-10 Antonis Psistakis , Burak Ocalan , Chloe Alverti , Fabien Chaix , Ramnatthan Alagappan , Josep Torrellas

CXLMemSim: A pure software simulated CXL.mem for performance characterization

CXLMemSim is a fast, lightweight simulation framework that enables performance characterization of memory systems based on Compute Express Link (CXL) .mem technology. CXL.mem allows disaggregation and pooling of memory to mitigate memory…

Performance · Computer Science 2025-06-18 Yiwei Yang , Brian Zhao , Yusheng Zheng , Pooneh Safayenikoo , Tanvir Ahmed Khan , Andi Quinn

Failure Tolerant Training with Persistent Memory Disaggregation over CXL

This paper proposes TRAININGCXL that can efficiently process large-scale recommendation datasets in the pool of disaggregated memory while making training fault tolerant with low overhead. To this end, i) we integrate persistent memory…

Hardware Architecture · Computer Science 2023-01-23 Miryeong Kwon , Junhyeok Jang , Hanjin Choi , Sangwon Lee , Myoungsoo Jung

Rethinking PM Crash Consistency in the CXL Era

Persistent Memory (PM) introduces new opportunities for designing crash-consistent applications without the traditional storage overheads. However, ensuring crash consistency in PM demands intricate knowledge of CPU, cache, and memory…

Emerging Technologies · Computer Science 2025-04-25 João Oliveira , João Gonçalves , Miguel Matos

Understanding and Optimizing Serverless Workloads in CXL-Enabled Tiered Memory

Recent Serverless workloads tend to be largescaled/CPU-memory intensive, such as DL, graph applications, that require dynamic memory-to-compute resources provisioning. Meanwhile, recent solutions seek to design page management strategies…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-26 Yuze Li , Shunyu Yao

CXL-Interference: Analysis and Characterization in Modern Computer Systems

Compute Express Link (CXL) is a promising technology that addresses memory and storage challenges. Despite its advantages, CXL faces performance threats from external interference when co-existing with current memory and storage systems.…

Hardware Architecture · Computer Science 2024-11-28 Shunyu Mao , Jiajun Luo , Yixin Li , Jiapeng Zhou , Weidong Zhang , Zheng Liu , Teng Ma , Shuwen Deng

Exploring and Evaluating Real-world CXL: Use Cases and System Adoption

Compute eXpress Link (CXL) is emerging as a promising memory interface technology. However, its performance characteristics remain largely unclear due to the limited availability of production hardware. Key questions include: What are the…

Performance · Computer Science 2025-10-14 Xi Wang , Jie Liu , Jianbo Wu , Shuangyan Yang , Jie Ren , Bhanu Shankar , Dong Li

CXL over Ethernet: A Novel FPGA-based Memory Disaggregation Design in Data Centers

Memory resources in data centers generally suffer from low utilization and lack of dynamics. Memory disaggregation solves these problems by decoupling CPU and memory, which currently includes approaches based on RDMA or interconnection…

Hardware Architecture · Computer Science 2023-02-23 Chenjiu Wang , Ke He , Ruiqi Fan , Xiaonan Wang , Yang Kong , Wei Wang , Qinfen Hao

CCCL: Node-Spanning GPU Collectives with CXL Memory Pooling

Large language models (LLMs) training or inference across multiple nodes introduces significant pressure on GPU memory and interconnect bandwidth. The Compute Express Link (CXL) shared memory pool offers a scalable solution by enabling…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-08 Dong Xu , Han Meng , Xinyu Chen , Dengcheng Zhu , Wei Tang , Fei Liu , Liguang Xie , Wu Xiang , Rui Shi , Yue Li , Henry Hu , Hui Zhang , Jianping Jiang , Dong Li

emucxl: an emulation framework for CXL-based disaggregated memory applications

The emergence of CXL (Compute Express Link) promises to transform the status of interconnects between host and devices and in turn impact the design of all software layers. With its low overhead, low latency, and memory coherency…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-15 Raja Gond , Purushottam Kulkarni

CXL and the Return of Scale-Up Database Engines

The trend toward specialized processing devices such as TPUs, DPUs, GPUs, and FPGAs has exposed the weaknesses of PCIe in interconnecting these devices and their hosts. Several attempts have been proposed to improve, augment, or downright…

Databases · Computer Science 2024-09-04 Alberto Lerner , Gustavo Alonso

Modeling the Potential of Message-Free Communication via CXL.mem

Heterogeneous memory technologies are increasingly important instruments in addressing the memory wall in HPC systems. While most are deployed in single node setups, CXL.mem is a technology that implements memories that can be attached to…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-10 Stepan Vanecek , Matthew Turner , Manisha Gajbe , Matthew Wolf , Martin Schulz

An Introduction to the Compute Express Link (CXL) Interconnect

The Compute Express Link (CXL) is an open industry-standard interconnect between processors and devices such as accelerators, memory buffers, smart network interfaces, persistent memory, and solid-state drives. CXL offers coherency and…

Hardware Architecture · Computer Science 2024-05-09 Debendra Das Sharma , Robert Blankenship , Daniel S. Berger

Pooling Engram Conditional Memory in Large Language Models using CXL

Engram conditional memory has emerged as a promising component for LLMs by decoupling static knowledge lookup from dynamic computation. Since Engram exhibits sparse access patterns and supports prefetching, its massive embedding tables are…

Hardware Architecture · Computer Science 2026-03-12 Ruiyang Ma , Teng Ma , Zhiyuan Su , Hantian Zha , Xinpeng Zhao , Xuchun Shang , Xingrui Yi , Zheng Liu , Zhu Cao , An Wu , Zhichong Dou , Ziqian Liu , Daikang Kuang , Guojie Luo

Demystifying CXL Memory with Genuine CXL-Ready Systems and Devices

The ever-growing demands for memory with larger capacity and higher bandwidth have driven recent innovations on memory expansion and disaggregation technologies based on Compute eXpress Link (CXL). Especially, CXL-based memory expansion…

Performance · Computer Science 2023-10-06 Yan Sun , Yifan Yuan , Zeduo Yu , Reese Kuper , Chihun Song , Jinghan Huang , Houxiang Ji , Siddharth Agarwal , Jiaqi Lou , Ipoom Jeong , Ren Wang , Jung Ho Ahn , Tianyin Xu , Nam Sung Kim

Telepathic Datacenters: Fast RPCs using Shared CXL Memory

Datacenter applications often rely on remote procedure calls (RPCs) for fast, efficient, and secure communication. However, RPCs are slow, inefficient, and hard to use as they require expensive serialization and compression to communicate…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-22 Suyash Mahar , Ehsan Hajyjasini , Seungjin Lee , Zifeng Zhang , Mingyao Shen , Steven Swanson