Computer Science

RTP-LLM: High-Performance Alibaba LLM Inference Engine

Large Language Models (LLMs) have revolutionized AI applications, but deploying them at scale presents significant challenges. We present RTP-LLM, a high-performance inference engine for industrial-scale LLM deployment, successfully…

Operating Systems · Computer Science 2026-05-29 Boyu Tan , Jiarui Guo , Zongwei Lv , Hanbo Sun , Tong Yang , Kan Liu , Xinfei Shi , Zetao Hu , Yaxin Yu , Chi Zhang , Jianning Zhang , Xi Yang , Wei Zhang , Bo Cai , Silu Zhou , Xiyu Wang , Na He , Yinghao Yu , Wending Bao , Guiyang Huang , Yuxing Yuan , Juncheng Yin , Nan Wang , Lin Yang , Zechao Zhang , Lu Chen , Guoding Li , Tao Lan , Lin Qu

Bounded Priority-Aware Locking for Real-Time Kernels

A real-time multicore system requires delay bounds on access to shared resources. These resources include the kernel, which has potentially many non-preemptible critical sections guarded by one or more different synchronization primitives.…

Operating Systems · Computer Science 2026-05-28 Shriram Raja , Richard West

LearnedCache: An eBPF-Integrated Perceptron-Based Eviction Policy for the Linux Page Cache

Linux is the foundation of the digital age, accounting for the majority of the cloud and mobile OS markets. Any device that runs Linux uses the Linux page cache, a central pillar in OS and application performance, serving to reduce…

Operating Systems · Computer Science 2026-05-27 Zejia Qi

Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live

KV cache management is essential for efficient LLM inference. To maximize utilization, existing inference engines evict finished requests' KV cache if new requests are waiting. This policy breaks for agentic workloads, which interleave LLM…

Operating Systems · Computer Science 2026-05-27 Hanchen Li , Runyuan He , Qiuyang Mang , Qizheng Zhang , Huanzhi Mao , Xiaokun Chen , Hangrui Zhou , Alvin Cheung , Joseph Gonzalez , Ion Stoica

DeltaBox: Scaling Stateful AI Agents with Millisecond-Level Sandbox Checkpoint/Rollback

LLM-powered AI agents require high-frequency state exploration (e.g., test-time tree search and reinforcement learning), relying on rapid checkpoint and rollback (C/R) of the complete sandbox state, including files and process state (e.g.,…

Operating Systems · Computer Science 2026-05-22 Yunpeng Dong , Jingkai He , Yuze Hou , Dong Du , Zhonghu Xu , Si Yu , Yubin Xia , Haibo Chen

ParamRF: A JAX-native Framework for Declarative Circuit Modelling

This work introduces ParamRF: a Python library for efficient, parametric modelling of radio frequency (RF) circuits. Built on top of the next-generation computational library JAX, as well as the object-oriented wrapper Equinox, the…

Other Computer Science · Computer Science 2026-05-22 Gary V. C. Allen , Dirk I. L. de Villiers

ParaCell: Paravirtualized Secure Containers with Lightweight Intra-Container Isolation and Intent-Driven Memory Management

Secure containers isolate each container with its own kernel, mitigating shared-kernel attacks prevalent in traditional container systems. However, existing designs still face a fundamental isolation--performance trade-off. Nested-cloud…

Operating Systems · Computer Science 2026-05-21 Yiyang Wu , Xunjie Wang , Jinyu Gu , Haibo Chen

Clove: Object-Level CXL Memory Management in Managed Runtimes

Object-level management of tiered memory has been studied to address the inefficiencies in page-based systems. However, object-level management for CXL-tiered memory remains underexplored due to CXL's tight performance budget and load/store…

Operating Systems · Computer Science 2026-05-21 Sam Son , Zhihong Luo , Wen Zhang , Sylvia Ratnasamy , Scott Shenker

SSV: Sparse Speculative Verification for Efficient LLM Inference

Speculative decoding and dynamic sparse attention are two complementary approaches for accelerating long-context LLM inference: the former amortizes target-model execution across multiple verifier queries, while the latter reduces each…

Operating Systems · Computer Science 2026-05-21 Zhibin Wang , Ziyu Zhong , Nuo Shen , Yuhang Zhou , Rong Gu , Sheng Zhong

Experimental Analysis of FreeRTOS Dependability through Targeted Fault Injection Campaigns

Real-Time Operating Systems (RTOSes) play a crucial role in safety-critical domains, where deterministic and predictable task execution is essential. Yet they are increasingly exposed to ionizing radiation, which can compromise system…

Operating Systems · Computer Science 2026-05-21 Luca Mannella , Stefano Di Carlo , Alessandro Savino

Where Linux Breaks Under Radiation: A Cross-Architecture Kernel-Level Characterization of Proton-Induced Failures in COTS SoCs

Linux is increasingly deployed in Low Earth Orbit on commercial off the shelf systems on chip that were not designed for space radiation. Ionizing particles can trigger single event functional interrupts that crash the kernel without…

Operating Systems · Computer Science 2026-05-21 Saad Memon , Rafal Graczyk , Tomasz Rajkowski , Jan Swakon , Damian Wrobel , Sebastian Kusyk , Seth Roffe , Mike Papadakis

THEMIS: Time, Heterogeneity, and Energy Minded Scheduling for Fair Multi-Tenant Use in FPGAs

Using correct design metrics and understanding the limitations of the underlying technology is critical to developing effective scheduling algorithms. Unfortunately, existing scheduling techniques used \emph{incorrect} metrics and had…

Operating Systems · Computer Science 2026-05-21 Emre Karabulut , Arsalan Ali Malik , Amro Awad , Aydin Aysu

C2CServe: Leveraging NVLink-C2C for Elastic Serverless LLM Serving on MIG

Modern LLM serving is increasingly serverless in shape: large model catalogs, long-tail invocations, and multi-tenant demand. Existing GPU serving systems face a tradeoff: dedicated-GPU allocation wastes scarce HBM under sparse traffic,…

Operating Systems · Computer Science 2026-05-20 Shutian Luo , Ali Zafar Sadiq , Rui Yang , Mingye Zhang , Haiying Shen , Wei Wang , Yue Cheng

Embedded Rust or C Firmware? Lessons from an Industrial Microcontroller Use Case with Ariel OS

As Rust gains traction for developing safer systems software, a reality check for the microcontroller hardware segment becomes necessary. How ready is the Rust ecosystem for this segment? Can Rust compete with C in practice? This paper…

Operating Systems · Computer Science 2026-05-20 Bipin Thapa , Daniele Alfonso , Lorenzo Bini , Licio Mapelli , Kaspar Schleiser , Romain Fouquet , Emmanuel Baccelli

TIDAL: Recovering Temporal Phase for Cloud Block Storage Placement from LLM-Derived Semantics

Cloud Virtual Disk (CVD) placement in Cloud Block Storage (CBS) is critical for resource efficiency and performance isolation. Existing schemes prioritize spatial load balancing by dispersing disks across pods based on configuration-derived…

Operating Systems · Computer Science 2026-05-19 Difan Tan , Changlin Wan , Jiawen Liu , Hua Wang , Ke Zhou

PipeANN-Filter: An Efficient Filtered Vector Search System on SSD

We propose PipeANN-Filter, an efficient filtered vector search system on SSD. Unlike existing systems that explore only valid vectors (i.e., those satisfying the attribute constraints) during search, PipeANN-Filter explores a superset of…

Operating Systems · Computer Science 2026-05-19 Hao Guo , Jiwu Shu , Youyou Lu

TClone: Low-Latency Forking of Live GUI Environments for Computer-Use Agents

Computer-use agents increasingly operate inside live personal workspaces, where their actions can modify files, applications, GUI state, credentials, and authenticated sessions. This creates a tension between safety and quality: agents need…

Operating Systems · Computer Science 2026-05-19 Yutong Huang , Vikranth Srivatsa , Alex Asch , Hansin Tushar Patwa , Yiying Zhang

SemaTune: Semantic-Aware Online OS Tuning with Large Language Models

Online OS tuning can improve long-running services, but existing controllers are poorly matched to live hosts. They treat scheduler, power, memory, and I/O controls as black-box variables and optimize a scalar reward. This view ignores…

Operating Systems · Computer Science 2026-05-15 Georgios Liargkovas , Mihir Nitin Joshi , Hubertus Franke , Kostis Kaffes

The Compute ICE-AGE: Invariant Compute Envelope under Addressable Graph Evolution

This paper presents empirical results from a production-grade C++ implementation of a deterministic semantic state substrate operating under bounded local state evolution. The system was realized as a CPU-resident persistent semantic graph…

Operating Systems · Computer Science 2026-05-13 R. Jay Martin

Bounded Local Generator Classes for Deterministic State Evolution

We define a bounded local generator class (BLGC) for deterministic state evolution on graph-indexed systems. The construction consists of finite-range generators operating on bounded local state under deterministic composition. Each update…

Operating Systems · Computer Science 2026-05-11 R. Jay Martin