操作系统 — Scifaro

Zerrow: True Zero-Copy Arrow Pipelines in Bauplan

Bauplan is a FaaS-based lakehouse specifically built for data pipelines: its execution engine uses Apache Arrow for data passing between the nodes in the DAG. While Arrow is known as the "zero copy format", in practice, limited Linux kernel…

操作系统 · 计算机科学 2025-05-15 Yifan Dai , Jacopo Tagliabue , Andrea Arpaci-Dusseau , Remzi Arpaci-Dusseau , Tyler R. Caraza-Harter

Work-in-Progress: Multi-Deadline DAG Scheduling Model for Autonomous Driving Systems

Autoware is an autonomous driving system implemented on Robot Operation System (ROS) 2, where an end-to-end timing guarantee is crucial to ensure safety. However, existing ROS 2 cause-effect chain models for analyzing end-to-end latency…

操作系统 · 计算机科学 2025-05-14 Atsushi Yano , Takuya Azumi

RTOS Architectures that Solve the Diminishing Bandwidth Problem

The Diminishing Bandwidth Problem is a long standing, previously unidentified, extensibility problem of current real-time operating systems characterized by a superficial dependency between the number of tasks in a system and the maximum…

操作系统 · 计算机科学 2025-05-13 Mazen Arakji

Work in Progress: Middleware-Transparent Callback Enforcement in Commoditized Component-Oriented Real-time Systems

Real-time scheduling in commoditized component-oriented real-time systems, such as ROS 2 systems on Linux, has been studied under nested scheduling: OS thread scheduling and middleware layer scheduling (e.g., ROS 2 Executor). However, by…

操作系统 · 计算机科学 2025-05-13 Takahiro Ishikawa-Aso , Atsushi Yano , Takuya Azumi , Shinpei Kato

Concurrency Testing in the Linux Kernel via eBPF

Concurrency is vital for our critical software to meet modern performance requirements, yet concurrency bugs are notoriously difficult to detect and reproduce. Controlled Concurrency Testing (CCT) can make bugs easier to expose by enabling…

操作系统 · 计算机科学 2025-05-01 Jiacheng Xu , Dylan Wolff , Xing Yi Han , Jialin Li , Abhik Roychoudhury

The First Principle of Big Memory Systems

Persistence is the first principle of big memory systems. We comprehensively analyze the vertical and horizontal extensions of existing memory hierarchy. Networks are flattening traditional storage hierarchies. We present the…

操作系统 · 计算机科学 2025-05-01 Yu Hua

From Good to Great: Improving Memory Tiering Performance Through Parameter Tuning

Memory tiering systems achieve memory scaling by adding multiple tiers of memory wherein different tiers have different access latencies and bandwidth. For maximum performance, frequently accessed (hot) data must be placed close to the host…

操作系统 · 计算机科学 2025-04-29 Konstantinos Kanellis , Sujay Yadalam , Fanchao Chen , Michael Swift , Shivaram Venkataraman

Safe and usable kernel extensions with Rex

Safe kernel extensions have gained significant traction, evolving from simple packet filters to large, complex programs that customize storage, networking, and scheduling. Existing kernel extension mechanisms like eBPF rely on in-kernel…

操作系统 · 计算机科学 2025-04-29 Jinghao Jia , Ruowen Qin , Milo Craun , Egor Lukiyanov , Ayush Bansal , Michael V. Le , Hubertus Franke , Hani Jamjoom , Tianyin Xu , Dan Williams

The NIC should be part of the OS

The network interface adapter (NIC) is a critical component of a cloud server occupying a unique position. Not only is network performance vital to efficient operation of the machine, but unlike compute accelerators like GPUs, the network…

操作系统 · 计算机科学 2025-04-24 Pengcheng Xu , Timothy Roscoe

LithOS: An Operating System for Efficient Machine Learning on GPUs

The surging demand for GPUs in datacenters for machine learning (ML) has made efficient GPU utilization crucial. However, meeting the diverse needs of ML models while optimizing resource usage is challenging. To enable transparent,…

操作系统 · 计算机科学 2025-04-23 Patrick H. Coppock , Brian Zhang , Eliot H. Solomon , Vasilis Kypriotis , Leon Yang , Bikash Sharma , Dan Schatzberg , Todd C. Mowry , Dimitrios Skarlatos

My CXL Pool Obviates Your PCIe Switch

Pooling PCIe devices across multiple hosts offers a promising solution to mitigate stranded I/O resources, enhance device utilization, address device failures, and reduce total cost of ownership. The only viable option today are PCIe…

操作系统 · 计算机科学 2025-04-22 Yuhong Zhong , Daniel S. Berger , Pantea Zardoshti , Enrique Saurez , Jacob Nelson , Antonis Psistakis , Joshua Fried , Asaf Cidon

Futureproof Static Memory Planning

The NP-complete combinatorial optimization task of assigning offsets to a set of buffers with known sizes and lifetimes so as to minimize total memory usage is called dynamic storage allocation (DSA). Existing DSA implementations bypass the…

操作系统 · 计算机科学 2025-04-08 Christos Lamprakos , Panagiotis Xanthopoulos , Manolis Katsaragakis , Sotirios Xydis , Dimitrios Soudris , Francky Catthoor

HeteroPod: XPU-Accelerated Infrastructure Offloading for Commodity Cloud-Native Applications

Cloud-native systems increasingly rely on infrastructure services (e.g., service meshes, monitoring agents), which compete for resources with user applications, degrading performance and scalability. We propose HeteroPod, a new abstraction…

操作系统 · 计算机科学 2025-04-01 Bicheng Yang , Jingkai He , Dong Du , Yubin Xia , Haibo Chen

Linux for Everyone: Can Standardization Drive Mainstream Adoption?

Despite its technical superiority and flexibility, Linux remains a niche OS in the consumer markets. Because fragmentation stems from diverse distributions, it lacks the standardized experience, which discourages mainstream adoption. This…

操作系统 · 计算机科学 2025-04-01 Rohit J Nandha , Ronak D Patel

Saving Storage Space Using Files on the Web

As conventional storage density reaches its physical limits, the cost of a gigabyte of storage is no longer plummeting, but rather has remained mostly flat for the past decade. Meanwhile, file sizes continue to grow, leading to ever fuller…

操作系统 · 计算机科学 2025-03-31 Kevin Saric , Gowri Sankar Ramachandran , Raja Jurdak , Surya Nepal

Empowering WebAssembly with Thin Kernel Interfaces

Wasm is gaining popularity outside the Web as a well-specified low-level binary format with ISA portability, low memory footprint and polyglot targetability, enabling efficient in-process sandboxing of untrusted code. Despite these…

操作系统 · 计算机科学 2025-03-28 Arjun Ramesh , Tianshu Huang , Ben L. Titzer , Anthony Rowe

Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud Platforms

Cloud platforms remain underutilized despite multiple proposals to improve their utilization (e.g., disaggregation, harvesting, and oversubscription). Our characterization of the resource utilization of virtual machines (VMs) in Azure…

操作系统 · 计算机科学 2025-03-21 Benjamin Reidys , Pantea Zardoshti , Íñigo Goiri , Celine Irvene , Daniel S. Berger , Haoran Ma , Kapil Arya , Eli Cortez , Taylor Stark , Eugene Bak , Mehmet Iyigun , Stanko Novaković , Lisa Hsu , Karel Trueba , Abhisek Pan , Chetan Bansal , Saravan Rajmohan , Jian Huang , Ricardo Bianchini

Efficient Function-as-a-Service for Large Language Models with TIDAL

Large Language Model (LLM) applications have emerged as a prominent use case for Function-as-a-Service (FaaS) due to their high computational demands and sporadic invocation patterns. However, serving LLM functions within FaaS frameworks…

操作系统 · 计算机科学 2025-03-11 Weihao Cui , Ziyi Xu , Han Zhao , Quan Chen , Zijun Li , Bingsheng He , Minyi Guo

FlexInfer: Breaking Memory Constraint via Flexible and Efficient Offloading for On-Device LLM Inference

Large Language Models (LLMs) face challenges for on-device inference due to high memory demands. Traditional methods to reduce memory usage often compromise performance and lack adaptability. We propose FlexInfer, an optimized offloading…

操作系统 · 计算机科学 2025-03-07 Hongchao Du , Shangyu Wu , Arina Kharlamova , Nan Guan , Chun Jason Xue

TUNA: Tuning Unstable and Noisy Cloud Applications

Autotuning plays a pivotal role in optimizing the performance of systems, particularly in large-scale cloud deployments. One of the main challenges in performing autotuning in the cloud arises from performance variability. We first…

操作系统 · 计算机科学 2025-03-07 Johannes Freischuetz , Konstantinos Kanellis , Brian Kroth , Shivaram Venkataraman