Computer Science
The transition toward Software-Defined Vehicles (SDVs) represents a major paradigm shift in vehicle design, transforming traditional hardware-centric systems into software-centric platforms capable of dynamic adaptation and continuous…
Large Language Models (LLMs) have revolutionized AI applications, but deploying them at scale presents significant challenges. We present RTP-LLM, a high-performance inference engine for industrial-scale LLM deployment, successfully…
Memristor computing offers a route to low-energy edge AI, but device variability, sensitivity to operating conditions, and system-integration challenges can hinder deployment. Here we show that these limitations can be mitigated by using…
A real-time multicore system requires delay bounds on access to shared resources. These resources include the kernel, which has potentially many non-preemptible critical sections guarded by one or more different synchronization primitives.…
Enterprise data platforms face an enduring tension between domain self-service and holistic governance. The data mesh paradigm proposed decentralized domain ownership as a remedy, but pure implementations frequently underdeliver: teams…
Linux is the foundation of the digital age, accounting for the majority of the cloud and mobile OS markets. Any device that runs Linux uses the Linux page cache, a central pillar in OS and application performance, serving to reduce…
KV cache management is essential for efficient LLM inference. To maximize utilization, existing inference engines evict finished requests' KV cache if new requests are waiting. This policy breaks for agentic workloads, which interleave LLM…
Living mycelial filaments integrate chemical, optical, mechanical, thermal, and biological information via electrophysiological cellular trans-membrane potential. The challenge is to create a mycology interface that sustains metabolism,…
Kolmogorov-Arnold Networks (KANs) shift neural computation from linear layers to learnable nonlinear edge functions, but implementing these nonlinearities efficiently in hardware remains an open challenge. Here we introduce a physical…
Pulse-level simulators are the lowest-level, most widely used abstraction layer for studying how quantum hardware responds to control signals, but they can be built on Hamiltonian models with very different fidelity and cost. This raises…
LLM-powered AI agents require high-frequency state exploration (e.g., test-time tree search and reinforcement learning), relying on rapid checkpoint and rollback (C/R) of the complete sandbox state, including files and process state (e.g.,…
Liquid biopsy can detect tumor-derived biomarkers such as circulating tumor DNA (ctDNA), but ultra-low-fraction assays remain costly, slow, and difficult to scale. This motivates interest in intravascular in vivo sensing in the context of…
Secure containers isolate each container with its own kernel, mitigating shared-kernel attacks prevalent in traditional container systems. However, existing designs still face a fundamental isolation--performance trade-off. Nested-cloud…
Object-level management of tiered memory has been studied to address the inefficiencies in page-based systems. However, object-level management for CXL-tiered memory remains underexplored due to CXL's tight performance budget and load/store…
Speculative decoding and dynamic sparse attention are two complementary approaches for accelerating long-context LLM inference: the former amortizes target-model execution across multiple verifier queries, while the latter reduces each…
Real-Time Operating Systems (RTOSes) play a crucial role in safety-critical domains, where deterministic and predictable task execution is essential. Yet they are increasingly exposed to ionizing radiation, which can compromise system…
Linux is increasingly deployed in Low Earth Orbit on commercial off the shelf systems on chip that were not designed for space radiation. Ionizing particles can trigger single event functional interrupts that crash the kernel without…
Vertical Take-Off and Landing (VTOL) vehicles are gaining traction in both the delivery drone market and passenger transportation, driving the development of Urban Air Mobility (UAM) systems. UAM seeks to alleviate road congestion in dense…
Using correct design metrics and understanding the limitations of the underlying technology is critical to developing effective scheduling algorithms. Unfortunately, existing scheduling techniques used \emph{incorrect} metrics and had…
Modern LLM serving is increasingly serverless in shape: large model catalogs, long-tail invocations, and multi-tenant demand. Existing GPU serving systems face a tradeoff: dedicated-GPU allocation wastes scarce HBM under sparse traffic,…