Computer Science
The transition toward Software-Defined Vehicles (SDVs) represents a major paradigm shift in vehicle design, transforming traditional hardware-centric systems into software-centric platforms capable of dynamic adaptation and continuous…
Large Language Models (LLMs) have revolutionized AI applications, but deploying them at scale presents significant challenges. We present RTP-LLM, a high-performance inference engine for industrial-scale LLM deployment, successfully…
Memristor computing offers a route to low-energy edge AI, but device variability, sensitivity to operating conditions, and system-integration challenges can hinder deployment. Here we show that these limitations can be mitigated by using…
A real-time multicore system requires delay bounds on access to shared resources. These resources include the kernel, which has potentially many non-preemptible critical sections guarded by one or more different synchronization primitives.…
Enterprise data platforms face an enduring tension between domain self-service and holistic governance. The data mesh paradigm proposed decentralized domain ownership as a remedy, but pure implementations frequently underdeliver: teams…
Linux is the foundation of the digital age, accounting for the majority of the cloud and mobile OS markets. Any device that runs Linux uses the Linux page cache, a central pillar in OS and application performance, serving to reduce…
We consider the problem of computing sample points in each connected component of a semi-algebraic set defined by the non-vanishing or the positivity of an n-variate polynomial of degree d, with rational coefficients of bit size bounded by…
KV cache management is essential for efficient LLM inference. To maximize utilization, existing inference engines evict finished requests' KV cache if new requests are waiting. This policy breaks for agentic workloads, which interleave LLM…
Large Language Models (LLMs) have demonstrated impressive progress in complex reasoning tasks, largely driven by the Chain-of-Thought (CoT) paradigm, which decomposes difficult problems into intermediate steps. However, CoT reasoning…
Living mycelial filaments integrate chemical, optical, mechanical, thermal, and biological information via electrophysiological cellular trans-membrane potential. The challenge is to create a mycology interface that sustains metabolism,…
Kolmogorov-Arnold Networks (KANs) shift neural computation from linear layers to learnable nonlinear edge functions, but implementing these nonlinearities efficiently in hardware remains an open challenge. Here we introduce a physical…
Pulse-level simulators are the lowest-level, most widely used abstraction layer for studying how quantum hardware responds to control signals, but they can be built on Hamiltonian models with very different fidelity and cost. This raises…
LLM-powered AI agents require high-frequency state exploration (e.g., test-time tree search and reinforcement learning), relying on rapid checkpoint and rollback (C/R) of the complete sandbox state, including files and process state (e.g.,…
Liquid biopsy can detect tumor-derived biomarkers such as circulating tumor DNA (ctDNA), but ultra-low-fraction assays remain costly, slow, and difficult to scale. This motivates interest in intravascular in vivo sensing in the context of…
We study the problem of computing the isolated regular solutions of a system \((f_1,\ldots,f_n)\) of \(n\) polynomial equations in \(n\) variables \((X_1, \dots, X_n)\) over a field of characteristic zero \(k\). We focus on systems with a…
We present a new algorithm for fast matrix multiplication using tensor decompositions which have special features. Thanks to these features we obtain exponents lower than what the rank of the tensor decomposition suggests. In particular for…
This paper presents a generalised symbolic algorithm for solving systems of linear algebraic equations with multi-diagonal coefficient matrices. The algorithm is given in a pseudocode. A theorem which gives the condition for correctness of…
Secure containers isolate each container with its own kernel, mitigating shared-kernel attacks prevalent in traditional container systems. However, existing designs still face a fundamental isolation--performance trade-off. Nested-cloud…
Object-level management of tiered memory has been studied to address the inefficiencies in page-based systems. However, object-level management for CXL-tiered memory remains underexplored due to CXL's tight performance budget and load/store…
Speculative decoding and dynamic sparse attention are two complementary approaches for accelerating long-context LLM inference: the former amortizes target-model execution across multiple verifier queries, while the latter reduces each…