Computer Science
Large Language Models (LLMs) have revolutionized AI applications, but deploying them at scale presents significant challenges. We present RTP-LLM, a high-performance inference engine for industrial-scale LLM deployment, successfully…
Online change point detection in dynamic graphs requires comparing graphs as they arrive, in time linear in the number of edges, without parametric assumptions. Recent spectral methods address scale via the Kernel Polynomial Method (KPM):…
We describe libhmm, a C++20 library for Hidden Markov Model parameter estimation, sequence decoding, and model selection. libhmm addresses two gaps in existing software: the absence of a well-maintained, zero-dependency C++ HMM library…
A real-time multicore system requires delay bounds on access to shared resources. These resources include the kernel, which has potentially many non-preemptible critical sections guarded by one or more different synchronization primitives.…
In abstract models of algorithmic self-assembly, synchronization between attachments has emerged as a crucial distinction between the classical asynchronous model (aTAM) and a new synchronous model, the syncTAM. This paper presents recent…
We introduce a rotation-invariant representation of planar shapes. In particular, this representation encodes shapes as vectors such that the Euclidean distance between them serves as a valid shape distance. For standardized, star-shaped…
This paper proves a conjecture by Solomon about Steiner shallow-light trees (SLT) in Euclidean $d$-space: It is shown that for any finite point set $\mathbb{R}^d$, any root, and any $\epsilon>0$, there is a Euclidean Steiner…
Linux is the foundation of the digital age, accounting for the majority of the cloud and mobile OS markets. Any device that runs Linux uses the Linux page cache, a central pillar in OS and application performance, serving to reduce…
KV cache management is essential for efficient LLM inference. To maximize utilization, existing inference engines evict finished requests' KV cache if new requests are waiting. This policy breaks for agentic workloads, which interleave LLM…
Neural networks encode inputs as high-dimensional vectors, known as representations, that capture how models process data by encoding task-relevant structure and semantics. Representation alignment refers to the degree to which different…
Neural networks are increasingly deployed in scientific, safety critical, and mission critical pipelines, yet verification and analysis are often performed outside the programming environment that defines and runs the model. This creates a…
Recent developments in shape reconstruction and comparison call for the use of many different (topological) descriptor types, such as persistence diagrams and Euler characteristic functions. We establish a framework to quantitatively…
The persistent homology transform (PHT) represents a shape with a multiset of persistence diagrams parameterized by the sphere of directions in the ambient space. In this work, we describe a finite set of diagrams that discretize the PHT…
Our context of interest is how binary locality sensitive hash (LSH) functions can be used to solve the approximate near neighbour (ANN) problem, which seeks to find the k closest elements of some dataset X to some further point q presented…
In this paper, we show that it is NP-hard to determine whether a given graph admits a min-1-planar drawing. A drawing of a graph is min-$k$-planar if, for every crossing in the drawing, at least one of the two crossing edges involves at…
Efficient solutions of large-scale, ill-conditioned and indefinite algebraic equations are ubiquitously needed in numerous computational fields, including multiphysics simulations, machine learning, and data science. Because of their…
LLM-powered AI agents require high-frequency state exploration (e.g., test-time tree search and reinforcement learning), relying on rapid checkpoint and rollback (C/R) of the complete sandbox state, including files and process state (e.g.,…
Let $P$ be a set of $n$ points in the plane, where each element of $P$ is assigned a weight $\omega(p)$, positive or negative. In this paper, we present an algorithm that runs in $O(n^4\log n)$ time and $O(n)$ space to find two possibly…
Geometric modelling has been a crucial component of the design process ever since the introduction of the first Computer-Aided Design (CAD) systems. Additive Manufacturing (AM) pushes design freedom to previously unachievable limits. AM…
A key property of the Delaunay filtration is that it is topologically (i.e., weakly) equivalent to the offset (union-of-balls) filtration. Recently, this filtration has been extended to point clouds equipped with an $\mathbb{R}$-valued…