Computer Science

RTP-LLM: High-Performance Alibaba LLM Inference Engine

Large Language Models (LLMs) have revolutionized AI applications, but deploying them at scale presents significant challenges. We present RTP-LLM, a high-performance inference engine for industrial-scale LLM deployment, successfully…

Operating Systems · Computer Science 2026-05-29 Boyu Tan , Jiarui Guo , Zongwei Lv , Hanbo Sun , Tong Yang , Kan Liu , Xinfei Shi , Zetao Hu , Yaxin Yu , Chi Zhang , Jianning Zhang , Xi Yang , Wei Zhang , Bo Cai , Silu Zhou , Xiyu Wang , Na He , Yinghao Yu , Wending Bao , Guiyang Huang , Yuxing Yuan , Juncheng Yin , Nan Wang , Lin Yang , Zechao Zhang , Lu Chen , Guoding Li , Tao Lan , Lin Qu

libhmm: A Modern C++20 Library for Hidden Markov Models with Correct MLE Emission M-Steps

We describe libhmm, a C++20 library for Hidden Markov Model parameter estimation, sequence decoding, and model selection. libhmm addresses two gaps in existing software: the absence of a well-maintained, zero-dependency C++ HMM library…

Mathematical Software · Computer Science 2026-05-29 Gary Wolfman

The Biosecurity Blind Spot: Systematic Dual-use Detection in Open Science Infrastructure

AI is transforming life sciences research at unprecedented speed, accelerating discovery across protein structure prediction, genome modeling, and drug development (Jumper et al., 2021; Mak et al., 2024). Yet this rapid advancement, coupled…

Digital Libraries · Computer Science 2026-05-29 Vasudha Sharma , Chakresh Kumar Singh , Jayesh Choudhari , Dharmit Nakrani

Co-creation of AI technology, empowering curators of cultural heritage information and guarding research commons

The substance of this paper is the description of the use of Retrieval-Augmented Generation (RAG) for specific digital collections of cultural assets. The collections are provided by institutions operating in the cultural sector. The…

Digital Libraries · Computer Science 2026-05-29 Andrea Scharnhorst , Han Yang , Jetze Touber , Kim Ferguson , Philipp Mayr , Vyacheslav Tykhonov

Verified Misguidance: Measuring Structural Citation Failures in Search-Augmented LLMs

Users of search-augmented LLMs rely on citations as evidence that responses are grounded in real sources, and rarely verify the cited pages themselves. Millions of queries per day now pass through these systems, making citation quality a…

Digital Libraries · Computer Science 2026-05-28 Yongsik Seo , Wooseok Jeong , Eunyoung Kim , Hyeonseo Jang , Dongha Lee

CiteCheck: Retrieval-Grounded Detection of LLM Citation Hallucinations in Scientific Text

Large language models (LLMs) are increasingly used to generate scientific reports, but they can produce references that appear plausible while containing corrupted metadata or pointing to papers that do not exist. We introduce CiteCheck, a…

Digital Libraries · Computer Science 2026-05-28 Khashayar Khajavi , Shaghayegh Sadeghi , Rise Adhikari , Alexander Tessier

Bounded Priority-Aware Locking for Real-Time Kernels

A real-time multicore system requires delay bounds on access to shared resources. These resources include the kernel, which has potentially many non-preemptible critical sections guarded by one or more different synchronization primitives.…

Operating Systems · Computer Science 2026-05-28 Shriram Raja , Richard West

LearnedCache: An eBPF-Integrated Perceptron-Based Eviction Policy for the Linux Page Cache

Linux is the foundation of the digital age, accounting for the majority of the cloud and mobile OS markets. Any device that runs Linux uses the Linux page cache, a central pillar in OS and application performance, serving to reduce…

Operating Systems · Computer Science 2026-05-27 Zejia Qi

Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live

KV cache management is essential for efficient LLM inference. To maximize utilization, existing inference engines evict finished requests' KV cache if new requests are waiting. This policy breaks for agentic workloads, which interleave LLM…

Operating Systems · Computer Science 2026-05-27 Hanchen Li , Runyuan He , Qiuyang Mang , Qizheng Zhang , Huanzhi Mao , Xiaokun Chen , Hangrui Zhou , Alvin Cheung , Joseph Gonzalez , Ion Stoica

sciwrite-lint: Verification Infrastructure for the Age of Science Vibe-Writing

Scientific papers make claims about prior work backed by citations. Verifying those citations at scale (that each cited paper exists, says what the citation claims, and is itself reliable) is structurally beyond what human review can…

Digital Libraries · Computer Science 2026-05-26 Sergey V Samsonau

TorchLean: Formalizing Neural Networks in Lean

Neural networks are increasingly deployed in scientific, safety critical, and mission critical pipelines, yet verification and analysis are often performed outside the programming environment that defines and runs the model. This creates a…

Mathematical Software · Computer Science 2026-05-26 Robert Joseph George , Jennifer Cruden , Will Adkisson , Xiangru Zhong , Huan Zhang , Anima Anandkumar

BookReconciler: An Open-Source Tool for Metadata Enrichment and Work-Level Clustering

We present BookReconciler, an open-source tool for enhancing and clustering book data. BookReconciler allows users to take spreadsheets with minimal metadata, such as book title and author, and automatically 1) add authoritative, persistent…

Digital Libraries · Computer Science 2026-05-26 Matt Miller , Dan Sinykin , Melanie Walsh

HALvest-Contrastive: Retrieval-Like Authorship Attribution with Patch-Level Late Interaction

Authorship attribution asks whether two pieces of text share a writer, but topical confound makes the task deceptively easy: two authors covering the same topic may look more alike than one author covering two topics. Scholarly prose offers…

Digital Libraries · Computer Science 2026-05-26 Francis Kulumba , Wissam Antoun , Guillaume Vimont , Laurent Romary , Florian Cafiero

Tracking a Decade of Research at the University of Nigeria, Nsukka: A Scientometric Analysis (2014-2023)

This study employs scientometric methods to assess the research output and performance of the University of Nigeria from 2014 to 2023. By analyzing publication trends, citation patterns, and collaboration networks, the research aims to…

Digital Libraries · Computer Science 2026-05-25 Muneer Ahmad , Joseph U Igligli

Thinking like a business: Reconfiguring relationships to sustain open data infrastructures

Sustaining open data infrastructures over time is a complex puzzle, involving dynamic funding models and relationships with customers, collaborators, and competitors. Despite their importance, these mechanisms are often hidden from view,…

Digital Libraries · Computer Science 2026-05-25 Kathleen Gregory , Dorothea Strecker

Parallel Sparse and Data-Sparse Factorization-based Linear Solvers

Efficient solutions of large-scale, ill-conditioned and indefinite algebraic equations are ubiquitously needed in numerous computational fields, including multiphysics simulations, machine learning, and data science. Because of their…

Mathematical Software · Computer Science 2026-05-25 Xiaoye Sherry Li , Yang Liu

DeltaBox: Scaling Stateful AI Agents with Millisecond-Level Sandbox Checkpoint/Rollback

LLM-powered AI agents require high-frequency state exploration (e.g., test-time tree search and reinforcement learning), relying on rapid checkpoint and rollback (C/R) of the complete sandbox state, including files and process state (e.g.,…

Operating Systems · Computer Science 2026-05-22 Yunpeng Dong , Jingkai He , Yuze Hou , Dong Du , Zhonghu Xu , Si Yu , Yubin Xia , Haibo Chen

The Ephemeral Web and the Case for Proactive Archiving

The web is often treated as a durable record of institutional and social life, yet in practice it is fragile, revisable, and frequently ephemeral. Domains change, redesigns erase earlier material, institutions relocate, maintainers…

Digital Libraries · Computer Science 2026-05-22 Meliksah Yorulmazlar

The Curious Case of Max Planck retracted papers. When past scientific practices meet contemporary publishing norms

This article examines the case of two papers published in Naturwissenschaften by the physicist Max Planck that were retrospectively marked as retracted on Springer digital platform. Rather than originating in scientific fraud, these…

Digital Libraries · Computer Science 2026-05-22 Yves Gingras , Mahdi Khelfaoui

Parallelized Discrete Exterior Calculus for Three-Dimensional Elliptic Problems

A formulation of elliptic boundary value problems is used to develop the first discrete exterior calculus (DEC) library for massively parallel computations with 3D domains. This can be used for steady-state analysis of any physical process…

Mathematical Software · Computer Science 2026-05-22 Pieter D. Boom , Ashley Seepujak , Odysseas Kosmas , Lee Margetts , Andrey Jivkov