Computer Science

On Language Generation in the Limit with Bounded Memory

We study language generation in the limit under bounded memory. In this task, a learner observes examples from an unknown target language one at a time and must eventually output only new valid examples. Prior work assumes access to the…

Data Structures and Algorithms · Computer Science 2026-05-29 Jon Kleinberg , Anay Mehrotra , Amin Saberi , Grigoris Velegkas

A Radius-Sensitive Approximation Algorithm for Connected Submodular Maximization

Connected Submodular Maximization (CSM) is a graph problem with important applications to wireless network deployment, path planning, epidemic outbreaks, and cancer genome studies. In CSM, we are given a graph $G$, a non-negative monotone…

Data Structures and Algorithms · Computer Science 2026-05-29 Philip Cervenjak , Junhao Gan , Naonori Kakimura , Seeun William Umboh , Anthony Wirth

Sampling Directed Eulerian Tours in $\widetilde O(m^{3/2})$ Time

We give a randomized algorithm that samples a nearly uniform Eulerian tour of a directed Eulerian multigraph with $m$ arcs in $\widetilde O(m^{3/2})$ time. The guarantee is worst-case, applies to arbitrary directed Eulerian multigraphs, and…

Data Structures and Algorithms · Computer Science 2026-05-29 Nima Anari

Explaining Rankings with Hidden Group Bonuses

Determining a linear utility function that correlates with observed candidate rankings is a foundational problem with applications in domains such as admissions, hiring, and recommendation systems, e.g., [Storandt and Funke, AAAI'19, Zhang…

Data Structures and Algorithms · Computer Science 2026-05-29 Alvin Hong Yao Yan , Suraj Shetiya , Sujoy Bhore , Priyanka Golia , Diptarka Chakraborty

Distributed Gaussian Mean Testing under Communication Constraints: messages, samples, and coins

We revisit the problem of Gaussian mean testing in a distributed, communication constrained setting, where each of $n$ users independently observes samples from an unknown $d$-dimensional spherical Gaussian distribution…

Data Structures and Algorithms · Computer Science 2026-05-29 Clément L. Canonne , Nimitt

An Improved Greedy Approximation for (Metric) $k$-Means

Clustering is a basic task in data analysis and machine learning, and the optimization of clustering objectives are well-studied optimization problems; amongst these, the $k$-Means objective is arguably the most well known. Given a…

Data Structures and Algorithms · Computer Science 2026-05-29 Moses Charikar , Vincent Cohen-Addad , Ruiquan Gao , Fabrizio Grandoni , Euiwoong Lee , Ernest van Wijland

Residual-Entropy Accounting for Routed Atom-Budgeted Learned Indexes

We study exact predecessor and rank search in a routed, atom-budgeted, certified-repair learned-index architecture. An ordered directory routes each query to a contiguous interval, a counted local predictor returns a certified rank window,…

Data Structures and Algorithms · Computer Science 2026-05-29 Faruk Alpay , Levent Sarioglu

Algorithms with Polynomially-Improved Approximation Factors for the $2 \rightarrow q$ Norm, and Applications

The $2 \rightarrow q$ norm of a matrix $X \in \mathbb{R}^{n \times d}$ is defined as $\lVert X \rVert_{2 \rightarrow q} = \sup_{\lVert v \rVert_2 = 1} \lVert Xv \rVert_q$. We give polynomial-time multiplicative approximation algorithms for…

Data Structures and Algorithms · Computer Science 2026-05-29 Samuel B. Hopkins , Stefan Tiegel

Parse indexing for discarding short pseudo-MEMs safely

Brown et al.\ (2025) described a pre-processing step, called $k$-mer based breaking (KeBaB), that speeds up searching for long maximal exact matches (MEMs) between a pattern $P$ and an indexed repetitive text $T$. KeBaB produces a set of…

Data Structures and Algorithms · Computer Science 2026-05-29 Travis Gagie

Min-Sum Set Cover on Parallel Machines

Consider the classical Min-Sum Set Cover problem: We are given a universe $\mathcal{U}$ of $n$ elements and a collection $\mathcal{S}$ of $k$ subsets of $\mathcal{U}$. Moreover, a cost function is associated with each set. The goal is to…

Data Structures and Algorithms · Computer Science 2026-05-29 Michał Szyfelbein

Grammar-Aware Literate Generative Mathematical Programming with Compiler-in-the-Loop

Mathematical programming is widely employed across various sectors - such as logistics, energy, and workforce planning - to model and solve industrial optimisation problems, but its use requires substantial domain expertise. Large language…

Programming Languages · Computer Science 2026-05-29 Roberto Rossi , Steven D. Prestwich

On the sensitivity of CDAWG-grammars

The compact directed acyclic word graph (CDAWG) [Blumer et al. 1987] of a string is the minimal compact automaton that recognizes all the suffixes of the string. CDAWGs can be used for various string tasks including text pattern searching,…

Data Structures and Algorithms · Computer Science 2026-05-29 Hiroto Fujimaru , Shunsuke Inenaga

CompilerDream: Learning a Compiler World Model for General Code Optimization

Effective code optimization in compilers is crucial for computer and software engineering. The success of these optimizations primarily depends on the selection and ordering of the optimization passes applied to the code. While most…

Programming Languages · Computer Science 2026-05-29 Chaoyi Deng , Jialong Wu , Ningya Feng , Jianmin Wang , Mingsheng Long

E-Path: Equality Saturation for Control-Flow Graphs

Modern equality saturation systems excel at expression-level rewrites by exploring large spaces of equivalent programs without suffering from the phase-ordering problem. How- ever, these systems struggle to represent equivalence directly…

Programming Languages · Computer Science 2026-05-28 Guillermo Garcia

High-Quality Multi-Constraint Hypergraph Partitioning via Greedy Rebalancing

Multi-constraint hypergraph partitioning is a generalization of balanced partitioning, where the vertex set of a hypergraph is partitioned such that the inter-block connectivity of hyperedges is minimized while balancing the vertices with…

Data Structures and Algorithms · Computer Science 2026-05-28 Nikolai Maas

A Deterministic Separation Lemma

The \emph{Separation Lemma} is a simple yet powerful tool, akin to the well-known \emph{Isolation Lemma}, that guarantees the uniqueness of certain set sums. Bandopadhyay et al.\ introduced this lemma to establish lower bounds for the \ALP…

Data Structures and Algorithms · Computer Science 2026-05-28 Abhishek Sahu

Efficient Algorithms for Interdicting Facilities in Trees and Bounded Treewidth Graphs

Given a graph $G$ of $n$ nodes partitioned into facilities and customers, the $r$-edge interdiction covering problem (REIC) is to remove up to $r$ edges so as to maximize the total weight of customers disconnected from all facilities, which…

Data Structures and Algorithms · Computer Science 2026-05-28 Ali Abbasi , Eli Friedman , Leana Golubchik , Samir Khuller , Marco Paolieri

Skill-as-Pseudocode: Refactoring Skill Libraries to Pseudocode for LLM Agents

Markdown skill libraries for LLM agents ship as free-form prose, forcing the agent to re-derive both the input schema and the concrete invocation syntax on every retrieval. We observe that this often produces a "confused -> re-retrieve ->…

Programming Languages · Computer Science 2026-05-28 Xinze Li , Yuhang Zang , Yixin Cao , Aixin Sun

FPMoE: A Sparse Mixture-of-Experts Approach to Functional Code Generation

Despite rapid progress in LLM-based code generation, existing models are predominantly trained on imperative languages, leaving functional programming languages (FPLs) such as Haskell, OCaml, and Scala chronically underexplored, with even…

Programming Languages · Computer Science 2026-05-28 Loc Pham , Lang Hong Nguyet Anh , Thanh Le-Cong

Smoothed Score Queries and the Complexity of Sampling

We study the query complexity of sampling from high-dimensional Gaussian distributions using gradient information. In the standard oracle model, exact gradients expose only matrix-vector products with the precision matrix, leading to…

Data Structures and Algorithms · Computer Science 2026-05-28 Jingbo Liu