English

$\texttt{SEM-CTRL}$: Semantically Controlled Decoding

Computation and Language 2026-04-10 v4 Artificial Intelligence Machine Learning

Abstract

Ensuring both syntactic and semantic correctness in Large Language Model (LLM) outputs remains a significant challenge, despite being critical for real-world deployment. In this paper, we introduce SEM-CTRL\texttt{SEM-CTRL}, a unified approach that allows for enforcing rich context-sensitive constraints, and task and instance specific semantics directly on the LLM decoder. Our approach integrates token-level MCTS which is guided by specific syntactic and semantic constraints. The constraints over desired outputs are expressed using Answer Set Grammars, which is a logic-based formalism that generalizes context sensitive grammars while incorporating background knowledge to represent task-specific semantics. We show that our approach helps guarantee valid completions for any off-the-shelf LLM without the need for fine-tuning. We evaluate SEM-CTRL\texttt{SEM-CTRL} on a range of tasks, including synthetic grammar synthesis, combinatorial reasoning, JSON parsing, and planning. Our experimental results demonstrate that SEM-CTRL\texttt{SEM-CTRL} allows even small pre-trained LLMs to efficiently outperform larger variants and state-of-the-art reasoning models (e.g., o4-mini\textit{o4-mini}) while simultaneously guaranteeing semantic validity.

Keywords

Cite

@article{arxiv.2503.01804,
  title  = {$\texttt{SEM-CTRL}$: Semantically Controlled Decoding},
  author = {Mohammad Albinhassan and Pranava Madhyastha and Alessandra Russo},
  journal= {arXiv preprint arXiv:2503.01804},
  year   = {2026}
}

Comments

Published in Transactions on Machine Learning Research (TMLR), 03/2026

R2 v1 2026-06-28T22:05:04.875Z