English

AnCoder: Anchored Code Generation via Discrete Diffusion Models

Machine Learning 2026-02-23 v1 Programming Languages

Abstract

Diffusion language models offer a compelling alternative to autoregressive code generation, enabling global planning and iterative refinement of complex program logic. However, existing approaches fail to respect the rigid structure of programming languages and, as a result, often produce broken programs that fail to execute. To address this, we introduce AnchorTree, a framework that explicitly anchors the diffusion process using structured, hierarchical priors native to code. Specifically, AnchorTree uses the abstract syntax tree to prioritize resolving syntactically and semantically salient tokens, such as keywords (e.g., if, while) and identifiers (e.g., variable names), thereby establishing a structural scaffold that guides the remaining generation. We validate this framework via AnCoder, a family of models showing that structurally anchored diffusion offers a parameter-efficient path to high-quality code generation.

Keywords

Cite

@article{arxiv.2602.17688,
  title  = {AnCoder: Anchored Code Generation via Discrete Diffusion Models},
  author = {Anton Xue and Litu Rout and Constantine Caramanis and Sanjay Shakkottai},
  journal= {arXiv preprint arXiv:2602.17688},
  year   = {2026}
}