Dependency-Guided Parallel Decoding in Discrete Diffusion Language Models

Liran Ringel; Ameen Ali; Yaniv Romano

Dependency-Guided Parallel Decoding in Discrete Diffusion Language Models

Computation and Language 2026-04-06 v1

Authors: Liran Ringel , Ameen Ali , Yaniv Romano

Abstract

Discrete diffusion language models (dLLMs) accelerate text generation by unmasking multiple tokens in parallel. However, parallel decoding introduces a distributional mismatch: it approximates the joint conditional using a fully factorized product of per-token marginals, which degrades output quality when selected tokens are strongly dependent. We propose DEMASK (DEpendency-guided unMASKing), a lightweight dependency predictor that attaches to the final hidden states of a dLLM. In a single forward pass, it estimates pairwise conditional influences between masked positions. Using these predictions, a greedy selection algorithm identifies positions with bounded cumulative dependency for simultaneous unmasking. Under a sub-additivity assumption, we prove this bounds the total variation distance between our parallel sampling and the model's joint. Empirically, DEMASK achieves 1.7-2.2 $\times$ speedup on Dream-7B while matching or improving accuracy compared to confidence-based and KL-based baselines.

Keywords

diffusion model tokenization language modeling

Cite

@article{arxiv.2604.02560,
  title  = {Dependency-Guided Parallel Decoding in Discrete Diffusion Language Models},
  author = {Liran Ringel and Ameen Ali and Yaniv Romano},
  journal= {arXiv preprint arXiv:2604.02560},
  year   = {2026}
}

Dependency-Guided Parallel Decoding in Discrete Diffusion Language Models

Abstract

Keywords

Cite

Related papers