dLLM: Simple Diffusion Language Modeling

Zhanhui Zhou; Lingjie Chen; Hanghang Tong; Dawn Song

dLLM: Simple Diffusion Language Modeling

Computation and Language 2026-02-27 v1 Artificial Intelligence Machine Learning

Authors: Zhanhui Zhou , Lingjie Chen , Hanghang Tong , Dawn Song

Abstract

Although diffusion language models (DLMs) are evolving quickly, many recent models converge on a set of shared components. These components, however, are distributed across ad-hoc research codebases or lack transparent implementations, making them difficult to reproduce or extend. As the field accelerates, there is a clear need for a unified framework that standardizes these common components while remaining flexible enough to support new methods and architectures. To address this gap, we introduce dLLM, an open-source framework that unifies the core components of diffusion language modeling -- training, inference, and evaluation -- and makes them easy to customize for new designs. With dLLM, users can reproduce, finetune, deploy, and evaluate open-source large DLMs such as LLaDA and Dream through a standardized pipeline. The framework also provides minimal, reproducible recipes for building small DLMs from scratch with accessible compute, including converting any BERT-style encoder or autoregressive LM into a DLM. We also release the checkpoints of these small DLMs to make DLMs more accessible and accelerate future research.

Keywords

large language model programming languages diffusion model

Cite

@article{arxiv.2602.22661,
  title  = {dLLM: Simple Diffusion Language Modeling},
  author = {Zhanhui Zhou and Lingjie Chen and Hanghang Tong and Dawn Song},
  journal= {arXiv preprint arXiv:2602.22661},
  year   = {2026}
}

Comments

Code available at: https://github.com/ZHZisZZ/dllm

dLLM: Simple Diffusion Language Modeling

Abstract

Keywords

Cite

Comments

Related papers