Related papers: Diffusion On Syntax Trees For Program Synthesis

Rethinking Token Prediction: Tree-Structured Diffusion Language Model

Discrete diffusion language models have emerged as a competitive alternative to auto-regressive language models, but training them efficiently under limited parameter and memory budgets remains challenging. Modern architectures are…

Computation and Language · Computer Science 2026-04-07 Zihao Wu , Haoming Yang , Juncheng Dong , Vahid Tarokh

TreeDiff: AST-Guided Code Generation with Diffusion LLMs

Code generation is increasingly critical for real-world applications. Still, diffusion-based large language models continue to struggle with this demand. Unlike free-form text, code requires syntactic precision; even minor structural…

Computation and Language · Computer Science 2026-01-07 Yiming Zeng , Jinghan Cao , Zexin Li , Yiming Chen , Tao Ren , Zhuochun Li , Dawei Xiang , Xidong Wu , Shangqian Gao , Tingting Yu

Self-conditioned Embedding Diffusion for Text Generation

Can continuous diffusion models bring the same performance breakthrough on natural language they did for image generation? To circumvent the discrete nature of text data, we can simply project tokens in a continuous space of embeddings, as…

Computation and Language · Computer Science 2022-11-09 Robin Strudel , Corentin Tallec , Florent Altché , Yilun Du , Yaroslav Ganin , Arthur Mensch , Will Grathwohl , Nikolay Savinov , Sander Dieleman , Laurent Sifre , Rémi Leblond

Schr\"odinger's Tree -- On Syntax and Neural Language Models

In the last half-decade, the field of natural language processing (NLP) has undergone two major transitions: the switch to neural networks as the primary modeling paradigm and the homogenization of the training regime (pre-train, then…

Computation and Language · Computer Science 2021-10-19 Artur Kulmizev , Joakim Nivre

AnCoder: Anchored Code Generation via Discrete Diffusion Models

Diffusion language models offer a compelling alternative to autoregressive code generation, enabling global planning and iterative refinement of complex program logic. However, existing approaches fail to respect the rigid structure of…

Machine Learning · Computer Science 2026-02-23 Anton Xue , Litu Rout , Constantine Caramanis , Sanjay Shakkottai

Learning Structural Edits via Incremental Tree Transformations

While most neural generative models generate outputs in a single pass, the human creative process is usually one of iterative building and refinement. Recent work has proposed models of editing processes, but these mostly focus on editing…

Machine Learning · Computer Science 2021-03-08 Ziyu Yao , Frank F. Xu , Pengcheng Yin , Huan Sun , Graham Neubig

On Tree-Based Neural Sentence Modeling

Neural networks with tree-based sentence encoders have shown better results on many downstream tasks. Most of existing tree-based encoders adopt syntactic parsing trees as the explicit structure prior. To study the effectiveness of…

Computation and Language · Computer Science 2018-08-30 Haoyue Shi , Hao Zhou , Jiaze Chen , Lei Li

Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model

Recently, diffusion-based image generation methods are credited for their remarkable text-to-image generation capabilities, while still facing challenges in accurately generating multilingual scene text images. To tackle this problem, we…

Computer Vision and Pattern Recognition · Computer Science 2023-12-20 Lingjun Zhang , Xinyuan Chen , Yaohui Wang , Yue Lu , Yu Qiao

Retrieval-Based Neural Code Generation

In models to generate program source code from natural language, representing this code in a tree structure has been a common approach. However, existing methods often fail to generate complex code correctly due to a lack of ability to…

Computation and Language · Computer Science 2018-08-31 Shirley Anugrah Hayati , Raphael Olivier , Pravalika Avvaru , Pengcheng Yin , Anthony Tomasic , Graham Neubig

Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens

Recent endeavors in Multimodal Large Language Models (MLLMs) aim to unify visual comprehension and generation by combining LLM and diffusion models, the state-of-the-art in each task, respectively. Existing approaches rely on spatial visual…

Computer Vision and Pattern Recognition · Computer Science 2025-04-22 Kaihang Pan , Wang Lin , Zhongqi Yue , Tenglong Ao , Liyu Jia , Wei Zhao , Juncheng Li , Siliang Tang , Hanwang Zhang

Beyond Autoregression: An Empirical Study of Diffusion Large Language Models for Code Generation

LLMs have become the mainstream approaches to code generation. Existing LLMs mainly employ autoregressive generation, i.e. generating code token-by-token from left to right. However, the underlying autoregressive generation has two…

Software Engineering · Computer Science 2025-11-04 Chengze Li , Yitong Zhang , Jia Li , Liyi Cai , Ge Li

Incorporating Syntactic Uncertainty in Neural Machine Translation with Forest-to-Sequence Model

Incorporating syntactic information in Neural Machine Translation models is a method to compensate their requirement for a large amount of parallel training text, especially for low-resource language pairs. Previous works on using syntactic…

Computation and Language · Computer Science 2017-11-27 Poorya Zaremoodi , Gholamreza Haffari

Syntax-driven Iterative Expansion Language Models for Controllable Text Generation

The dominant language modeling paradigm handles text as a sequence of discrete tokens. While that approach can capture the latent structure of the text, it is inherently constrained to sequential dynamics for text generation. We propose a…

Computation and Language · Computer Science 2020-11-02 Noe Casas , José A. R. Fonollosa , Marta R. Costa-jussà

Diffuse Thinking: Exploring Diffusion Language Models as Efficient Thought Proposers for Reasoning

In recent years, large language models (LLMs) have witnessed remarkable advancements, with the test-time scaling law consistently enhancing the reasoning capabilities. Through systematic evaluation and exploration of a diverse spectrum of…

Computation and Language · Computer Science 2025-11-03 Chenyang Shao , Sijian Ren , Fengli Xu , Yong Li

In Tree Structure Should Sentence Be Generated

Generative models reliant on sequential autoregression have been at the forefront of language generation for an extensive period, particularly following the introduction of widely acclaimed transformers. Despite its excellent performance,…

Computation and Language · Computer Science 2024-06-21 Yaguang Li , Xin Chen

Finding Syntax in Human Encephalography with Beam Search

Recurrent neural network grammars (RNNGs) are generative models of (tree,string) pairs that rely on neural networks to evaluate derivational choices. Parsing with them using beam search yields a variety of incremental complexity metrics…

Computation and Language · Computer Science 2018-06-12 John Hale , Chris Dyer , Adhiguna Kuncoro , Jonathan R. Brennan

Diffusion-LM Improves Controllable Text Generation

Controlling the behavior of language models (LMs) without re-training is a major open problem in natural language generation. While recent works have demonstrated successes on controlling simple sentence attributes (e.g., sentiment), there…

Computation and Language · Computer Science 2022-05-31 Xiang Lisa Li , John Thickstun , Ishaan Gulrajani , Percy Liang , Tatsunori B. Hashimoto

Seek for Incantations: Towards Accurate Text-to-Image Diffusion Synthesis through Prompt Engineering

The text-to-image synthesis by diffusion models has recently shown remarkable performance in generating high-quality images. Although performs well for simple texts, the models may get confused when faced with complex texts that contain…

Computer Vision and Pattern Recognition · Computer Science 2024-01-15 Chang Yu , Junran Peng , Xiangyu Zhu , Zhaoxiang Zhang , Qi Tian , Zhen Lei

TSLM: Tree-Structured Language Modeling for Divergent Thinking

Language models generate reasoning sequentially, preventing them from decoupling irrelevant exploration paths during search. We introduce Tree-Structured Language Modeling (TSLM), which uses special tokens to encode branching structure,…

Computation and Language · Computer Science 2026-02-02 Doyoung Kim , Jaehyeok Doo , Minjoon Seo

SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers

Diffusion model, a new generative modelling paradigm, has achieved great success in image, audio, and video generation. However, considering the discrete categorical nature of text, it is not trivial to extend continuous diffusion models to…

Computation and Language · Computer Science 2023-05-23 Hongyi Yuan , Zheng Yuan , Chuanqi Tan , Fei Huang , Songfang Huang