Related papers: Insertion-based Decoding with automatically Inferr…

Insertion Transformer: Flexible Sequence Generation via Insertion Operations

We present the Insertion Transformer, an iterative, partially autoregressive model for sequence generation based on insertion operations. Unlike typical autoregressive models which rely on a fixed, often left-to-right ordering of the…

Computation and Language · Computer Science 2019-02-12 Mitchell Stern , William Chan , Jamie Kiros , Jakob Uszkoreit

Discovering Non-monotonic Autoregressive Orderings with Variational Inference

The predominant approach for language modeling is to process sequences from left to right, but this eliminates a source of information: the order by which the sequence was generated. One strategy to recover this information is to decode…

Computation and Language · Computer Science 2021-11-01 Xuanlin Li , Brandon Trabucco , Dong Huk Park , Michael Luo , Sheng Shen , Trevor Darrell , Yang Gao

Sequence Modeling with Unconstrained Generation Order

The dominant approach to sequence generation is to produce a sequence in some predefined order, e.g. left to right. In contrast, we propose a more general model that can generate the output sequence by inserting tokens in any arbitrary…

Computation and Language · Computer Science 2019-11-04 Dmitrii Emelianenko , Elena Voita , Pavel Serdyukov

Insertion Based Sequence Generation with Learnable Order Dynamics

In many domains generating variable length sequences through insertions provides greater flexibility over autoregressive models. However, the action space of insertion models is much larger than that of autoregressive models (ARMs) making…

Machine Learning · Computer Science 2026-02-24 Dhruvesh Patel , Benjamin Rozonoyer , Gaurav Pandey , Tahira Naseem , Ramón Fernandez Astudillo , Andrew McCallum

Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement

We propose a conditional non-autoregressive neural sequence model based on iterative refinement. The proposed model is designed based on the principles of latent variable models and denoising autoencoders, and is generally applicable to any…

Machine Learning · Computer Science 2018-08-29 Jason Lee , Elman Mansimov , Kyunghyun Cho

Learning and Analyzing Generation Order for Undirected Sequence Models

Undirected neural sequence models have achieved performance competitive with the state-of-the-art directed sequence models that generate monotonically from left to right in machine translation tasks. In this work, we train a policy that…

Computation and Language · Computer Science 2021-12-17 Yichen Jiang , Mohit Bansal

Decoding Order Matters in Autoregressive Speech Synthesis

Autoregressive speech synthesis often adopts a left-to-right order, yet generation order is a modelling choice. We investigate decoding order through masked diffusion framework, which progressively unmasks positions and allows arbitrary…

Sound · Computer Science 2026-01-14 Minghui Zhao , Anton Ragni

Fast Autoregressive Video Generation with Diagonal Decoding

Autoregressive Transformer models have demonstrated impressive performance in video generation, but their sequential token-by-token decoding process poses a major bottleneck, particularly for long videos represented by tens of thousands of…

Computer Vision and Pattern Recognition · Computer Science 2025-03-19 Yang Ye , Junliang Guo , Haoyu Wu , Tianyu He , Tim Pearce , Tabish Rashid , Katja Hofmann , Jiang Bian

INDigo: An INN-Guided Probabilistic Diffusion Algorithm for Inverse Problems

Recently it has been shown that using diffusion models for inverse problems can lead to remarkable results. However, these approaches require a closed-form expression of the degradation model and can not support complex degradations. To…

Computer Vision and Pattern Recognition · Computer Science 2023-06-06 Di You , Andreas Floros , Pier Luigi Dragotti

Using Intermediate Forward Iterates for Intermediate Generator Optimization

Score-based models have recently been introduced as a richer framework to model distributions in high dimensions and are generally more suitable for generative tasks. In score-based models, a generative task is formulated using a parametric…

Machine Learning · Computer Science 2023-02-07 Harsh Mishra , Jurijs Nazarovs , Manmohan Dogra , Sathya N. Ravi

Towards More Efficient Insertion Transformer with Fractional Positional Encoding

Auto-regressive neural sequence models have been shown to be effective across text generation tasks. However, their left-to-right decoding order prevents generation from being parallelized. Insertion Transformer (Stern et al., 2019) is an…

Computation and Language · Computer Science 2023-02-01 Zhisong Zhang , Yizhe Zhang , Bill Dolan

INDIGO+: A Unified INN-Guided Probabilistic Diffusion Algorithm for Blind and Non-Blind Image Restoration

Generative diffusion models are becoming one of the most popular prior in image restoration (IR) tasks due to their remarkable ability to generate realistic natural images. Despite achieving satisfactory results, IR methods based on…

Computer Vision and Pattern Recognition · Computer Science 2025-01-27 Di You , Pier Luigi Dragotti

Sequence Generation: From Both Sides to the Middle

The encoder-decoder framework has achieved promising process for many sequence generation tasks, such as neural machine translation and text summarization. Such a framework usually generates a sequence token by token from left to right,…

Computation and Language · Computer Science 2019-06-25 Long Zhou , Jiajun Zhang , Chengqing Zong , Heng Yu

InsNet: An Efficient, Flexible, and Performant Insertion-based Text Generation Model

We propose InsNet, an expressive insertion-based text generator with efficient training and flexible decoding (parallel or sequential). Unlike most existing insertion-based text generation works that require re-encoding of the context after…

Computation and Language · Computer Science 2022-10-18 Sidi Lu , Tao Meng , Nanyun Peng

Non-Autoregressive Machine Translation with Disentangled Context Transformer

State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens. The sequential nature of this generation process causes fundamental latency in…

Computation and Language · Computer Science 2020-07-01 Jungo Kasai , James Cross , Marjan Ghazvininejad , Jiatao Gu

Self-Infilling Code Generation

This work introduces self-infilling code generation, a general framework that incorporates infilling operations into auto-regressive decoding. Our approach capitalizes on the observation that recent infilling-capable code language models…

Programming Languages · Computer Science 2024-05-28 Lin Zheng , Jianbo Yuan , Zhi Zhang , Hongxia Yang , Lingpeng Kong

{\sigma}-GPTs: A New Approach to Autoregressive Models

Autoregressive models, such as the GPT family, use a fixed order, usually left-to-right, to generate sequences. However, this is not a necessity. In this paper, we challenge this assumption and show that by simply adding a positional…

Machine Learning · Computer Science 2024-07-02 Arnaud Pannatier , Evann Courdier , François Fleuret

Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation

To ensure that text generated by large language models (LLMs) is in an expected format, constrained decoding proposes to enforce strict formal language constraints during generation. However, as we show in this work, not only do such…

Machine Learning · Computer Science 2024-03-13 Luca Beurer-Kellner , Marc Fischer , Martin Vechev

Synchronous Bidirectional Inference for Neural Sequence Generation

In sequence to sequence generation tasks (e.g. machine translation and abstractive summarization), inference is generally performed in a left-to-right manner to produce the result token by token. The neural approaches, such as LSTM and…

Computation and Language · Computer Science 2019-02-26 Jiajun Zhang , Long Zhou , Yang Zhao , Chengqing Zong

Fast Interleaved Bidirectional Sequence Generation

Independence assumptions during sequence generation can speed up inference, but parallel generation of highly inter-dependent tokens comes at a cost in quality. Instead of assuming independence between neighbouring tokens…

Computation and Language · Computer Science 2020-10-28 Biao Zhang , Ivan Titov , Rico Sennrich