Related papers: Pseudo-Bidirectional Decoding for Local Sequence T…

Transformer with Bidirectional Decoder for Speech Recognition

Attention-based models have made tremendous progress on end-to-end automatic speech recognition(ASR) recently. However, the conventional transformer-based approaches usually generate the sequence results token by token from left to right,…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-12 Xi Chen , Songyang Zhang , Dandan Song , Peng Ouyang , Shouyi Yin

Self-Speculative Biased Decoding for Faster Re-Translation

Large language models achieve strong machine translation quality but incur high inference cost and latency, posing challenges for simultaneous translation. Re-translation provides a practical solution for off-the-shelf LLMs by repeatedly…

Computation and Language · Computer Science 2026-01-06 Linxiao Zeng , Haoyun Deng , Kangyuan Shu , Shizhen Wang

SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations

Recent years have seen the successful application of large pre-trained models to code representation learning, resulting in substantial improvements on many code-related downstream tasks. But there are issues surrounding their application…

Software Engineering · Computer Science 2022-05-26 Changan Niu , Chuanyi Li , Vincent Ng , Jidong Ge , Liguo Huang , Bin Luo

Synchronous Bidirectional Neural Machine Translation

Existing approaches to neural machine translation (NMT) generate the target language sequence token by token from left to right. However, this kind of unidirectional decoding framework cannot make full use of the target-side future contexts…

Computation and Language · Computer Science 2019-05-14 Long Zhou , Jiajun Zhang , Chengqing Zong

Synchronous Bidirectional Inference for Neural Sequence Generation

In sequence to sequence generation tasks (e.g. machine translation and abstractive summarization), inference is generally performed in a left-to-right manner to produce the result token by token. The neural approaches, such as LSTM and…

Computation and Language · Computer Science 2019-02-26 Jiajun Zhang , Long Zhou , Yang Zhao , Chengqing Zong

Distribution-Aligned Decoding for Efficient LLM Task Adaptation

Adapting billion-parameter language models to a downstream task is still costly, even with parameter-efficient fine-tuning (PEFT). We re-cast task adaptation as output-distribution alignment: the objective is to steer the output…

Computation and Language · Computer Science 2026-03-03 Senkang Hu , Xudong Han , Jinqi Jiang , Yihang Tao , Zihan Fang , Yong Dai , Sam Tak Wu Kwong , Yuguang Fang

A Framework for Bidirectional Decoding: Case Study in Morphological Inflection

Transformer-based encoder-decoder models that generate outputs in a left-to-right fashion have become standard for sequence-to-sequence tasks. In this paper, we propose a framework for decoding that produces sequences from the "outside-in":…

Computation and Language · Computer Science 2023-10-31 Marc E. Canby , Julia Hockenmaier

Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation

A conventional approach to improving the performance of end-to-end speech translation (E2E-ST) models is to leverage the source transcription via pre-training and joint training with automatic speech recognition (ASR) and neural machine…

Computation and Language · Computer Science 2021-04-15 Hirofumi Inaguma , Tatsuya Kawahara , Shinji Watanabe

Asynchronous Bidirectional Decoding for Neural Machine Translation

The dominant neural machine translation (NMT) models apply unified attentional encoder-decoder neural networks for translation. Traditionally, the NMT decoders adopt recurrent neural networks (RNNs) to perform translation in a left-toright…

Computation and Language · Computer Science 2018-02-06 Xiangwen Zhang , Jinsong Su , Yue Qin , Yang Liu , Rongrong Ji , Hongji Wang

Bidirectional Scene Text Recognition with a Single Decoder

Scene Text Recognition (STR) is the problem of recognizing the correct word or character sequence in a cropped word image. To obtain more robust output sequences, the notion of bidirectional STR has been introduced. So far, bidirectional…

Computer Vision and Pattern Recognition · Computer Science 2020-03-03 Maurits Bleeker , Maarten de Rijke

LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning

Fine-tuning large pre-trained models on downstream tasks has been adopted in a variety of domains recently. However, it is costly to update the entire parameter set of large pre-trained models. Although recently proposed parameter-efficient…

Computation and Language · Computer Science 2022-11-01 Yi-Lin Sung , Jaemin Cho , Mohit Bansal

Consecutive Decoding for Speech-to-text Translation

Speech-to-text translation (ST), which directly translates the source language speech to the target language text, has attracted intensive attention recently. However, the combination of speech recognition and machine translation in a…

Computation and Language · Computer Science 2022-04-18 Qianqian Dong , Mingxuan Wang , Hao Zhou , Shuang Xu , Bo Xu , Lei Li

R-BI: Regularized Batched Inputs enhance Incremental Decoding Framework for Low-Latency Simultaneous Speech Translation

Incremental Decoding is an effective framework that enables the use of an offline model in a simultaneous setting without modifying the original model, making it suitable for Low-Latency Simultaneous Speech Translation. However, this…

Computation and Language · Computer Science 2024-01-12 Jiaxin Guo , Zhanglin Wu , Zongyao Li , Hengchao Shang , Daimeng Wei , Xiaoyu Chen , Zhiqiang Rao , Shaojun Li , Hao Yang

Reflective Decoding: Beyond Unidirectional Generation with Off-the-Shelf Language Models

Publicly available, large pretrained LanguageModels (LMs) generate text with remarkable quality, but only sequentially from left to right. As a result, they are not immediately applicable to generation tasks that break the unidirectional…

Computation and Language · Computer Science 2021-12-28 Peter West , Ximing Lu , Ari Holtzman , Chandra Bhagavatula , Jena Hwang , Yejin Choi

Boosting Template-based SSVEP Decoding by Cross-domain Transfer Learning

Objective: This study aims to establish a generalized transfer-learning framework for boosting the performance of steady-state visual evoked potential (SSVEP)-based brain-computer interfaces (BCIs) by leveraging cross-domain data…

Machine Learning · Computer Science 2021-02-11 Kuan-Jung Chiang , Chun-Shu Wei , Masaki Nakanishi , Tzyy-Ping Jung

Set Block Decoding is a Language Model Inference Accelerator

Autoregressive next token prediction language models offer powerful capabilities but face significant challenges in practical deployment due to the high computational and memory costs of inference, particularly during the decoding stage. We…

Machine Learning · Computer Science 2025-09-05 Itai Gat , Heli Ben-Hamu , Marton Havasi , Daniel Haziza , Jeremy Reizenstein , Gabriel Synnaeve , David Lopez-Paz , Brian Karrer , Yaron Lipman

Latent Speech-Text Transformer

Auto-regressive speech-text models pre-trained on interleaved text tokens and discretized speech tokens demonstrate strong speech understanding and generation, yet remain substantially less compute-efficient than text LLMs, partly due to…

Computation and Language · Computer Science 2026-03-11 Yen-Ju Lu , Yashesh Gaur , Wei Zhou , Benjamin Muller , Jesus Villalba , Najim Dehak , Luke Zettlemoyer , Gargi Ghosh , Mike Lewis , Srinivasan Iyer , Duc Le

Sequence Repetition Enhances Token Embeddings and Improves Sequence Labeling with Decoder-only Language Models

Modern language models (LMs) are trained in an autoregressive manner, conditioned only on the prefix. In contrast, sequence labeling (SL) tasks assign labels to each individual input token, naturally benefiting from bidirectional context.…

Computation and Language · Computer Science 2026-01-27 Matija Luka Kukić , Marko Čuljak , David Dukić , Martin Tutek , Jan Šnajder

Can Emulating Semantic Translation Help LLMs with Code Translation? A Study Based on Pseudocode

Although large language models (LLMs) show promising potential in code translation, they still struggle to generate accurate translations using the commonly adopted direct code-to-code translation approach, which converts an original…

Software Engineering · Computer Science 2026-02-24 Songqiang Chen , Congying Xu , Jingyi Chen , Jialun Cao , Jiarong Wu , Shing-Chi Cheung

Parallel Decoder Transformer: Planner-Seeded Latent Coordination for Synchronized Parallel Decoding

Autoregressive language models can often identify parallel subproblems, but standard decoding exposes only a single left-to-right output interface. External orchestration methods can launch multiple prompts concurrently, yet they provide no…

Artificial Intelligence · Computer Science 2026-03-10 Logan Robbins