Related papers: Positional Encoding to Control Output Sequence Len…

Controlling Output Length in Neural Encoder-Decoders

Neural encoder-decoder models have shown great success in many sequence generation tasks. However, previous work has not investigated situations in which we would like to control the length of encoder-decoder outputs. This capability is…

Computation and Language · Computer Science 2016-10-03 Yuta Kikuchi , Graham Neubig , Ryohei Sasano , Hiroya Takamura , Manabu Okumura

Length-controllable Abstractive Summarization by Guiding with Summary Prototype

We propose a new length-controllable abstractive summarization model. Recent state-of-the-art abstractive summarization models based on encoder-decoder models generate only one summary per source text. However, controllable summarization,…

Computation and Language · Computer Science 2020-01-22 Itsumi Saito , Kyosuke Nishida , Kosuke Nishida , Atsushi Otsuka , Hisako Asano , Junji Tomita , Hiroyuki Shindo , Yuji Matsumoto

Randomized Positional Encodings Boost Length Generalization of Transformers

Transformers have impressive generalization capabilities on tasks with a fixed context length. However, they fail to generalize to sequences of arbitrary length, even for seemingly simple tasks such as duplicating a string. Moreover, simply…

Machine Learning · Computer Science 2023-05-29 Anian Ruoss , Grégoire Delétang , Tim Genewein , Jordi Grau-Moya , Róbert Csordás , Mehdi Bennani , Shane Legg , Joel Veness

Learning to Encode Position for Transformer with Continuous Dynamical Model

We introduce a new way of learning to encode position information for non-recurrent models, such as Transformer models. Unlike RNN and LSTM, which contain inductive bias by loading the input tokens sequentially, non-recurrent models are…

Machine Learning · Computer Science 2020-03-23 Xuanqing Liu , Hsiang-Fu Yu , Inderjit Dhillon , Cho-Jui Hsieh

The Impact of Positional Encodings on Multilingual Compression

In order to preserve word-order information in a non-autoregressive setting, transformer architectures tend to include positional knowledge, by (for instance) adding positional encodings to token embeddings. Several modifications have been…

Computation and Language · Computer Science 2021-09-14 Vinit Ravishankar , Anders Søgaard

Exploring Length Generalization For Transformer-based Speech Enhancement

Transformer network architecture has proven effective in speech enhancement. However, as its core module, self-attention suffers from quadratic complexity, making it infeasible for training on long speech utterances. In practical scenarios,…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-10 Qiquan Zhang , Hongxu Zhu , Xinyuan Qian , Eliathamby Ambikairajah , Haizhou Li

Positional Encoding Helps Recurrent Neural Networks Handle a Large Vocabulary

This study reports an unintuitive finding that positional encoding enhances learning of recurrent neural networks (RNNs). Positional encoding is a high-dimensional representation of time indices on input data. Most famously, positional…

Machine Learning · Computer Science 2024-11-28 Takashi Morita

Alternative positional encoding functions for neural transformers

A key module in neural transformer-based deep architectures is positional encoding. This module enables a suitable way to encode positional information as input for transformer neural layers. This success has been rooted in the use of…

Machine Learning · Computer Science 2025-12-23 Ezequiel Lopez-Rubio , Macoris Decena-Gimenez , Rafael Marcos Luque-Baena

Improved Positional Encoding for Implicit Neural Representation based Compact Data Representation

Positional encodings are employed to capture the high frequency information of the encoded signals in implicit neural representation (INR). In this paper, we propose a novel positional encoding method which improves the reconstruction…

Computer Vision and Pattern Recognition · Computer Science 2023-11-13 Bharath Bhushan Damodaran , Francois Schnitzler , Anne Lambert , Pierre Hellier

Improving Transformers using Faithful Positional Encoding

We propose a new positional encoding method for a neural network architecture called the Transformer. Unlike the standard sinusoidal positional encoding, our approach is based on solid mathematical grounds and has a guarantee of not losing…

Machine Learning · Computer Science 2024-05-17 Tsuyoshi Idé , Jokin Labaien , Pin-Yu Chen

Learning Generic Sentence Representations Using Convolutional Neural Networks

We propose a new encoder-decoder approach to learn distributed sentence representations that are applicable to multiple purposes. The model is learned by using a convolutional neural network as an encoder to map an input sentence into a…

Computation and Language · Computer Science 2017-07-28 Zhe Gan , Yunchen Pu , Ricardo Henao , Chunyuan Li , Xiaodong He , Lawrence Carin

Neural Syntactic Preordering for Controlled Paraphrase Generation

Paraphrasing natural language sentences is a multifaceted process: it might involve replacing individual words or short phrases, local rearrangement of content, or high-level restructuring like topicalization or passivization. Past…

Computation and Language · Computer Science 2020-05-06 Tanya Goyal , Greg Durrett

An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Transformer architecture has enabled recent progress in speech enhancement. Since Transformers are position-agostic, positional encoding is the de facto standard component used to enable Transformers to distinguish the order of elements in…

Audio and Speech Processing · Electrical Eng. & Systems 2024-02-15 Qiquan Zhang , Meng Ge , Hongxu Zhu , Eliathamby Ambikairajah , Qi Song , Zhaoheng Ni , Haizhou Li

A Character-Level Length-Control Algorithm for Non-Autoregressive Sentence Summarization

Sentence summarization aims at compressing a long sentence into a short one that keeps the main gist, and has extensive real-world applications such as headline generation. In previous work, researchers have developed various approaches to…

Computation and Language · Computer Science 2022-10-18 Puyuan Liu , Xiang Zhang , Lili Mou

Theoretical Analysis of Positional Encodings in Transformer Models: Impact on Expressiveness and Generalization

Positional encodings are a core part of transformer-based models, enabling processing of sequential data without recurrence. This paper presents a theoretical framework to analyze how various positional encoding methods, including…

Machine Learning · Computer Science 2025-06-10 Yin Li

Controllable Length Control Neural Encoder-Decoder via Reinforcement Learning

Controlling output length in neural language generation is valuable in many scenarios, especially for the tasks that have length constraints. A model with stronger length control capacity can produce sentences with more specific length,…

Computation and Language · Computer Science 2019-09-23 Junyi Bian , Baojun Lin , Ke Zhang , Zhaohui Yan , Hong Tang , Yonghe Zhang

An Augmented Transformer Architecture for Natural Language Generation Tasks

The Transformer based neural networks have been showing significant advantages on most evaluations of various natural language processing and other sequence-to-sequence tasks due to its inherent architecture based superiorities. Although…

Computation and Language · Computer Science 2019-10-31 Hailiang Li , Adele Y. C. Wang , Yang Liu , Du Tang , Zhibin Lei , Wenye Li

PEPS: Positional Encoding Projected Sampling -- Extended

Implicit neural representations (INRs) are increasingly being used as tools to map coordinates to signals, encompassing applications from neural fields to texture compression, shape representations, and beyond. Most INR methods are based on…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Guillaume Perez , Janarbek Matai , Takahiro Harada

Positional Encoding in Transformer-Based Time Series Models: A Survey

Recent advancements in transformer-based models have greatly improved time series analysis, providing robust solutions for tasks such as forecasting, anomaly detection, and classification. A crucial element of these models is positional…

Machine Learning · Computer Science 2026-05-07 Habib Irani , Vangelis Metsis

Abstractive Summary Generation for the Urdu Language

Abstractive summary generation is a challenging task that requires the model to comprehend the source text and generate a concise and coherent summary that captures the essential information. In this paper, we explore the use of an…

Computation and Language · Computer Science 2023-05-26 Ali Raza , Hadia Sultan Raja , Usman Maratib