Related papers: Spherical Position Encoding for Transformers

Geotokens and Geotransformers

In transformer architectures, position encoding primarily provides a sense of sequence for input tokens. While the original transformer paper's method has shown satisfactory results in general language processing tasks, there have been new…

Computation and Language · Computer Science 2024-03-26 Eren Unlu

Context-aware Rotary Position Embedding

Positional encoding is a vital component of Transformer architectures, enabling models to incorporate sequence order into self-attention mechanisms. Rotary Positional Embeddings (RoPE) have become a widely adopted solution due to their…

Computation and Language · Computer Science 2025-08-01 Ali Veisi , Delaram Fartoot , Hamidreza Amirzadeh

RoFormer: Enhanced Transformer with Rotary Position Embedding

Position encoding recently has shown effective in the transformer architecture. It enables valuable supervision for dependency modeling between elements at different positions of the sequence. In this paper, we first investigate various…

Computation and Language · Computer Science 2023-11-09 Jianlin Su , Yu Lu , Shengfeng Pan , Ahmed Murtadha , Bo Wen , Yunfeng Liu

CoPE: A Lightweight Complex Positional Encoding

Recent studies have demonstrated the effectiveness of position encoding in transformer architectures. By incorporating positional information, this approach provides essential guidance for modeling dependencies between elements across…

Machine Learning · Computer Science 2025-08-27 Avinash Amballa

LieRE: Lie Rotational Positional Encodings

Transformer architectures rely on position encodings to model the spatial structure of input data. Rotary Position Encoding (RoPE) is a widely used method in language models that encodes relative positions through fixed, block-diagonal,…

Computer Vision and Pattern Recognition · Computer Science 2025-08-19 Sophie Ostmeier , Brian Axelrod , Maya Varma , Michael E. Moseley , Akshay Chaudhari , Curtis Langlotz

Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling

Every Transformer architecture dedicates enormous capacity to learning rich representations in semantic embedding space -- yet the rotation manifold acted upon by Rotary Positional Embeddings (RoPE) has been treated as a fixed, hand-crafted…

Artificial Intelligence · Computer Science 2026-04-28 Hailing Cheng , Daqi Sun , Xinyu Lu

An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Transformer architecture has enabled recent progress in speech enhancement. Since Transformers are position-agostic, positional encoding is the de facto standard component used to enable Transformers to distinguish the order of elements in…

Audio and Speech Processing · Electrical Eng. & Systems 2024-02-15 Qiquan Zhang , Meng Ge , Hongxu Zhu , Eliathamby Ambikairajah , Qi Song , Zhaoheng Ni , Haizhou Li

Spiral RoPE: Rotate Your Rotary Positional Embeddings in the 2D Plane

Rotary Position Embedding (RoPE) is the de facto positional encoding in large language models due to its ability to encode relative positions and support length extrapolation. When adapted to vision transformers, the standard axial…

Computer Vision and Pattern Recognition · Computer Science 2026-02-04 Haoyu Liu , Sucheng Ren , Tingyu Zhu , Peng Wang , Cihang Xie , Alan Yuille , Zeyu Zheng , Feng Wang

Dynamic Position Encoding for Transformers

Recurrent models have been dominating the field of neural machine translation (NMT) for the past few years. Transformers \citep{vaswani2017attention}, have radically changed it by proposing a novel architecture that relies on a feed-forward…

Computation and Language · Computer Science 2022-10-25 Joyce Zheng , Mehdi Rezagholizadeh , Peyman Passban

The Locality and Symmetry of Positional Encodings

Positional Encodings (PEs) are used to inject word-order information into transformer-based language models. While they can significantly enhance the quality of sentence representations, their specific contribution to language models is not…

Computation and Language · Computer Science 2023-10-20 Lihu Chen , Gaël Varoquaux , Fabian M. Suchanek

Selective Rotary Position Embedding

Position information is essential for language modeling. In softmax transformers, Rotary Position Embeddings (\textit{RoPE}) encode positions through \textit{fixed-angle} rotations, while in linear transformers, order is handled via…

Computation and Language · Computer Science 2026-04-27 Sajad Movahedi , Timur Carstensen , Arshia Afzal , Frank Hutter , Antonio Orvieto , Volkan Cevher

Round and Round We Go! What makes Rotary Positional Encodings useful?

Positional Encodings (PEs) are a critical component of Transformer-based Large Language Models (LLMs), providing the attention mechanism with important sequence-position information. One of the most popular types of encoding used today in…

Computation and Language · Computer Science 2025-05-14 Federico Barbero , Alex Vitvitskyi , Christos Perivolaropoulos , Razvan Pascanu , Petar Veličković

Improving Transformers using Faithful Positional Encoding

We propose a new positional encoding method for a neural network architecture called the Transformer. Unlike the standard sinusoidal positional encoding, our approach is based on solid mathematical grounds and has a guarantee of not losing…

Machine Learning · Computer Science 2024-05-17 Tsuyoshi Idé , Jokin Labaien , Pin-Yu Chen

Do traveling waves make good positional encodings?

Transformers rely on positional encoding to compensate for the inherent permutation invariance of self-attention. Traditional approaches use absolute sinusoidal embeddings or learned positional vectors, while more recent methods emphasize…

Machine Learning · Computer Science 2025-11-18 Chase van de Geijn , Ayush Paliwal , Timo Lüddecke , Alexander S. Ecker

GeoPE:A Unified Geometric Positional Embedding for Structured Tensors

Standard Vision Transformers flatten 2D images into 1D sequences, disrupting the natural spatial topology. While Rotary Positional Embedding (RoPE) excels in 1D, it inherits this limitation, often treating spatially distant patches (e.g.,…

Computer Vision and Pattern Recognition · Computer Science 2025-12-05 Yupu Yao , Bowen Yang

SeqPE: Transformer with Sequential Position Encoding

Since self-attention layers in Transformers are permutation invariant by design, positional encodings must be explicitly incorporated to enable spatial understanding. However, fixed-size lookup tables used in traditional learnable position…

Machine Learning · Computer Science 2025-06-18 Huayang Li , Yahui Liu , Hongyu Sun , Deng Cai , Leyang Cui , Wei Bi , Peilin Zhao , Taro Watanabe

Position Information in Transformers: An Overview

Transformers are arguably the main workhorse in recent Natural Language Processing research. By definition a Transformer is invariant with respect to reordering of the input. However, language is inherently sequential and word order is…

Computation and Language · Computer Science 2021-09-10 Philipp Dufter , Martin Schmitt , Hinrich Schütze

ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices

The Transformer architecture has revolutionized various regions since it was proposed, and its effectiveness largely depends on the ability to encode positional information. Traditional position encoding methods exhibit significant…

Computer Vision and Pattern Recognition · Computer Science 2025-06-05 Hao Yu , Tangyu Jiang , Shuning Jia , Shannan Yan , Shunning Liu , Haolong Qian , Guanghao Li , Shuting Dong , Huaisong Zhang , Chun Yuan

Geographic Location Encoding with Spherical Harmonics and Sinusoidal Representation Networks

Learning representations of geographical space is vital for any machine learning model that integrates geolocated data, spanning application domains such as remote sensing, ecology, or epidemiology. Recent work embeds coordinates using sine…

Machine Learning · Computer Science 2024-04-16 Marc Rußwurm , Konstantin Klemmer , Esther Rolf , Robin Zbinden , Devis Tuia

GRPE: Relative Positional Encoding for Graph Transformer

We propose a novel positional encoding for learning graph on Transformer architecture. Existing approaches either linearize a graph to encode absolute position in the sequence of nodes, or encode relative position with another node using…

Machine Learning · Computer Science 2022-10-17 Wonpyo Park , Woonggi Chang , Donggeon Lee , Juntae Kim , Seung-won Hwang