Pointer: Linear-Complexity Long-Range Modeling without Pre-training

Zixi Li

Pointer: Linear-Complexity Long-Range Modeling without Pre-training

Computation and Language 2025-08-05 v1

Authors: Zixi Li

Abstract

We introduce Pointer, a novel architecture that achieves linear $O(NK)$ complexity for long-range sequence modeling while maintaining superior performance without requiring pre-training. Unlike standard attention mechanisms that compute $O(N^2)$ pairwise interactions, our approach uses layer-wise pointer chaining where each layer's pointer selection depends on previous layer's pointer positions, creating explicit long-distance connections through pointer chains. We demonstrate that this architecture achieves $2$ -- $10\times$ speedup on long sequences compared to standard transformers, maintains $>95\%$ accuracy on copy tasks at distances up to 2048 tokens, and learns interpretable pointer patterns that reveal structured dependency modeling. Our experiments on efficiency benchmarks, long-range dependency tasks, and interpretability analysis show that Pointer offers a compelling alternative to attention mechanisms for scenarios requiring efficient long-range modeling without pre-training dependencies.

Keywords

attention mechanism long short-term memory neural operator

Cite

@article{arxiv.2508.02631,
  title  = {Pointer: Linear-Complexity Long-Range Modeling without Pre-training},
  author = {Zixi Li},
  journal= {arXiv preprint arXiv:2508.02631},
  year   = {2025}
}

Comments

Submitted to Nordic AI Meet 2025

Pointer: Linear-Complexity Long-Range Modeling without Pre-training

Abstract

Keywords

Cite

Comments

Related papers