HomeComputation & LanguagearXiv:2605.30274

Loong: A Human-Like Long Document Translation Agent with Observe-and-Act Adaptive Context Selection

Abstract

Document-level translation remains one of the most challenging tasks for large language models, which are constrained by limited context windows that impede global cohesion, while simultaneously suffering from redundant contextual information that degrades translation quality. To address this, we propose a human-like long document translation agent called Loong, which leverages a 3E memory module (Essence-Exemplar-Entity) to store summaries, sentence pairs, and entity records as historical context. Instead of passively attending to all history, Loong performs deep reasoning to adaptively identify the optimal context for translation guidance. Loong optimizes its context policy through reinforcement learning, utilizing preference data derived from its own sampled observe-and-act reasoning trajectories. Empirical evaluations demonstrate that Loong achieves substantial translation quality improvements in English \Leftrightarrow Chinese, German, and French directions, with average gains of up to 13.0 points across the three evaluation metrics. Furthermore, Loong exhibits strong generalization across domains and robustness against contextual noise, while maintaining remarkable stability in ultra-long document translation. Our code is released at https://github.com/YutongWang1216/LoongDocMT.

Cite

@article{arxiv.2605.30274,
  title  = {Loong: A Human-Like Long Document Translation Agent with Observe-and-Act Adaptive Context Selection},
  author = {Yutong Wang and Xuebo Liu and Derek F. Wong and Zhilin Li and Rongqing Jiang and Min Zhang and Shimin Tao and Daimeng Wei and Min Zhang},
  journal= {arXiv preprint arXiv:2605.30274},
  year   = {2026}
}