Large Language Models for Multilingual Code Intelligence: A Survey

Chao Jiang; Dugang Liu; Cheng Wen; Zhiwu Xu; Hua Zheng; Muhammad Sadiq; Jawwad Ahmed Shamsi; Shengchao Qin; Zhong Ming

Large Language Models for Multilingual Code Intelligence: A Survey

Software Engineering 2026-04-30 v1 Machine Learning Programming Languages

Authors: Chao Jiang , Dugang Liu , Cheng Wen , Zhiwu Xu , Hua Zheng , Muhammad Sadiq , Jawwad Ahmed Shamsi , Shengchao Qin , Zhong Ming

View on arXiv ↗ PDF ↗

Abstract

Large language models have transformed AI-assisted software engineering, but current research remains biased toward high-resource languages such as Python, with weaker performance in languages like Rust and OCaml. Since real-world systems are inherently polyglot, robust multilingual code intelligence is crucial. This survey focuses on two key tasks: multilingual code generation from shared natural-language requirements, and multilingual code translation that preserves semantics across languages. It reviews representative methods, benchmarks, and evaluation metrics, and highlights challenges and opportunities for trustworthy cross-language generalization.

Keywords

code generation cross-lingual transfer large language model

Cite

@article{arxiv.2604.25960,
  title  = {Large Language Models for Multilingual Code Intelligence: A Survey},
  author = {Chao Jiang and Dugang Liu and Cheng Wen and Zhiwu Xu and Hua Zheng and Muhammad Sadiq and Jawwad Ahmed Shamsi and Shengchao Qin and Zhong Ming},
  journal= {arXiv preprint arXiv:2604.25960},
  year   = {2026}
}

Large Language Models for Multilingual Code Intelligence: A Survey

Abstract

Keywords

Cite

Related papers