LLMs have shown immense potential for code translation, yet they often struggle to ensure both syntactic correctness and semantic consistency. While preference-based learning offers a promising alignment strategy, it is hindered by unreliable semantic rewards derived from sparse test cases or restrictive reference translations. We argue that a robust semantic reward for code translation must be derived directly from the source code. In this paper, we propose CTO to improve code translation with syntax-guided and semantic-aware preference optimization. Through contrastive learning, we train a cross-lingual semantic model to directly assess functional equivalence between source and translated code. By formulating code translation as a multi-objective optimization problem, this robust semantic signal is seamlessly unified with compiler-based syntactic feedback within the direct preference optimization framework. Extensive experiments on C++, Java, and Python translations demonstrate that CTO significantly outperforms existing baselines and alternative preference optimization strategies.
@article{arxiv.2605.13229,
title = {Improving Code Translation with Syntax-Guided and Semantic-aware Preference Optimization},
author = {Yuhan Wu and Huan Zhang and Wei Cheng and Chen Shen and Jingyue Yang and Wei Hu},
journal= {arXiv preprint arXiv:2605.13229},
year = {2026}
}
Comments
Accepted in the 35th International Joint Conference on Artificial Intelligence (IJCAI 2016)