Related papers: Knowledge Transfer for Pseudo-code Generation from…

Can Emulating Semantic Translation Help LLMs with Code Translation? A Study Based on Pseudocode

Although large language models (LLMs) show promising potential in code translation, they still struggle to generate accurate translations using the commonly adopted direct code-to-code translation approach, which converts an original…

Software Engineering · Computer Science 2026-02-24 Songqiang Chen , Congying Xu , Jingyi Chen , Jialun Cao , Jiarong Wu , Shing-Chi Cheung

Neural Code Translation of Legacy Code: APL to C#

Automatic translation between programming languages remains a challenging problem, particularly when the source language is highly concise and specialized. This paper investigates the translation of APL into C# using large language models.…

Software Engineering · Computer Science 2026-05-15 Abdulrahman Ramadan , Hanen Borchani , Iben Lilholm , Mikkel Almind , Allan Peter Engsig-Karup

Transfer Learning with Pre-trained Conditional Generative Models

Transfer learning is crucial in training deep neural networks on new target tasks. Current transfer learning methods always assume at least one of (i) source and target task label spaces overlap, (ii) source datasets are available, and…

Machine Learning · Computer Science 2025-02-21 Shin'ya Yamaguchi , Sekitoshi Kanai , Atsutoshi Kumagai , Daiki Chijiwa , Hisashi Kashima

Fine-grained Pseudo-code Generation Method via Code Feature Extraction and Transformer

Pseudo-code written by natural language is helpful for novice developers' program comprehension. However, writing such pseudo-code is time-consuming and laborious. Motivated by the research advancements of sequence-to-sequence learning and…

Software Engineering · Computer Science 2021-09-22 Guang Yang , Yanlin Zhou , Xiang Chen , Chi Yu

Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration

Large language models (LLMs) have recently shown remarkable performance across a wide range of tasks. However, the substantial number of parameters in LLMs contributes to significant latency during model inference. This is particularly…

Computation and Language · Computer Science 2024-04-19 Pengfei Wu , Jiahao Liu , Zhuocheng Gong , Qifan Wang , Jinpeng Li , Jingang Wang , Xunliang Cai , Dongyan Zhao

Summarize and Generate to Back-translate: Unsupervised Translation of Programming Languages

Back-translation is widely known for its effectiveness in neural machine translation when there is little to no parallel data. In this approach, a source-to-target model is coupled with a target-to-source model trained in parallel. The…

Computation and Language · Computer Science 2023-02-14 Wasi Uddin Ahmad , Saikat Chakraborty , Baishakhi Ray , Kai-Wei Chang

Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs

Over the past few years, Large Language Models of Code (Code LLMs) have started to have a significant impact on programming practice. Code LLMs are also emerging as building blocks for research in programming languages and software…

Programming Languages · Computer Science 2024-09-24 Federico Cassano , John Gouwar , Francesca Lucchetti , Claire Schlesinger , Anders Freeman , Carolyn Jane Anderson , Molly Q Feldman , Michael Greenberg , Abhinav Jangda , Arjun Guha

Utilization of Pre-trained Language Model for Adapter-based Knowledge Transfer in Software Engineering

Software Engineering (SE) Pre-trained Language Models (PLMs), such as CodeBERT, are pre-trained on large code corpora, and their learned knowledge has shown success in transferring into downstream tasks (e.g., code clone detection) through…

Software Engineering · Computer Science 2024-02-07 Iman Saberi , Fatemeh Fard , Fuxiang Chen

Low-Resource Speech-to-Text Translation

Speech-to-text translation has many potential applications for low-resource languages, but the typical approach of cascading speech recognition with machine translation is often impossible, since the transcripts needed to train a speech…

Computation and Language · Computer Science 2018-06-19 Sameer Bansal , Herman Kamper , Karen Livescu , Adam Lopez , Sharon Goldwater

Meta Back-translation

Back-translation is an effective strategy to improve the performance of Neural Machine Translation~(NMT) by generating pseudo-parallel data. However, several recent works have found that better translation quality of the pseudo-parallel…

Computation and Language · Computer Science 2021-02-17 Hieu Pham , Xinyi Wang , Yiming Yang , Graham Neubig

Parallel-SFT: Improving Zero-Shot Cross-Programming-Language Transfer for Code RL

Modern language models demonstrate impressive coding capabilities in common programming languages (PLs), such as C++ and Python, but their performance in lower-resource PLs is often limited by training data availability. In principle,…

Computation and Language · Computer Science 2026-04-24 Zhaofeng Wu , Shiqi Wang , Boya Peng , Anuj Goyal , Melanie Kambadur , Sebastian Ruder , Yoon Kim , Chloe Bi

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation

Pre-trained models for Natural Languages (NL) like BERT and GPT have been recently shown to transfer well to Programming Languages (PL) and largely benefit a broad set of code-related tasks. Despite their success, most current methods…

Computation and Language · Computer Science 2021-09-03 Yue Wang , Weishi Wang , Shafiq Joty , Steven C. H. Hoi

LLM-Assisted Translation of Legacy FORTRAN Codes to C++: A Cross-Platform Study

Large Language Models (LLMs) are increasingly being leveraged for generating and translating scientific computer codes by both domain-experts and non-domain experts. Fortran has served as one of the go to programming languages in legacy…

Software Engineering · Computer Science 2025-04-23 Nishath Rajiv Ranasinghe , Shawn M. Jones , Michal Kucer , Ayan Biswas , Daniel O'Malley , Alexander Buschmann Most , Selma Liliane Wanna , Ajay Sreekumar

Transfer Learning across Low-Resource, Related Languages for Neural Machine Translation

We present a simple method to improve neural translation of a low-resource language pair using parallel data from a related, also low-resource, language pair. The method is based on the transfer method of Zoph et al., but whereas their…

Computation and Language · Computer Science 2017-09-22 Toan Q. Nguyen , David Chiang

Exploiting Curriculum Learning in Unsupervised Neural Machine Translation

Back-translation (BT) has become one of the de facto components in unsupervised neural machine translation (UNMT), and it explicitly makes UNMT have translation ability. However, all the pseudo bi-texts generated by BT are treated equally…

Computation and Language · Computer Science 2021-09-24 Jinliang Lu , Jiajun Zhang

SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations

Recent years have seen the successful application of large pre-trained models to code representation learning, resulting in substantial improvements on many code-related downstream tasks. But there are issues surrounding their application…

Software Engineering · Computer Science 2022-05-26 Changan Niu , Chuanyi Li , Vincent Ng , Jidong Ge , Liguo Huang , Bin Luo

Relevance Transformer: Generating Concise Code Snippets with Relevance Feedback

Tools capable of automatic code generation have the potential to augment programmer's capabilities. While straightforward code retrieval is incorporated into many IDEs, an emerging area is explicit code generation. Code generation is…

Computation and Language · Computer Science 2020-12-09 Carlos Gemmell , Federico Rossetto , Jeffrey Dalton

Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary

Cross-lingual model transfer is a compelling and popular method for predicting annotations in a low-resource language, whereby parallel corpora provide a bridge to a high-resource language and its associated annotated corpora. However,…

Computation and Language · Computer Science 2017-05-02 Meng Fang , Trevor Cohn

IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators

Code understanding and generation have fast become some of the most popular applications of language models (LMs). Nonetheless, research on multilingual aspects of Code-LMs (i.e., LMs for code generation) such as cross-lingual transfer…

Artificial Intelligence · Computer Science 2024-04-16 Indraneil Paul , Goran Glavaš , Iryna Gurevych

InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code Translation

Code translation aims to convert a program from one programming language (PL) to another. This long-standing software engineering task is crucial for modernizing legacy systems, ensuring cross-platform compatibility, enhancing performance,…

Software Engineering · Computer Science 2024-11-06 Marcos Macedo , Yuan Tian , Pengyu Nie , Filipe R. Cogo , Bram Adams