English
Related papers

Related papers: Beyond Code Pairs: Dialogue-Based Data Generation …

200 papers

Translating legacy Fortran code into C++ is a crucial step in modernizing high-performance computing (HPC) applications. However, the scarcity of high-quality, parallel Fortran-to-C++ datasets and the limited domain-specific expertise in…

Machine Learning · Computer Science 2025-02-04 Le Chen , Bin Lei , Dunzhi Zhou , Pei-Hung Lin , Chunhua Liao , Caiwen Ding , Ali Jannesari

In this study, we present a novel dataset for training machine learning models translating between OpenMP Fortran and C++ code. To ensure reliability and applicability, the dataset is created from a range of representative open-source…

Software Engineering · Computer Science 2023-09-20 Bin Lei , Caiwen Ding , Le Chen , Pei-Hung Lin , Chunhua Liao

Large Language Models (LLMs) have achieved remarkable success in automated code translation. While prior work has focused on improving translation accuracy through advanced prompting and iterative repair, the reliability of the underlying…

Software Engineering · Computer Science 2026-05-11 Fazle Rabbi , Soumit Kanti Saha , Jinqiu Yang

Large language models (LLMs) have shown impressive promise in code generation, yet their progress remains limited by the shortage of large-scale datasets that are both diverse and well-aligned with human reasoning. Most existing resources…

Machine Learning · Computer Science 2025-10-28 Amal Abed , Ivan Lukic , Jörg K. H. Franke , Frank Hutter

Large Language Models (LLMs) have revolutionized code generation but require significant resources and often over-generalize, limiting their task-specific efficiency. Fine-tuning smaller, open-source LLMs provides a cost-effective…

Computation and Language · Computer Science 2025-06-27 Leitian Tao , Xiang Chen , Tong Yu , Tung Mai , Ryan Rossi , Yixuan Li , Saayan Mitra

Large language models (LLMs) have shown promise for automated source-code translation, a capability critical to software migration, maintenance, and interoperability. Yet comparative evidence on how model choice, prompt design, and prompt…

Software Engineering · Computer Science 2025-09-17 Aamer Aljagthami , Mohammed Banabila , Musab Alshehri , Mohammed Kabini , Mohammad D. Alahmadi

Recent advancements in Large Language Models (LLMs) have renewed interest in automatic programming language translation. Encoder-decoder transformer models, in particular, have shown promise in translating between different programming…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-29 Ali TehraniJamsaz , Arijit Bhattacharjee , Le Chen , Nesreen K. Ahmed , Amir Yazdanbakhsh , Ali Jannesari

Code translation aims to convert source code from one programming language (PL) to another. Given the promising abilities of large language models (LLMs) in code synthesis, researchers are exploring their potential to automate code…

Natural language to code generation is an important application area of LLMs and has received wide attention from the community. The majority of relevant studies have exclusively concentrated on increasing the quantity and functional…

Machine Learning · Computer Science 2023-11-28 Naman Jain , Tianjun Zhang , Wei-Lin Chiang , Joseph E. Gonzalez , Koushik Sen , Ion Stoica

Large Language Models (LLMs) are becoming integral to modern software development workflows, assisting developers with code generation, API explanation, and iterative problem-solving through natural language conversations. Despite…

Software Engineering · Computer Science 2025-09-15 Suzhen Zhong , Ying Zou , Bram Adams

Large Language Models (LLMs) are increasingly being leveraged for generating and translating scientific computer codes by both domain-experts and non-domain experts. Fortran has served as one of the go to programming languages in legacy…

The adoption of Large Language Models (LLMs) for code generation in data science offers substantial potential for enhancing tasks such as data manipulation, statistical analysis, and visualization. However, the effectiveness of these models…

Software Engineering · Computer Science 2024-11-20 Nathalia Nascimento , Everton Guimaraes , Sai Sanjna Chintakunta , Santhosh Anitha Boominathan

We introduce a novel method to enhance cross-language code translation from Fortran to C++ by integrating task-specific embedding alignment into a Retrieval-Augmented Generation (RAG) framework. Unlike conventional retrieval approaches that…

Artificial Intelligence · Computer Science 2024-12-09 Manish Bhattarai , Minh Vu , Javier E. Santos , Ismael Boureima , Daniel O' Malley

Existing large language models (LLMs) for machine translation are typically fine-tuned on sentence-level translation instructions and achieve satisfactory performance at the sentence level. However, when applied to document-level…

Computation and Language · Computer Science 2024-01-17 Yachao Li , Junhui Li , Jing Jiang , Min Zhang

Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks. Recent research indicates that moderately-sized LLMs often outperform larger ones after task-specific fine-tuning. This study focuses…

Computation and Language · Computer Science 2024-10-14 Minghao Wu , Thuy-Trang Vu , Lizhen Qu , George Foster , Gholamreza Haffari

This paper introduces a novel research direction for model-to-text/code transformations by leveraging Large Language Models (LLMs) that can be enhanced with Retrieval-Augmented Generation (RAG) pipelines. The focus is on quantum and hybrid…

Software Engineering · Computer Science 2025-12-03 Nazanin Siavash , Armin Moin

Recent leaps in large language models (LLMs) caused a revolution in programming tools (like GitHub Copilot) that can help with code generation, debugging, and even performance optimization. In this paper, we focus on the capabilities of the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-21 Matyáš Brabec , Jiří Klepl , Michal Töpfer , Martin Kruliš

Decoder-only LLMs have shown impressive performance in MT due to their ability to learn from extensive datasets and generate high-quality translations. However, LLMs often struggle with the nuances and style required for…

Computation and Language · Computer Science 2024-09-11 Inacio Vieira , Will Allred , Séamus Lankford , Sheila Castilho , Andy Way

The rapid growth of deep learning has driven exponential increases in model parameters and computational demands. NVIDIA GPUs and their CUDA-based software ecosystem provide robust support for parallel computing, significantly alleviating…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-08 Jiaqi Lv , Xufeng He , Yanchen Liu , Xu Dai , Aocheng Shen , Yinghao Li , Jiachen Hao , Jianrong Ding , Yang Hu , Shouyi Yin

Large language models (LLMs) have demonstrated strong performance in sentence-level machine translation, but scaling to document-level translation remains challenging, particularly in modeling long-range dependencies and discourse phenomena…

Computation and Language · Computer Science 2025-08-29 Miguel Moura Ramos , Patrick Fernandes , Sweta Agrawal , André F. T. Martins
‹ Prev 1 2 3 10 Next ›