Related papers: Code Translation with Compiler Representations

NL in the Middle: Code Translation with LLMs and Intermediate Representations

Studies show that large language models (LLMs) produce buggy code translations. One promising avenue to improve translation accuracy is through intermediate representations, which provide structured guidance for the translation process. We…

Software Engineering · Computer Science 2025-09-18 Chi-en Amy Tai , Pengyu Nie , Lukasz Golab , Alexander Wong

Unleashing the Power of Compiler Intermediate Representation to Enhance Neural Program Embeddings

Neural program embeddings have demonstrated considerable promise in a range of program analysis tasks, including clone identification, program repair, code completion, and program synthesis. However, most existing methods generate neural…

Software Engineering · Computer Science 2022-04-21 Zongjie Li , Pingchuan Ma , Huaijin Wang , Shuai Wang , Qiyi Tang , Sen Nie , Shi Wu

ComPile: A Large IR Dataset from Production Sources

Code is increasingly becoming a core data modality of modern machine learning research impacting not only the way we write code with conversational agents like OpenAI's ChatGPT, Google's Bard, or Anthropic's Claude, the way we translate…

Programming Languages · Computer Science 2024-05-01 Aiden Grossman , Ludger Paehler , Konstantinos Parasyris , Tal Ben-Nun , Jacob Hegna , William Moses , Jose M Monsalve Diaz , Mircea Trofin , Johannes Doerfert

IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators

Code understanding and generation have fast become some of the most popular applications of language models (LMs). Nonetheless, research on multilingual aspects of Code-LMs (i.e., LMs for code generation) such as cross-lingual transfer…

Artificial Intelligence · Computer Science 2024-04-16 Indraneil Paul , Goran Glavaš , Iryna Gurevych

Quality Estimation & Interpretability for Code Translation

Recently, the automated translation of source code from one programming language to another by using automatic approaches inspired by Neural Machine Translation (NMT) methods for natural languages has come under study. However, such…

Software Engineering · Computer Science 2021-04-28 Mayank Agarwal , Kartik Talamadupula , Stephanie Houde , Fernando Martinez , Michael Muller , John Richards , Steven Ross , Justin D. Weisz

Program Translation via Code Distillation

Software version migration and program translation are an important and costly part of the lifecycle of large codebases. Traditional machine translation relies on parallel corpora for supervised translation, which is not feasible for…

Software Engineering · Computer Science 2023-10-19 Yufan Huang , Mengnan Qi , Yongqiang Yao , Maoquan Wang , Bin Gu , Colin Clement , Neel Sundaresan

Can Large Language Models Understand Intermediate Representations in Compilers?

Intermediate Representations (IRs) play a critical role in compiler design and program analysis, yet their comprehension by Large Language Models (LLMs) remains underexplored. In this paper, we present an explorative empirical study…

Machine Learning · Computer Science 2025-06-06 Hailong Jiang , Jianfeng Zhu , Yao Wan , Bo Fang , Hongyu Zhang , Ruoming Jin , Qiang Guan

Compiler Optimization: A Case for the Transformation Tool Contest

An optimizing compiler consists of a front end parsing a textual programming language into an intermediate representation (IR), a middle end performing optimizations on the IR, and a back end lowering the IR to a target representation (TR)…

Programming Languages · Computer Science 2011-11-22 Sebastian Buchwald , Edgar Jakumeit

LLM Translation of Compiler Intermediate Representation

GCC and LLVM underpin much of modern software infrastructure, relying on distinct Intermediate Representations (IRs) to drive optimizations and code generation. However, the semantic and structural differences between these IRs create…

Programming Languages · Computer Science 2026-05-12 Andrea Valenzuela Ramirez , Cristian Gutierrez-Gomez , Marta Barroso , Dario Garcia-Gasulla , Sara Royuela

Neural Machine Translation for Code Generation

Neural machine translation (NMT) methods developed for natural language processing have been shown to be highly successful in automating translation from one natural language to another. Recently, these NMT methods have been adapted to the…

Computation and Language · Computer Science 2023-05-24 Dharma KC , Clayton T. Morrison

Enabling Retargetable Optimizing Compilers for Quantum Accelerators via a Multi-Level Intermediate Representation

We present a multi-level quantum-classical intermediate representation (IR) that enables an optimizing, retargetable, ahead-of-time compiler for available quantum programming languages. To demonstrate our architecture, we leverage our…

Quantum Physics · Physics 2021-09-02 Thien Nguyen , Alexander McCaskey

Towards Neural Decompilation

We address the problem of automatic decompilation, converting a program in low-level representation back to a higher-level human-readable programming language. The problem of decompilation is extremely important for security researchers.…

Programming Languages · Computer Science 2019-05-22 Omer Katz , Yuval Olshaker , Yoav Goldberg , Eran Yahav

Specification-Driven Code Translation Powered by Large Language Models: How Far Are We?

Large Language Models (LLMs) are increasingly being applied across various domains, including code-related tasks such as code translation. Previous studies have explored using LLMs for translating code between different programming…

Software Engineering · Computer Science 2026-05-05 Soumit Kanti Saha , Fazle Rabbi , Song Wang , Jinqiu Yang

Unsupervised Translation of Programming Languages

A transcompiler, also known as source-to-source translator, is a system that converts source code from a high-level programming language (such as C++ or Python) to another. Transcompilers are primarily used for interoperability, and to port…

Computation and Language · Computer Science 2020-09-23 Marie-Anne Lachaux , Baptiste Roziere , Lowik Chanussot , Guillaume Lample

InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code Translation

Code translation aims to convert a program from one programming language (PL) to another. This long-standing software engineering task is crucial for modernizing legacy systems, ensuring cross-platform compatibility, enhancing performance,…

Software Engineering · Computer Science 2024-11-06 Marcos Macedo , Yuan Tian , Pengyu Nie , Filipe R. Cogo , Bram Adams

Enhancing LLMs in Long Code Translation through Instrumentation and Program State Alignment

Code translation aims to transform code between programming languages while preserving functionality, with applications in cross-platform development and software migration. Recent advances in Large Language Models (LLMs) have improved code…

Software Engineering · Computer Science 2025-04-04 Li Xin-Ye , Du Ya-Li , Li Ming

High Performance GPU Code Generation for Matrix-Matrix Multiplication using MLIR: Some Early Results

This report presents some early results on code generation targeting tensor cores on NVIDIA GPUs using the MLIR compiler infrastructure. The state-of-the-art in high-performance deep learning today is primarily driven by manually optimized…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-08-31 Navdeep Katel , Vivek Khandelwal , Uday Bondhugula

Rectifier: Code Translation with Corrector via LLMs

Software migration is garnering increasing attention with the evolution of software and society. Early studies mainly relied on handcrafted translation rules to translate between two languages, the translation process is error-prone and…

Software Engineering · Computer Science 2024-07-11 Xin Yin , Chao Ni , Tien N. Nguyen , Shaohua Wang , Xiaohu Yang

Representing LLVM-IR in a Code Property Graph

In the past years, a number of static application security testing tools have been proposed which make use of so-called code property graphs, a graph model which keeps rich information about the source code while enabling its user to write…

Software Engineering · Computer Science 2022-12-12 Alexander Küchler , Christian Banse

Enhancing R with Advanced Compilation Tools and Methods

I describe an approach to compiling common idioms in R code directly to native machine code and illustrate it with several examples. Not only can this yield significant performance gains, but it allows us to use new approaches to computing…

Computation · Statistics 2014-09-12 Duncan Temple Lang