Related papers: The test set for the TransCoder system

Unsupervised Translation of Programming Languages

A transcompiler, also known as source-to-source translator, is a system that converts source code from a high-level programming language (such as C++ or Python) to another. Transcompilers are primarily used for interoperability, and to port…

Computation and Language · Computer Science 2020-09-23 Marie-Anne Lachaux , Baptiste Roziere , Lowik Chanussot , Guillaume Lample

Leveraging Automated Unit Tests for Unsupervised Code Translation

With little to no parallel data available for programming languages, unsupervised methods are well-suited to source code translation. However, the majority of unsupervised machine translation approaches rely on back-translation, a method…

Software Engineering · Computer Science 2022-02-17 Baptiste Roziere , Jie M. Zhang , Francois Charton , Mark Harman , Gabriel Synnaeve , Guillaume Lample

Quality Estimation & Interpretability for Code Translation

Recently, the automated translation of source code from one programming language to another by using automatic approaches inspired by Neural Machine Translation (NMT) methods for natural languages has come under study. However, such…

Software Engineering · Computer Science 2021-04-28 Mayank Agarwal , Kartik Talamadupula , Stephanie Houde , Fernando Martinez , Michael Muller , John Richards , Steven Ross , Justin D. Weisz

The Strengths and Behavioral Quirks of Java Bytecode Decompilers

During compilation from Java source code to bytecode, some information is irreversibly lost. In other words, compilation and decompilation of Java code is not symmetric. Consequently, the decompilation process, which aims at producing…

Software Engineering · Computer Science 2019-12-19 Nicolas Harrand , César Soto-Valero , Martin Monperrus , Benoit Baudry

On the Impact of Language Selection for Training and Evaluating Programming Language Models

The recent advancements in Transformer-based Language Models have demonstrated significant potential in enhancing the multilingual capabilities of these models. The remarkable progress made in this domain not only applies to natural…

Software Engineering · Computer Science 2023-08-28 Jonathan Katzy , Maliheh Izadi , Arie van Deursen

TransCoder: A Neural-Enhancement Framework for Channel Codes

Reliable communication over noisy channels requires the design of specialized error-correcting codes (ECCs) tailored to specific system requirements. Recently, neural network-based decoders have emerged as promising tools for enhancing ECC…

Information Theory · Computer Science 2025-12-01 Anastasiia Kurmukova , Selim F. Yilmaz , Emre Ozfatura , Deniz Gunduz

Code quality assessment using transformers

Automatically evaluate the correctness of programming assignments is rather straightforward using unit and integration tests. However, programming tasks can be solved in multiple ways, many of which, although correct, are inelegant. For…

Computation and Language · Computer Science 2023-09-19 Mosleh Mahamud , Isak Samsten

Software Vulnerability Prediction Knowledge Transferring Between Programming Languages

Developing automated and smart software vulnerability detection models has been receiving great attention from both research and development communities. One of the biggest challenges in this area is the lack of code samples for all…

Software Engineering · Computer Science 2023-03-14 Khadija Hanifi , Ramin F Fouladi , Basak Gencer Unsalver , Goksu Karadag

Enhancing the Transformer Decoder with Transition-based Syntax

Notwithstanding recent advances, syntactic generalization remains a challenge for text decoders. While some studies showed gains from incorporating source-side symbolic syntactic and semantic structure into text generation Transformers,…

Computation and Language · Computer Science 2022-11-02 Leshem Choshen , Omri Abend

TransCoder: Towards Unified Transferable Code Representation Learning Inspired by Human Skills

Code pre-trained models (CodePTMs) have recently demonstrated a solid capacity to process various software intelligence tasks, e.g., code clone detection, code translation, and code summarization. The current mainstream method that deploys…

Software Engineering · Computer Science 2024-05-10 Qiushi Sun , Nuo Chen , Jianing Wang , Xiang Li , Ming Gao

A Comprehensive Review of State-of-The-Art Methods for Java Code Generation from Natural Language Text

Java Code Generation consists in generating automatically Java code from a Natural Language Text. This NLP task helps in increasing programmers' productivity by providing them with immediate solutions to the simplest and most repetitive…

Computation and Language · Computer Science 2023-06-13 Jessica López Espejel , Mahaman Sanoussi Yahaya Alassan , El Mehdi Chouham , Walid Dahhane , El Hassane Ettifouri

Verifying Functional Correctness Properties At the Level of Java Bytecode

The breakneck evolution of modern programming languages aggravates the development of deductive verification tools, which struggle to timely and fully support all new language features. To address this challenge, we present ByteBack: a…

Programming Languages · Computer Science 2024-10-03 Marco Paganoni , Carlo A. Furia

REMODEL-LLM: Transforming C code to Java using LLMs

The automated translation of C code to Java code is a notoriously difficult task, fraught with challenges stemming from fundamental paradigm shifts (procedural vs. Object Oriented), memory models (manual pointers vs. Garbage Collection),…

Software Engineering · Computer Science 2025-12-15 Aryan Gupta , Y. Raghu Reddy

Do Transformer Modifications Transfer Across Implementations and Applications?

The research community has proposed copious modifications to the Transformer architecture since it was introduced over three years ago, relatively few of which have seen widespread adoption. In this paper, we comprehensively evaluate many…

Machine Learning · Computer Science 2021-09-14 Sharan Narang , Hyung Won Chung , Yi Tay , William Fedus , Thibault Fevry , Michael Matena , Karishma Malkan , Noah Fiedel , Noam Shazeer , Zhenzhong Lan , Yanqi Zhou , Wei Li , Nan Ding , Jake Marcus , Adam Roberts , Colin Raffel

Comprehending Test Code: An Empirical Study

Developers spend a large portion of their time and effort on comprehending source code. While many studies have investigated how developers approach these comprehension tasks and what factors influence their success, less is known about how…

Software Engineering · Computer Science 2019-08-01 Chak Shun Yu , Christoph Treude , Maurício Aniche

Obfuscating Java Programs by Translating Selected Portions of Bytecode to Native Libraries

Code obfuscation is a popular approach to turn program comprehension and analysis harder, with the aim of mitigating threats related to malicious reverse engineering and code tampering. However, programming languages that compile to high…

Software Engineering · Computer Science 2019-01-16 Davide Pizzolotto , Mariano Ceccato

JavaBERT: Training a transformer-based model for the Java programming language

Code quality is and will be a crucial factor while developing new software code, requiring appropriate tools to ensure functional and reliable code. Machine learning techniques are still rarely used for software engineering tools, missing…

Software Engineering · Computer Science 2021-10-22 Nelson Tavares de Sousa , Wilhelm Hasselbring

MMT: Mutation Testing of Java Bytecode with Model Transformation -- An Illustrative Demonstration

Mutation testing is an approach to check the robustness of test suites. The program code is slightly changed by mutations to inject errors. A test suite is robust enough if it finds such errors. Tools for mutation testing usually integrate…

Software Engineering · Computer Science 2024-04-23 Christoph Bockisch , Gabriele Taentzer , Daniel Neufeld

Transfer Learning Toolkit: Primers and Benchmarks

The transfer learning toolkit wraps the codes of 17 transfer learning models and provides integrated interfaces, allowing users to use those models by calling a simple function. It is easy for primary researchers to use this toolkit and to…

Machine Learning · Computer Science 2019-11-21 Fuzhen Zhuang , Keyu Duan , Tongjia Guo , Yongchun Zhu , Dongbo Xi , Zhiyuan Qi , Qing He

Test Code Refactoring Unveiled: Where and How Does It Affect Test Code Quality and Effectiveness?

Context. Refactoring has been widely investigated in the past in relation to production code quality, yet still little is known on how developers apply refactoring on test code. Specifically, there is still a lack of investigation into how…

Software Engineering · Computer Science 2023-08-21 Luana Martins , Valeria Pontillo , Heitor Costa , Filomena Ferrucci , Fabio Palomba , Ivan Machado