English

Don't Transform the Code, Code the Transforms: Towards Precise Code Rewriting using LLMs

Machine Learning 2024-10-14 v1

Abstract

Tools for rewriting, refactoring and optimizing code should be fast and correct. Large language models (LLMs), by their nature, possess neither of these qualities. Yet, there remains tremendous opportunity in using LLMs to improve code. We explore the use of LLMs not to transform code, but to code transforms. We propose a chain-of-thought approach to synthesizing code transformations from a small number of input/output code examples that incorporates execution and feedback. Unlike the direct rewrite approach, LLM-generated transformations are easy to inspect, debug, and validate. The logic of the rewrite is explicitly coded and easy to adapt. The compute required to run code transformations is minute compared to that of LLM rewriting. We test our approach on 16 Python code transformations and find that LLM- generated transforms are perfectly precise for 7 of them and less imprecise than direct LLM rewriting on the others. We hope to encourage further research to improving the precision of LLM code rewriting.

Keywords

Cite

@article{arxiv.2410.08806,
  title  = {Don't Transform the Code, Code the Transforms: Towards Precise Code Rewriting using LLMs},
  author = {Chris Cummins and Volker Seeker and Jordi Armengol-Estapé and Aram H. Markosyan and Gabriel Synnaeve and Hugh Leather},
  journal= {arXiv preprint arXiv:2410.08806},
  year   = {2024}
}