English

Learning to Superoptimize Real-world Programs

Machine Learning 2022-04-06 v2 Artificial Intelligence Programming Languages Software Engineering

Abstract

Program optimization is the process of modifying software to execute more efficiently. Superoptimizers attempt to find the optimal program by employing significantly more expensive search and constraint solving techniques. Generally, these methods do not scale well to programs in real development scenarios, and as a result, superoptimization has largely been confined to small-scale, domain-specific, and/or synthetic program benchmarks. In this paper, we propose a framework to learn to superoptimize real-world programs by using neural sequence-to-sequence models. We created a dataset consisting of over 25K real-world x86-64 assembly functions mined from open-source projects and propose an approach, Self Imitation Learning for Optimization (SILO) that is easy to implement and outperforms a standard policy gradient learning approach on our dataset. Our method, SILO, superoptimizes 5.9% of our test set when compared with the gcc version 10.3 compiler's aggressive optimization level -O3. We also report that SILO's rate of superoptimization on our test set is over five times that of a standard policy gradient approach and a model pre-trained on compiler optimization demonstration.

Keywords

Cite

@article{arxiv.2109.13498,
  title  = {Learning to Superoptimize Real-world Programs},
  author = {Alex Shypula and Pengcheng Yin and Jeremy Lacomis and Claire Le Goues and Edward Schwartz and Graham Neubig},
  journal= {arXiv preprint arXiv:2109.13498},
  year   = {2022}
}

Comments

Best Paper, ICLR 2022 Deep Learning for Code workshop