English

Neural Sketch Learning for Conditional Program Generation

Programming Languages 2018-04-16 v5 Machine Learning

Abstract

We study the problem of generating source code in a strongly typed, Java-like programming language, given a label (for example a set of API calls or types) carrying a small amount of information about the code that is desired. The generated programs are expected to respect a "realistic" relationship between programs and labels, as exemplified by a corpus of labeled programs available during training. Two challenges in such conditional program generation are that the generated programs must satisfy a rich set of syntactic and semantic constraints, and that source code contains many low-level features that impede learning. We address these problems by training a neural generator not on code but on program sketches, or models of program syntax that abstract out names and operations that do not generalize across programs. During generation, we infer a posterior distribution over sketches, then concretize samples from this distribution into type-safe programs using combinatorial techniques. We implement our ideas in a system for generating API-heavy Java code, and show that it can often predict the entire body of a method given just a few API calls or data types that appear in the method.

Keywords

Cite

@article{arxiv.1703.05698,
  title  = {Neural Sketch Learning for Conditional Program Generation},
  author = {Vijayaraghavan Murali and Letao Qi and Swarat Chaudhuri and Chris Jermaine},
  journal= {arXiv preprint arXiv:1703.05698},
  year   = {2018}
}