English

Automatic Code Documentation Generation Using GPT-3

Software Engineering 2022-09-07 v1

Abstract

Source code documentation is an important artifact for efficient software development. Code documentation could greatly benefit from automation since manual documentation is often labouring, resource and time-intensive. In this paper, we employed Codex for automatic code documentation creation. Codex is a GPT-3 based model pre-trained on both natural and programming languages. We find that Codex outperforms existing techniques even with basic settings like one-shot learning (i.e., providing only one example for training). Codex achieves an overall BLEU score of 20.6 for six different programming languages (11.2% improvement over earlier state-of-the-art techniques). Thus, Codex shows promise and warrants in-depth future studies for automatic code documentation generation to support diverse development tasks.

Keywords

Cite

@article{arxiv.2209.02235,
  title  = {Automatic Code Documentation Generation Using GPT-3},
  author = {Junaed Younus Khan and Gias Uddin},
  journal= {arXiv preprint arXiv:2209.02235},
  year   = {2022}
}

Comments

Accepted in IEEE/ACM International Conference on Automated Software Engineering (ASE 2022) - NIER