An Empirical Study on Code Comment Completion

Antonio Mastropaolo; Emad Aghajani; Luca Pascarella; Gabriele Bavota

An Empirical Study on Code Comment Completion

Software Engineering 2021-07-23 v1

Authors: Antonio Mastropaolo , Emad Aghajani , Luca Pascarella , Gabriele Bavota

Abstract

Code comments play a prominent role in program comprehension activities. However, source code is not always documented and code and comments not always co-evolve. To deal with these issues, researchers have proposed techniques to automatically generate comments documenting a given code at hand. The most recent works in the area applied deep learning (DL) techniques to support such a task. Despite the achieved advances, the empirical evaluations of these approaches show that they are still far from a performance level that would make them valuable for developers. We tackle a simpler and related problem: Code comment completion. Instead of generating a comment for a given code from scratch, we investigate the extent to which state-of-the-art techniques can help developers in writing comments faster. We present a large-scale study in which we empirically assess how a simple n-gram model and the recently proposed Text-To-Text Transfer Transformer (T5) architecture can perform in autocompleting a code comment the developer is typing. The achieved results show the superiority of the T5 model, despite the n-gram model being a competitive solution.

Keywords

code generation software refactoring program analysis

Cite

@article{arxiv.2107.10544,
  title  = {An Empirical Study on Code Comment Completion},
  author = {Antonio Mastropaolo and Emad Aghajani and Luca Pascarella and Gabriele Bavota},
  journal= {arXiv preprint arXiv:2107.10544},
  year   = {2021}
}

Comments

Accepted for publication at the 37th International Conference on Software Maintenance and Evolution (ICSME 2021)

An Empirical Study on Code Comment Completion

Abstract

Keywords

Cite

Comments

Related papers