English

Performance Comparison of Binary Machine Learning Classifiers in Identifying Code Comment Types: An Exploratory Study

Software Engineering 2023-03-06 v2

Abstract

Code comments are vital to source code as they help developers with program comprehension tasks. Written in natural language (usually English), code comments convey a variety of different information, which are grouped into specific categories. In this study, we construct 19 binary machine learning classifiers for code comment categories that belong to three different programming languages. We present a comparison of performance scores for different types of machine learning classifiers and show that the Linear SVC classifier has the highest average F1 score of 0.5474.

Keywords

Cite

@article{arxiv.2303.01035,
  title  = {Performance Comparison of Binary Machine Learning Classifiers in Identifying Code Comment Types: An Exploratory Study},
  author = {Amila Indika and Peter Y. Washington and Anthony Peruma},
  journal= {arXiv preprint arXiv:2303.01035},
  year   = {2023}
}

Comments

This study has been accepted at: The 2nd International Workshop on Natural Language-based Software Engineering (NLBSE 2023); Tool Competition Track