Related papers: Semantic Similarity Metrics for Evaluating Source …

Evaluating Code Summarization Techniques: A New Metric and an Empirical Characterization

Several code summarization techniques have been proposed in the literature to automatically document a code snippet or a function. Ideally, software developers should be involved in assessing the quality of the generated summaries. However,…

Software Engineering · Computer Science 2023-12-27 Antonio Mastropaolo , Matteo Ciniselli , Massimiliano Di Penta , Gabriele Bavota

This paper presents a procedure for and evaluation of using a semantic similarity metric as a loss function for neural source code summarization. Code summarization is the task of writing natural language descriptions of source code. Neural…

Software Engineering · Computer Science 2024-06-13 Chia-Yi Su , Collin McMillan

Action Word Prediction for Neural Source Code Summarization

Source code summarization is the task of creating short, natural language descriptions of source code. Code summarization is the backbone of much software documentation such as JavaDocs, in which very brief comments such as "adds the…

Software Engineering · Computer Science 2021-01-11 Sakib Haque , Aakash Bansal , Lingfei Wu , Collin McMillan

Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors

Automated source code summarization is a popular software engineering research topic wherein machine translation models are employed to "translate" code snippets into relevant natural language descriptions. Most evaluations of such models…

Software Engineering · Computer Science 2021-06-17 Junayed Mahmud , Fahim Faisal , Raihan Islam Arnob , Antonios Anastasopoulos , Kevin Moran

Recommendations for Datasets for Source Code Summarization

Source Code Summarization is the task of writing short, natural language descriptions of source code. The main use for these descriptions is in software documentation e.g. the one-sentence Java method descriptions in JavaDocs. Code…

Computation and Language · Computer Science 2019-04-05 Alexander LeClair , Collin McMillan

On the Evaluation of Neural Code Summarization

Source code summaries are important for program comprehension and maintenance. However, there are plenty of programs with missing, outdated, or mismatched summaries. Recently, deep learning techniques have been exploited to automatically…

Software Engineering · Computer Science 2022-02-14 Ensheng Shi , Yanlin Wang , Lun Du , Junjie Chen , Shi Han , Hongyu Zhang , Dongmei Zhang , Hongbin Sun

Supporting software documentation with source code summarization

Source code summarization is a process of generating summaries that describe software code, the majority of source code summarization usually generated manually, where the summaries are written by software developers. Recently, new…

Software Engineering · Computer Science 2019-01-07 Ra'Fat Al-Msie'deen , Anas H. Blasi

Meta Learning for Code Summarization

Source code summarization is the task of generating a high-level natural language description for a segment of programming language code. Current neural models for the task differ in their architecture and the aspects of code they consider.…

Machine Learning · Computer Science 2022-01-21 Moiz Rauf , Sebastian Padó , Michael Pradel

Calibration of Large Language Models on Code Summarization

A brief, fluent, and relevant summary can be helpful during program comprehension; however, such a summary does require significant human effort to produce. Often, good summaries are unavailable in software projects, which makes maintenance…

Software Engineering · Computer Science 2025-06-03 Yuvraj Virk , Premkumar Devanbu , Toufique Ahmed

Do Machines and Humans Focus on Similar Code? Exploring Explainability of Large Language Models in Code Summarization

Recent language models have demonstrated proficiency in summarizing source code. However, as in many other domains of machine learning, language models of code lack sufficient explainability. Informally, we lack a formulaic or intuitive…

Software Engineering · Computer Science 2024-02-23 Jiliang Li , Yifan Zhang , Zachary Karas , Collin McMillan , Kevin Leach , Yu Huang

Statement-based Memory for Neural Source Code Summarization

Source code summarization is the task of writing natural language descriptions of source code behavior. Code summarization underpins software documentation for programmers. Short descriptions of code help programmers understand the program…

Artificial Intelligence · Computer Science 2023-07-24 Aakash Bansal , Siyuan Jiang , Sakib Haque , Collin McMillan

Project-Level Encoding for Neural Source Code Summarization of Subroutines

Source code summarization of a subroutine is the task of writing a short, natural language description of that subroutine. The description usually serves in documentation aimed at programmers, where even brief phrase (e.g. "compresses data…

Software Engineering · Computer Science 2021-03-23 Aakash Bansal , Sakib Haque , Collin McMillan

Source Code Summarization in the Era of Large Language Models

To support software developers in understanding and maintaining programs, various automatic (source) code summarization techniques have been proposed to generate a concise natural language summary (i.e., comment) for a given code snippet.…

Software Engineering · Computer Science 2025-08-26 Weisong Sun , Yun Miao , Yuekang Li , Hongyu Zhang , Chunrong Fang , Yi Liu , Gelei Deng , Yang Liu , Zhenyu Chen

Understanding the Extent to which Summarization Evaluation Metrics Measure the Information Quality of Summaries

Reference-based metrics such as ROUGE or BERTScore evaluate the content quality of a summary by comparing the summary to a reference. Ideally, this comparison should measure the summary's information quality by calculating how much…

Computation and Language · Computer Science 2020-10-26 Daniel Deutsch , Dan Roth

Ensemble Models for Neural Source Code Summarization of Subroutines

A source code summary of a subroutine is a brief description of that subroutine. Summaries underpin a majority of documentation consumed by programmers, such as the method summaries in JavaDocs. Source code summarization is the task of…

Software Engineering · Computer Science 2021-07-27 Alexander LeClair , Aakash Bansal , Collin McMillan

Analyzing the Performance of Large Language Models on Code Summarization

Large language models (LLMs) such as Llama 2 perform very well on tasks that involve both natural language and source code, particularly code summarization and code generation. We show that for the task of code summarization, the…

Software Engineering · Computer Science 2024-04-15 Rajarshi Haldar , Julia Hockenmaier

Deep Assessment of Code Review Generation Approaches: Beyond Lexical Similarity

Code review is a standard practice for ensuring the quality of software projects, and recent research has focused extensively on automated code review. While significant advancements have been made in generating code reviews, the automated…

Software Engineering · Computer Science 2025-01-10 Yanjie Jiang , Hui Liu , Tianyi Chen , Fu Fan , Chunhao Dong , Kui Liu , Lu Zhang

Towards Modeling Human Attention from Eye Movements for Neural Source Code Summarization

Neural source code summarization is the task of generating natural language descriptions of source code behavior using neural networks. A fundamental component of most neural models is an attention mechanism. The attention mechanism learns…

Software Engineering · Computer Science 2023-05-18 Aakash Bansal , Bonita Sharif , Collin McMillan

A Neural Model for Generating Natural Language Summaries of Program Subroutines

Source code summarization -- creating natural language descriptions of source code behavior -- is a rapidly-growing research topic with applications to automatic documentation generation, program comprehension, and software maintenance.…

Software Engineering · Computer Science 2019-02-07 Alexander LeClair , Siyuan Jiang , Collin McMillan

Better Summarization Evaluation with Word Embeddings for ROUGE

ROUGE is a widely adopted, automatic evaluation measure for text summarization. While it has been shown to correlate well with human judgements, it is biased towards surface lexical similarities. This makes it unsuitable for the evaluation…

Computation and Language · Computer Science 2015-08-26 Jun-Ping Ng , Viktoria Abrecht