Related papers: Code to Comment "Translation": Data, Metrics, Base…

Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors

Automated source code summarization is a popular software engineering research topic wherein machine translation models are employed to "translate" code snippets into relevant natural language descriptions. Most evaluations of such models…

Software Engineering · Computer Science 2021-06-17 Junayed Mahmud , Fahim Faisal , Raihan Islam Arnob , Antonios Anastasopoulos , Kevin Moran

Too Noisy To Learn: Enhancing Data Quality for Code Review Comment Generation

Code review is an important practice in software development, yet it is time-consuming and requires substantial effort. While open-source datasets have been used to train neural models for automating code review tasks, including review…

Software Engineering · Computer Science 2025-02-07 Chunhua Liu , Hong Yi Lin , Patanamon Thongtanunam

Revisiting the Role of Natural Language Code Comments in Code Translation

The advent of large language models (LLMs) has ushered in a new era in automated code translation across programming languages. Since most code-specific LLMs are pretrained on well-commented code from large repositories like GitHub, it is…

Software Engineering · Computer Science 2026-01-26 Monika Gupta , Ajay Meena , Anamitra Roy Choudhury , Vijay Arya , Srikanta Bedathur

Large Language Models are Qualified Benchmark Builders: Rebuilding Pre-Training Datasets for Advancing Code Intelligence Tasks

Pre-trained code models rely heavily on high-quality pre-training data, particularly human-written reference comments that bridge code and natural language. However, these comments often become outdated as software evolves, degrading model…

Software Engineering · Computer Science 2025-04-29 Kang Yang , Xinjun Mao , Shangwen Wang , Yanlin Wang , Tanghaoran Zhang , Bo Lin , Yihao Qin , Zhang Zhang , Yao Lu , Kamal Al-Sabahi

A Qualitative Investigation into LLM-Generated Multilingual Code Comments and Automatic Evaluation Metrics

Large Language Models are essential coding assistants, yet their training is predominantly English-centric. In this study, we evaluate the performance of code language models in non-English contexts, identifying challenges in their adoption…

Software Engineering · Computer Science 2025-05-22 Jonathan Katzy , Yongcheng Huang , Gopal-Raj Panchu , Maksym Ziemlewski , Paris Loizides , Sander Vermeulen , Arie van Deursen , Maliheh Izadi

Code Needs Comments: Enhancing Code LLMs with Comment Augmentation

The programming skill is one crucial ability for Large Language Models (LLMs), necessitating a deep understanding of programming languages (PLs) and their correlation with natural languages (NLs). We examine the impact of pre-training data…

Computation and Language · Computer Science 2024-02-21 Demin Song , Honglin Guo , Yunhua Zhou , Shuhao Xing , Yudong Wang , Zifan Song , Wenwei Zhang , Qipeng Guo , Hang Yan , Xipeng Qiu , Dahua Lin

Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning

Code comment generation aims at generating natural language descriptions for a code snippet to facilitate developers' program comprehension activities. Despite being studied for a long time, a bottleneck for existing approaches is that…

Software Engineering · Computer Science 2023-06-16 Mingyang Geng , Shangwen Wang , Dezun Dong , Haotian Wang , Ge Li , Zhi Jin , Xiaoguang Mao , Xiangke Liao

On the Quality of AI-Generated Source Code Comments: A Comprehensive Evaluation

This paper investigates the quality of source code comments automatically generated by Large Language Models (LLMs). While AI-based comment generation has emerged as a promising solution to reduce developers' documentation effort, prior…

Software Engineering · Computer Science 2025-12-02 Ian Guelman , Arthur Gregório Leal , Laerte Xavier , Marco Tulio Valente

Harnessing Large Language Models for Curated Code Reviews

In code review, generating structured and relevant comments is crucial for identifying code issues and facilitating accurate code changes that ensure an efficient code review process. Well-crafted comments not only streamline the code…

Software Engineering · Computer Science 2025-02-06 Oussama Ben Sghaier , Martin Weyssow , Houari Sahraoui

Yet Another Combination of IR- and Neural-based Comment Generation

Code comment generation techniques aim to generate natural language descriptions for source code. There are two orthogonal approaches for this task, i.e., information retrieval (IR) based and neural-based methods. Recent studies have…

Software Engineering · Computer Science 2021-07-28 Huang Yuchao , Wei Moshi , Wang Song , Wang Junjie , Wang Qing

Speculative Analysis for Quality Assessment of Code Comments

Previous studies have shown that high-quality code comments assist developers in program comprehension and maintenance tasks. However, the semi-structured nature of comments, unclear conventions for writing good comments, and the lack of…

Software Engineering · Computer Science 2021-07-27 Pooja Rani

ICE-Score: Instructing Large Language Models to Evaluate Code

Recent advancements in the field of natural language generation have facilitated the use of large language models to assess the quality of generated text. Although these models have shown promising results in tasks such as machine…

Artificial Intelligence · Computer Science 2024-01-23 Terry Yue Zhuo

DeepCRCEval: Revisiting the Evaluation of Code Review Comment Generation

Code review is a vital but demanding aspect of software development, generating significant interest in automating review comments. Traditional evaluation methods for these comments, primarily based on text similarity, face two major…

Software Engineering · Computer Science 2025-01-28 Junyi Lu , Xiaojia Li , Zihan Hua , Lei Yu , Shiqi Cheng , Li Yang , Fengjun Zhang , Chun Zuo

Deep Assessment of Code Review Generation Approaches: Beyond Lexical Similarity

Code review is a standard practice for ensuring the quality of software projects, and recent research has focused extensively on automated code review. While significant advancements have been made in generating code reviews, the automated…

Software Engineering · Computer Science 2025-01-10 Yanjie Jiang , Hui Liu , Tianyi Chen , Fu Fan , Chunhao Dong , Kui Liu , Lu Zhang

Bridging Music and Text with Crowdsourced Music Comments: A Sequence-to-Sequence Framework for Thematic Music Comments Generation

We consider a novel task of automatically generating text descriptions of music. Compared with other well-established text generation tasks such as image caption, the scarcity of well-paired music and text datasets makes it a much more…

Sound · Computer Science 2022-09-07 Peining Zhang , Junliang Guo , Linli Xu , Mu You , Junming Yin

Unity is Strength: Cross-Task Knowledge Distillation to Improve Code Review Generation

Code review is a fundamental process in software development that plays a critical role in ensuring code quality and reducing the likelihood of errors and bugs. However, code review might be complex, subjective, and time-consuming. Comment…

Software Engineering · Computer Science 2023-09-08 Oussama Ben Sghaier , Lucas Maes , Houari Sahraoui

Prompting and Fine-tuning Large Language Models for Automated Code Review Comment Generation

Generating accurate code review comments remains a significant challenge due to the inherently diverse and non-unique nature of the task output. Large language models pretrained on both programming and natural language data tend to perform…

Software Engineering · Computer Science 2024-11-18 Md. Asif Haider , Ayesha Binte Mostofa , Sk. Sabit Bin Mosaddek , Anindya Iqbal , Toufique Ahmed

Code Attention: Translating Code to Comments by Exploiting Domain Features

Appropriate comments of code snippets provide insight for code functionality, which are helpful for program comprehension. However, due to the great cost of authoring with the comments, many code projects do not contain adequate comments.…

Artificial Intelligence · Computer Science 2017-11-28 Wenhao Zheng , Hong-Yu Zhou , Ming Li , Jianxin Wu

Deep Just-In-Time Inconsistency Detection Between Comments and Source Code

Natural language comments convey key aspects of source code such as implementation, usage, and pre- and post-conditions. Failure to update comments accordingly when the corresponding code is modified introduces inconsistencies, which is…

Software Engineering · Computer Science 2020-12-29 Sheena Panthaplackel , Junyi Jessy Li , Milos Gligoric , Raymond J. Mooney

A Survey of Automatic Generation of Source Code Comments: Algorithms and Techniques

As an integral part of source code files, code comments help improve program readability and comprehension. However, developers sometimes do not comment on their program code adequately due to the incurred extra efforts, lack of relevant…

Software Engineering · Computer Science 2019-07-31 Xiaotao Song , Hailong Sun , Xu Wang , Jiafei Yan