Related papers: STACC: Code Comment Classification using SentenceT…

Dopamin: Transformer-based Comment Classifiers through Domain Post-Training and Multi-level Layer Aggregation

Code comments provide important information for understanding the source code. They can help developers understand the overall purpose of a function or class, as well as identify bugs and technical debt. However, an overabundance of…

Computation and Language · Computer Science 2024-08-12 Nam Le Hai , Nghi D. Q. Bui

Performance Comparison of Binary Machine Learning Classifiers in Identifying Code Comment Types: An Exploratory Study

Code comments are vital to source code as they help developers with program comprehension tasks. Written in natural language (usually English), code comments convey a variety of different information, which are grouped into specific…

Software Engineering · Computer Science 2023-03-06 Amila Indika , Peter Y. Washington , Anthony Peruma

Optimizing Deep Learning Models to Address Class Imbalance in Code Comment Classification

Developers rely on code comments to document their work, track issues, and understand the source code. As such, comments provide valuable insights into developers' understanding of their code and describe their various intentions in writing…

Software Engineering · Computer Science 2025-07-03 Moritz Mock , Thomas Borsani , Giuseppe Di Fatta , Barbara Russo

Towards Better Static Code Analysis Reports: Sentence Transformer-based Filtering of Non-Actionable Alerts

Static code analysis (SCA) tools are widely used as effective ways to detect bugs and vulnerabilities in software systems. However, the reports generated by these tools often contain a large number of non-actionable findings, which can…

Software Engineering · Computer Science 2026-04-21 Tamás Aladics , Norbert Vándor , Rudolf Ferenc , Péter Hegedűs

Identification of the Relevance of Comments in Codes Using Bag of Words and Transformer Based Models

The Forum for Information Retrieval (FIRE) started a shared task this year for classification of comments of different code segments. This is binary text classification task where the objective is to identify whether comments given for…

Information Retrieval · Computer Science 2023-08-14 Sruthi S , Tanmay Basu

STF: Sentence Transformer Fine-Tuning For Topic Categorization With Limited Data

Nowadays, topic classification from tweets attracts considerable research attention. Different classification systems have been suggested thanks to these research efforts. Nevertheless, they face major challenges owing to low performance…

Computation and Language · Computer Science 2024-07-04 Kheir Eddine Daouadi , Yaakoub Boualleg , Oussama Guehairia

Lex2Sent: A bagging approach to unsupervised sentiment analysis

Unsupervised text classification, with its most common form being sentiment analysis, used to be performed by counting words in a text that were stored in a lexicon, which assigns each word to one class or as a neutral word. In recent…

Computation and Language · Computer Science 2025-06-26 Kai-Robin Lange , Jonas Rieger , Carsten Jentsch

Retrospective: Data Mining Static Code Attributes to Learn Defect Predictors

Industry can get any research it wants, just by publishing a baseline result along with the data and scripts need to reproduce that work. For instance, the paper ``Data Mining Static Code Attributes to Learn Defect Predictors'' presented…

Software Engineering · Computer Science 2025-01-28 Tim Menzies

Evaluating the Performance and Efficiency of Sentence-BERT for Code Comment Classification

This work evaluates Sentence-BERT for a multi-label code comment classification task seeking to maximize the classification performance while controlling efficiency constraints during inference. Using a dataset of 13,216 labeled comment…

Software Engineering · Computer Science 2025-06-16 Fabian C. Peña , Steffen Herbold

ComFormer: Code Comment Generation via Transformer and Fusion Method-based Hybrid Code Representation

Developers often write low-quality code comments due to the lack of programming experience, which can reduce the efficiency of developers program comprehension. Therefore, developers hope that code comment generation tools can be developed…

Software Engineering · Computer Science 2021-07-09 Guang Yang , Xiang Chen , Jinxin Cao , Shuyuan Xu , Zhanqi Cui , Chi Yu , Ke Liu

Enhancing Binary Code Comment Quality Classification: Integrating Generative AI for Improved Accuracy

This report focuses on enhancing a binary code comment quality classification model by integrating generated code and comment pairs, to improve model accuracy. The dataset comprises 9048 pairs of code and comments written in the C…

Software Engineering · Computer Science 2023-10-19 Rohith Arumugam S , Angel Deborah S

A Convolutional Neural Network for Language-Agnostic Source Code Summarization

Descriptive comments play a crucial role in the software engineering process. They decrease development time, enable better bug detection, and facilitate the reuse of previously written code. However, comments are commonly the last of a…

Computation and Language · Computer Science 2019-04-02 Jessica Moore , Ben Gelman , David Slater

Enhancing Code Annotation Reliability: Generative AI's Role in Comment Quality Assessment Models

This paper explores a novel method for enhancing binary classification models that assess code comment quality, leveraging Generative Artificial Intelligence to elevate model performance. By integrating 1,437 newly generated code-comment…

Software Engineering · Computer Science 2024-10-30 Seetharam Killivalavan , Durairaj Thenmozhi

Speculative Analysis for Quality Assessment of Code Comments

Previous studies have shown that high-quality code comments assist developers in program comprehension and maintenance tasks. However, the semi-structured nature of comments, unclear conventions for writing good comments, and the lack of…

Software Engineering · Computer Science 2021-07-27 Pooja Rani

Supervised Sentiment Classification with CNNs for Diverse SE Datasets

Sentiment analysis, a popular technique for opinion mining, has been used by the software engineering research community for tasks such as assessing app reviews, developer emotions in issue trackers and developer opinions on APIs. Past…

Computation and Language · Computer Science 2018-12-27 Achyudh Ram , Meiyappan Nagappan

Automated Classification of Human Code Review Comments with Large Language Models

Context: Code reviews are essential for maintaining software quality, yet many human review comments suffer from issues such as redundancy, vagueness, or lack of constructiveness. These types of comments may slow down feedback and obscure…

Software Engineering · Computer Science 2026-04-28 Semih Çağlar , Şükrü Eren Gökırmak , Eray Tüzün

STAB: Speech Tokenizer Assessment Benchmark

Representing speech as discrete tokens provides a framework for transforming speech into a format that closely resembles text, thus enabling the use of speech as an input to the widely successful large language models (LLMs). Currently,…

Computation and Language · Computer Science 2024-09-05 Shikhar Vashishth , Harman Singh , Shikhar Bharadwaj , Sriram Ganapathy , Chulayuth Asawaroengchai , Kartik Audhkhasi , Andrew Rosenberg , Ankur Bapna , Bhuvana Ramabhadran

Adapting Neural Text Classification for Improved Software Categorization

Software Categorization is the task of organizing software into groups that broadly describe the behavior of the software, such as "editors" or "science." Categorization plays an important role in several maintenance tasks, such as…

Software Engineering · Computer Science 2018-06-18 Alexander LeClair , Zachary Eberhart , Collin McMillan

Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study

Transformer-based pre-trained models have recently achieved great results in solving many software engineering tasks including automatic code completion which is a staple in a developer's toolkit. While many have striven to improve the…

Computation and Language · Computer Science 2023-04-25 Tim van Dam , Maliheh Izadi , Arie van Deursen

Boosting Commit Classification with Contrastive Learning

Commit Classification (CC) is an important task in software maintenance, which helps software developers classify code changes into different types according to their nature and purpose. It allows developers to understand better how their…

Software Engineering · Computer Science 2023-08-17 Jiajun Tong , Zhixiao Wang , Xiaobin Rui