Related papers: Codetations: Intelligent, Persistent Notes and UIs…

Leveraging Design-Aware Context in Large Language Models for Code Comment Generation

Comments are very useful to the flow of code development. With the increasing commonality of code, novice coders have been creating a significant amount of codebases. Due to lack of commenting standards, their comments are often useless,…

Software Engineering · Computer Science 2026-05-05 Aritra Mitra , Srijoni Majumdar , Anamitra Mukhopadhyay , Partha Pratim Das , Paul D Clough , Partha Pratim Chakrabarti

Text Annotation via Inductive Coding: Comparing Human Experts to LLMs in Qualitative Data Analysis

This paper investigates the automation of qualitative data analysis, focusing on inductive coding using large language models (LLMs). Unlike traditional approaches that rely on deductive methods with predefined labels, this research…

Computation and Language · Computer Science 2025-12-02 Angelina Parfenova , Andreas Marfurt , Alexander Denzler , Juergen Pfeffer

Magic Markup: Maintaining Document-External Markup with an LLM

Text documents, including programs, typically have human-readable semantic structure. Historically, programmatic access to these semantics has required explicit in-document tagging. Especially in systems where the text has an execution…

Computation and Language · Computer Science 2024-03-07 Edward Misback , Zachary Tatlock , Steven L. Tanimoto

Teaching Code Refactoring Using LLMs

This Innovative Practice full paper explores how Large Language Models (LLMs) can enhance the teaching of code refactoring in software engineering courses through real-time, context-aware feedback. Refactoring improves code quality but is…

Software Engineering · Computer Science 2025-08-14 Anshul Khairnar , Aarya Rajoju , Edward F. Gehringer

Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators

Event annotation is important for identifying market changes, monitoring breaking news, and understanding sociological trends. Although expert annotators set the gold standards, human coding is expensive and inefficient. Unlike information…

Computation and Language · Computer Science 2026-04-29 Feng Gu , Zongxia Li , Carlos Rafael Colon , Benjamin Evans , Ishani Mondal , Jordan Lee Boyd-Graber

The Hidden Cost of Readability: How Code Formatting Silently Consumes Your LLM Budget

Source code is usually formatted with elements like indentation and newlines to improve readability for human developers. However, these visual aids do not seem to be beneficial for large language models (LLMs) in the same way since the…

Software Engineering · Computer Science 2025-08-21 Dangfeng Pan , Zhensu Sun , Cenyuan Zhang , David Lo , Xiaoning Du

Exploring Large Language Models for Code Explanation

Automating code documentation through explanatory text can prove highly beneficial in code understanding. Large Language Models (LLMs) have made remarkable strides in Natural Language Processing, especially within software engineering tasks…

Software Engineering · Computer Science 2023-10-26 Paheli Bhattacharya , Manojit Chakraborty , Kartheek N S N Palepu , Vikas Pandey , Ishan Dindorkar , Rakesh Rajpurohit , Rishabh Gupta

ContextModule: Improving Code Completion via Repository-level Contextual Information

Large Language Models (LLMs) have demonstrated impressive capabilities in code completion tasks, where they assist developers by predicting and generating new code in real-time. However, existing LLM-based code completion systems primarily…

Software Engineering · Computer Science 2024-12-12 Zhanming Guan , Junlin Liu , Jierui Liu , Chao Peng , Dexin Liu , Ningyuan Sun , Bo Jiang , Wenchao Li , Jie Liu , Hang Zhu

CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation

Annotated data plays a critical role in Natural Language Processing (NLP) in training models and evaluating their performance. Given recent developments in Large Language Models (LLMs), models such as ChatGPT demonstrate zero-shot…

Computation and Language · Computer Science 2024-03-18 Minzhi Li , Taiwei Shi , Caleb Ziems , Min-Yen Kan , Nancy F. Chen , Zhengyuan Liu , Diyi Yang

Enhancing Text Annotation through Rationale-Driven Collaborative Few-Shot Prompting

The traditional data annotation process is often labor-intensive, time-consuming, and susceptible to human bias, which complicates the management of increasingly complex datasets. This study explores the potential of large language models…

Computation and Language · Computer Science 2024-09-17 Jianfei Wu , Xubin Wang , Weijia Jia

LAMeD: LLM-generated Annotations for Memory Leak Detection

Static analysis tools are widely used to detect software bugs and vulnerabilities but often struggle with scalability and efficiency in complex codebases. Traditional approaches rely on manually crafted annotations -- labeling functions as…

Software Engineering · Computer Science 2025-05-06 Ekaterina Shemetova , Ilya Shenbin , Ivan Smirnov , Anton Alekseev , Alexey Rukhovich , Sergey Nikolenko , Vadim Lomshakov , Irina Piontkovskaya

LLMs Accelerate Annotation for Medical Information Extraction

The unstructured nature of clinical notes within electronic health records often conceals vital patient-related information, making it challenging to access or interpret. To uncover this hidden information, specialized Natural Language…

Computation and Language · Computer Science 2023-12-06 Akshay Goel , Almog Gueta , Omry Gilon , Chang Liu , Sofia Erell , Lan Huong Nguyen , Xiaohong Hao , Bolous Jaber , Shashir Reddy , Rupesh Kartha , Jean Steiner , Itay Laish , Amir Feder

Evaluating Large Language Models as Expert Annotators

Textual data annotation, the process of labeling or tagging text with relevant information, is typically costly, time-consuming, and labor-intensive. While large language models (LLMs) have demonstrated their potential as direct…

Computation and Language · Computer Science 2025-08-12 Yu-Min Tseng , Wei-Lin Chen , Chung-Chi Chen , Hsin-Hsi Chen

Recording Concerns in Source Code Using Annotations

A concern can be characterized as a developer's intent behind a piece of code, often not explicitly captured in it. We discuss a technique of recording concerns using source code annotations (concern annotations). Using two studies and two…

Software Engineering · Computer Science 2018-08-13 Matúš Sulír , Milan Nosáľ , Jaroslav Porubän

Open-Source LLMs for Text Annotation: A Practical Guide for Model Setting and Fine-Tuning

This paper studies the performance of open-source Large Language Models (LLMs) in text classification tasks typical for political science research. By examining tasks like stance, topic, and relevance classification, we aim to guide…

Computation and Language · Computer Science 2024-05-30 Meysam Alizadeh , Maël Kubli , Zeynab Samei , Shirin Dehghani , Mohammadmasiha Zahedivafa , Juan Diego Bermeo , Maria Korobeynikova , Fabrizio Gilardi

Exploring the Capabilities of LLMs for Code Change Related Tasks

Developers deal with code-change-related tasks daily, e.g., reviewing code. Pre-trained code and code-change-oriented models have been adapted to help developers with such tasks. Recently, large language models (LLMs) have shown their…

Software Engineering · Computer Science 2024-07-04 Lishui Fan , Jiakun Liu , Zhongxin Liu , David Lo , Xin Xia , Shanping Li

Exploring Code Analysis: Zero-Shot Insights on Syntax and Semantics with LLMs

Code analysis is fundamental in Software Engineering, supporting debugging, optimization, and security assessment. Human developers approach it through syntax parsing, static semantics inference, and dynamic reasoning. Traditional tools are…

Software Engineering · Computer Science 2026-05-22 Wei Ma , Zhihao Lin , Shangqing Liu , Qiang Hu , Ye Liu , Wenhan Wang , Cen Zhang , Liming Nie , Li Li , Yang Liu , Lingxiao Jiang

Prompt-Driven Code Summarization: A Systematic Literature Review

Software documentation is essential for program comprehension, developer onboarding, code review, and long-term maintenance. Yet producing quality documentation manually is time-consuming and frequently yields incomplete or inconsistent…

Software Engineering · Computer Science 2026-04-20 Afia Farjana , Zaiyu Cheng , Antonio Mastropaolo

Free and Customizable Code Documentation with LLMs: A Fine-Tuning Approach

Automated documentation of programming source code is a challenging task with significant practical and scientific implications for the developer community. We present a large language model (LLM)-based application that developers can use…

Software Engineering · Computer Science 2025-12-17 Sayak Chakrabarty , Souradip Pal

Evaluating the Use of LLMs for Documentation to Code Traceability

Large Language Models (LLMs) offer new potential for automating documentation-to-code traceability, yet their capabilities remain underexplored. We present a comprehensive evaluation of LLMs (Claude 3.5 Sonnet, GPT-4o, and o3-mini) in…

Software Engineering · Computer Science 2025-08-08 Ebube Alor , SayedHassan Khatoonabadi , Emad Shihab