Related papers: Relative Code Comprehensibility Prediction

Beyond Accuracy: Characterizing Code Comprehension Capabilities in (Large) Language Models

Large Language Models (LLMs) are increasingly integrated into software engineering workflows, yet current benchmarks provide only coarse performance summaries that obscure the diverse capabilities and limitations of these models. This paper…

Software Engineering · Computer Science 2026-01-21 Felix Mächtle , Jan-Niclas Serr , Nils Loose , Thomas Eisenbarth

On the Reliability of Code Comprehension Proxies

Prior work on code comprehension uses different comprehension proxies-for example, Likert-scale ratings or answers to input-output questions about program snippets, usually collected from students, to approximate whether code is…

Software Engineering · Computer Science 2026-05-25 Erfan Arvan , Nadeeshan De Silva , Oscar Chaparro , Martin Kellogg

Evaluating Code Readability and Legibility: An Examination of Human-centric Studies

Reading code is an essential activity in software maintenance and evolution. Several studies with human subjects have investigated how different factors, such as the employed programming constructs and naming conventions, can impact code…

Software Engineering · Computer Science 2021-10-05 Delano Oliveira , Reydne Bruno , Fernanda Madeiral , Fernando Castor

Personalized Code Readability Assessment: Are We There Yet?

Unreadable code could be a breeding ground for errors. Thus, previous work defined approaches based on machine learning to automatically assess code readability that can warn developers when some code artifacts (e.g., classes) become…

Software Engineering · Computer Science 2025-03-12 Antonio Vitale , Emanuela Guglielmi , Rocco Oliveto , Simone Scalabrino

Applications of Multi-view Learning Approaches for Software Comprehension

Program comprehension concerns the ability of an individual to make an understanding of an existing software system to extend or transform it. Software systems comprise of data that are noisy and missing, which makes program understanding…

Software Engineering · Computer Science 2019-02-05 Amir Saeidi , Jurriaan Hage , Ravi Khadka , Slinger Jansen

Learning to predict test effectiveness

The high cost of the test can be dramatically reduced, provided that the coverability as an inherent feature of the code under test is predictable. This article offers a machine learning model to predict the extent to which the test could…

Software Engineering · Computer Science 2022-08-23 Morteza Zakeri-Nasrabadi , Saeed Parsa

Investigating the Impact of Vocabulary Difficulty and Code Naturalness on Program Comprehension

Context: Developers spend most of their time comprehending source code during software development. Automatically assessing how readable and understandable source code is can provide various benefits in different tasks, such as task…

Software Engineering · Computer Science 2023-08-28 Bin Lin , Gregorio Robles

Teaching Machine Comprehension with Compositional Explanations

Advances in machine reading comprehension (MRC) rely heavily on the collection of large scale human-annotated examples in the form of (question, paragraph, answer) triples. In contrast, humans are typically able to generalize with only a…

Computation and Language · Computer Science 2020-10-15 Qinyuan Ye , Xiao Huang , Elizabeth Boschee , Xiang Ren

Readability and Understandability Scores for Snippet Assessment: an Exploratory Study

Code search engines usually use readability feature to rank code snippets. There are several metrics to calculate this feature, but developers may have different perceptions about readability. Correlation between readability and…

Software Engineering · Computer Science 2025-10-14 Carlos Eduardo C. Dantas , Marcelo A. Maia

Demystifying and Assessing Code Understandability in Java Decompilation

Decompilation, the process of converting machine-level code into readable source code, plays a critical role in reverse engineering. Given that the main purpose of decompilation is to facilitate code comprehension in scenarios where the…

Software Engineering · Computer Science 2024-10-01 Ruixin Qin , Yifan Xiong , Yifei Lu , Minxue Pan

Procedural Pretraining: Warming Up Language Models with Abstract Data

Pretraining language models directly on web-scale corpora is the de facto paradigm. We study an alternative where the model is initially exposed to abstract structured data to ease the subsequent acquisition of rich semantic knowledge, much…

Computation and Language · Computer Science 2026-05-29 Liangze Jiang , Zachary Shinnick , Anton van den Hengel , Hemanth Saratchandran , Damien Teney

Assessing the Local Interpretability of Machine Learning Models

The increasing adoption of machine learning tools has led to calls for accountability via model interpretability. But what does it mean for a machine learning model to be interpretable by humans, and how can this be assessed? We focus on…

Machine Learning · Computer Science 2019-08-06 Dylan Slack , Sorelle A. Friedler , Carlos Scheidegger , Chitradeep Dutta Roy

Source code summarization involves creating brief descriptions of source code in natural language. These descriptions are a key component of software documentation such as JavaDocs. Automatic code summarization is a prized target of…

Software Engineering · Computer Science 2022-04-05 Sakib Haque , Zachary Eberhart , Aakash Bansal , Collin McMillan

On the Relationship between Code Verifiability and Understandability

Proponents of software verification have argued that simpler code is easier to verify: that is, that verification tools issue fewer false positives and require less human intervention when analyzing simpler code. We empirically validate…

Software Engineering · Computer Science 2023-11-01 Kobi Feldman , Martin Kellogg , Oscar Chaparro

Do Machines and Humans Focus on Similar Code? Exploring Explainability of Large Language Models in Code Summarization

Recent language models have demonstrated proficiency in summarizing source code. However, as in many other domains of machine learning, language models of code lack sufficient explainability. Informally, we lack a formulaic or intuitive…

Software Engineering · Computer Science 2024-02-23 Jiliang Li , Yifan Zhang , Zachary Karas , Collin McMillan , Kevin Leach , Yu Huang

Learning Autocompletion from Real-World Datasets

Code completion is a popular software development tool integrated into all major IDEs. Many neural language models have achieved promising results in completion suggestion prediction on synthetic benchmarks. However, a recent study When…

Software Engineering · Computer Science 2020-11-10 Gareth Ari Aye , Seohyun Kim , Hongyu Li

The Effect of Code Obfuscation on Human Program Comprehension

We investigate how code obfuscation influences human understanding of programs through an output-prediction task. To study this effect, we construct multiple levels of obfuscation, ranging from unobfuscated code to transformations involving…

Software Engineering · Computer Science 2026-03-10 Anh H. N. Nguyen , Jack Le , Ilse Lahnstein Coronado , Tien N. Nguyen

Why Machine Reading Comprehension Models Learn Shortcuts?

Recent studies report that many machine reading comprehension (MRC) models can perform closely to or even better than humans on benchmark datasets. However, existing works indicate that many MRC models may learn shortcuts to outwit these…

Computation and Language · Computer Science 2021-06-03 Yuxuan Lai , Chen Zhang , Yansong Feng , Quzhe Huang , Dongyan Zhao

Verifier Warnings Do Not Improve Comprehensibility Prediction

Proponents of software verification suggest that code simplicity is linked to the effort to verify code, hypothesizing that formal verifiers produce fewer false positive warnings and require less manual intervention when analyzing simpler…

Software Engineering · Computer Science 2026-04-27 Nadeeshan De Silva , Martin Kellogg , Oscar Chaparro

Analysis of Predictive Coding Models for Phonemic Representation Learning in Small Datasets

Neural network models using predictive coding are interesting from the viewpoint of computational modelling of human language acquisition, where the objective is to understand how linguistic units could be learned from speech without any…

Computation and Language · Computer Science 2020-07-09 María Andrea Cruz Blandón , Okko Räsänen