Related papers: Dependency-Aware Code Naturalness

Investigating the Impact of Vocabulary Difficulty and Code Naturalness on Program Comprehension

Context: Developers spend most of their time comprehending source code during software development. Automatically assessing how readable and understandable source code is can provide various benefits in different tasks, such as task…

Software Engineering · Computer Science 2023-08-28 Bin Lin , Gregorio Robles

Exploring Software Naturalness through Neural Language Models

The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing. We explore this hypothesis through the use of a pre-trained transformer-based language…

Computation and Language · Computer Science 2020-06-25 Luca Buratti , Saurabh Pujar , Mihaela Bornea , Scott McCarley , Yunhui Zheng , Gaetano Rossiello , Alessandro Morari , Jim Laredo , Veronika Thost , Yufan Zhuang , Giacomo Domeniconi

An Empirical Validation of Cognitive Complexity as a Measure of Source Code Understandability

Background: Developers spend a lot of their time on understanding source code. Static code analysis tools can draw attention to code that is difficult for developers to understand. However, most of the findings are based on non-validated…

Software Engineering · Computer Science 2020-07-27 Marvin Muñoz Barón , Marvin Wyrich , Stefan Wagner

Learning Natural Coding Conventions

Every programmer has a characteristic style, ranging from preferences about identifier naming to preferences about object relationships and design patterns. Coding conventions define a consistent syntactic style, fostering readability and…

Software Engineering · Computer Science 2016-11-09 Miltiadis Allamanis , Earl T. Barr , Christian Bird , Charles Sutton

CodeBERT-nt: code naturalness via CodeBERT

Much of software-engineering research relies on the naturalness of code, the fact that code, in small code snippets, is repetitive and can be predicted using statistical language models like n-gram. Although powerful, training such models…

Software Engineering · Computer Science 2022-08-15 Ahmed Khanfir , Matthieu Jimenez , Mike Papadakis , Yves Le Traon

Bringing Structure to Naturalness: On the Naturalness of ASTs

Source code comes in different shapes and forms. Previous research has already shown code to be more predictable than natural language as well as highlighted its statistical predictability at the token level: source code can be natural.…

Software Engineering · Computer Science 2025-04-14 Profir-Petru Pârţachi , Mahito Sugiyama

CodeFlow: Program Behavior Prediction with Dynamic Dependencies Learning

Predicting program behavior without execution is a critical task in software engineering. Existing models often fall short in capturing the dynamic dependencies among program elements. To address this, we present CodeFlow, a novel machine…

Software Engineering · Computer Science 2025-02-11 Cuong Chi Le , Hoang Nhat Phan , Huy Nhat Phan , Tien N. Nguyen , Nghi D. Q. Bui

Evaluating Code Readability and Legibility: An Examination of Human-centric Studies

Reading code is an essential activity in software maintenance and evolution. Several studies with human subjects have investigated how different factors, such as the employed programming constructs and naming conventions, can impact code…

Software Engineering · Computer Science 2021-10-05 Delano Oliveira , Reydne Bruno , Fernanda Madeiral , Fernando Castor

A Survey of Machine Learning for Big Code and Naturalness

Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit code's abundance of patterns. In…

Software Engineering · Computer Science 2018-05-08 Miltiadis Allamanis , Earl T. Barr , Premkumar Devanbu , Charles Sutton

Mutant Density: A Measure of Fault-Sensitive Complexity

Software code complexity is a well-studied property to determine software component health. However, the existing code complexity metrics do not directly take into account the fault-proneness aspect of the code. We propose a metric called…

Software Engineering · Computer Science 2021-04-27 Ali Parsai , Serge Demeyer

Understanding Code Patterns - Analysis, Interpretation & Measurement

This research paper aims to find, analyze and understand code patterns in any software system and measure its quality by defining standards and proposing a formula for the same. Every code that is written can be divided into different code…

Software Engineering · Computer Science 2011-07-01 Jitesh Dundas

Use of Source Code Similarity Metrics in Software Defect Prediction

In recent years, defect prediction has received a great deal of attention in the empirical software engineering world. Predicting software defects before the maintenance phase is very important not only to decrease the maintenance costs but…

Software Engineering · Computer Science 2018-08-31 Ahmet Okutan

Do Code Clones Matter?

Code cloning is not only assumed to inflate maintenance costs but also considered defect-prone as inconsistent changes to code duplicates can lead to unexpected behavior. Consequently, the identification of duplicated code, clone detection,…

Software Engineering · Computer Science 2017-11-15 Elmar Juergens , Florian Deissenboeck , Benjamin Hummel , Stefan Wagner

Natural Language-Guided Programming

In today's software world with its cornucopia of reusable software libraries, when a programmer is faced with a programming task that they suspect can be completed through the use of a library, they often look for code examples using a…

Software Engineering · Computer Science 2021-10-08 Geert Heyman , Rafael Huysegems , Pascal Justen , Tom Van Cutsem

In-IDE Code Generation from Natural Language: Promise and Challenges

A great part of software development involves conceptualizing or communicating the underlying procedures and logic that needs to be expressed in programs. One major difficulty of programming is turning concept into code, especially when…

Software Engineering · Computer Science 2021-09-23 Frank F. Xu , Bogdan Vasilescu , Graham Neubig

TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing

Program comprehension is a fundamental task in software development and maintenance processes. Software developers often need to understand a large amount of existing code before they can develop new features or fix bugs in existing…

Machine Learning · Computer Science 2019-10-29 Vinoj Jayasundara , Nghi Duy Quoc Bui , Lingxiao Jiang , David Lo

Representing and Reasoning about Dynamic Code

Dynamic code, i.e., code that is created or modified at runtime, is ubiquitous in today's world. The behavior of dynamic code can depend on the logic of the dynamic code generator in subtle and non-obvious ways, with significant security…

Cryptography and Security · Computer Science 2019-10-31 Jesse Bartels , Jon Stephens , Saumya Debray

A systematic literature review on source code similarity measurement and clone detection: techniques, applications, and challenges

Measuring and evaluating source code similarity is a fundamental software engineering activity that embraces a broad range of applications, including but not limited to code recommendation, duplicate code, plagiarism, malware, and smell…

Software Engineering · Computer Science 2023-06-29 Morteza Zakeri-Nasrabadi , Saeed Parsa , Mohammad Ramezani , Chanchal Roy , Masoud Ekhtiarzadeh

Detecting Code Clones: A review

Code clone detection is involved with detecting duplicated fragments of code within a code base. Detecting these clones is useful for maintenance operations which require editing the clones. The tools developed are expected to be robust…

Software Engineering · Computer Science 2016-05-10 Ogechi Onuoha

Source Code Retrieval Using Sequence Based Similarity

Duplicated code has a negative impact on the quality of software systems and should be detected at least. In this paper, we discuss an approach that improves source code retrieval using the structural information about the programs. We…

Software Engineering · Computer Science 2013-08-19 Yoshihisa Udagawa