Related papers: Source Code Hotspots: A Diagnostic Method for Qual…
Context: Software systems are in continuous evolution through source code changes to fixing bugs, adding new functionalities and improving the internal architecture. All these practices are recorded in the version history, which can be…
Modern version control systems such as Git or SVN include bug tracking mechanisms, through which developers can highlight the presence of bugs through bug reports, i.e., textual descriptions reporting the problem and what are the steps that…
This paper presents a large-scale study that investigates the bug resolution characteristics among popular Github projects written in different programming languages. We explore correlations but, of course, we cannot infer causation.…
Two key contributions presented in this paper are: i) A method for building a dataset containing source code features extracted from source files taken from Open Source Software (OSS) and associated bug reports, ii) A predictive model for…
The cost of software maintenance often surpasses the initial development expenses, making it a significant concern for the software industry. A key strategy for alleviating future maintenance burdens is the early prediction and…
Source code repositories allow developers to manage multiple versions (or branches) of a software system. Pull-requests are used to modify a branch, and backporting is a regular activity used to port changes from a current development…
Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of…
We present HaPy-Bug, a curated dataset of 793 Python source code commits associated with bug fixes, with each line of code annotated by three domain experts. The annotations offer insights into the purpose of modified files, changes at the…
Continuous technology scaling and the introduction of advanced technology nodes in Integrated Circuit (IC) fabrication is constantly exposing new manufacturability issues. One such issue, stemming from complex interaction between design and…
Bugs are inescapable during software development due to frequent code changes, tight deadlines, etc.; therefore, it is important to have tools to find these errors. One way of performing bug identification is to analyze the characteristics…
GitHub is the world's largest host of source code, with more than 150M repositories. However, most of these repositories are not labeled or inadequately so, making it harder for users to find relevant projects. There have been various…
Source code is changed for a reason, e.g., to adapt, correct, or adapt it. This reason can provide valuable insight into the development process but is rarely explicitly documented when the change is committed to a source code repository.…
In open-source projects, anyone can contribute, so it is important to have an active continuous integration and continuous delivery (CI/CD) pipeline in addition to a protocol for reporting security concerns, especially in projects that are…
There are often multiple ways to implement the same requirement in source code. Different implementation choices can result in code snippets that are similar, and have been defined in multiple ways: code clones, examples, simions and…
In software development, fixing bugs is an important task that is time consuming and cost-sensitive. While many approaches have been proposed to automatically detect and patch software code, the strategies are limited to a set of identified…
Code comments are essential for clarifying code functionality, improving readability, and facilitating collaboration among developers. Despite their importance, comments often become outdated, leading to inconsistencies with the…
Context: Addressing user requests in the form of bug reports and Github issues represents a crucial task of any successful software project. However, user-submitted issue reports tend to widely differ in their quality, and developers spend…
Code clones are code snippets that are identical or similar to other snippets within the same or different files. They are often created through copy-and-paste practices and modified during development and maintenance activities. Since a…
AI-based code review tools automatically review and comment on pull requests to improve code quality. Despite their growing presence, little is known about their actual impact. We present a large-scale empirical study of 16 popular AI-based…
A critical part of creating code suggestion systems is the pre-training of Large Language Models on vast amounts of source code and natural language text, often of questionable origin or quality. This may contribute to the presence of bugs…