Related papers: Generalizability of Code Clone Detection on CodeBE…
Detecting code clones is relevant to software maintenance and code refactoring. This challenge still presents unresolved cases, mainly when structural similarity does not reflect functional equivalence, though recent code models show…
Code clone detection is involved with detecting duplicated fragments of code within a code base. Detecting these clones is useful for maintenance operations which require editing the clones. The tools developed are expected to be robust…
Large language models (LLMs) have demonstrated remarkable capabilities in various software engineering tasks, such as code generation and debugging, because of their ability to translate between programming languages and natural languages.…
Code clone is a serious problem in software and has the potential to software defects, maintenance overhead, and licensing violations. Therefore, clone detection is important for reducing maintenance effort and improving code quality during…
Code cloning, the duplication of code fragments, is common in software development. While some reuse aids productivity, excessive cloning hurts maintainability and introduces bugs. Hence, automatic code clone detection is vital. Meanwhile,…
This study aims to assess the performance of two advanced Large Language Models (LLMs), GPT-3.5 and GPT-4, in the task of code clone detection. The evaluation involves testing the models on a variety of code pairs of different clone types…
Code clones are pairs of code snippets that implement similar functionality. Clone detection is a fundamental branch of automatic source code comprehension, having many applications in refactoring recommendation, plagiarism detection, and…
Detecting and tracking code clones can ease various software development and maintenance tasks when changes in a code fragment should be propagated over all its copies. Several deep learning-based clone detection models have appeared in the…
This paper investigates source code similarity detection using a transformer model augmented with an execution-derived signal. We extend GraphCodeBERT with an explicit, low-dimensional behavioral feature that captures observable agreement…
Source code clones pose risks ranging from intellectual property violations to unintended vulnerabilities. Effective and efficient scalable clone detection, especially for diverged clones, remains challenging. Large language models (LLMs)…
The capability of accurately determining code similarity is crucial in many tasks related to software development. For example, it might be essential to identify code duplicates for performing software maintenance. This research introduces…
Source code clone detection is the task of finding code fragments that have the same or similar functionality, but may differ in syntax or structure. This task is important for software maintenance, reuse, and quality assurance (Roy et al.…
Code clones are identical or similar code segments. The wide existence of code clones can increase the cost of maintenance and jeopardise the quality of software. The research community has developed many techniques to detect code clones,…
Assessing the degree of similarity of code fragments is crucial for ensuring software quality, but it remains challenging due to the need to capture the deeper semantic aspects of code. Traditional syntactic methods often fail to identify…
Code cloning is not only assumed to inflate maintenance costs but also considered defect-prone as inconsistent changes to code duplicates can lead to unexpected behavior. Consequently, the identification of duplicated code, clone detection,…
Clone detection plays an important role in software engineering. Finding clones within a single project introduces possible refactoring opportunities, and between different projects it could be used for detecting code reuse or possible…
With the involvement of multiple programming languages in modern software development, cross-lingual code clone detection has gained traction within the software engineering community. Numerous studies have explored this topic, proposing…
Software clones have been an active area of research for the past two decades. However, although numerous clone detection tools are now available, only a small fraction of the literature has focused on tool evaluation, and this is in fact…
Finding the same or similar code snippets in source code is one of fundamental activities in software maintenance. Text-based pattern matching tools such as grep is frequently used for such purpose, but making proper queries for the…
Copy & paste is a widespread practice when developing software and, thus, duplicated and subsequently modified code occurs frequently in software projects. Since such code clones, i.e., identical or similar fragments of code, can bloat…