Related papers: ZC3: Zero-Shot Cross-Language Code Clone Detection
We consider the well-known and important tasks of clone detection and information retrieval for source code. The most standard setup is to search clones inside the same language code snippets. But it is also useful to find code snippets…
Code Clone Detection, which aims to retrieve functionally similar programs from large code bases, has been attracting increasing attention. Modern software often involves a diverse range of programming languages. However, current code clone…
With the involvement of multiple programming languages in modern software development, cross-lingual code clone detection has gained traction within the software engineering community. Numerous studies have explored this topic, proposing…
The diversity of programming languages is growing, making the language extensibility of code clone detectors crucial. However, this is challenging for most existing clone detection detectors because the source code handler needs…
Code clones are pairs of code snippets that implement similar functionality. Clone detection is a fundamental branch of automatic source code comprehension, having many applications in refactoring recommendation, plagiarism detection, and…
Software clones are beneficial to detect security gaps and software maintenance in one programming language or across multiple languages. The existing work on source clone detection performs well but in a single programming language.…
Successful cross-language clone detection could enable researchers and developers to create robust language migration tools, facilitate learning additional programming languages once one is mastered, and promote reuse of code snippets over…
Code clone detection is involved with detecting duplicated fragments of code within a code base. Detecting these clones is useful for maintenance operations which require editing the clones. The tools developed are expected to be robust…
The lexical and syntactic disparities among different programming languages (e.g., Java and Python) pose significant challenges for multi-language software engineering tasks such as cross-language code clone detection and code retrieval,…
Recent work in cross-lingual semantic parsing has successfully applied machine translation to localize parsers to new languages. However, these advances assume access to high-quality machine translation systems and word alignment tools. We…
Code cloning, the duplication of code fragments, is common in software development. While some reuse aids productivity, excessive cloning hurts maintainability and introduces bugs. Hence, automatic code clone detection is vital. Meanwhile,…
Inconsistent modifications to code clones can lead to software defects. Many approaches exist to support consistent modifications based on clone detection and/or change pattern extraction. However, no tool currently supports synchronization…
Context: Code Clone Detection (CCD) is a software engineering task that is used for plagiarism detection, code search, and code comprehension. Recently, deep learning-based models have achieved an F1 score (a metric used to assess…
Finding the same or similar code snippets in source code is one of fundamental activities in software maintenance. Text-based pattern matching tools such as grep is frequently used for such purpose, but making proper queries for the…
Large Language Models (LLMs) have demonstrated remarkable success in various natural language processing and software engineering tasks, such as code generation. The LLMs are mainly utilized in the prompt-based zero/few-shot paradigm to…
Source code clone detection is the task of finding code fragments that have the same or similar functionality, but may differ in syntax or structure. This task is important for software maintenance, reuse, and quality assurance (Roy et al.…
Modern software relies on a multitude of automated testing and quality assurance tools to prevent errors, bugs and potential vulnerabilities. This study sets out to provide a head-to-head, quantitative and qualitative evaluation of six…
Detecting and tracking code clones can ease various software development and maintenance tasks when changes in a code fragment should be propagated over all its copies. Several deep learning-based clone detection models have appeared in the…
Large language models (LLMs) have demonstrated remarkable capabilities in various software engineering tasks, such as code generation and debugging, because of their ability to translate between programming languages and natural languages.…
Deep Learning (DL) models to analyze source code have shown immense promise during the past few years. More recently, self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE…