English
Related papers

Related papers: Detecting Code Clones with Graph Neural Networkand…

200 papers

Classic clone detection approaches are hardly capable of finding redundant code that has been developed independently, i.e., is not the result of copy&paste. To automatically detect such functionally similar code of independent origin, we…

Software Engineering · Computer Science 2018-01-19 Florian Deissenboeck , Lars Heinemann , Benjamin Hummel , Stefan Wagner

Semantic clones are program components with similar behavior, but different textual representation. Semantic similarity is hard to detect, and semantic clone detection is still an open issue. We present semantic clone detection via…

Software Engineering · Computer Science 2020-01-22 Hannes Thaller , Lukas Linsbauer , Alexander Egyed

Code retrieval techniques and tools have been playing a key role in facilitating software developers to retrieve existing code fragments from available open-source repositories given a user query. Despite the existing efforts in improving…

Software Engineering · Computer Science 2019-10-01 Yao Wan , Jingdong Shu , Yulei Sui , Guandong Xu , Zhou Zhao , Jian Wu , Philip S. Yu

Recently program learning techniques have been proposed to process source code based on syntactical structures (e.g., Abstract Syntax Trees) and/or semantic information (e.g., Dependency Graphs). Although graphs may be better at capturing…

Software Engineering · Computer Science 2020-12-15 Nghi D. Q. Bui , Yijun Yu , Lingxiao Jiang

Semantic clone detection is the process of finding program elements with similar or equal runtime behavior. For example, detecting the semantic equality between the recursive and iterative implementation of the factorial computation.…

Software Engineering · Computer Science 2022-05-24 Hannes Thaller , Lukas Linsbauer , Alexander Egyed

Detecting code clones is crucial in various software engineering tasks. In particular, code clone detection can have significant uses in the context of analyzing and fixing bugs in large scale applications. However, prior works, such as…

Software Engineering · Computer Science 2020-09-23 Hongfa Xue , Yongsheng Mei , Kailash Gogineni , Guru Venkataramani , Tian Lan

Code search aims to retrieve accurate code snippets based on a natural language query to improve software productivity and quality. With the massive amount of available programs such as (on GitHub or Stack Overflow), identifying and…

Software Engineering · Computer Science 2023-02-14 Shangqing Liu , Xiaofei Xie , Jingkai Siow , Lei Ma , Guozhu Meng , Yang Liu

Copy & paste is a widespread practice when developing software and, thus, duplicated and subsequently modified code occurs frequently in software projects. Since such code clones, i.e., identical or similar fragments of code, can bloat…

Software Engineering · Computer Science 2026-02-03 Thomas S. Heinze , André Schäfer , Wolfram Amme

Deep learning is being used extensively in a variety of software engineering tasks, e.g., program classification and defect prediction. Although the technique eliminates the required process of feature engineering, the construction of…

Software Engineering · Computer Science 2021-11-24 Zhehao Zhao , Bo Yang , Ge Li , Huai Liu , Zhi Jin

Code Clone Detection, which aims to retrieve functionally similar programs from large code bases, has been attracting increasing attention. Modern software often involves a diverse range of programming languages. However, current code clone…

Software Engineering · Computer Science 2024-03-07 Yangkai Du , Tengfei Ma , Lingfei Wu , Xuhong Zhang , Shouling Ji

Detecting and tracking code clones can ease various software development and maintenance tasks when changes in a code fragment should be propagated over all its copies. Several deep learning-based clone detection models have appeared in the…

Software Engineering · Computer Science 2024-12-20 Subroto Nag Pinku , Debajyoti Mondal , Chanchal K. Roy

Code completion has become an essential component of integrated development environments. Contemporary code completion methods rely on the abstract syntax tree (AST) to generate syntactically correct code. However, they cannot fully capture…

Software Engineering · Computer Science 2021-03-18 Yanlin Wang , Hui Li

The rapid evolution of programming languages and software systems has necessitated the implementation of multilingual and scalable clone detection tools. However, it is difficult to achieve the above requirements at the same time. Most…

Software Engineering · Computer Science 2024-12-06 Yuhang Ye , Yuekun Wang , Yinxing Xue , Yueming Wu , Yang Liu

Transformer networks such as CodeBERT already achieve outstanding results for code clone detection in benchmark datasets, so one could assume that this task has already been solved. However, code clone detection is not a trivial task.…

Software Engineering · Computer Science 2022-09-02 Tim Sonnekalb , Bernd Gruner , Clemens-Alexander Brust , Patrick Mäder

Code cloning, a widespread practice in software development, involves replicating code fragments to save time but often at the expense of software maintainability and quality. In this paper, we address the specific challenge of detecting…

Software Engineering · Computer Science 2025-02-27 Lida Zhao , Shihan Dou , Yutao Hu , Yueming Wu , Jiahui Wu , Chengwei Liu , Lyuye Zhang , Yi Liu , Jun Sun , Xuanjing Huang , Yang Liu

Performance analysis has always been an afterthought during the application development process, focusing on application correctness first. The learning curve of the existing static and dynamic analysis tools are steep, which requires…

Machine Learning · Computer Science 2021-04-23 Nathan Pinnow , Tarek Ramadan , Tanzima Z. Islam , Chase Phelps , Jayaraman J. Thiagarajan

This study aims to assess the performance of two advanced Large Language Models (LLMs), GPT-3.5 and GPT-4, in the task of code clone detection. The evaluation involves testing the models on a variety of code pairs of different clone types…

Software Engineering · Computer Science 2024-07-03 Zixian Zhang , Takfarinas Saber

Detecting code clones is relevant to software maintenance and code refactoring. This challenge still presents unresolved cases, mainly when structural similarity does not reflect functional equivalence, though recent code models show…

Software Engineering · Computer Science 2025-06-16 Jorge Martinez-Gil

This paper investigates source code similarity detection using a transformer model augmented with an execution-derived signal. We extend GraphCodeBERT with an explicit, low-dimensional behavioral feature that captures observable agreement…

Software Engineering · Computer Science 2026-02-11 Jorge Martinez-Gil

Tasks like code generation and semantic parsing require mapping unstructured (or partially structured) inputs to well-formed, executable outputs. We introduce abstract syntax networks, a modeling framework for these problems. The outputs…

Computation and Language · Computer Science 2017-04-26 Maxim Rabinovich , Mitchell Stern , Dan Klein