English
Related papers

Related papers: Improving Source Code Similarity Detection Through…

200 papers

Assessing the degree of similarity of code fragments is crucial for ensuring software quality, but it remains challenging due to the need to capture the deeper semantic aspects of code. Traditional syntactic methods often fail to identify…

Information Retrieval · Computer Science 2025-04-14 Jorge Martinez-Gil

The capability of accurately determining code similarity is crucial in many tasks related to software development. For example, it might be essential to identify code duplicates for performing software maintenance. This research introduces…

Software Engineering · Computer Science 2025-04-25 Jorge Martinez-Gil

Pre-trained models for programming language have achieved dramatic empirical improvements on a variety of code-related tasks such as code search, code completion, code summarization, etc. However, existing pre-trained models regard a code…

Transformer networks such as CodeBERT already achieve outstanding results for code clone detection in benchmark datasets, so one could assume that this task has already been solved. However, code clone detection is not a trivial task.…

Software Engineering · Computer Science 2022-09-02 Tim Sonnekalb , Bernd Gruner , Clemens-Alexander Brust , Patrick Mäder

Source code clone detection is the task of finding code fragments that have the same or similar functionality, but may differ in syntax or structure. This task is important for software maintenance, reuse, and quality assurance (Roy et al.…

Computation and Language · Computer Science 2023-12-29 Mohammed Ataaur Rahaman , Julia Ive

As opposed to natural languages, source code understanding is influenced by grammatical relationships between tokens regardless of their identifier name. Graph representations of source code such as Abstract Syntax Tree (AST) can capture…

Machine Learning · Computer Science 2021-11-18 Junyan Cheng , Iordanis Fostiropoulos , Barry Boehm

Source code similarity are increasingly used in application development to identify clones, isolate bugs, and find copy-rights violations. Similar code fragments can be very problematic due to the fact that errors in the original code must…

Software Engineering · Computer Science 2019-07-30 F Alomari , M Harbi

Code clone detection is about finding out similar code fragments, which has drawn much attention in software engineering since it is important for software maintenance and evolution. Researchers have proposed many techniques and tools for…

Software Engineering · Computer Science 2023-11-21 Junjie Shan , Shihan Dou , Yueming Wu , Hairu Wu , Yang Liu

Pre-trained models of code built on the transformer architecture have performed well on software engineering (SE) tasks such as predictive code generation, code summarization, among others. However, whether the vector representations from…

Software Engineering · Computer Science 2021-08-26 Anjan Karmakar , Romain Robbes

Pre-trained models of source code have recently been successfully applied to a wide variety of Software Engineering tasks; they have also seen some practical adoption in practice, e.g. for code completion. Yet, we still know very little…

Software Engineering · Computer Science 2023-12-11 Anjan Karmakar , Romain Robbes

Source code vulnerability detection aims to identify inherent vulnerabilities to safeguard software systems from potential attacks. Many prior studies overlook diverse vulnerability characteristics, simplifying the problem into a binary…

Cryptography and Security · Computer Science 2024-04-16 Shangqing Liu , Wei Ma , Jian Wang , Xiaofei Xie , Ruitao Feng , Yang Liu

Recently, many pre-trained language models for source code have been proposed to model the context of code and serve as a basis for downstream code intelligence tasks such as code completion, code search, and code summarization. These…

Software Engineering · Computer Science 2022-02-15 Yao Wan , Wei Zhao , Hongyu Zhang , Yulei Sui , Guandong Xu , Hai Jin

Representational Similarity Analysis is a method from cognitive neuroscience, which helps in comparing representations from two different sources of data. In this paper, we propose using Representational Similarity Analysis to probe the…

Computation and Language · Computer Science 2022-07-19 Shounak Naik , Rajaswa Patil , Swati Agarwal , Veeky Baths

In software, a vulnerability is a defect in a program that attackers might utilize to acquire unauthorized access, alter system functions, and acquire information. These vulnerabilities arise from programming faults, design flaws, incorrect…

Software Engineering · Computer Science 2024-11-28 Md. Fahim Sultan , Tasmin Karim , Md. Shazzad Hossain Shaon , Mohammad Wardat , Mst Shapna Akter

Researchers have investigated the potential of leveraging pre-trained language models, such as CodeBERT, to enhance source code-related tasks. Previous methodologies have relied on CodeBERT's '[CLS]' token as the embedding representation of…

Computation and Language · Computer Science 2024-09-04 Yong Ma , Senlin Luo , Yu-Ming Shang , Yifei Zhang , Zhengjun Li

Widespread reuse of open-source code in smart contract development boosts programming efficiency but significantly amplifies bug propagation across contracts, while dedicated methods for detecting similar smart contract functions remain…

Software Engineering · Computer Science 2025-09-12 Zhenguang Liu , Lixun Ma , Zhongzheng Mu , Chengkun Wei , Xiaojun Xu , Yingying Jiao , Kui Ren

Foundation models (e.g., CodeBERT, GraphCodeBERT, CodeT5) work well for many software engineering tasks. These models are pre-trained (using self-supervision) with billions of code tokens, and then fine-tuned with hundreds of thousands of…

Software Engineering · Computer Science 2022-06-03 Toufique Ahmed , Premkumar Devanbu

Understanding the functional (dis)-similarity of source code is significant for code modeling tasks such as software vulnerability and code clone detection. We present DISCO(DIS-similarity of COde), a novel self-supervised model focusing on…

Programming Languages · Computer Science 2022-03-22 Yangruibo Ding , Luca Buratti , Saurabh Pujar , Alessandro Morari , Baishakhi Ray , Saikat Chakraborty

Duplicated code has a negative impact on the quality of software systems and should be detected at least. In this paper, we discuss an approach that improves source code retrieval using the structural information about the programs. We…

Software Engineering · Computer Science 2013-08-19 Yoshihisa Udagawa

Detecting code clones is relevant to software maintenance and code refactoring. This challenge still presents unresolved cases, mainly when structural similarity does not reflect functional equivalence, though recent code models show…

Software Engineering · Computer Science 2025-06-16 Jorge Martinez-Gil
‹ Prev 1 2 3 10 Next ›