English
Related papers

Related papers: Towards Learning (Dis)-Similarity of Source Code f…

200 papers

Deep Learning (DL) models to analyze source code have shown immense promise during the past few years. More recently, self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE…

Software Engineering · Computer Science 2023-06-07 Yangruibo Ding , Saikat Chakraborty , Luca Buratti , Saurabh Pujar , Alessandro Morari , Gail Kaiser , Baishakhi Ray

Assessing similarity in source code has gained significant attention in recent years due to its importance in software engineering tasks such as clone detection and code search and recommendation. This work presents a comparative analysis…

Software Engineering · Computer Science 2024-08-13 Jorge Martinez-Gil

Code clones are pairs of code snippets that implement similar functionality. Clone detection is a fundamental branch of automatic source code comprehension, having many applications in refactoring recommendation, plagiarism detection, and…

Software Engineering · Computer Science 2022-06-20 Maksim Zubkov , Egor Spirin , Egor Bogomolov , Timofey Bryksin

We propose Corder, a self-supervised contrastive learning framework for source code model. Corder is designed to alleviate the need of labeled data for code retrieval and code summarization tasks. The pre-trained model of Corder can be used…

Software Engineering · Computer Science 2021-05-25 Nghi D. Q. Bui , Yijun Yu , Lingxiao Jiang

We consider the problem of program clone search, i.e. given a target program and a repository of known programs (all in executable format), the goal is to find the program in the repository most similar to the target program - with…

Cryptography and Security · Computer Science 2023-09-04 Tristan Benoit , Jean-Yves Marion , Sébastien Bardin

Code clones are duplicate code fragments that share (nearly) similar syntax or semantics. Code clone detection plays an important role in software maintenance, code refactoring, and reuse. A substantial amount of research has been conducted…

Software Engineering · Computer Science 2020-11-26 Nikita Mehrotra , Navdha Agarwal , Piyush Gupta , Saket Anand , David Lo , Rahul Purandare

Software clones are beneficial to detect security gaps and software maintenance in one programming language or across multiple languages. The existing work on source clone detection performs well but in a single programming language.…

Software Engineering · Computer Science 2022-05-11 Mohammad A. Yahya , Dae-Kyoo Kim

Code clone detection is involved with detecting duplicated fragments of code within a code base. Detecting these clones is useful for maintenance operations which require editing the clones. The tools developed are expected to be robust…

Software Engineering · Computer Science 2016-05-10 Ogechi Onuoha

We address contextualized code retrieval, the search for code snippets helpful to fill gaps in a partial input program. Our approach facilitates a large-scale self-supervised contrastive training by splitting source code randomly into…

Software Engineering · Computer Science 2022-04-26 Johannes Villmow , Viola Campos , Adrian Ulges , Ulrich Schwanecke

Data is often impractical to share for a range of well considered reasons, such as concerns over privacy, intellectual property, and legal constraints. This not only fragments the statistical power of predictive models, but creates an…

In recent years, defect prediction has received a great deal of attention in the empirical software engineering world. Predicting software defects before the maintenance phase is very important not only to decrease the maintenance costs but…

Software Engineering · Computer Science 2018-08-31 Ahmet Okutan

Distribution shift has been a longstanding challenge for the reliable deployment of deep learning (DL) models due to unexpected accuracy degradation. Although DL has been becoming a driving force for large-scale source code analysis in the…

Software Engineering · Computer Science 2023-02-07 Qiang Hu , Yuejun Guo , Xiaofei Xie , Maxime Cordy , Lei Ma , Mike Papadakis , Yves Le Traon

Semantic code clone detection is the task of detecting whether two snippets of code implement the same functionality (e.g., Sort Array). Recently, many neural models achieved near-perfect performance on this task. These models seek to make…

Software Engineering · Computer Science 2025-12-02 Konstantinos Kitsios , Francesco Sovrano , Earl T. Barr , Alberto Bacchelli

Discerning between authentic content and that generated by advanced AI methods has become increasingly challenging. While previous research primarily addresses the detection of fake faces, the identification of generated natural images has…

Computer Vision and Pattern Recognition · Computer Science 2024-07-31 Lorenzo Baraldi , Federico Cocchi , Marcella Cornia , Lorenzo Baraldi , Alessandro Nicolosi , Rita Cucchiara

Recent work learns contextual representations of source code by reconstructing tokens from their context. For downstream semantic understanding tasks like summarizing code in English, these representations should ideally capture program…

Machine Learning · Computer Science 2022-01-10 Paras Jain , Ajay Jain , Tianjun Zhang , Pieter Abbeel , Joseph E. Gonzalez , Ion Stoica

Programmers often reuse code from source code repositories to reduce the development effort. Code clones are candidates for reuse in exploratory or rapid development, as they represent often repeated functionality in software systems. To…

Software Engineering · Computer Science 2020-12-08 Muhammad Hammad , Önder Babur , Hamid Abdul Basit , Mark van den Brand

Duplicated code has a negative impact on the quality of software systems and should be detected at least. In this paper, we discuss an approach that improves source code retrieval using the structural information about the programs. We…

Software Engineering · Computer Science 2013-08-19 Yoshihisa Udagawa

Source code similarity are increasingly used in application development to identify clones, isolate bugs, and find copy-rights violations. Similar code fragments can be very problematic due to the fact that errors in the original code must…

Software Engineering · Computer Science 2019-07-30 F Alomari , M Harbi

Function-level binary code similarity detection is a crucial aspect of cybersecurity. It enables the detection of bugs and patent infringements in released software and plays a pivotal role in preventing supply chain attacks. A practical…

Cryptography and Security · Computer Science 2023-12-27 Sun RuiJin , Guo Shize , Guo Jinhong , Li Wei , Zhan Dazhi , Sun Meng , Pan Zhisong

This paper investigates source code similarity detection using a transformer model augmented with an execution-derived signal. We extend GraphCodeBERT with an explicit, low-dimensional behavioral feature that captures observable agreement…

Software Engineering · Computer Science 2026-02-11 Jorge Martinez-Gil
‹ Prev 1 2 3 10 Next ›