Related papers: Detecting Code Clones with Graph Neural Networkand…

Enhancing Software Vulnerability Detection Using Code Property Graphs and Convolutional Neural Networks

The increasing complexity of modern software systems has led to a rise in vulnerabilities that malicious actors can exploit. Traditional methods of vulnerability detection, such as static and dynamic analysis, have limitations in…

Software Engineering · Computer Science 2025-04-01 Amanpreet Singh Saimbhi

Integrated Reasoning Engine for Pointer-related Code Clone Detection

Detecting similar code fragments, usually referred to as code clones, is an important task. In particular, code clone detection can have significant uses in the context of vulnerability discovery, refactoring and plagiarism detection.…

Software Engineering · Computer Science 2021-05-26 Hongfa Xue , Yongsheng Mei , Kailash Gogineni , Guru Venkataramani , Tian Lan

Semantic Code Graph -- an information model to facilitate software comprehension

Software comprehension can be extremely time-consuming due to the ever-growing size of codebases. Consequently, there is an increasing need to accelerate the code comprehension process to facilitate maintenance and reduce associated costs.…

Software Engineering · Computer Science 2024-01-15 Krzysztof Borowski , Bartosz Baliś , Tomasz Orzechowski

Vul-LMGNNs: Fusing language models and online-distilled graph neural networks for code vulnerability detection

Code Language Models (codeLMs) and Graph Neural Networks (GNNs) are widely used in code vulnerability detection. However, GNNs often rely on aggregating information from adjacent nodes, limiting structural information propagation across…

Cryptography and Security · Computer Science 2025-03-24 Ruitong Liu , Yanbin Wang , Haitao Xu , Jianguo Sun , Fan Zhang , Peiyue Li , Zhenhao Guo

Graph-based Clustering for Detecting Semantic Change Across Time and Languages

Despite the predominance of contextualized embeddings in NLP, approaches to detect semantic change relying on these embeddings and clustering methods underperform simpler counterparts based on static word embeddings. This stems from the…

Computation and Language · Computer Science 2024-02-05 Xianghe Ma , Michael Strube , Wei Zhao

Deep Code Search with Naming-Agnostic Contrastive Multi-View Learning

Software development is a repetitive task, as developers usually reuse or get inspiration from existing implementations. Code search, which refers to the retrieval of relevant code snippets from a codebase according to the developer's…

Information Retrieval · Computer Science 2025-08-12 Jiadong Feng , Wei Li , Suhuang Wu , Zhao Wei , Yong Xu , Juhong Wang , Hui Li

Feature Engineering-Based Detection of Buffer Overflow Vulnerability in Source Code Using Neural Networks

One of the most significant challenges in the field of software code auditing is the presence of vulnerabilities in software source code. Every year, more and more software flaws are discovered, either internally in proprietary code or…

Cryptography and Security · Computer Science 2023-06-16 Mst Shapna Akter , Hossain Shahriar , Juan Rodriguez Cardenas , Sheikh Iqbal Ahamed , Alfredo Cuzzocrea

Code Search based on Context-aware Code Translation

Code search is a widely used technique by developers during software development. It provides semantically similar implementations from a large code corpus to developers based on their queries. Existing techniques leverage deep learning…

Software Engineering · Computer Science 2022-02-17 Weisong Sun , Chunrong Fang , Yuchen Chen , Guanhong Tao , Tingxu Han , Quanjun Zhang

A Code Smell Refactoring Approach using GNNs

Code smell is a great challenge in software refactoring, which indicates latent design or implementation flaws that may degrade the software maintainability and evolution. Over the past decades, a variety of refactoring approaches have been…

Software Engineering · Computer Science 2026-04-21 HanYu Zhang , Tomoji Kishi

Code vs Serialized AST Inputs for LLM-Based Code Summarization: An Empirical Study

Summarizing source code into natural language descriptions (code summarization) helps developers better understand program functionality and reduce the burden of software maintenance. Abstract Syntax Trees (ASTs), as opposed to source code,…

Software Engineering · Computer Science 2026-02-09 Shijia Dong , Haoruo Zhao , Paul Harvey

Code Similarity on High Level Programs

This paper presents a new approach for code similarity on High Level programs. Our technique is based on Fast Dynamic Time Warping, that builds a warp path or points relation with local restrictions. The source code is represented into Time…

Computer Vision and Pattern Recognition · Computer Science 2007-10-31 M. Miron Bernal , H. Coyote Estrada , J. Figueroa Nazuno

Cross-Language Code Search using Static and Dynamic Analyses

As code search permeates most activities in software development,code-to-code search has emerged to support using code as a query and retrieving similar code in the search results. Applications include duplicate code detection for…

Software Engineering · Computer Science 2021-06-18 George Mathew , Kathryn T. Stolee

HELoC: Hierarchical Contrastive Learning of Source Code Representation

Abstract syntax trees (ASTs) play a crucial role in source code representation. However, due to the large number of nodes in an AST and the typically deep AST hierarchy, it is challenging to learn the hierarchical structure of an AST…

Software Engineering · Computer Science 2022-03-29 Xiao Wang , Qiong Wu , Hongyu Zhang , Chen Lyu , Xue Jiang , Zhuoran Zheng , Lei Lyu , Songlin Hu

Identifying Obfuscated Code through Graph-Based Semantic Analysis of Binary Code

Protecting sensitive program content is a critical issue in various situations, ranging from legitimate use cases to unethical contexts. Obfuscation is one of the most used techniques to ensure such protection. Consequently, attackers must…

Cryptography and Security · Computer Science 2025-04-03 Roxane Cohen , Robin David , Florian Yger , Fabrice Rossi

A Unified Active Learning Framework for Annotating Graph Data with Application to Software Source Code Performance Prediction

Most machine learning and data analytics applications, including performance engineering in software systems, require a large number of annotations and labelled data, which might not be available in advance. Acquiring annotations often…

Software Engineering · Computer Science 2023-09-21 Peter Samoaa , Linus Aronsson , Antonio Longa , Philipp Leitner , Morteza Haghir Chehreghani

Smelling out Code Clones: Clone Detection Tool Evaluation and Corresponding Challenges

Software clones have been an active area of research for the past two decades. However, although numerous clone detection tools are now available, only a small fraction of the literature has focused on tool evaluation, and this is in fact…

Software Engineering · Computer Science 2015-03-03 Rachel Gauci

SFNet: Faster and Accurate Semantic Segmentation via Semantic Flow

In this paper, we focus on exploring effective methods for faster and accurate semantic segmentation. A common practice to improve the performance is to attain high-resolution feature maps with strong semantic representation. Two strategies…

Computer Vision and Pattern Recognition · Computer Science 2023-08-09 Xiangtai Li , Jiangning Zhang , Yibo Yang , Guangliang Cheng , Kuiyuan Yang , Yunhai Tong , Dacheng Tao

Towards Learning (Dis)-Similarity of Source Code from Program Contrasts

Understanding the functional (dis)-similarity of source code is significant for code modeling tasks such as software vulnerability and code clone detection. We present DISCO(DIS-similarity of COde), a novel self-supervised model focusing on…

Programming Languages · Computer Science 2022-03-22 Yangruibo Ding , Luca Buratti , Saurabh Pujar , Alessandro Morari , Baishakhi Ray , Saikat Chakraborty

High-Frequency-aware Hierarchical Contrastive Selective Coding for Representation Learning on Text-attributed Graphs

We investigate node representation learning on text-attributed graphs (TAGs), where nodes are associated with text information. Although recent studies on graph neural networks (GNNs) and pretrained language models (PLMs) have exhibited…

Information Retrieval · Computer Science 2024-04-22 Peiyan Zhang , Chaozhuo Li , Liying Kang , Feiran Huang , Senzhang Wang , Xing Xie , Sunghun Kim

Co-Neighbor Encoding Schema: A Light-cost Structure Encoding Method for Dynamic Link Prediction

Structure encoding has proven to be the key feature to distinguishing links in a graph. However, Structure encoding in the temporal graph keeps changing as the graph evolves, repeatedly computing such features can be time-consuming due to…

Machine Learning · Computer Science 2024-07-31 Ke Cheng , Linzhi Peng , Junchen Ye , Leilei Sun , Bowen Du