Related papers: Detecting Code Clones with Graph Neural Networkand…
Software undergoes constant changes to support new requirements, address bugs, enhance performance, and ensure maintainability. Thus, developers spend a great portion of their workday trying to understand and review the code changes of…
The application of deep learning techniques in software engineering becomes increasingly popular. One key problem is developing high-quality and easy-to-use source code representations for code-related tasks. The research community has…
The increasing adoption of AI-generated code has reshaped modern software development, introducing syntactic and semantic variations in cloned code. Unlike traditional human-written clones, AI-generated clones exhibit systematic syntactic…
Duplicated code or code clones are a kind of code smell that have both positive and negative impacts on the development and maintenance of software systems. Software clone research in the past mostly focused on the detection and analysis of…
Recently, there has been a growing interest in automatic software vulnerability detection. Pre-trained model-based approaches have demonstrated superior performance than other Deep Learning (DL)-based approaches in detecting…
Capturing semantic consistency among nodes is crucial for effective graph representation learning. Existing approaches typically rely on $k$-nearest neighbors ($k$NN) or other node-level full search algorithms (FSA) to mine semantic…
Automatic code review (ACR), which can relieve the costs of manual inspection, is an indispensable and essential task in software engineering. To deal with ACR, existing work is to serialize the abstract syntax tree (AST). However, making…
The reuse of code fragments by copying and pasting is widely practiced in software development and results in code clones. Cloning is considered an anti-pattern as it negatively affects program correctness and increases maintenance efforts.…
Narrowing the performance gap between optimal and feasible detection in inter-symbol interference (ISI) channels, this paper proposes to use graph neural networks (GNNs) for detection that can also be used to perform joint detection and…
Traditional Graph Neural Network (GNN) approaches for fake news detection (FND) often depend on auxiliary, non-textual data such as user interaction histories or content dissemination patterns. However, these data sources are not always…
Code cloning is not only assumed to inflate maintenance costs but also considered defect-prone as inconsistent changes to code duplicates can lead to unexpected behavior. Consequently, the identification of duplicated code, clone detection,…
Machine learning (ML) models for code clone detection determine whether two pieces of code are semantically equivalent, which in turn is a key building block for software-engineering tasks like refactoring and security tasks like…
An effective and efficient encoding of the source code of a computer program is critical to the success of sequence-to-sequence deep neural network models for tasks in computer program comprehension, such as automated code summarization and…
Existing defects in software components is unavoidable and leads to not only a waste of time and money but also many serious consequences. To build predictive models, previous studies focus on manually extracting features or using tree…
Automated program analysis is a pivotal research domain in many areas of Computer Science -- Formal Methods and Artificial Intelligence, in particular. Due to the undecidability of the problem of program equivalence, comparing two programs…
This paper presents Tree Notation, a new simple, universal syntax. Language designers can invent new programming languages, called Tree Languages, on top of Tree Notation. Tree Languages have a number of advantages over traditional…
Code analysis is fundamental in Software Engineering, supporting debugging, optimization, and security assessment. Human developers approach it through syntax parsing, static semantics inference, and dynamic reasoning. Traditional tools are…
Software systems are getting more complex as the system grows where maintaining such system is a primary concern for the industry. Code clone is one of the factors making software maintenance more difficult. It is a process of replicating…
Learning representation for source code is a foundation of many program analysis tasks. In recent years, neural networks have already shown success in this area, but most existing models did not make full use of the unique structural…
Large language models (LLMs) have demonstrated remarkable capabilities in various software engineering tasks, such as code generation and debugging, because of their ability to translate between programming languages and natural languages.…