Related papers: Detecting Code Clones with Graph Neural Networkand…
Code smells and software vulnerabilities both increase maintenance cost, yet they are often handled by separate tools that miss structural context and produce noisy warnings. This paper presents The Code Whisperer, a hybrid framework that…
Machine learning models that take computer program source code as input typically use Natural Language Processing (NLP) techniques. However, a major challenge is that code is written using an open, rapidly changing vocabulary due to, e.g.,…
This paper introduces a novel code-to-code search technique that enhances the performance of Large Language Models (LLMs) by including both static and dynamic features as well as utilizing both similar and dissimilar examples during…
Software developers routinely search for code using general-purpose search engines. However, these search engines cannot find code semantically unless it has an accompanying description. We propose a technique for semantic code search: A…
Large Language Models (LLMs) are transforming software engineering tasks, including code vulnerability detection-a critical area of software security. However, existing methods often rely on resource-intensive models or graph-based…
Software clones are often introduced when developers reuse code fragments to implement similar functionalities in the same or different software systems. Many high-performing clone detection tools today are based on deep learning techniques…
We consider the problem of program clone search, i.e. given a target program and a repository of known programs (all in executable format), the goal is to find the program in the repository most similar to the target program - with…
Measuring and evaluating source code similarity is a fundamental software engineering activity that embraces a broad range of applications, including but not limited to code recommendation, duplicate code, plagiarism, malware, and smell…
Clustering points in a vector space or nodes in a graph is a ubiquitous primitive in statistical data analysis, and it is commonly used for exploratory data analysis. In practice, it is often of interest to "refine" or "improve" a given…
The paper addresses challenges in storing and retrieving sequences in contexts like anomaly detection, behavior prediction, and genetic information analysis. Associative Knowledge Graphs (AKGs) offer a promising approach by leveraging…
Understanding large software systems is a challenging task, especially when code is distributed across multiple repositories and microservices. Developers often need to reason not only about the structure of the code, but also about its…
Detecting vulnerabilities in source code is a critical task for software security assurance. Graph Neural Network (GNN) machine learning can be a promising approach by modeling source code as graphs. Early approaches treated code elements…
Utilizing large language models to generate codes has shown promising meaning in software development revolution. Despite the intelligence shown by the large language models, their specificity in code generation can still be improved due to…
Clone-and-own approach is a natural way of source code reuse for software developers. To assess how known bugs and security vulnerabilities of a cloned component affect an application, developers and security analysts need to identify an…
In recent years, powered by the learned discriminative representation via graph neural network (GNN) models, deep graph matching methods have made great progresses in the task of matching semantic features. However, these methods usually…
Graph Neural Networks (GNNs) have attracted increasing attention in recent years and have achieved excellent performance in semi-supervised node classification tasks. The success of most GNNs relies on one fundamental assumption, i.e., the…
Usage of multiprocessor and multicore computers implies parallel programming. Tools for preparing parallel programs include parallel languages and libraries as well as parallelizing compilers and convertors that can perform automatic…
Software developers embed logging statements inside the source code as an imperative duty in modern software development as log files are necessary for tracking down runtime system issues and troubleshooting system management tasks.…
During software maintenance, programmers spend a lot of time on code comprehension. Reading comments is an effective way for programmers to reduce the reading and navigating time when comprehending source code. Therefore, as a critical task…
Deep Learning applications are becoming increasingly popular. Developers of deep learning systems strive to write more efficient code. Deep learning systems are constantly evolving, imposing tighter development timelines and increasing…