Related papers: A Language-Independent Analysis Platform for Sourc…
In the past years, a number of static application security testing tools have been proposed which make use of so-called code property graphs, a graph model which keeps rich information about the source code while enabling its user to write…
Python is one of the most popular programming languages; as such, projects written in Python involve an increasing number of diverse security vulnerabilities. However, existing state-of-the-art analysis tools for Python only support a few…
Large language models (LLMs) have been proposed as powerful tools for detecting software vulnerabilities, where task-specific fine-tuning is typically employed to provide vulnerability-specific knowledge to the LLMs. However, existing…
Detecting defects and vulnerabilities in the early stage has long been a challenge in software engineering. Static analysis, a technique that inspects code without execution, has emerged as a key strategy to address this challenge. Among…
In this paper, we present the Cloud Property Graph (CloudPG), which bridges the gap between static code analysis and runtime security assessment of cloud services. The CloudPG is able to resolve data flows between cloud applications…
Large Language Models (LLMs) face critical challenges when analyzing security vulnerabilities in real world codebases: token limits prevent loading entire repositories, code embeddings fail to capture inter procedural data flows, and LLMs…
Software vulnerabilities present a persistent security challenge, with over 25,000 new vulnerabilities reported in the Common Vulnerabilities and Exposures (CVE) database in 2024 alone. While deep learning based approaches show promise for…
In our times, when the world is increasingly becoming more dependent on software programs, writing bug-free, correct programs is crucial. Program verification based on formal methods can guarantee this by detecting run-time errors in…
Software comprehension can be extremely time-consuming due to the ever-growing size of codebases. Consequently, there is an increasing need to accelerate the code comprehension process to facilitate maintenance and reduce associated costs.…
Modern software systems are developed in diverse programming languages and often harbor critical vulnerabilities that attackers can exploit to compromise security. These vulnerabilities have been actively targeted in real-world attacks,…
In recent years, artificial intelligence has had a conspicuous growth in almost every aspect of life. One of the most applicable areas is security code review, in which a lot of AI-based tools and approaches have been proposed. Recently,…
Existing code similarity metrics, such as BLEU, CodeBLEU, and TSED, largely rely on surface-level string overlap or abstract syntax tree structures, and often fail to capture deeper semantic relationships between programs.We propose CSSG…
The large transformer-based language models demonstrate excellent performance in natural language processing. By considering the transferability of the knowledge gained by these models in one domain to other related domains, and the…
Program representation, which aims at converting program source code into vectors with automatically extracted features, is a fundamental problem in programming language processing (PLP). Recent work tries to represent programs with neural…
Security vulnerabilities present in a code that has been written in diverse programming languages are among the most critical yet complicated aspects of source code to detect. Static analysis tools based on rule-based patterns usually do…
Deep learning is being used extensively in a variety of software engineering tasks, e.g., program classification and defect prediction. Although the technique eliminates the required process of feature engineering, the construction of…
Source code plagiarism is a long-standing issue in tertiary computer science education. Many source code plagiarism detection tools have been proposed to aid in the detection of source code plagiarism. However, existing detection tools are…
As of today the programming language of the vast majority of the published source code is manually specified or programmatically assigned based on the sole file extension. In this paper we show that the source code programming language…
In recent years, utilization of heterogeneous hardware other than small core CPU such as GPU, FPGA or many core CPU is increasing. However, when using heterogeneous hardware, barriers of technical skills such as CUDA are high. Based on…
The area of software development and secure coding can benefit significantly from advancements in virtual assistants. Research has shown that many coders neglect security in favor of meeting deadlines. This shortcoming leaves systems…