Related papers: Coding-PTMs: How to Find Optimal Code Pre-trained …
Pre-trained language models have demonstrated powerful capabilities in the field of natural language processing (NLP). Recently, code pre-trained model (PTM), which draw from the experiences of the NLP field, have also achieved…
The rapid advancement of pre-trained language models (PLMs) has demonstrated promising results for various code-related tasks. However, their effectiveness in detecting real-world vulnerabilities remains a critical challenge. While existing…
Pre-training a language model and then fine-tuning it has shown to be an efficient and effective technique for a wide range of code intelligence tasks, such as code generation, code summarization, and vulnerability detection. However,…
Vulnerability identification is crucial for cyber security in the software-related industry. Early identification methods require significant manual efforts in crafting features or annotating vulnerable code. Although the recent pre-trained…
Pre-trained models (PTMs) have achieved great success in various Software Engineering (SE) downstream tasks following the ``pre-train then fine-tune'' paradigm. As fully fine-tuning all parameters of PTMs can be computationally expensive, a…
With the involvement of multiple programming languages in modern software development, cross-lingual code clone detection has gained traction within the software engineering community. Numerous studies have explored this topic, proposing…
Bug localization refers to the identification of source code files which is in a programming language and also responsible for the unexpected behavior of software using the bug report, which is a natural language. As bug localization is…
Large language models (LLMs) excel in many tasks of software engineering, yet progress in leveraging them for vulnerability discovery has stalled in recent years. To understand this phenomenon, we investigate LLMs through the lens of…
Code pre-trained models (CodePTMs) have recently demonstrated significant success in code intelligence. To interpret these models, some probing methods have been applied. However, these methods fail to consider the inherent characteristics…
In recent years, the growing complexity and scale of source code have rendered manual software vulnerability detection increasingly impractical. To address this challenge, automated approaches leveraging machine learning and code embeddings…
Software fault prediction model are employed to optimize testing resource allocation by identifying fault-prone classes before testing phases. Several researchers' have validated the use of different classification techniques to develop…
As large language models (LLMs) are increasingly adopted for code vulnerability detection, their reliability and robustness across diverse vulnerability types have become a pressing concern. In traditional adversarial settings, code…
The availability of large-scale datasets, advanced architectures, and powerful computational resources have led to effective code models that automate diverse software engineering activities. The datasets usually consist of billions of…
In the age of big data and machine learning, at a time when the techniques and methods of software development are evolving rapidly, a problem has arisen: programmers can no longer detect all the security flaws and vulnerabilities in their…
Despite various approaches being employed to detect vulnerabilities, the number of reported vulnerabilities shows an upward trend over the years. This suggests the problems are not caught before the code is released, which could be caused…
Code Pre-trained Models (CodePTMs) based vulnerability detection have achieved promising results over recent years. However, these models struggle to generalize as they typically learn superficial mapping from source code to labels instead…
Pre-trained models of code built on the transformer architecture have performed well on software engineering (SE) tasks such as predictive code generation, code summarization, among others. However, whether the vector representations from…
While automated vulnerability detection techniques have made promising progress in detecting security vulnerabilities, their scalability and applicability remain challenging. The remarkable performance of Large Language Models (LLMs), such…
The pre-training paradigm plays a key role in the success of Large Language Models (LLMs), which have been recognized as one of the most significant advancements of AI recently. Building on these breakthroughs, code LLMs with advanced…
The significant increase in software production driven by automation and faster development lifecycles has resulted in a corresponding surge in software vulnerabilities. In parallel, the evolving landscape of software vulnerability…