English
Related papers

Related papers: ActiveClean: Generating Line-Level Vulnerability D…

200 papers

Data cleaning is often an important step to ensure that predictive models, such as regression and classification, are not affected by systematic errors such as inconsistent, out-of-date, or outlier data. Identifying dirty data is often a…

Databases · Computer Science 2016-01-18 Sanjay Krishnan , Jiannan Wang , Eugene Wu , Michael J. Franklin , Ken Goldberg

Vulnerability detection is crucial for identifying security weaknesses in software systems. However, training effective machine learning models for this task is often constrained by the high cost and expertise required for data annotation.…

Cryptography and Security · Computer Science 2025-08-19 Xiang Lan , Tim Menzies , Bowen Xu

It is increasingly suggested to identify Software Vulnerabilities (SVs) in code commits to give early warnings about potential security risks. However, there is a lack of effort to assess vulnerability-contributing commits right after they…

Software Engineering · Computer Science 2021-08-19 Triet H. M. Le , David Hin , Roland Croft , M. Ali Babar

We propose and release a new vulnerable source code dataset. We curate the dataset by crawling security issue websites, extracting vulnerability-fixing commits and source codes from the corresponding projects. Our new dataset contains…

Cryptography and Security · Computer Science 2023-08-10 Yizheng Chen , Zhoujie Ding , Lamya Alowain , Xinyun Chen , David Wagner

Accurate identification of software vulnerabilities is crucial for system integrity. Vulnerability datasets, often derived from the National Vulnerability Database (NVD) or directly from GitHub, are essential for training machine learning…

Current machine-learning based software vulnerability detection methods are primarily conducted at the function-level. However, a key limitation of these methods is that they do not indicate the specific lines of code contributing to…

Cryptography and Security · Computer Science 2022-03-28 David Hin , Andrey Kan , Huaming Chen , M. Ali Babar

Deep Learning (DL) has emerged as a powerful tool for vulnerability detection, often outperforming traditional solutions. However, developing effective DL models requires large amounts of real-world data, which can be difficult to obtain in…

The impact of software vulnerabilities on everyday software systems is significant. Despite deep learning models being proposed for vulnerability detection, their reliability is questionable. Prior evaluations show high recall/F1 scores of…

Software Engineering · Computer Science 2024-07-04 Partha Chakraborty , Krishna Kanth Arumugam , Mahmoud Alfadel , Meiyappan Nagappan , Shane McIntosh

Machine learning-based software vulnerability detection requires high-quality datasets, which is essential for training effective models. To address challenges related to data label quality, diversity, and comprehensiveness, we constructed…

Software Engineering · Computer Science 2025-05-14 Chaomeng Lu , Tianyu Li , Toon Dehaene , Bert Lagaisse

Software supply chain vulnerabilities arise when attackers exploit weaknesses by injecting vulnerable code into widely used packages or libraries within software repositories. While most existing approaches focus on identifying vulnerable…

Cryptography and Security · Computer Science 2025-06-25 Sajal Halder , Muhammad Ejaz Ahmed , Seyit Camtepe

Vulnerability detection methods based on deep learning (DL) have shown strong performance on benchmark datasets, yet their real-world effectiveness remains underexplored. Recent work suggests that both graph neural network (GNN)-based and…

Cryptography and Security · Computer Science 2025-12-12 Chaomeng Lu , Bert Lagaisse

Active learning(AL), which serves as the representative label-efficient learning paradigm, has been widely applied in resource-constrained scenarios. The achievement of AL is attributed to acquisition functions, which are designed for…

Cryptography and Security · Computer Science 2025-08-11 Yuhan Zhi , Longtian Wang , Xiaofei Xie , Chao Shen , Qiang Hu , Xiaohong Guan

Automatically locating vulnerable statements in source code is crucial to assure software security and alleviate developers' debugging efforts. This becomes even more important in today's software ecosystem, where vulnerable code can flow…

Software Engineering · Computer Science 2022-01-14 Yangruibo Ding , Sahil Suneja , Yunhui Zheng , Jim Laredo , Alessandro Morari , Gail Kaiser , Baishakhi Ray

The identification of vulnerabilities is an important element in the software development life cycle to ensure the security of software. While vulnerability identification based on the source code is a well studied field, the identification…

Cryptography and Security · Computer Science 2022-12-05 Andreas Schaad , Dominik Binder

Deep learning (DL) models of code have recently reported great progress for vulnerability detection. In some cases, DL-based models have outperformed static analysis tools. Although many great models have been proposed, we do not yet have a…

Software Engineering · Computer Science 2023-02-14 Benjamin Steenhoek , Md Mahbubur Rahman , Richard Jiles , Wei Le

Vulnerability identification is crucial to protect the software systems from attacks for cyber security. It is especially important to localize the vulnerable functions among the source code to facilitate the fix. However, it is a…

Software Engineering · Computer Science 2019-09-10 Yaqin Zhou , Shangqing Liu , Jingkai Siow , Xiaoning Du , Yang Liu

The availability of labelled data is one of the main limitations in machine learning. We can alleviate this using weak supervision: a framework that uses expert-defined rules $\boldsymbol{\lambda}$ to estimate probabilistic labels…

Machine Learning · Computer Science 2021-05-03 Samantha Biegel , Rafah El-Khatib , Luiz Otavio Vilas Boas Oliveira , Max Baak , Nanne Aben

In the context of the rising interest in code language models (code LMs) and vulnerability detection, we study the effectiveness of code LMs for detecting vulnerabilities. Our analysis reveals significant shortcomings in existing…

Software Engineering · Computer Science 2024-07-11 Yangruibo Ding , Yanjun Fu , Omniyyah Ibrahim , Chawin Sitawarin , Xinyun Chen , Basel Alomair , David Wagner , Baishakhi Ray , Yizheng Chen

Software vulnerabilities are a serious and crucial concern. Typically, in a program or function consisting of hundreds or thousands of source code statements, there are only a few statements causing the corresponding vulnerabilities. Most…

Cryptography and Security · Computer Science 2024-06-13 Van Nguyen , Trung Le , Chakkrit Tantithamthavorn , Michael Fu , John Grundy , Hung Nguyen , Seyit Camtepe , Paul Quirk , Dinh Phung

Deep learning models are the state-of-the-art methods for semantic point cloud segmentation, the success of which relies on the availability of large-scale annotated datasets. However, it can be extremely time-consuming and prohibitively…

Computer Vision and Pattern Recognition · Computer Science 2021-04-13 Xian Shi , Xun Xu , Ke Chen , Lile Cai , Chuan Sheng Foo , Kui Jia
‹ Prev 1 2 3 10 Next ›