English
Related papers

Related papers: Bug Prediction Using Source Code Embedding Based o…

200 papers

Recent advances in neural modeling for bug detection have been very promising. More specifically, using snippets of code to create continuous vectors or \textit{embeddings} has been shown to be very good at method name prediction and…

Software Engineering · Computer Science 2020-05-14 Jón Arnar Briem , Jordi Smit , Hendrig Sellik , Pavel Rapoport

Source code representations are key in applying machine learning techniques for processing and analyzing programs. A popular approach in representing source code is neural source code embeddings that represents programs with…

Machine Learning · Computer Science 2022-06-17 Md Rafiqul Islam Rabin , Arjun Mukherjee , Omprakash Gnawali , Mohammad Amin Alipour

Natural language processing has improved tremendously after the success of word embedding techniques such as word2vec. Recently, the same idea has been applied on source code with encouraging results. In this survey, we aim to collect and…

Machine Learning · Computer Science 2019-04-08 Zimin Chen , Martin Monperrus

We present a neural model for representing snippets of code as continuous distributed vectors ("code embeddings"). The main idea is to represent a code snippet as a single fixed-length $\textit{code vector}$, which can be used to predict…

Machine Learning · Computer Science 2018-10-31 Uri Alon , Meital Zilberstein , Omer Levy , Eran Yahav

Bug localization refers to the identification of source code files which is in a programming language and also responsible for the unexpected behavior of software using the bug report, which is a natural language. As bug localization is…

Software Engineering · Computer Science 2024-06-26 Partha Chakraborty , Venkatraman Arumugam , Meiyappan Nagappan

One of the most significant challenges in the field of software code auditing is the presence of vulnerabilities in software source code. Every year, more and more software flaws are discovered, either internally in proprietary code or…

Cryptography and Security · Computer Science 2023-06-16 Mst Shapna Akter , Hossain Shahriar , Juan Rodriguez Cardenas , Sheikh Iqbal Ahamed , Alfredo Cuzzocrea

Training a deep learning model on source code has gained significant traction recently. Since such models reason about vectors of numbers, source code needs to be converted to a code representation before vectorization. Numerous approaches…

Software Engineering · Computer Science 2022-07-18 Marjane Namavar , Noor Nashid , Ali Mesbah

Code embedding is a keystone in the application of machine learning on several Software Engineering (SE) tasks. To effectively support a plethora of SE tasks, the embedding needs to capture program syntax and semantics in a way that is…

Software Engineering · Computer Science 2022-01-24 Wei Ma , Mengjie Zhao , Ezekiel Soremekun , Qiang Hu , Jie Zhang , Mike Papadakis , Maxime Cordy , Xiaofei Xie , Yves Le Traon

Automatic source code analysis in key areas of software engineering, such as code security, can benefit from Machine Learning (ML). However, many standard ML approaches require a numeric representation of data and cannot be applied directly…

Machine Learning · Computer Science 2020-04-08 Rhys Compton , Eibe Frank , Panos Patros , Abigail Koay

The advent of large language models (LLMs) has significantly advanced artificial intelligence (AI) in software engineering (SE), with source code embeddings playing a crucial role in tasks such as source code clone detection and source code…

Software Engineering · Computer Science 2025-06-04 Zixiang Xian , Chenhui Cui , Rubing Huang , Chunrong Fang , Zhenyu Chen

Program representation learning is a fundamental task in software engineering applications. With the availability of "big code" and the development of deep learning techniques, various program representation learning models have been…

Software Engineering · Computer Science 2021-09-17 Siqi Han , DongXia Wang , Wanting Li , Xuesong Lu

Tagging news articles or blog posts with relevant tags from a collection of predefined ones is coined as document tagging in this work. Accurate tagging of articles can benefit several downstream applications such as recommendation and…

Computation and Language · Computer Science 2017-07-18 Sheng Chen , Akshay Soni , Aasish Pappu , Yashar Mehdad

Recent research has achieved impressive results on understanding and improving source code by building up on machine-learning techniques developed for natural languages. A significant advancement in natural-language understanding has come…

Software Engineering · Computer Science 2020-08-19 Aditya Kanade , Petros Maniatis , Gogul Balakrishnan , Kensen Shi

With the recent success of embeddings in natural language processing, research has been conducted into applying similar methods to code analysis. Most works attempt to process the code directly or use a syntactic tree representation,…

Machine Learning · Computer Science 2018-11-30 Tal Ben-Nun , Alice Shoshana Jakobovits , Torsten Hoefler

Recently, Le and Mikolov (2014) proposed doc2vec as an extension to word2vec (Mikolov et al., 2013a) to learn document-level embeddings. Despite promising results in the original paper, others have struggled to reproduce those results. This…

Computation and Language · Computer Science 2016-12-19 Jey Han Lau , Timothy Baldwin

We present an efficient document representation learning framework, Document Vector through Corruption (Doc2VecC). Doc2VecC represents each document as a simple average of word embeddings. It ensures a representation generated as such…

Computation and Language · Computer Science 2017-07-11 Minmin Chen

Deep learning methods, which have found successful applications in fields like image classification and natural language processing, have recently been applied to source code analysis too, due to the enormous amount of freely available…

Software Engineering · Computer Science 2021-11-18 Rocìo Cabrera Lozoya , Arnaud Baumann , Antonino Sabetta , Michele Bezzi

The number of academic papers being published is increasing exponentially in recent years, and recommending adequate citations to assist researchers in writing papers is a non-trivial task. Conventional approaches may not be optimal, as the…

Information Retrieval · Computer Science 2020-01-09 Yang Zhang , Qiang Ma

Existing work on software patches often use features specific to a single task. These works often rely on manually identified features, and human effort is required to identify these features for each task. In this work, we propose CC2Vec,…

Software Engineering · Computer Science 2020-03-13 Thong Hoang , Hong Jin Kang , Julia Lawall , David Lo

In the past couple of decades, significant research efforts are devoted to the prediction of software bugs. However, most existing work in this domain treats all bugs the same, which is not the case in practice. It is important for a defect…

Software Engineering · Computer Science 2023-09-07 Ehsan Mashhadi , Hossein Ahmadvand , Hadi Hemmati
‹ Prev 1 2 3 10 Next ›