Related papers: CodeGen-Test: An Automatic Code Generation Model I…
The task of generating code solutions for a given programming problem can benefit from the use of pre-trained language models such as Codex, which can produce multiple diverse samples. However, a major challenge for this task is to select…
Recent advancements in natural language processing \cite{gpt2} \cite{BERT} have led to near-human performance in multiple natural language tasks. In this paper, we seek to understand whether similar techniques can be applied to a highly…
Evaluation metrics play a vital role in the growth of an area as it defines the standard of distinguishing between good and bad models. In the area of code synthesis, the commonly used evaluation metric is BLEU or perfect accuracy, but they…
Automatic code completion helps improve developers' productivity in their programming tasks. A program contains instructions expressed via code statements, which are considered as the basic units of program execution. In this paper, we…
As an integral part of source code files, code comments help improve program readability and comprehension. However, developers sometimes do not comment on their program code adequately due to the incurred extra efforts, lack of relevant…
Today, AI technology is showing its strengths in almost every industry and walks of life. From text generation, text summarization, chatbots, NLP is being used widely. One such paradigm is automatic code generation. An AI could be…
While programming is one of the most broadly applicable skills in modern society, modern machine learning models still cannot code solutions to basic problems. Despite its importance, there has been surprisingly little work on evaluating…
With the development of code generation techniques, selecting the correct code solution from multiple candidate solutions has become a crucial task. This study proposes AutoTest, a novel technique that combines automated test case…
Evaluating the correctness of code generated by AI is a challenging open problem. In this paper, we propose a fully automated method, named ACCA, to evaluate the correctness of AI-generated code for security purposes. The method uses…
Code generation aims to automatically generate code snippets of specific programming language according to natural language descriptions. The continuous advancements in deep learning, particularly pre-trained models, have empowered the code…
Automatic code generation from natural language descriptions can be highly beneficial during the process of software development. In this work, we propose GAP-Gen, a Guided Automatic Python Code Generation method based on Python syntactic…
A code generation system generates programming language code based on an input natural language description. State-of-the-art approaches rely on neural networks for code generation. However, these code generators suffer from two problems.…
Since the rise of neural natural-language-to-code models (NL->Code) that can generate long expressions and statements rather than a single next-token, one of the major problems has been reliably evaluating their generated output. In this…
Tasks like code generation and semantic parsing require mapping unstructured (or partially structured) inputs to well-formed, executable outputs. We introduce abstract syntax networks, a modeling framework for these problems. The outputs…
Large language models have demonstrated impressive capabilities in generating code, yet they often produce programs with flaws or deviations from intended behavior, limiting their suitability for safety-critical applications. To address…
Source code documentation is an important artifact for efficient software development. Code documentation could greatly benefit from automation since manual documentation is often labouring, resource and time-intensive. In this paper, we…
The size and complexity of software applications is increasing at an accelerating pace. Source code repositories (along with their dependencies) require vast amounts of labor to keep them tested, maintained, and up to date. As the…
In recent years, researchers have created and introduced a significant number of various code generation models. As human evaluation of every new model version is unfeasible, the community adopted automatic evaluation metrics such as BLEU…
We present a benchmark dataset for evaluating method-level code generation task. The benchmark contains a dataset of 175 samples for automated evaluation and a dataset of 161 samples for manual evaluation. We also present a new metric for…
Code review is an effective software quality assurance activity; however, it is labor-intensive and time-consuming. Thus, a number of generation-based automatic code review (ACR) approaches have been proposed recently, which leverage deep…