Related papers: Combining Contexts from Multiple Sources for Docum…

Automatic Code Documentation Generation Using GPT-3

Source code documentation is an important artifact for efficient software development. Code documentation could greatly benefit from automation since manual documentation is often labouring, resource and time-intensive. In this paper, we…

Software Engineering · Computer Science 2022-09-07 Junaed Younus Khan , Gias Uddin

CodeT: Code Generation with Generated Tests

The task of generating code solutions for a given programming problem can benefit from the use of pre-trained language models such as Codex, which can produce multiple diverse samples. However, a major challenge for this task is to select…

Computation and Language · Computer Science 2022-11-24 Bei Chen , Fengji Zhang , Anh Nguyen , Daoguang Zan , Zeqi Lin , Jian-Guang Lou , Weizhu Chen

CodeExp: Explanatory Code Document Generation

Developing models that can automatically generate detailed code explanation can greatly benefit software maintenance and programming education. However, existing code-to-text generation models often produce only high-level summaries of code…

Computation and Language · Computer Science 2022-11-29 Haotian Cui , Chenglong Wang , Junjie Huang , Jeevana Priya Inala , Todd Mytkowicz , Bo Wang , Jianfeng Gao , Nan Duan

Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study

Transformer-based pre-trained models have recently achieved great results in solving many software engineering tasks including automatic code completion which is a staple in a developer's toolkit. While many have striven to improve the…

Computation and Language · Computer Science 2023-04-25 Tim van Dam , Maliheh Izadi , Arie van Deursen

DocPrompting: Generating Code by Retrieving the Docs

Publicly available source-code libraries are continuously growing and changing. This makes it impossible for models of code to keep current with all available APIs by simply training these models on existing code repositories. Thus,…

Computation and Language · Computer Science 2023-02-21 Shuyan Zhou , Uri Alon , Frank F. Xu , Zhiruo Wang , Zhengbao Jiang , Graham Neubig

Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code

Few-shot learning with large-scale, pre-trained language models is a powerful way to answer questions about code, e.g., how to complete a given code example, or even generate code snippets from scratch. The success of these models raises…

Software Engineering · Computer Science 2022-06-14 Patrick Bareiß , Beatriz Souza , Marcelo d'Amorim , Michael Pradel

An Empirical Study of Retrieval-Augmented Code Generation: Challenges and Opportunities

Code generation aims to automatically generate code snippets of specific programming language according to natural language descriptions. The continuous advancements in deep learning, particularly pre-trained models, have empowered the code…

Software Engineering · Computer Science 2025-01-24 Zezhou Yang , Sirong Chen , Cuiyun Gao , Zhenhao Li , Xing Hu , Kui Liu , Xin Xia

SkCoder: A Sketch-based Approach for Automatic Code Generation

Recently, deep learning techniques have shown great success in automatic code generation. Inspired by the code reuse, some researchers propose copy-based approaches that can copy the content from similar code snippets to obtain better…

Software Engineering · Computer Science 2023-09-08 Jia Li , Yongmin Li , Ge Li , Zhi Jin , Yiyang Hao , Xing Hu

A Survey of Automatic Generation of Source Code Comments: Algorithms and Techniques

As an integral part of source code files, code comments help improve program readability and comprehension. However, developers sometimes do not comment on their program code adequately due to the incurred extra efforts, lack of relevant…

Software Engineering · Computer Science 2019-07-31 Xiaotao Song , Hailong Sun , Xu Wang , Jiafei Yan

Generation-Augmented Query Expansion For Code Retrieval

Pre-trained language models have achieved promising success in code retrieval tasks, where a natural language documentation query is given to find the most relevant existing code snippet. However, existing models focus only on optimizing…

Software Engineering · Computer Science 2022-12-22 Dong Li , Yelong Shen , Ruoming Jin , Yi Mao , Kuan Wang , Weizhu Chen

Evaluating Large Language Models Trained on Code

We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we…

Machine Learning · Computer Science 2021-07-15 Mark Chen , Jerry Tworek , Heewoo Jun , Qiming Yuan , Henrique Ponde de Oliveira Pinto , Jared Kaplan , Harri Edwards , Yuri Burda , Nicholas Joseph , Greg Brockman , Alex Ray , Raul Puri , Gretchen Krueger , Michael Petrov , Heidy Khlaaf , Girish Sastry , Pamela Mishkin , Brooke Chan , Scott Gray , Nick Ryder , Mikhail Pavlov , Alethea Power , Lukasz Kaiser , Mohammad Bavarian , Clemens Winter , Philippe Tillet , Felipe Petroski Such , Dave Cummings , Matthias Plappert , Fotios Chantzis , Elizabeth Barnes , Ariel Herbert-Voss , William Hebgen Guss , Alex Nichol , Alex Paino , Nikolas Tezak , Jie Tang , Igor Babuschkin , Suchir Balaji , Shantanu Jain , William Saunders , Christopher Hesse , Andrew N. Carr , Jan Leike , Josh Achiam , Vedant Misra , Evan Morikawa , Alec Radford , Matthew Knight , Miles Brundage , Mira Murati , Katie Mayer , Peter Welinder , Bob McGrew , Dario Amodei , Sam McCandlish , Ilya Sutskever , Wojciech Zaremba

ToolCoder: Teach Code Generation Models to use API search tools

Automatically generating source code from natural language descriptions has been a growing field of research in recent years. However, current large-scale code generation models often encounter difficulties when selecting appropriate APIs…

Software Engineering · Computer Science 2023-09-12 Kechi Zhang , Huangzhao Zhang , Ge Li , Jia Li , Zhuo Li , Zhi Jin

Deep Learning Based Code Generation Methods: Literature Review

This paper focuses on Code Generation task that aims at generating relevant code fragments according to given natural language descriptions. In the process of software development, developers often encounter two scenarios. One is requested…

Software Engineering · Computer Science 2024-04-19 Zezhou Yang , Sirong Chen , Cuiyun Gao , Zhenhao Li , Ge Li , Michael Lyu

Automatic Code Generation using Pre-Trained Language Models

Recent advancements in natural language processing \cite{gpt2} \cite{BERT} have led to near-human performance in multiple natural language tasks. In this paper, we seek to understand whether similar techniques can be applied to a highly…

Computation and Language · Computer Science 2021-02-23 Luis Perez , Lizi Ottens , Sudharshan Viswanathan

Auto-Documenation for Software Development

Software documentation is an essential but labor intensive task that often requires a dedicated team of developers to ensure coverage and accuracy. Good documentation will help shorten the development cycle and improve the overall team…

Software Engineering · Computer Science 2017-01-31 Thomas Zheng , Jeff Shaw , Sergey Kozlov

Automated Test Generation from Program Documentation Encoded in Code Comments

Documenting the functionality of software units with code comments, e.g., Javadoc comments, is a common programmer best-practice in software engineering. This paper introduces a novel test generation technique that exploits the code-comment…

Software Engineering · Computer Science 2025-05-01 Giovanni Denaro , Luca Guglielmo

Context-aware Code Summary Generation

Code summary generation is the task of writing natural language descriptions of a section of source code. Recent advances in Large Language Models (LLMs) and other AI-based technologies have helped make automatic code summarization a…

Software Engineering · Computer Science 2024-08-20 Chia-Yi Su , Aakash Bansal , Yu Huang , Toby Jia-Jun Li , Collin McMillan

Learning and Suggesting Source Code Changes from Version History: A Systematic Review

Context: Software systems are in continuous evolution through source code changes to fixing bugs, adding new functionalities and improving the internal architecture. All these practices are recorded in the version history, which can be…

Software Engineering · Computer Science 2020-01-17 Leandro Ungari Cayres , Bruno Santos de Lima , Rogério Eduardo Garcia

Automatic Generation of Programming Exercises and Code Explanations using Large Language Models

This article explores the natural language generation capabilities of large language models with application to the production of two types of learning resources common in programming courses. Using OpenAI Codex as the large language model,…

Software Engineering · Computer Science 2022-06-28 Sami Sarsa , Paul Denny , Arto Hellas , Juho Leinonen

On the Use of Context in Recommending Exception Handling Code Examples

Studies show that software developers often either misuse exception handling features or use them inefficiently, and such a practice may lead an undergoing software project to a fragile, insecure and non-robust application system. In this…

Software Engineering · Computer Science 2018-07-09 Mohammad Masudur Rahman , Chanchal K. Roy