Related papers: Python Code Generation by Asking Clarification Que…

Automatic Code Generation using Pre-Trained Language Models

Recent advancements in natural language processing \cite{gpt2} \cite{BERT} have led to near-human performance in multiple natural language tasks. In this paper, we seek to understand whether similar techniques can be applied to a highly…

Computation and Language · Computer Science 2021-02-23 Luis Perez , Lizi Ottens , Sudharshan Viswanathan

Generating Clarifying Questions for Query Refinement in Source Code Search

In source code search, a common information-seeking strategy involves providing a short initial query with a broad meaning, and then iteratively refining the query using terms gleaned from the results of subsequent searches. This strategy…

Software Engineering · Computer Science 2022-01-26 Zachary Eberhart , Collin McMillan

A Syntactic Neural Model for General-Purpose Code Generation

We consider the problem of parsing natural language descriptions into source code written in a general-purpose programming language like Python. Existing data-driven methods treat this problem as a language generation task without…

Computation and Language · Computer Science 2017-04-07 Pengcheng Yin , Graham Neubig

CodeExp: Explanatory Code Document Generation

Developing models that can automatically generate detailed code explanation can greatly benefit software maintenance and programming education. However, existing code-to-text generation models often produce only high-level summaries of code…

Computation and Language · Computer Science 2022-11-29 Haotian Cui , Chenglong Wang , Junjie Huang , Jeevana Priya Inala , Todd Mytkowicz , Bo Wang , Jianfeng Gao , Nan Duan

Can Code Language Models Learn Clarification-Seeking Behaviors?

Large language models (LLMs) have demonstrated remarkable capabilities in code generation tasks. However, a gap remains between their output and the problem-solving strategies of human developers. Unlike humans, who spend substantial time…

Software Engineering · Computer Science 2025-09-29 Jie JW Wu , Manav Chaudhary , Davit Abrahamyan , Arhaan Khaku , Anjiang Wei , Fatemeh H. Fard

Toward Code Generation: A Survey and Lessons from Semantic Parsing

With the growth of natural language processing techniques and demand for improved software engineering efficiency, there is an emerging interest in translating intention from human languages to programming languages. In this survey paper,…

Software Engineering · Computer Science 2021-05-20 Celine Lee , Justin Gottschlich , Dan Roth

Understanding Unnatural Questions Improves Reasoning over Text

Complex question answering (CQA) over raw text is a challenging task. A prominent approach to this task is based on the programmer-interpreter framework, where the programmer maps the question into a sequence of reasoning actions which is…

Computation and Language · Computer Science 2020-10-20 Xiao-Yu Guo , Yuan-Fang Li , Gholamreza Haffari

Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning

Code summarization generates brief natural language description given a source code snippet, while code retrieval fetches relevant source code given a natural language query. Since both tasks aim to model the association between natural…

Information Retrieval · Computer Science 2020-02-26 Wei Ye , Rui Xie , Jinglei Zhang , Tianxiang Hu , Xiaoyin Wang , Shikun Zhang

Large Language Models Should Ask Clarifying Questions to Increase Confidence in Generated Code

Large language models (LLMs) have significantly improved the ability to perform tasks in the field of code generation. However, there is still a gap between LLMs being capable coders and being top-tier software engineers. Based on the…

Software Engineering · Computer Science 2024-01-23 Jie JW Wu

NoviCode: Generating Programs from Natural Language Utterances by Novices

Current Text-to-Code models demonstrate impressive capabilities in generating executable code from natural language snippets. However, current studies focus on technical instructions and programmer-oriented language, and it is an open…

Computation and Language · Computer Science 2024-07-17 Asaf Achi Mordechai , Yoav Goldberg , Reut Tsarfaty

Why is constrained neural language generation particularly challenging?

Recent advances in deep neural language models combined with the capacity of large scale datasets have accelerated the development of natural language generation systems that produce fluent and coherent texts (to various degrees of success)…

Computation and Language · Computer Science 2025-04-15 Cristina Garbacea , Qiaozhu Mei

CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning

Program synthesis or code generation aims to generate a program that satisfies a problem specification. Recent approaches using large-scale pretrained language models (LMs) have shown promising results, yet they have some critical…

Machine Learning · Computer Science 2022-11-04 Hung Le , Yue Wang , Akhilesh Deepak Gotmare , Silvio Savarese , Steven C. H. Hoi

QURIOUS: Question Generation Pretraining for Text Generation

Recent trends in natural language processing using pretraining have shifted focus towards pretraining and fine-tuning approaches for text generation. Often the focus has been on task-agnostic approaches that generalize the language modeling…

Computation and Language · Computer Science 2020-04-24 Shashi Narayan , Gonçalo Simoes , Ji Ma , Hannah Craighead , Ryan Mcdonald

Automatic question generation based on sentence structure analysis using machine learning approach

Automatic question generation is one of the most challenging tasks of Natural Language Processing. It requires "bidirectional" language processing: firstly, the system has to understand the input text (Natural Language Understanding) and it…

Computation and Language · Computer Science 2022-05-26 Miroslav Blšták , Viera Rozinajová

Generating Comments From Source Code with CCGs

Good comments help developers understand software faster and provide better maintenance. However, comments are often missing, generally inaccurate, or out of date. Many of these problems can be avoided by automatic comment generation. This…

Software Engineering · Computer Science 2018-10-17 Sergey Matskevich , Colin S. Gordon

On the Reliability and Explainability of Language Models for Program Generation

Recent studies have adopted pre-trained language models, such as CodeT5 and CodeGPT, for automated program generation tasks like code generation, repair, and translation. Numerous language model-based approaches have been proposed and…

Software Engineering · Computer Science 2024-01-09 Yue Liu , Chakkrit Tantithamthavorn , Yonghui Liu , Li Li

CodeGen-Test: An Automatic Code Generation Model Integrating Program Test Information

Automatic code generation is to generate the program code according to the given natural language description. The current mainstream approach uses neural networks to encode natural language descriptions, and output abstract syntax trees…

Software Engineering · Computer Science 2022-02-16 Maosheng Zhong , Gen Liu , Hongwei Li , Jiangling Kuang , Jinshan Zeng , Mingwen Wang

Learning to Explain: Answering Why-Questions via Rephrasing

Providing plausible responses to why questions is a challenging but critical goal for language based human-machine interaction. Explanations are challenging in that they require many different forms of abstract knowledge and reasoning.…

Computation and Language · Computer Science 2019-06-05 Allen Nie , Erin D. Bennett , Noah D. Goodman

From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text

Generating code-switched text is a problem of growing interest, especially given the scarcity of corpora containing large volumes of real code-switched text. In this work, we adapt a state-of-the-art neural machine translation model to…

Computation and Language · Computer Science 2021-07-15 Ishan Tarunesh , Syamantak Kumar , Preethi Jyothi

A Survey on Natural Language Counterfactual Generation

Natural language counterfactual generation aims to minimally modify a given text such that the modified text will be classified into a different class. The generated counterfactuals provide insight into the reasoning behind a model's…

Computation and Language · Computer Science 2024-10-08 Yongjie Wang , Xiaoqi Qiu , Yu Yue , Xu Guo , Zhiwei Zeng , Yuhong Feng , Zhiqi Shen