Related papers: Exploring Language Model's Code Generation Ability…

Eliciting Instruction-tuned Code Language Models' Capabilities to Utilize Auxiliary Function for Code Generation

We study the code generation behavior of instruction-tuned models built on top of code pre-trained language models when they could access an auxiliary function to implement a function. We design several ways to provide auxiliary functions…

Software Engineering · Computer Science 2024-09-24 Seonghyeon Lee , Suyeon Kim , Joonwon Jang , Heejae Chon , Dongha Lee , Hwanjo Yu

Investigating the Performance of Language Models for Completing Code in Functional Programming Languages: a Haskell Case Study

Language model-based code completion models have quickly grown in use, helping thousands of developers write code in many different programming languages. However, research on code completion models typically focuses on imperative languages…

Computation and Language · Computer Science 2024-03-25 Tim van Dam , Frank van der Heijden , Philippe de Bekker , Berend Nieuwschepen , Marc Otten , Maliheh Izadi

Augmented Language Models: a Survey

This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into simpler subtasks while the latter consists in…

Computation and Language · Computer Science 2023-02-16 Grégoire Mialon , Roberto Dessì , Maria Lomeli , Christoforos Nalmpantis , Ram Pasunuru , Roberta Raileanu , Baptiste Rozière , Timo Schick , Jane Dwivedi-Yu , Asli Celikyilmaz , Edouard Grave , Yann LeCun , Thomas Scialom

Code Execution with Pre-trained Language Models

Code execution is a fundamental aspect of programming language semantics that reflects the exact behavior of the code. However, most pre-trained models for code intelligence ignore the execution trace and only rely on source code and…

Programming Languages · Computer Science 2023-05-10 Chenxiao Liu , Shuai Lu , Weizhu Chen , Daxin Jiang , Alexey Svyatkovskiy , Shengyu Fu , Neel Sundaresan , Nan Duan

An Empirical Study of Retrieval-Augmented Code Generation: Challenges and Opportunities

Code generation aims to automatically generate code snippets of specific programming language according to natural language descriptions. The continuous advancements in deep learning, particularly pre-trained models, have empowered the code…

Software Engineering · Computer Science 2025-01-24 Zezhou Yang , Sirong Chen , Cuiyun Gao , Zhenhao Li , Xing Hu , Kui Liu , Xin Xia

Technical Report: Auxiliary Tuning and its Application to Conditional Text Generation

We introduce a simple and efficient method, called Auxiliary Tuning, for adapting a pre-trained Language Model to a novel task; we demonstrate this approach on the task of conditional text generation. Our approach supplements the original…

Computation and Language · Computer Science 2020-07-01 Yoel Zeldes , Dan Padnos , Or Sharir , Barak Peleg

Generation-Augmented Query Expansion For Code Retrieval

Pre-trained language models have achieved promising success in code retrieval tasks, where a natural language documentation query is given to find the most relevant existing code snippet. However, existing models focus only on optimizing…

Software Engineering · Computer Science 2022-12-22 Dong Li , Yelong Shen , Ruoming Jin , Yi Mao , Kuan Wang , Weizhu Chen

Function-constrained Program Synthesis

This work introduces (1) a technique that allows large language models (LLMs) to leverage user-provided code when solving programming tasks and (2) a method to iteratively generate modular sub-functions that can aid future code generation…

Machine Learning · Computer Science 2023-12-05 Patrick Hajali , Ignas Budvytis

GenX: Mastering Code and Test Generation with Execution Feedback

Recent advancements in language modeling have enabled the translation of natural language into code, and the use of execution feedback to improve code generation. However, these methods often rely heavily on pre-existing test cases, which…

Software Engineering · Computer Science 2024-12-19 Nan Wang , Yafei Liu , Chen Chen , Haonan Lu

CodeExp: Explanatory Code Document Generation

Developing models that can automatically generate detailed code explanation can greatly benefit software maintenance and programming education. However, existing code-to-text generation models often produce only high-level summaries of code…

Computation and Language · Computer Science 2022-11-29 Haotian Cui , Chenglong Wang , Junjie Huang , Jeevana Priya Inala , Todd Mytkowicz , Bo Wang , Jianfeng Gao , Nan Duan

Building A Coding Assistant via the Retrieval-Augmented Language Model

Pretrained language models have shown strong effectiveness in code-related tasks, such as code retrieval, code generation, code summarization, and code completion tasks. In this paper, we propose COde assistaNt viA retrieval-augmeNted…

Computation and Language · Computer Science 2024-11-05 Xinze Li , Hanbin Wang , Zhenghao Liu , Shi Yu , Shuo Wang , Yukun Yan , Yukai Fu , Yu Gu , Ge Yu

Auxiliary task demands mask the capabilities of smaller language models

Developmental psychologists have argued about when cognitive capacities such as language understanding or theory of mind emerge. These debates often hinge on the concept of "task demands" -- the auxiliary challenges associated with…

Computation and Language · Computer Science 2024-07-31 Jennifer Hu , Michael C. Frank

Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion

Pretrained code language models have enabled great progress towards program synthesis. However, common approaches only consider in-file local context and thus miss information and constraints imposed by other parts of the codebase and its…

Software Engineering · Computer Science 2023-06-02 Hengzhi Pei , Jinman Zhao , Leonard Lausen , Sheng Zha , George Karypis

On the Reliability and Explainability of Language Models for Program Generation

Recent studies have adopted pre-trained language models, such as CodeT5 and CodeGPT, for automated program generation tasks like code generation, repair, and translation. Numerous language model-based approaches have been proposed and…

Software Engineering · Computer Science 2024-01-09 Yue Liu , Chakkrit Tantithamthavorn , Yonghui Liu , Li Li

Large Language Models for Code Analysis: Do LLMs Really Do Their Job?

Large language models (LLMs) have demonstrated significant potential in the realm of natural language understanding and programming code processing tasks. Their capacity to comprehend and generate human-like code has spurred research into…

Software Engineering · Computer Science 2024-03-07 Chongzhou Fang , Ning Miao , Shaurya Srivastav , Jialin Liu , Ruoyu Zhang , Ruijie Fang , Asmita , Ryan Tsang , Najmeh Nazari , Han Wang , Houman Homayoun

Large Language Models for Code Generation: A Comprehensive Survey of Challenges, Techniques, Evaluation, and Applications

Large Language Models (LLMs) have demonstrated their remarkable capabilities in numerous fields. This survey focuses on how LLMs empower users, regardless of their technical background, to use human languages to automatically generate…

Software Engineering · Computer Science 2025-04-03 Nam Huynh , Beiyu Lin

CodeT: Code Generation with Generated Tests

The task of generating code solutions for a given programming problem can benefit from the use of pre-trained language models such as Codex, which can produce multiple diverse samples. However, a major challenge for this task is to select…

Computation and Language · Computer Science 2022-11-24 Bei Chen , Fengji Zhang , Anh Nguyen , Daoguang Zan , Zeqi Lin , Jian-Guang Lou , Weizhu Chen

Show Me How It's Done: The Role of Explanations in Fine-Tuning Language Models

Our research demonstrates the significant benefits of using fine-tuning with explanations to enhance the performance of language models. Unlike prompting, which maintains the model's parameters, fine-tuning allows the model to learn and…

Computation and Language · Computer Science 2024-02-13 Mohamad Ballout , Ulf Krumnack , Gunther Heidemann , Kai-Uwe Kuehnberger

Auxiliary task discovery through generate-and-test

In this paper, we explore an approach to auxiliary task discovery in reinforcement learning based on ideas from representation learning. Auxiliary tasks tend to improve data efficiency by forcing the agent to learn auxiliary prediction and…

Machine Learning · Computer Science 2024-07-23 Banafsheh Rafiee , Sina Ghiassian , Jun Jin , Richard Sutton , Jun Luo , Adam White

Reasoning Elicitation in Language Models via Counterfactual Feedback

Despite the increasing effectiveness of language models, their reasoning capabilities remain underdeveloped. In particular, causal reasoning through counterfactual question answering is lacking. This work aims to bridge this gap. We first…

Computation and Language · Computer Science 2025-03-18 Alihan Hüyük , Xinnuo Xu , Jacqueline Maasch , Aditya V. Nori , Javier González