English
Related papers

Related papers: Program Synthesis with Large Language Models

200 papers

Language models for program synthesis are usually trained and evaluated on programming competition datasets (MBPP, APPS). However, these datasets are limited in size and quality, while these language models are extremely data hungry.…

Software Engineering · Computer Science 2025-07-23 Noah van der Vleuten

Recent advancements in large language models (LLMs) have greatly improved code generation, specifically at the function level. For instance, GPT-4o has achieved a 91.0\% pass rate on HumanEval. However, this draws into question the adequacy…

Computation and Language · Computer Science 2025-08-19 Jianbo Dai , Jianqiao Lu , Yunlong Feng , Guangtao Zeng , Rongju Ruan , Ming Cheng , Dong Huang , Haochen Tan , Zhijiang Guo

Driven by the surge in code generation using large language models (LLMs), numerous benchmarks have emerged to evaluate these LLMs capabilities. We conducted a large-scale human evaluation of HumanEval and MBPP, two popular benchmarks for…

Computation and Language · Computer Science 2024-07-08 Ankit Yadav , Himanshu Beniwal , Mayank Singh

Software engineers mainly write code by editing existing programs. In contrast, language models (LMs) autoregressively synthesize programs in a single pass. One explanation for this is the scarcity of sequential edit data. While…

Machine Learning · Computer Science 2025-02-12 Ulyana Piterbarg , Lerrel Pinto , Rob Fergus

Large pre-trained language models such as GPT-3, Codex, and Google's language model are now capable of generating code from natural language specifications of programmer intent. We view these developments with a mixture of optimism and…

Software Engineering · Computer Science 2021-12-07 Naman Jain , Skanda Vaidyanath , Arun Iyer , Nagarajan Natarajan , Suresh Parthasarathy , Sriram Rajamani , Rahul Sharma

While programming is one of the most broadly applicable skills in modern society, modern machine learning models still cannot code solutions to basic problems. Despite its importance, there has been surprisingly little work on evaluating…

This work introduces (1) a technique that allows large language models (LLMs) to leverage user-provided code when solving programming tasks and (2) a method to iteratively generate modular sub-functions that can aid future code generation…

Machine Learning · Computer Science 2023-12-05 Patrick Hajali , Ignas Budvytis

Program synthesis strives to generate a computer program as a solution to a given problem specification, expressed with input-output examples or natural language descriptions. The prevalence of large language models advances the…

Machine Learning · Computer Science 2023-03-01 Erik Nijkamp , Bo Pang , Hiroaki Hayashi , Lifu Tu , Huan Wang , Yingbo Zhou , Silvio Savarese , Caiming Xiong

We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X. These datasets cover over 10 programming languages and are generated using a scalable conversion framework that transpiles…

Large language models (LLMs) have shown impressive promise in code generation, yet their progress remains limited by the shortage of large-scale datasets that are both diverse and well-aligned with human reasoning. Most existing resources…

Machine Learning · Computer Science 2025-10-28 Amal Abed , Ivan Lukic , Jörg K. H. Franke , Frank Hutter

This survey reviews how large language models (LLMs) are transforming synthetic training data generation in both natural language and code domains. By producing artificial but task-relevant examples, these models can significantly augment…

Computation and Language · Computer Science 2025-11-21 Mihai Nadas , Laura Diosan , Andreea Tomescu

Large language and multimodal models have shown remarkable success on various benchmarks focused on specific skills such as general-purpose programming, math word problem-solving, and visual question answering. However, it is unclear how…

Artificial Intelligence · Computer Science 2025-10-07 Chao Wen , Jacqueline Staub , Adish Singla

Proof-oriented programs mix computational content with proofs of program correctness. However, the human effort involved in programming and proving is still substantial, despite the use of Satisfiability Modulo Theories (SMT) solvers to…

Programming Languages · Computer Science 2024-09-06 Saikat Chakraborty , Gabriel Ebner , Siddharth Bhat , Sarah Fakhoury , Sakina Fatima , Shuvendu Lahiri , Nikhil Swamy

Large language models show great promise in many domains, including programming. A promise is easy to make but hard to keep, and language models often fail to keep their promises, generating erroneous code. A promising avenue to keep models…

Software Engineering · Computer Science 2024-06-12 Md Rakib Hossain Misu , Cristina V. Lopes , Iris Ma , James Noble

In today's society, we are becoming increasingly dependent on software systems. However, we also constantly witness the negative impacts of buggy software. Program synthesis aims to improve software correctness by automatically generating…

Software Engineering · Computer Science 2023-12-11 Sanyogita Piya , Allison Sullivan

Recently, program synthesis driven by large language models (LLMs) has become increasingly popular. However, program synthesis for machine learning (ML) tasks still poses significant challenges. This paper explores a novel form of program…

Software Engineering · Computer Science 2024-09-10 Jinglue Xu , Jialong Li , Zhen Liu , Nagar Anthel Venkatesh Suryanarayanan , Guoyuan Zhou , Jia Guo , Hitoshi Iba , Kenji Tei

Large Language Models (LLMs) have been gaining increasing attention and demonstrated promising performance across a variety of Software Engineering (SE) tasks, such as Automated Program Repair (APR), code summarization, and code completion.…

Software Engineering · Computer Science 2024-04-18 Quanjun Zhang , Tongke Zhang , Juan Zhai , Chunrong Fang , Bowen Yu , Weisong Sun , Zhenyu Chen

Large Language Models are traditionally finetuned on large instruction datasets. However recent studies suggest that small, high-quality datasets can suffice for general purpose instruction following. This lack of consensus surrounding…

Machine Learning · Computer Science 2023-12-29 Aditi Jha , Sam Havens , Jeremy Dohmann , Alex Trott , Jacob Portes

Program synthesis has been long studied with recent approaches focused on directly using the power of Large Language Models (LLMs) to generate code. Programming benchmarks, with curated synthesis problems and test-cases, are used to measure…

Software Engineering · Computer Science 2023-11-01 Jiawei Liu , Chunqiu Steven Xia , Yuyao Wang , Lingming Zhang

Recent development of large language models (LLMs) for code like CodeX and CodeT5+ demonstrates tremendous promise in achieving code intelligence. Their ability of synthesizing code that completes a program for performing a pre-defined task…

Computation and Language · Computer Science 2023-10-10 Weimin Xiong , Yiwen Guo , Hao Chen
‹ Prev 1 2 3 10 Next ›