Related papers: Guiding Enumerative Program Synthesis with Large L…

Generative Explanations for Program Synthesizers

Despite great advances in program synthesis techniques, they remain algorithmic black boxes. Although they guarantee that when synthesis is successful, the implementation satisfies the specification, they provide no additional information…

Programming Languages · Computer Science 2024-03-07 Amirmohammad Nazari , Souti Chattopadhyay , Swabha Swayamdipta , Mukund Raghothaman

Combining Large Language Models and Gradient-Free Optimization for Automatic Control Policy Synthesis

Large Language models (LLMs) have shown promise as generators of symbolic control policies, producing interpretable program-like representations through iterative search. However, these models are not capable of separating the functional…

Machine Learning · Computer Science 2025-10-02 Carlo Bosio , Matteo Guarrera , Alberto Sangiovanni-Vincentelli , Mark W. Mueller

Can LLMs Perform Synthesis?

How do LLMs compare with symbolic tools on program synthesis tasks? We investigate this question on several synthesis domains: LTL reactive synthesis, syntax-guided synthesis, distributed protocol synthesis, and recursive function…

Programming Languages · Computer Science 2026-03-24 Derek Egolf , Yuhao Zhou , Stavros Tripakis

Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies

Large Language Models (LLMs) have revolutionized the field of Natural Language Processing thanks to their ability to reuse knowledge acquired on massive text corpora on a wide variety of downstream tasks, with minimal (if any) tuning steps.…

Computation and Language · Computer Science 2024-07-12 Flavio Petruzzellis , Alberto Testolin , Alessandro Sperduti

Large Language Models Synergize with Automated Machine Learning

Recently, program synthesis driven by large language models (LLMs) has become increasingly popular. However, program synthesis for machine learning (ML) tasks still poses significant challenges. This paper explores a novel form of program…

Software Engineering · Computer Science 2024-09-10 Jinglue Xu , Jialong Li , Zhen Liu , Nagar Anthel Venkatesh Suryanarayanan , Guoyuan Zhou , Jia Guo , Hitoshi Iba , Kenji Tei

Synthetic Data Generation Using Large Language Models: Advances in Text and Code

This survey reviews how large language models (LLMs) are transforming synthetic training data generation in both natural language and code domains. By producing artificial but task-relevant examples, these models can significantly augment…

Computation and Language · Computer Science 2025-11-21 Mihai Nadas , Laura Diosan , Andreea Tomescu

Efficacy of Synthetic Data as a Benchmark

Large language models (LLMs) have enabled a range of applications in zero-shot and few-shot learning settings, including the generation of synthetic datasets for training and testing. However, to reliably use these synthetic datasets, it is…

Computation and Language · Computer Science 2024-09-19 Gaurav Maheshwari , Dmitry Ivanov , Kevin El Haddad

Can LLMs Reason About Program Semantics? A Comprehensive Evaluation of LLMs on Formal Specification Inference

Large Language Models (LLMs) are increasingly being used to automate programming tasks. Yet, LLMs' capabilities in reasoning about program semantics are still inadequately studied, leaving significant potential for further exploration. This…

Programming Languages · Computer Science 2025-05-30 Thanh Le-Cong , Bach Le , Toby Murray

The Synthetic Imputation Approach: Generating Optimal Synthetic Texts For Underrepresented Categories In Supervised Classification Tasks

Encoder-decoder Large Language Models (LLMs), such as BERT and RoBERTa, require that all categories in an annotation task be sufficiently represented in the training data for optimal performance. However, it is often difficult to find…

Computation and Language · Computer Science 2025-04-22 Joan C. Timoneda

AutoGeTS: Knowledge-based Automated Generation of Text Synthetics for Improving Text Classification

When developing text classification models for real world applications, one major challenge is the difficulty to collect sufficient data for all text classes. In this work, we address this challenge by utilizing large language models (LLMs)…

Computation and Language · Computer Science 2025-08-15 Chenhao Xue , Yuanzhe Jin , Adrian Carrasco-Revilla , Joyraj Chakraborty , Min Chen

Overfitting in Synthesis: Theory and Practice (Extended Version)

In syntax-guided synthesis (SyGuS), a synthesizer's goal is to automatically generate a program belonging to a grammar of possible implementations that meets a logical specification. We investigate a common limitation across…

Programming Languages · Computer Science 2019-06-11 Saswat Padhi , Todd Millstein , Aditya Nori , Rahul Sharma

Grammar Filtering For Syntax-Guided Synthesis

Programming-by-example (PBE) is a synthesis paradigm that allows users to generate functions by simply providing input-output examples. While a promising interaction paradigm, synthesis is still too slow for realtime interaction and more…

Machine Learning · Computer Science 2020-02-10 Kairo Morton , William Hallahan , Elven Shum , Ruzica Piskac , Mark Santolucito

Towards Automated Verification of LLM-Synthesized C Programs

We present \synver{}, a novel synthesis and verification framework for C programs, that deploys a Large Language Model (LLM) to search for a candidate program that satisfies the given specification. Our key idea is to impose syntactic and…

Programming Languages · Computer Science 2025-10-21 Prasita Mukherjee , Benjamin Delaware

Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation

Program synthesis has been long studied with recent approaches focused on directly using the power of Large Language Models (LLMs) to generate code. Programming benchmarks, with curated synthesis problems and test-cases, are used to measure…

Software Engineering · Computer Science 2023-11-01 Jiawei Liu , Chunqiu Steven Xia , Yuyao Wang , Lingming Zhang

Semantics-Guided Synthesis

This paper develops a new framework for program synthesis, called semantics-guided synthesis (SemGuS), that allows a user to provide both the syntax and the semantics for the constructs in the language. SemGuS accepts a recursively defined…

Programming Languages · Computer Science 2020-11-12 Jinwoo Kim , Qinheping Hu , Loris D'Antoni , Thomas Reps

Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models

Large Language Models (LLMs) perform best with well-crafted prompts, yet prompt engineering remains manual, inconsistent, and inaccessible to non-experts. We introduce Promptomatix, an automatic prompt optimization framework that transforms…

Computation and Language · Computer Science 2025-07-28 Rithesh Murthy , Ming Zhu , Liangwei Yang , Jielin Qiu , Juntao Tan , Shelby Heinecke , Caiming Xiong , Silvio Savarese , Huan Wang

Reliability of Large Language Models for Design Synthesis: An Empirical Study of Variance, Prompt Sensitivity, and Method Scaffolding

Large Language Models (LLMs) are increasingly applied to automate software engineering tasks, including the generation of UML class diagrams from natural language descriptions. While prior work demonstrates that LLMs can produce…

Software Engineering · Computer Science 2026-04-07 Rabia Iftikhar , Andreas Rausch

GPT-4.1 Sets the Standard in Automated Experiment Design Using Novel Python Libraries

Large Language Models (LLMs) have advanced rapidly as tools for automating code generation in scientific research, yet their ability to interpret and use unfamiliar Python APIs for complex computational experiments remains poorly…

Software Engineering · Computer Science 2025-09-17 Nuno Fachada , Daniel Fernandes , Carlos M. Fernandes , Bruno D. Ferreira-Saraiva , João P. Matos-Carvalho

Synthetic Data Generation for Phrase Break Prediction with Large Language Model

Current approaches to phrase break prediction address crucial prosodic aspects of text-to-speech systems but heavily rely on vast human annotations from audio or text, incurring significant manual effort and cost. Inherent variability in…

Computation and Language · Computer Science 2025-07-25 Hoyeon Lee , Sejung Son , Ye-Eun Kang , Jong-Hwan Kim

Arrows of Math Reasoning Data Synthesis for Large Language Models: Diversity, Complexity and Correctness

Enhancing the mathematical reasoning of large language models (LLMs) demands high-quality training data, yet conventional methods face critical challenges in scalability, cost, and data reliability. To address these limitations, we propose…

Computation and Language · Computer Science 2025-08-27 Sirui Chen , Changxin Tian , Binbin Hu , Kunlong Chen , Ziqi Liu , Zhiqiang Zhang , Jun Zhou