Related papers: Evaluating Sequence-to-Sequence Learning Models fo…

Latent Attention For If-Then Program Synthesis

Automatic translation from natural language descriptions into programs is a longstanding challenging problem. In this work, we consider a simple yet important sub-problem: translation from textual descriptions to If-Then programs. We devise…

Computation and Language · Computer Science 2016-11-08 Xinyun Chen , Chang Liu , Richard Shin , Dawn Song , Mingcheng Chen

Learning to Infer Program Sketches

Our goal is to build systems which write code automatically from the kinds of specifications humans can most easily provide, such as examples and natural language instruction. The key idea of this work is that a flexible combination of…

Artificial Intelligence · Computer Science 2019-06-06 Maxwell Nye , Luke Hewitt , Joshua Tenenbaum , Armando Solar-Lezama

Grammatical Sequence Prediction for Real-Time Neural Semantic Parsing

While sequence-to-sequence (seq2seq) models achieve state-of-the-art performance in many natural language processing tasks, they can be too slow for real-time applications. One performance bottleneck is predicting the most likely next token…

Computation and Language · Computer Science 2019-07-26 Chunyang Xiao , Christoph Teichmann , Konstantine Arkoudas

Predictive Synthesis of API-Centric Code

Today's programmers, especially data science practitioners, make heavy use of data-processing libraries (APIs) such as PyTorch, Tensorflow, NumPy, Pandas, and the like. Program synthesizers can provide significant coding assistance to this…

Software Engineering · Computer Science 2022-05-19 Daye Nam , Baishakhi Ray , Seohyun Kim , Xianshan Qu , Satish Chandra

Copy that! Editing Sequences by Copying Spans

Neural sequence-to-sequence models are finding increasing use in editing of documents, for example in correcting a text document or repairing source code. In this paper, we argue that common seq2seq models (with a facility to copy single…

Machine Learning · Computer Science 2020-12-15 Sheena Panthaplackel , Miltiadis Allamanis , Marc Brockschmidt

Training Language Models on Synthetic Edit Sequences Improves Code Synthesis

Software engineers mainly write code by editing existing programs. In contrast, language models (LMs) autoregressively synthesize programs in a single pass. One explanation for this is the scarcity of sequential edit data. While…

Machine Learning · Computer Science 2025-02-12 Ulyana Piterbarg , Lerrel Pinto , Rob Fergus

The What-If Tool: Interactive Probing of Machine Learning Models

A key challenge in developing and deploying Machine Learning (ML) systems is understanding their performance across a wide range of inputs. To address this challenge, we created the What-If Tool, an open-source application that allows…

Machine Learning · Computer Science 2019-10-04 James Wexler , Mahima Pushkarna , Tolga Bolukbasi , Martin Wattenberg , Fernanda Viegas , Jimbo Wilson

Learning Program Synthesis for Integer Sequences from Scratch

We present a self-learning approach for synthesizing programs from integer sequences. Our method relies on a tree search guided by a learned policy. Our system is tested on the On-Line Encyclopedia of Integer Sequences. There, it discovers,…

Artificial Intelligence · Computer Science 2022-11-30 Thibault Gauthier , Josef Urban

Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models

Large language models can perform various reasoning tasks by using chain-of-thought prompting, which guides them to find answers through step-by-step demonstrations. However, the quality of the prompts depends on the demonstrations given to…

Computation and Language · Computer Science 2023-02-02 Zhihong Shao , Yeyun Gong , Yelong Shen , Minlie Huang , Nan Duan , Weizhu Chen

Incorporating Copying Mechanism in Sequence-to-Sequence Learning

We address an important problem in sequence-to-sequence (Seq2Seq) learning referred to as copying, in which certain segments in the input sequence are selectively replicated in the output sequence. A similar phenomenon is observable in…

Computation and Language · Computer Science 2016-06-09 Jiatao Gu , Zhengdong Lu , Hang Li , Victor O. K. Li

Toward Trustworthy Neural Program Synthesis

We develop an approach to estimate the probability that a program sampled from a large language model is correct. Given a natural language description of a programming problem, our method samples both candidate programs as well as candidate…

Software Engineering · Computer Science 2023-10-11 Darren Key , Wen-Ding Li , Kevin Ellis

Sequence-to-Sequence Learning on Keywords for Efficient FAQ Retrieval

Frequently-Asked-Question (FAQ) retrieval provides an effective procedure for responding to user's natural language based queries. Such platforms are becoming common in enterprise chatbots, product question answering, and preliminary…

Information Retrieval · Computer Science 2021-08-24 Sourav Dutta , Haytham Assem , Edward Burgin

Synthesis of Parametric Programs using Genetic Programming and Model Checking

Formal methods apply algorithms based on mathematical principles to enhance the reliability of systems. It would only be natural to try to progress from verification, model checking or testing a system against its formal specification into…

Software Engineering · Computer Science 2014-02-28 Gal Katz , Doron Peled

Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models

Prompting, which casts downstream applications as language modeling tasks, has shown to be sample efficient compared to standard fine-tuning with pre-trained models. However, one pitfall of prompting is the need of manually-designed…

Computation and Language · Computer Science 2022-09-21 Zichun Yu , Tianyu Gao , Zhengyan Zhang , Yankai Lin , Zhiyuan Liu , Maosong Sun , Jie Zhou

code2seq: Generating Sequences from Structured Representations of Code

The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval. Sequence-to-sequence (seq2seq) models, adopted from neural machine…

Machine Learning · Computer Science 2019-02-22 Uri Alon , Shaked Brody , Omer Levy , Eran Yahav

How Can We Synthesize High-Quality Pretraining Data? A Systematic Study of Prompt Design, Generator Model, and Source Data

Synthetic data is a standard component in training large language models, yet systematic comparisons across design dimensions, including rephrasing strategy, generator model, and source data, remain absent. We conduct extensive controlled…

Computation and Language · Computer Science 2026-04-16 Joel Niklaus , Atsuki Yamaguchi , Michal Štefánik , Guilherme Penedo , Hynek Kydlíček , Elie Bakouch , Lewis Tunstall , Edward Emanuel Beeching , Thibaud Frere , Colin Raffel , Leandro von Werra , Thomas Wolf

Token-Level Fitting Issues of Seq2seq Models

Sequence-to-sequence (seq2seq) models have been widely used for natural language processing, computer vision, and other deep learning tasks. We find that seq2seq models trained with early-stopping suffer from issues at the token level. In…

Computation and Language · Computer Science 2023-06-23 Guangsheng Bao , Zhiyang Teng , Yue Zhang

Predicting User Actions in Software Processes

This paper describes an approach for user (e.g. SW architect) assisting in software processes. The approach observes the user's action and tries to predict his next step. For this we use approaches in the area of machine learning (sequence…

Software Engineering · Computer Science 2011-10-07 Michael Deynet

Deep Text-to-Speech System with Seq2Seq Model

Recent trends in neural network based text-to-speech/speech synthesis pipelines have employed recurrent Seq2seq architectures that can synthesize realistic sounding speech directly from text characters. These systems however have complex…

Computation and Language · Computer Science 2019-03-19 Gary Wang

On the Reliability and Explainability of Language Models for Program Generation

Recent studies have adopted pre-trained language models, such as CodeT5 and CodeGPT, for automated program generation tasks like code generation, repair, and translation. Numerous language model-based approaches have been proposed and…

Software Engineering · Computer Science 2024-01-09 Yue Liu , Chakkrit Tantithamthavorn , Yonghui Liu , Li Li