Related papers: JaCoText: A Pretrained Model for Java Code-Text Ge…

A Comprehensive Review of State-of-The-Art Methods for Java Code Generation from Natural Language Text

Java Code Generation consists in generating automatically Java code from a Natural Language Text. This NLP task helps in increasing programmers' productivity by providing them with immediate solutions to the simplest and most repetitive…

Computation and Language · Computer Science 2023-06-13 Jessica López Espejel , Mahaman Sanoussi Yahaya Alassan , El Mehdi Chouham , Walid Dahhane , El Hassane Ettifouri

JavaBERT: Training a transformer-based model for the Java programming language

Code quality is and will be a crucial factor while developing new software code, requiring appropriate tools to ensure functional and reliable code. Machine learning techniques are still rarely used for software engineering tools, missing…

Software Engineering · Computer Science 2021-10-22 Nelson Tavares de Sousa , Wilhelm Hasselbring

Text-to-Code Generation with Modality-relative Pre-training

Large pre-trained language models have recently been expanded and applied to programming language tasks with great success, often through further pre-training of a strictly-natural language model--where training sequences typically contain…

Computation and Language · Computer Science 2024-02-13 Fenia Christopoulou , Guchun Zhang , Gerasimos Lampouras

EvoText: Enhancing Natural Language Generation Models via Self-Escalation Learning for Up-to-Date Knowledge and Improved Performance

In recent years, pretrained models have been widely used in various fields, including natural language understanding, computer vision, and natural language generation. However, the performance of these language generation models is highly…

Computation and Language · Computer Science 2023-04-14 Zhengqing Yuan , Huiwen Xue , Chao Zhang , Yongming Liu

Pretrained Language Models for Text Generation: A Survey

Text generation has become one of the most important yet challenging tasks in natural language processing (NLP). The resurgence of deep learning has greatly advanced this field by neural generation models, especially the paradigm of…

Computation and Language · Computer Science 2021-05-26 Junyi Li , Tianyi Tang , Wayne Xin Zhao , Ji-Rong Wen

Pre-Training a Graph Recurrent Network for Language Representation

Transformer-based pre-trained models have gained much advance in recent years, becoming one of the most important backbones in natural language processing. Recent work shows that the attention mechanism inside Transformer may not be…

Computation and Language · Computer Science 2022-10-27 Yile Wang , Linyi Yang , Zhiyang Teng , Ming Zhou , Yue Zhang

Benchmarking Language Models for Code Syntax Understanding

Pre-trained language models have demonstrated impressive performance in both natural language processing and program understanding, which represent the input as a token sequence without explicitly modeling its structure. Some prior works…

Computation and Language · Computer Science 2022-10-27 Da Shen , Xinyun Chen , Chenguang Wang , Koushik Sen , Dawn Song

Sentence Bottleneck Autoencoders from Transformer Language Models

Representation learning for text via pretraining a language model on a large corpus has become a standard starting point for building NLP systems. This approach stands in contrast to autoencoders, also trained on raw text, but with the…

Computation and Language · Computer Science 2021-09-14 Ivan Montero , Nikolaos Pappas , Noah A. Smith

CoditT5: Pretraining for Source Code and Natural Language Editing

Pretrained language models have been shown to be effective in many software-related generation tasks; however, they are not well-suited for editing tasks as they are not designed to reason about edits. To address this, we propose a novel…

Software Engineering · Computer Science 2022-09-15 Jiyang Zhang , Sheena Panthaplackel , Pengyu Nie , Junyi Jessy Li , Milos Gligoric

Exploring and Evaluating Personalized Models for Code Generation

Large Transformer models achieved the state-of-the-art status for Natural Language Understanding tasks and are increasingly becoming the baseline model architecture for modeling source code. Transformers are usually pre-trained on large…

Software Engineering · Computer Science 2022-09-21 Andrei Zlotchevski , Dawn Drain , Alexey Svyatkovskiy , Colin Clement , Neel Sundaresan , Michele Tufano

Automating Code-Related Tasks Through Transformers: The Impact of Pre-training

Transformers have gained popularity in the software engineering (SE) literature. These deep learning models are usually pre-trained through a self-supervised objective, meant to provide the model with basic knowledge about a language of…

Software Engineering · Computer Science 2023-02-09 Rosalia Tufano , Luca Pascarella , Gabriele Bavota

On the Effectiveness of Transfer Learning for Code Search

The Transformer architecture and transfer learning have marked a quantum leap in natural language processing, improving the state of the art across a range of text-based tasks. This paper examines how these advancements can be applied to…

Software Engineering · Computer Science 2022-08-29 Pasquale Salza , Christoph Schwizer , Jian Gu , Harald C. Gall

Controlled Text Generation as Continuous Optimization with Multiple Constraints

As large-scale language model pretraining pushes the state-of-the-art in text generation, recent work has turned to controlling attributes of the text such models generate. While modifying the pretrained models via fine-tuning remains the…

Computation and Language · Computer Science 2021-08-05 Sachin Kumar , Eric Malmi , Aliaksei Severyn , Yulia Tsvetkov

From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text

Generating code-switched text is a problem of growing interest, especially given the scarcity of corpora containing large volumes of real code-switched text. In this work, we adapt a state-of-the-art neural machine translation model to…

Computation and Language · Computer Science 2021-07-15 Ishan Tarunesh , Syamantak Kumar , Preethi Jyothi

Machine Translation Pre-training for Data-to-Text Generation -- A Case Study in Czech

While there is a large body of research studying deep learning methods for text generation from structured data, almost all of it focuses purely on English. In this paper, we study the effectiveness of machine translation based pre-training…

Computation and Language · Computer Science 2020-04-07 Mihir Kale , Scott Roy

Improving Tree-Structured Decoder Training for Code Generation via Mutual Learning

Code generation aims to automatically generate a piece of code given an input natural language utterance. Currently, among dominant models, it is treated as a sequence-to-tree task, where a decoder outputs a sequence of actions…

Artificial Intelligence · Computer Science 2021-06-01 Binbin Xie , Jinsong Su , Yubin Ge , Xiang Li , Jianwei Cui , Junfeng Yao , Bin Wang

Code Execution with Pre-trained Language Models

Code execution is a fundamental aspect of programming language semantics that reflects the exact behavior of the code. However, most pre-trained models for code intelligence ignore the execution trace and only rely on source code and…

Programming Languages · Computer Science 2023-05-10 Chenxiao Liu , Shuai Lu , Weizhu Chen , Daxin Jiang , Alexey Svyatkovskiy , Shengyu Fu , Neel Sundaresan , Nan Duan

CTRL: A Conditional Transformer Language Model for Controllable Generation

Large-scale language models show promising text generation capabilities, but users cannot easily control particular aspects of the generated text. We release CTRL, a 1.63 billion-parameter conditional transformer language model, trained to…

Computation and Language · Computer Science 2019-09-24 Nitish Shirish Keskar , Bryan McCann , Lav R. Varshney , Caiming Xiong , Richard Socher

NatGen: Generative pre-training by "Naturalizing" source code

Pre-trained Generative Language models (e.g. PLBART, CodeT5, SPT-Code) for source code yielded strong results on several tasks in the past few years, including code generation and translation. These models have adopted varying pre-training…

Programming Languages · Computer Science 2022-07-07 Saikat Chakraborty , Toufique Ahmed , Yangruibo Ding , Premkumar Devanbu , Baishakhi Ray

Program Language Translation Using a Grammar-Driven Tree-to-Tree Model

The task of translating between programming languages differs from the challenge of translating natural languages in that programming languages are designed with a far more rigid set of structural and grammatical rules. Previous work has…

Machine Learning · Computer Science 2018-07-06 Mehdi Drissi , Olivia Watkins , Aditya Khant , Vivaswat Ojha , Pedro Sandoval , Rakia Segev , Eric Weiner , Robert Keller