English
Related papers

Related papers: NatGen: Generative pre-training by "Naturalizing" …

200 papers

Pretrained language models have been shown to be effective in many software-related generation tasks; however, they are not well-suited for editing tasks as they are not designed to reason about edits. To address this, we propose a novel…

Software Engineering · Computer Science 2022-09-15 Jiyang Zhang , Sheena Panthaplackel , Pengyu Nie , Junyi Jessy Li , Milos Gligoric

Pre-trained models for Natural Languages (NL) like BERT and GPT have been recently shown to transfer well to Programming Languages (PL) and largely benefit a broad set of code-related tasks. Despite their success, most current methods…

Computation and Language · Computer Science 2021-09-03 Yue Wang , Weishi Wang , Shafiq Joty , Steven C. H. Hoi

We present CodeBERT, a bimodal pre-trained model for programming language (PL) and nat-ural language (NL). CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural language codesearch, code…

Computation and Language · Computer Science 2020-09-21 Zhangyin Feng , Daya Guo , Duyu Tang , Nan Duan , Xiaocheng Feng , Ming Gong , Linjun Shou , Bing Qin , Ting Liu , Daxin Jiang , Ming Zhou

With the great success of pre-trained models, the pretrain-then-finetune paradigm has been widely adopted on downstream tasks for source code understanding. However, compared to costly training a large-scale model from scratch, how to…

Software Engineering · Computer Science 2022-03-16 Deze Wang , Zhouyang Jia , Shanshan Li , Yue Yu , Yun Xiong , Wei Dong , Xiangke Liao

Code generation aims to automatically generate code snippets of specific programming language according to natural language descriptions. The continuous advancements in deep learning, particularly pre-trained models, have empowered the code…

Software Engineering · Computer Science 2025-01-24 Zezhou Yang , Sirong Chen , Cuiyun Gao , Zhenhao Li , Xing Hu , Kui Liu , Xin Xia

Much of software-engineering research relies on the naturalness of code, the fact that code, in small code snippets, is repetitive and can be predicted using statistical language models like n-gram. Although powerful, training such models…

Software Engineering · Computer Science 2022-08-15 Ahmed Khanfir , Matthieu Jimenez , Mike Papadakis , Yves Le Traon

Software is constantly changing, requiring developers to perform several derived tasks in a timely manner, such as writing a description for the intention of the code change, or identifying the defect-prone code changes. Considering that…

Software Engineering · Computer Science 2023-05-19 Bo Lin , Shangwen Wang , Zhongxin Liu , Yepang Liu , Xin Xia , Xiaoguang Mao

Recent advancements in natural language processing \cite{gpt2} \cite{BERT} have led to near-human performance in multiple natural language tasks. In this paper, we seek to understand whether similar techniques can be applied to a highly…

Computation and Language · Computer Science 2021-02-23 Luis Perez , Lizi Ottens , Sudharshan Viswanathan

Recent years have seen the successful application of large pre-trained models to code representation learning, resulting in substantial improvements on many code-related downstream tasks. But there are issues surrounding their application…

Software Engineering · Computer Science 2022-05-26 Changan Niu , Chuanyi Li , Vincent Ng , Jidong Ge , Liguo Huang , Bin Luo

Large pre-trained language models have recently been expanded and applied to programming language tasks with great success, often through further pre-training of a strictly-natural language model--where training sequences typically contain…

Computation and Language · Computer Science 2024-02-13 Fenia Christopoulou , Guchun Zhang , Gerasimos Lampouras

Recent advances in self-supervised learning have dramatically improved the state of the art on a wide variety of tasks. However, research in language model pre-training has mostly focused on natural languages, and it is unclear whether…

Computation and Language · Computer Science 2021-10-29 Baptiste Roziere , Marie-Anne Lachaux , Marc Szafraniec , Guillaume Lample

Large language models (LLMs) pretrained on vast source code have achieved prominent progress in code intelligence. However, existing code LLMs have two main limitations in terms of architecture and pretraining tasks. First, they often adopt…

Computation and Language · Computer Science 2023-05-23 Yue Wang , Hung Le , Akhilesh Deepak Gotmare , Nghi D. Q. Bui , Junnan Li , Steven C. H. Hoi

Recently, many pre-trained language models for source code have been proposed to model the context of code and serve as a basis for downstream code intelligence tasks such as code completion, code search, and code summarization. These…

Software Engineering · Computer Science 2022-02-15 Yao Wan , Wei Zhao , Hongyu Zhang , Yulei Sui , Guandong Xu , Hai Jin

Natural language to code generation is an important application area of LLMs and has received wide attention from the community. The majority of relevant studies have exclusively concentrated on increasing the quantity and functional…

Machine Learning · Computer Science 2023-11-28 Naman Jain , Tianjun Zhang , Wei-Lin Chiang , Joseph E. Gonzalez , Koushik Sen , Ion Stoica

Pre-trained models of source code have recently been successfully applied to a wide variety of Software Engineering tasks; they have also seen some practical adoption in practice, e.g. for code completion. Yet, we still know very little…

Software Engineering · Computer Science 2023-12-11 Anjan Karmakar , Romain Robbes

Programming languages are emerging as a challenging and interesting domain for machine learning. A core task, which has received significant attention in recent years, is building generative models of source code. However, to our knowledge,…

Machine Learning · Computer Science 2019-04-08 Rui Zhao , David Bieber , Kevin Swersky , Daniel Tarlow

Software optimization refines programs for resource efficiency while preserving functionality. Traditionally, it is a process done by developers and compilers. This paper introduces a third option, automated optimization at the source code…

Software Engineering · Computer Science 2025-02-04 Zimin Chen , Sen Fang , Martin Monperrus

Deep learning models are widely used for solving challenging code processing tasks, such as code generation or code summarization. Traditionally, a specific model architecture was carefully built to solve a particular code processing task.…

Software Engineering · Computer Science 2022-11-18 Sergey Troshin , Nadezhda Chirkova

Back-translation is widely known for its effectiveness in neural machine translation when there is little to no parallel data. In this approach, a source-to-target model is coupled with a target-to-source model trained in parallel. The…

Computation and Language · Computer Science 2023-02-14 Wasi Uddin Ahmad , Saikat Chakraborty , Baishakhi Ray , Kai-Wei Chang

Recent studies have adopted pre-trained language models, such as CodeT5 and CodeGPT, for automated program generation tasks like code generation, repair, and translation. Numerous language model-based approaches have been proposed and…

Software Engineering · Computer Science 2024-01-09 Yue Liu , Chakkrit Tantithamthavorn , Yonghui Liu , Li Li
‹ Prev 1 2 3 10 Next ›