Related papers: Empirical Study of Transformers for Source Code

A Comparative Study on Code Generation with Transformers

In an era of widespread influence of Natural Language Processing (NLP), there have been multiple research efforts to supplant traditional manual coding techniques with automated systems capable of generating solutions autonomously. With…

Computation and Language · Computer Science 2024-12-10 Namrata Das , Rakshya Panta , Neelam Karki , Ruchi Manandhar , Dinesh Baniya Kshatri

A Transformer-based Approach for Source Code Summarization

Generating a readable summary that describes the functionality of a program is known as source code summarization. In this task, learning code representation by modeling the pairwise relationship between code tokens to capture their…

Software Engineering · Computer Science 2020-05-05 Wasi Uddin Ahmad , Saikat Chakraborty , Baishakhi Ray , Kai-Wei Chang

On the Ability and Limitations of Transformers to Recognize Formal Languages

Transformers have supplanted recurrent models in a large number of NLP tasks. However, the differences in their abilities to model different syntactic properties remain largely unknown. Past works suggest that LSTMs generalize very well on…

Computation and Language · Computer Science 2020-10-09 Satwik Bhattamishra , Kabir Ahuja , Navin Goyal

Tree-Transformer: A Transformer-Based Method for Correction of Tree-Structured Data

Many common sequential data sources, such as source code and natural language, have a natural tree-structured representation. These trees can be generated by fitting a sequence to a grammar, yielding a hierarchical ordering of the tokens in…

Machine Learning · Computer Science 2019-08-02 Jacob Harer , Chris Reale , Peter Chin

Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets

Large language models (LLMs) and transformer-based architectures are increasingly utilized for source code analysis. As software systems grow in complexity, integrating LLMs into code analysis workflows becomes essential for enhancing…

Software Engineering · Computer Science 2025-03-25 Hamed Jelodar , Mohammad Meymani , Roozbeh Razavi-Far

Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed

The utility of linguistic annotation in neural machine translation seemed to had been established in past papers. The experiments were however limited to recurrent sequence-to-sequence architectures and relatively small data settings. We…

Computation and Language · Computer Science 2019-10-25 Thuong-Hai Pham , Dominik Macháček , Ondřej Bojar

Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees

Pre-trained language models like BERT achieve superior performances in various NLP tasks without explicit consideration of syntactic information. Meanwhile, syntactic information has been proved to be crucial for the success of NLP…

Computation and Language · Computer Science 2021-03-09 Jiangang Bai , Yujing Wang , Yiren Chen , Yaming Yang , Jing Bai , Jing Yu , Yunhai Tong

Understanding Code Semantics: An Evaluation of Transformer Models in Summarization

This paper delves into the intricacies of code summarization using advanced transformer-based language models. Through empirical studies, we evaluate the efficacy of code summarization by altering function and variable names to explore…

Machine Learning · Computer Science 2023-10-30 Debanjan Mondal , Abhilasha Lodha , Ankita Sahoo , Beena Kumari

SmartPaste: Learning to Adapt Source Code

Deep Neural Networks have been shown to succeed at a range of natural language tasks such as machine translation and text summarization. While tasks on source code (ie, formal languages) have been considered recently, most work in this area…

Machine Learning · Computer Science 2017-05-23 Miltiadis Allamanis , Marc Brockschmidt

Code Structure Guided Transformer for Source Code Summarization

Code summaries help developers comprehend programs and reduce their time to infer the program functionalities during software maintenance. Recent efforts resort to deep learning techniques such as sequence-to-sequence models for generating…

Computation and Language · Computer Science 2023-02-08 Shuzheng Gao , Cuiyun Gao , Yulan He , Jichuan Zeng , Lun Yiu Nie , Xin Xia , Michael R. Lyu

On the Effectiveness of Transfer Learning for Code Search

The Transformer architecture and transfer learning have marked a quantum leap in natural language processing, improving the state of the art across a range of text-based tasks. This paper examines how these advancements can be applied to…

Software Engineering · Computer Science 2022-08-29 Pasquale Salza , Christoph Schwizer , Jian Gu , Harald C. Gall

TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing

Program comprehension is a fundamental task in software development and maintenance processes. Software developers often need to understand a large amount of existing code before they can develop new features or fix bugs in existing…

Machine Learning · Computer Science 2019-10-29 Vinoj Jayasundara , Nghi Duy Quoc Bui , Lingxiao Jiang , David Lo

Language Modelling for Source Code with Transformer-XL

It has been found that software, like natural language texts, exhibits "naturalness", which can be captured by statistical language models. In recent years, neural language models have been proposed to represent the naturalness of software…

Machine Learning · Computer Science 2020-08-03 Thomas Dowdell , Hongyu Zhang

Transformer with Tree-order Encoding for Neural Program Generation

While a considerable amount of semantic parsing approaches have employed RNN architectures for code generation tasks, there have been only few attempts to investigate the applicability of Transformers for this task. Including hierarchical…

Computation and Language · Computer Science 2022-06-28 Klaudia-Doris Thellmann , Bernhard Stadler , Ricardo Usbeck , Jens Lehmann

What does Transformer learn about source code?

In the field of source code processing, the transformer-based representation models have shown great powerfulness and have achieved state-of-the-art (SOTA) performance in many tasks. Although the transformer models process the sequential…

Software Engineering · Computer Science 2022-07-19 Kechi Zhang , Ge Li , Zhi Jin

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has…

Machine Learning · Computer Science 2023-09-20 Colin Raffel , Noam Shazeer , Adam Roberts , Katherine Lee , Sharan Narang , Michael Matena , Yanqi Zhou , Wei Li , Peter J. Liu

A Survey on Natural Language Processing for Programming

Natural language processing for programming aims to use NLP techniques to assist programming. It is increasingly prevalent for its effectiveness in improving productivity. Distinct from natural language, a programming language is highly…

Computation and Language · Computer Science 2023-08-08 Qingfu Zhu , Xianzhen Luo , Fang Liu , Cuiyun Gao , Wanxiang Che

Which Features are Learned by CodeBert: An Empirical Study of the BERT-based Source Code Representation Learning

The Bidirectional Encoder Representations from Transformers (BERT) were proposed in the natural language process (NLP) and shows promising results. Recently researchers applied the BERT to source-code representation learning and reported…

Computation and Language · Computer Science 2023-08-14 Lan Zhang , Chen Cao , Zhilong Wang , Peng Liu

On the Embeddings of Variables in Recurrent Neural Networks for Source Code

Source code processing heavily relies on the methods widely used in natural language processing (NLP), but involves specifics that need to be taken into account to achieve higher quality. An example of this specificity is that the semantics…

Software Engineering · Computer Science 2021-04-28 Nadezhda Chirkova

Source Code Summarization with Structural Relative Position Guided Transformer

Source code summarization aims at generating concise and clear natural language descriptions for programming languages. Well-written code summaries are beneficial for programmers to participate in the software development and maintenance…

Computation and Language · Computer Science 2022-02-15 Zi Gong , Cuiyun Gao , Yasheng Wang , Wenchao Gu , Yun Peng , Zenglin Xu