Related papers: What does Transformer learn about source code?

A Transformer-based Approach for Source Code Summarization

Generating a readable summary that describes the functionality of a program is known as source code summarization. In this task, learning code representation by modeling the pairwise relationship between code tokens to capture their…

Software Engineering · Computer Science 2020-05-05 Wasi Uddin Ahmad , Saikat Chakraborty , Baishakhi Ray , Kai-Wei Chang

GN-Transformer: Fusing Sequence and Graph Representation for Improved Code Summarization

As opposed to natural languages, source code understanding is influenced by grammatical relationships between tokens regardless of their identifier name. Graph representations of source code such as Abstract Syntax Tree (AST) can capture…

Machine Learning · Computer Science 2021-11-18 Junyan Cheng , Iordanis Fostiropoulos , Barry Boehm

What Do They Capture? -- A Structural Analysis of Pre-Trained Language Models for Source Code

Recently, many pre-trained language models for source code have been proposed to model the context of code and serve as a basis for downstream code intelligence tasks such as code completion, code search, and code summarization. These…

Software Engineering · Computer Science 2022-02-15 Yao Wan , Wei Zhao , Hongyu Zhang , Yulei Sui , Guandong Xu , Hai Jin

Code Structure Guided Transformer for Source Code Summarization

Code summaries help developers comprehend programs and reduce their time to infer the program functionalities during software maintenance. Recent efforts resort to deep learning techniques such as sequence-to-sequence models for generating…

Computation and Language · Computer Science 2023-02-08 Shuzheng Gao , Cuiyun Gao , Yulan He , Jichuan Zeng , Lun Yiu Nie , Xin Xia , Michael R. Lyu

Empirical Study of Transformers for Source Code

Initially developed for natural language processing (NLP), Transformers are now widely used for source code processing, due to the format similarity between source code and text. In contrast to natural language, source code is strictly…

Machine Learning · Computer Science 2021-06-25 Nadezhda Chirkova , Sergey Troshin

Structure-Aware Transformer for Graph Representation Learning

The Transformer architecture has gained growing attention in graph representation learning recently, as it naturally overcomes several limitations of graph neural networks (GNNs) by avoiding their strict structural inductive biases and…

Machine Learning · Statistics 2022-06-14 Dexiong Chen , Leslie O'Bray , Karsten Borgwardt

Transformers as Graph-to-Graph Models

We argue that Transformers are essentially graph-to-graph models, with sequences just being a special case. Attention weights are functionally equivalent to graph edges. Our Graph-to-Graph Transformer architecture makes this ability…

Computation and Language · Computer Science 2023-10-30 James Henderson , Alireza Mohammadshahi , Andrei C. Coman , Lesly Miculicich

Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks

Transformer networks have seen great success in natural language processing and machine vision, where task objectives such as next word prediction and image classification benefit from nuanced context sensitivity across high-dimensional…

Machine Learning · Computer Science 2022-12-13 Yuxuan Li , James L. McClelland

Scene Graph Parsing by Attention Graph

Scene graph representations, which form a graph of visual object nodes together with their attributes and relations, have proved useful across a variety of vision and language applications. Recent work in the area has used Natural Language…

Computation and Language · Computer Science 2019-09-16 Martin Andrews , Yew Ken Chia , Sam Witteveen

Attention Please: What Transformer Models Really Learn for Process Prediction

Predictive process monitoring aims to support the execution of a process during runtime with various predictions about the further evolution of a process instance. In the last years a plethora of deep learning architectures have been…

Machine Learning · Computer Science 2024-08-15 Martin Käppel , Lars Ackermann , Stefan Jablonski , Simon Härtl

SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations

Learning representations on large-sized graphs is a long-standing challenge due to the inter-dependence nature involved in massive data points. Transformers, as an emerging class of foundation encoders for graph-structured data, have shown…

Machine Learning · Computer Science 2024-08-19 Qitian Wu , Wentao Zhao , Chenxiao Yang , Hengrui Zhang , Fan Nie , Haitian Jiang , Yatao Bian , Junchi Yan

Generic Interpretation Approach for Transformer Models Incorporating Heterogenous Attention Structures

Transformer has significantly propelled the development of artificial intelligence, and certainly the development of agents as well. We categorize attention structures of Transformer into two types based on the source of the input…

Computer Vision and Pattern Recognition · Computer Science 2026-05-28 Yongjin Cui , Xiaohui Fan , Huajun Chen

Graph Convolutions Enrich the Self-Attention in Transformers!

Transformers, renowned for their self-attention mechanism, have achieved state-of-the-art performance across various tasks in natural language processing, computer vision, time-series modeling, etc. However, one of the challenges with deep…

Machine Learning · Computer Science 2024-11-04 Jeongwhan Choi , Hyowon Wi , Jayoung Kim , Yehjin Shin , Kookjin Lee , Nathaniel Trask , Noseong Park

Source Dependency-Aware Transformer with Supervised Self-Attention

Recently, Transformer has achieved the state-of-the-art performance on many machine translation tasks. However, without syntax knowledge explicitly considered in the encoder, incorrect context information that violates the syntax structure…

Computation and Language · Computer Science 2019-09-06 Chengyi Wang , Shuangzhi Wu , Shujie Liu

Source Code Summarization with Structural Relative Position Guided Transformer

Source code summarization aims at generating concise and clear natural language descriptions for programming languages. Well-written code summaries are beneficial for programmers to participate in the software development and maintenance…

Computation and Language · Computer Science 2022-02-15 Zi Gong , Cuiyun Gao , Yasheng Wang , Wenchao Gu , Yun Peng , Zenglin Xu

How Transformers Learn Causal Structure with Gradient Descent

The incredible success of transformers on sequence modeling tasks can be largely attributed to the self-attention mechanism, which allows information to be transferred between different parts of a sequence. Self-attention allows…

Machine Learning · Computer Science 2024-08-14 Eshaan Nichani , Alex Damian , Jason D. Lee

A Multiscale Visualization of Attention in the Transformer Model

The Transformer is a sequence model that forgoes traditional recurrent architectures in favor of a fully attention-based approach. Besides improving performance, an advantage of using attention is that it can also help to interpret a model…

Human-Computer Interaction · Computer Science 2019-06-14 Jesse Vig

What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding

Graph Transformers, which incorporate self-attention and positional encoding, have recently emerged as a powerful architecture for various graph learning tasks. Despite their impressive performance, the complex non-convex interactions…

Machine Learning · Computer Science 2024-06-05 Hongkang Li , Meng Wang , Tengfei Ma , Sijia Liu , Zaixi Zhang , Pin-Yu Chen

Transformers are efficient hierarchical chemical graph learners

Transformers, adapted from natural language processing, are emerging as a leading approach for graph representation learning. Contemporary graph transformers often treat nodes or edges as separate tokens. This approach leads to…

Machine Learning · Computer Science 2023-10-04 Zihan Pengmei , Zimu Li , Chih-chan Tien , Risi Kondor , Aaron R. Dinner

Learning Program Representations with a Tree-Structured Transformer

Learning vector representations for programs is a critical step in applying deep learning techniques for program understanding tasks. Various neural network models are proposed to learn from tree-structured program representations, e.g.,…

Software Engineering · Computer Science 2023-01-10 Wenhan Wang , Kechi Zhang , Ge Li , Shangqing Liu , Anran Li , Zhi Jin , Yang Liu