Related papers: Learning Graph Quantized Tokenizers

Pure Transformers are Powerful Graph Learners

We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice. Given a graph, we simply treat all nodes and edges as independent tokens, augment them with…

Machine Learning · Computer Science 2022-10-25 Jinwoo Kim , Tien Dat Nguyen , Seonwoo Min , Sungjun Cho , Moontae Lee , Honglak Lee , Seunghoon Hong

Graph Transformers for Large Graphs

Transformers have recently emerged as powerful neural networks for graph learning, showcasing state-of-the-art performance on several graph property prediction tasks. However, these results have been limited to small-scale graphs, where the…

Machine Learning · Computer Science 2023-12-19 Vijay Prakash Dwivedi , Yozen Liu , Anh Tuan Luu , Xavier Bresson , Neil Shah , Tong Zhao

Graph Transformer Networks

Graph neural networks (GNNs) have been widely used in representation learning on graphs and achieved state-of-the-art performance in tasks such as node classification and link prediction. However, most existing GNNs are designed to learn…

Machine Learning · Computer Science 2020-02-06 Seongjun Yun , Minbyul Jeong , Raehyun Kim , Jaewoo Kang , Hyunwoo J. Kim

What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding

Graph Transformers, which incorporate self-attention and positional encoding, have recently emerged as a powerful architecture for various graph learning tasks. Despite their impressive performance, the complex non-convex interactions…

Machine Learning · Computer Science 2024-06-05 Hongkang Li , Meng Wang , Tengfei Ma , Sijia Liu , Zaixi Zhang , Pin-Yu Chen

Graph Transformer Networks: Learning Meta-path Graphs to Improve GNNs

Graph Neural Networks (GNNs) have been widely applied to various fields due to their powerful representations of graph-structured data. Despite the success of GNNs, most existing GNNs are designed to learn node representations on the fixed…

Machine Learning · Computer Science 2021-06-14 Seongjun Yun , Minbyul Jeong , Sungdong Yoo , Seunghun Lee , Sean S. Yi , Raehyun Kim , Jaewoo Kang , Hyunwoo J. Kim

Rethinking Tokenized Graph Transformers for Node Classification

Node tokenized graph Transformers (GTs) have shown promising performance in node classification. The generation of token sequences is the key module in existing tokenized GTs which transforms the input graph into token sequences,…

Machine Learning · Computer Science 2025-02-13 Jinsong Chen , Chenyang Li , GaiChao Li , John E. Hopcroft , Kun He

Transformers are Graph Neural Networks

We establish connections between the Transformer architecture, originally introduced for natural language processing, and Graph Neural Networks (GNNs) for representation learning on graphs. We show how Transformers can be viewed as message…

Machine Learning · Computer Science 2025-06-30 Chaitanya K. Joshi

GraphFM: A generalist graph transformer that learns transferable representations across diverse domains

Graph neural networks (GNNs) are often trained on individual datasets, requiring specialized models and significant hyperparameter tuning due to the unique structures and features of each dataset. This approach limits the scalability and…

Machine Learning · Computer Science 2026-02-17 Divyansha Lachi , Mehdi Azabou , Vinam Arora , Eva Dyer

Graph Tokenization for Bridging Graphs and Transformers

The success of large pretrained Transformers is closely tied to tokenizers, which convert raw input into discrete symbols. Extending these models to graph-structured data remains a significant challenge. In this work, we introduce a graph…

Machine Learning · Computer Science 2026-03-13 Zeyuan Guo , Enmao Diao , Cheng Yang , Chuan Shi

A Survey of Graph Transformers: Architectures, Theories and Applications

Graph Transformers (GTs) have demonstrated a strong capability in modeling graph structures by addressing the intrinsic limitations of graph neural networks (GNNs), such as over-smoothing and over-squashing. Recent studies have proposed…

Machine Learning · Computer Science 2025-02-28 Chaohao Yuan , Kangfei Zhao , Ercan Engin Kuruoglu , Liang Wang , Tingyang Xu , Wenbing Huang , Deli Zhao , Hong Cheng , Yu Rong

TorchGT: A Holistic System for Large-scale Graph Transformer Training

Graph Transformer is a new architecture that surpasses GNNs in graph learning. While there emerge inspiring algorithm advancements, their practical adoption is still limited, particularly on real-world graphs involving up to millions of…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-22 Meng Zhang , Jie Sun , Qinghao Hu , Peng Sun , Zeke Wang , Yonggang Wen , Tianwei Zhang

Enhanced Graph Transformer with Serialized Graph Tokens

Transformers have demonstrated success in graph learning, particularly for node-level tasks. However, existing methods encounter an information bottleneck when generating graph-level representations. The prevalent single token paradigm…

Machine Learning · Computer Science 2026-02-11 Ruixiang Wang , Yuyang Hong , Shiming Xiang , Chunhong Pan

GQWformer: A Quantum-based Transformer for Graph Representation Learning

Graph Transformers (GTs) have demonstrated significant advantages in graph representation learning through their global attention mechanisms. However, the self-attention mechanism in GTs tends to neglect the inductive biases inherent in…

Machine Learning · Computer Science 2024-12-04 Lei Yu , Hongyang Chen , Jingsong Lv , Linyao Yang

Generalizing Graph Transformers Across Diverse Graphs and Tasks via Pre-training

Graph pre-training has been concentrated on graph-level tasks involving small graphs (e.g., molecular graphs) or learning node representations on a fixed graph. Extending graph pre-trained models to web-scale graphs with billions of nodes…

Machine Learning · Computer Science 2025-11-07 Yufei He , Zhenyu Hou , Yukuo Cen , Jun Hu , Feng He , Xu Cheng , Jie Tang , Bryan Hooi

Are More Layers Beneficial to Graph Transformers?

Despite that going deep has proven successful in many neural architectures, the existing graph transformers are relatively shallow. In this work, we explore whether more layers are beneficial to graph transformers, and find that current…

Machine Learning · Computer Science 2023-03-02 Haiteng Zhao , Shuming Ma , Dongdong Zhang , Zhi-Hong Deng , Furu Wei

A Hierarchical Quantized Tokenization Framework for Task-Adaptive Graph Representation Learning

Foundation models in language and vision benefit from a unified discrete token interface that converts raw inputs into sequences for scalable pre-training and inference. For graphs, an effective tokenizer should yield reusable discrete…

Information Retrieval · Computer Science 2026-05-28 Yang Xiang , Li Fan , Chenke Yin , Lutz Oettershagen , Chengtao Ji

Relational Graph Transformer

Relational Deep Learning (RDL) is a promising approach for building state-of-the-art predictive models on multi-table relational data by representing it as a heterogeneous temporal graph. However, commonly used Graph Neural Network models…

Machine Learning · Computer Science 2026-02-06 Vijay Prakash Dwivedi , Sri Jaladi , Yangyi Shen , Federico López , Charilaos I. Kanatsoulis , Rishi Puri , Matthias Fey , Jure Leskovec

Plain Transformers Can be Powerful Graph Learners

Transformers have attained outstanding performance across various modalities, owing to their simple but powerful scaled-dot-product (SDP) attention mechanisms. Researchers have attempted to migrate Transformers to graph learning, but most…

Machine Learning · Computer Science 2026-01-30 Liheng Ma , Soumyasundar Pal , Yingxue Zhang , Philip H. S. Torr , Mark Coates

Invariant Graph Transformer for Out-of-Distribution Generalization

Graph Transformers (GTs) have demonstrated great effectiveness across various graph analytical tasks. However, the existing GTs focus on training and testing graph data originated from the same distribution, but fail to generalize under…

Machine Learning · Computer Science 2026-03-16 Tianyin Liao , Ziwei Zhang , Yufei Sun , Chunyu Hu , Jianxin Li

PatchGT: Transformer over Non-trainable Clusters for Learning Graph Representations

Recently the Transformer structure has shown good performances in graph learning tasks. However, these Transformer models directly work on graph nodes and may have difficulties learning high-level information. Inspired by the vision…

Machine Learning · Computer Science 2023-04-11 Han Gao , Xu Han , Jiaoyang Huang , Jian-Xun Wang , Li-Ping Liu