Related papers: Multi-Layer Attention-Based Explainability via Tra…

Towards a Relationship-Aware Transformer for Tabular Data

Deep learning models for tabular data typically do not allow for imposing a graph of external dependencies between samples, which can be useful for accounting for relatedness in tasks such as treatment effect estimation. Graph neural…

Machine Learning · Computer Science 2025-12-09 Andrei V. Konstantinov , Valerii A. Zuev , Lev V. Utkin

Attention-based clustering

Transformers have emerged as a powerful neural network architecture capable of tackling a wide range of learning tasks. In this work, we provide a theoretical analysis of their ability to automatically extract structure from data in an…

Machine Learning · Statistics 2025-10-29 Rodrigo Maulen-Soto , Pierre Marion , Claire Boyer

Relational Attention: Generalizing Transformers for Graph-Structured Tasks

Transformers flexibly operate over sets of real-valued vectors representing task-specific entities and their attributes, where each vector might encode one word-piece token and its position in a sequence, or some piece of information that…

Machine Learning · Computer Science 2023-03-14 Cameron Diao , Ricky Loynd

Towards Mechanistic Interpretability of Graph Transformers via Attention Graphs

We introduce Attention Graphs, a new tool for mechanistic interpretability of Graph Neural Networks (GNNs) and Graph Transformers based on the mathematical equivalence between message passing in GNNs and the self-attention mechanism in…

Machine Learning · Computer Science 2025-02-26 Batu El , Deepro Choudhury , Pietro Liò , Chaitanya K. Joshi

MEGAN: Multi-Explanation Graph Attention Network

We propose a multi-explanation graph attention network (MEGAN). Unlike existing graph explainability methods, our network can produce node and edge attributional explanations along multiple channels, the number of which is independent of…

Machine Learning · Computer Science 2024-02-20 Jonas Teufel , Luca Torresi , Patrick Reiser , Pascal Friederich

Deriving Transformer Architectures as Implicit Multinomial Regression

While attention has been empirically shown to improve model performance, it lacks a rigorous mathematical justification. This short paper establishes a novel connection between attention mechanisms and multinomial regression. Specifically,…

Machine Learning · Computer Science 2025-10-28 Jonas A. Actor , Anthony Gruber , Eric C. Cyr

Transformers as Graph-to-Graph Models

We argue that Transformers are essentially graph-to-graph models, with sequences just being a special case. Attention weights are functionally equivalent to graph edges. Our Graph-to-Graph Transformer architecture makes this ability…

Computation and Language · Computer Science 2023-10-30 James Henderson , Alireza Mohammadshahi , Andrei C. Coman , Lesly Miculicich

KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning

Knowledge graph reasoning plays a vital role in various applications and has garnered considerable attention. Recently, path-based methods have achieved impressive performance. However, they may face limitations stemming from constraints in…

Artificial Intelligence · Computer Science 2024-12-18 Junnan Liu , Qianren Mao , Weifeng Jiang , Jianxin Li

GraphTARIF: Linear Graph Transformer with Augmented Rank and Improved Focus

Linear attention mechanisms have emerged as efficient alternatives to full self-attention in Graph Transformers, offering linear time complexity. However, existing linear attention models often suffer from a significant drop in…

Computer Vision and Pattern Recognition · Computer Science 2026-01-29 Zhaolin Hu , Kun Li , Hehe Fan , Yi Yang

An end-to-end attention-based approach for learning on graphs

There has been a recent surge in transformer-based architectures for learning on graphs, mainly motivated by attention as an effective learning mechanism and the desire to supersede handcrafted operators characteristic of message passing…

Machine Learning · Computer Science 2025-06-10 David Buterez , Jon Paul Janet , Dino Oglic , Pietro Lio

Attention Models in Graphs: A Survey

Graph-structured data arise naturally in many different application domains. By representing data as graphs, we can capture entities (i.e., nodes) as well as their relationships (i.e., edges) with each other. Many useful insights can be…

Artificial Intelligence · Computer Science 2018-07-24 John Boaz Lee , Ryan A. Rossi , Sungchul Kim , Nesreen K. Ahmed , Eunyee Koh

TabNet: Attentive Interpretable Tabular Learning

We propose a novel high-performance and interpretable canonical deep tabular data learning architecture, TabNet. TabNet uses sequential attention to choose which features to reason from at each decision step, enabling interpretability and…

Machine Learning · Computer Science 2020-12-10 Sercan O. Arik , Tomas Pfister

Semantic Interpretation and Validation of Graph Attention-based Explanations for GNN Models

In this work, we propose a methodology for investigating the use of semantic attention to enhance the explainability of Graph Neural Network (GNN)-based models. Graph Deep Learning (GDL) has emerged as a promising field for tasks like scene…

Machine Learning · Computer Science 2023-10-24 Efimia Panagiotaki , Daniele De Martini , Lars Kunze

Boosting gets full Attention for Relational Learning

More often than not in benchmark supervised ML, tabular data is flat, i.e. consists of a single $m \times d$ (rows, columns) file, but cases abound in the real world where observations are described by a set of tables with structural…

Machine Learning · Computer Science 2024-02-26 Mathieu Guillame-Bert , Richard Nock

A Multiscale Visualization of Attention in the Transformer Model

The Transformer is a sequence model that forgoes traditional recurrent architectures in favor of a fully attention-based approach. Besides improving performance, an advantage of using attention is that it can also help to interpret a model…

Human-Computer Interaction · Computer Science 2019-06-14 Jesse Vig

Polynomial-based Self-Attention for Table Representation learning

Structured data, which constitutes a significant portion of existing data types, has been a long-standing research topic in the field of machine learning. Various representation learning methods for tabular data have been proposed, ranging…

Artificial Intelligence · Computer Science 2023-12-19 Jayoung Kim , Yehjin Shin , Jeongwhan Choi , Hyowon Wi , Noseong Park

Multi-branch of Attention Yields Accurate Results for Tabular Data

Tabular data inherently exhibits significant feature heterogeneity, but existing transformer-based methods lack specialized mechanisms to handle this property. To bridge the gap, we propose MAYA, an encoder-decoder transformer-based…

Machine Learning · Computer Science 2025-09-23 Xuechen Li , Yupeng Li , Jian Liu , Xiaolin Jin , Xin Hu

An Attention Matrix for Every Decision: Faithfulness-based Arbitration Among Multiple Attention-Based Interpretations of Transformers in Text Classification

Transformers are widely used in natural language processing, where they consistently achieve state-of-the-art performance. This is mainly due to their attention-based architecture, which allows them to model rich linguistic relations…

Computation and Language · Computer Science 2022-11-29 Nikolaos Mylonas , Ioannis Mollas , Grigorios Tsoumakas

Transformers over Directed Acyclic Graphs

Transformer models have recently gained popularity in graph representation learning as they have the potential to learn complex relationships beyond the ones captured by regular graph neural networks. The main research question is how to…

Machine Learning · Computer Science 2023-10-31 Yuankai Luo , Veronika Thost , Lei Shi

Neural Attention: A Novel Mechanism for Enhanced Expressive Power in Transformer Models

Transformer models typically calculate attention matrices using dot products, which have limitations when capturing nonlinear relationships between embedding vectors. We propose Neural Attention, a technique that replaces dot products with…

Machine Learning · Computer Science 2025-11-10 Andrew DiGiugno , Ausif Mahmood