Related papers: Convolution, attention and structure embedding

Structured Attention Networks

Attention networks have proven to be an effective approach for embedding categorical inference within a deep neural network. However, for many tasks we may want to model richer structural dependencies without abandoning end-to-end training.…

Computation and Language · Computer Science 2017-02-17 Yoon Kim , Carl Denton , Luong Hoang , Alexander M. Rush

Hypergraph Convolution and Hypergraph Attention

Recently, graph neural networks have attracted great attention and achieved prominent performance in various research fields. Most of those algorithms have assumed pairwise relationships of objects of interest. However, in many real…

Machine Learning · Computer Science 2020-10-13 Song Bai , Feihu Zhang , Philip H. S. Torr

Evolving Attention with Residual Convolutions

Transformer is a ubiquitous model for natural language processing and has attracted wide attentions in computer vision. The attention maps are indispensable for a transformer model to encode the dependencies among input tokens. However,…

Machine Learning · Computer Science 2021-02-26 Yujing Wang , Yaming Yang , Jiangang Bai , Mingliang Zhang , Jing Bai , Jing Yu , Ce Zhang , Gao Huang , Yunhai Tong

Embeddings and Attention in Predictive Modeling

We explore in depth how categorical data can be processed with embeddings in the context of claim severity modeling. We develop several models that range in complexity from simple neural networks to state-of-the-art attention based…

Applications · Statistics 2021-04-09 Kevin Kuo , Ronald Richman

Describing Multimedia Content using Attention-based Encoder--Decoder Networks

Whereas deep neural networks were first mostly used for classification tasks, they are rapidly expanding in the realm of structured output problems, where the observed target is composed of multiple random variables that have a rich joint…

Neural and Evolutionary Computing · Computer Science 2016-11-15 Kyunghyun Cho , Aaron Courville , Yoshua Bengio

Attention, please! A survey of Neural Attention Models in Deep Learning

In humans, Attention is a core property of all perceptual and cognitive operations. Given our limited ability to process competing sources, attention mechanisms select, modulate, and focus on the information most relevant to behavior. For…

Machine Learning · Computer Science 2021-04-01 Alana de Santana Correia , Esther Luna Colombini

Representational Strengths and Limitations of Transformers

Attention layers, as commonly used in transformers, form the backbone of modern deep learning, yet there is no mathematical description of their benefits and deficiencies as compared with other architectures. In this work we establish both…

Machine Learning · Computer Science 2023-11-17 Clayton Sanford , Daniel Hsu , Matus Telgarsky

Agglomerative Attention

Neural networks using transformer-based architectures have recently demonstrated great power and flexibility in modeling sequences of many types. One of the core components of transformer networks is the attention layer, which allows…

Machine Learning · Computer Science 2019-07-16 Matthew Spellings

An Attention-Based Deep Net for Learning to Rank

In information retrieval, learning to rank constructs a machine-based ranking model which given a query, sorts the search results by their degree of relevance or importance to the query. Neural networks have been successfully applied to…

Machine Learning · Computer Science 2017-12-12 Baiyang Wang , Diego Klabjan

Layer-stacked Attention for Heterogeneous Network Embedding

The heterogeneous network is a robust data abstraction that can model entities of different types interacting in various ways. Such heterogeneity brings rich semantic information but presents nontrivial challenges in aggregating the…

Machine Learning · Computer Science 2020-09-18 Nhat Tran , Jean Gao

Variational Structured Attention Networks for Deep Visual Representation Learning

Convolutional neural networks have enabled major progresses in addressing pixel-level prediction tasks such as semantic segmentation, depth estimation, surface normal prediction and so on, benefiting from their powerful capabilities in…

Computer Vision and Pattern Recognition · Computer Science 2021-12-16 Guanglei Yang , Paolo Rota , Xavier Alameda-Pineda , Dan Xu , Mingli Ding , Elisa Ricci

Embedding Dynamic Attributed Networks by Modeling the Evolution Processes

Network embedding has recently emerged as a promising technique to embed nodes of a network into low-dimensional vectors. While fairly successful, most existing works focus on the embedding techniques for static networks. But in practice,…

Social and Information Networks · Computer Science 2020-10-28 Zenan Xu , Zijing Ou , Qinliang Su , Jianxing Yu , Xiaojun Quan , Zhenkun Lin

What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis

The Transformer architecture has inarguably revolutionized deep learning, overtaking classical architectures like multi-layer perceptrons (MLPs) and convolutional neural networks (CNNs). At its core, the attention block differs in form and…

Machine Learning · Computer Science 2025-03-18 Weronika Ormaniec , Felix Dangel , Sidak Pal Singh

What can a Single Attention Layer Learn? A Study Through the Random Features Lens

Attention layers -- which map a sequence of inputs to a sequence of outputs -- are core building blocks of the Transformer architecture which has achieved significant breakthroughs in modern artificial intelligence. This paper presents a…

Machine Learning · Computer Science 2023-07-24 Hengyu Fu , Tianyu Guo , Yu Bai , Song Mei

Deep Tensor Network

The quadratic complexity of dot-product attention introduced in Transformer remains a fundamental bottleneck impeding the progress of foundation models toward unbounded context lengths. Addressing this challenge, we introduce the Deep…

Machine Learning · Computer Science 2025-09-03 Yifan Zhang

A Non-Technical Survey on Deep Convolutional Neural Network Architectures

Artificial neural networks have recently shown great results in many disciplines and a variety of applications, including natural language understanding, speech processing, games and image data generation. One particular application in…

Computer Vision and Pattern Recognition · Computer Science 2018-03-07 Felix Altenberger , Claus Lenz

A Neural Network Model of Spatial and Feature-Based Attention

Visual attention is a mechanism closely intertwined with vision and memory. Top-down information influences visual processing through attention. We designed a neural network model inspired by aspects of human visual attention. This model…

Computer Vision and Pattern Recognition · Computer Science 2025-06-09 Ruoyang Hu , Robert A. Jacobs

Attention mechanisms in neural networks

Attention mechanisms represent a fundamental paradigm shift in neural network architectures, enabling models to selectively focus on relevant portions of input sequences through learned weighting functions. This monograph provides a…

Machine Learning · Computer Science 2026-01-08 Hasi Hays

On the Relationship between Self-Attention and Convolutional Layers

Recent trends of incorporating attention mechanisms in vision have led researchers to reconsider the supremacy of convolutional layers as a primary building block. Beyond helping CNNs to handle long-range dependencies, Ramachandran et al.…

Machine Learning · Computer Science 2020-01-13 Jean-Baptiste Cordonnier , Andreas Loukas , Martin Jaggi

A General Survey on Attention Mechanisms in Deep Learning

Attention is an important mechanism that can be employed for a variety of deep learning models across many different domains and tasks. This survey provides an overview of the most important attention mechanisms proposed in the literature.…

Machine Learning · Computer Science 2022-03-29 Gianni Brauwers , Flavius Frasincar