Related papers: Attention improves concentration when learning nod…

An Attention-Based Deep Net for Learning to Rank

In information retrieval, learning to rank constructs a machine-based ranking model which given a query, sorts the search results by their degree of relevance or importance to the query. Neural networks have been successfully applied to…

Machine Learning · Computer Science 2017-12-12 Baiyang Wang , Diego Klabjan

Attention with Trained Embeddings Provably Selects Important Tokens

Token embeddings play a crucial role in language modeling but, despite this practical relevance, their theoretical understanding remains limited. Our paper addresses the gap by characterizing the structure of embeddings obtained via…

Machine Learning · Computer Science 2025-06-26 Diyuan Wu , Aleksandr Shevchenko , Samet Oymak , Marco Mondelli

FAN: Focused Attention Networks

Attention networks show promise for both vision and language tasks, by emphasizing relationships between constituent elements through weighting functions. Such elements could be regions in an image output by a region proposal network, or…

Machine Learning · Computer Science 2019-10-07 Chu Wang , Babak Samari , Vladimir Kim , Siddhartha Chaudhuri , Kaleem Siddiqi

Alignment Attention by Matching Key and Query Distributions

The neural attention mechanism has been incorporated into deep neural networks to achieve state-of-the-art performance in various domains. Most such models use multi-head self-attention which is appealing for the ability to attend to…

Machine Learning · Computer Science 2021-10-26 Shujian Zhang , Xinjie Fan , Huangjie Zheng , Korawat Tanwisuth , Mingyuan Zhou

Self-attention Presents Low-dimensional Knowledge Graph Embeddings for Link Prediction

A few models have tried to tackle the link prediction problem, also known as knowledge graph completion, by embedding knowledge graphs in comparably lower dimensions. However, the state-of-the-art results are attained at the cost of…

Machine Learning · Computer Science 2022-11-29 Peyman Baghershahi , Reshad Hosseini , Hadi Moradi

Attention-based Ensemble for Deep Metric Learning

Deep metric learning aims to learn an embedding function, modeled as deep neural network. This embedding function usually puts semantically similar images close while dissimilar images far from each other in the learned embedding space.…

Computer Vision and Pattern Recognition · Computer Science 2018-09-03 Wonsik Kim , Bhavya Goyal , Kunal Chawla , Jungmin Lee , Keunjoo Kwon

Knowledge Graph Embedding using Graph Convolutional Networks with Relation-Aware Attention

Knowledge graph embedding methods learn embeddings of entities and relations in a low dimensional space which can be used for various downstream machine learning tasks such as link prediction and entity matching. Various graph convolutional…

Machine Learning · Computer Science 2021-02-16 Nasrullah Sheikh , Xiao Qin , Berthold Reinwald , Christoph Miksovic , Thomas Gschwind , Paolo Scotton

Research on a hybrid LSTM-CNN-Attention model for text-based web content classification

This study presents a hybrid deep learning architecture that integrates LSTM, CNN, and an Attention mechanism to enhance the classification of web content based on text. Pretrained GloVe embeddings are used to represent words as dense…

Computation and Language · Computer Science 2025-12-29 Mykola Kuz , Ihor Lazarovych , Mykola Kozlenko , Mykola Pikuliak , Andrii Kvasniuk

Revisiting Attention Weights as Explanations from an Information Theoretic Perspective

Attention mechanisms have recently demonstrated impressive performance on a range of NLP tasks, and attention scores are often used as a proxy for model explainability. However, there is a debate on whether attention weights can, in fact,…

Computation and Language · Computer Science 2022-11-16 Bingyang Wen , K. P. Subbalakshmi , Fan Yang

Watch Your Step: Learning Node Embeddings via Graph Attention

Graph embedding methods represent nodes in a continuous vector space, preserving information from the graph (e.g. by sampling random walks). There are many hyper-parameters to these methods (such as random walk length) which have to be…

Machine Learning · Computer Science 2018-12-27 Sami Abu-El-Haija , Bryan Perozzi , Rami Al-Rfou , Alex Alemi

Attention-based Multimodal Feature Representation Model for Micro-video Recommendation

In recommender systems, models mostly use a combination of embedding layers and multilayer feedforward neural networks. The high-dimensional sparse original features are downscaled in the embedding layer and then fed into the fully…

Information Retrieval · Computer Science 2022-05-19 Mohan Hasama , Jing Li

Neural Attention: A Novel Mechanism for Enhanced Expressive Power in Transformer Models

Transformer models typically calculate attention matrices using dot products, which have limitations when capturing nonlinear relationships between embedding vectors. We propose Neural Attention, a technique that replaces dot products with…

Machine Learning · Computer Science 2025-11-10 Andrew DiGiugno , Ausif Mahmood

aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model

As an alternative to question answering methods based on feature engineering, deep learning approaches such as convolutional neural networks (CNNs) and Long Short-Term Memory Models (LSTMs) have recently been proposed for semantic matching…

Information Retrieval · Computer Science 2019-06-04 Liu Yang , Qingyao Ai , Jiafeng Guo , W. Bruce Croft

Link Prediction with Attention Applied on Multiple Knowledge Graph Embedding Models

Predicting missing links between entities in a knowledge graph is a fundamental task to deal with the incompleteness of data on the Web. Knowledge graph embeddings map nodes into a vector space to predict new links, scoring them according…

Artificial Intelligence · Computer Science 2023-02-14 Cosimo Gregucci , Mojtaba Nayyeri , Daniel Hernández , Steffen Staab

Adversarial Context Aware Network Embeddings for Textual Networks

Representation learning of textual networks poses a significant challenge as it involves capturing amalgamated information from two modalities: (i) underlying network structure, and (ii) node textual attributes. For this, most existing…

Computation and Language · Computer Science 2020-11-06 Tony Gracious , Ambedkar Dukkipati

Joint Embedding of Words and Labels for Text Classification

Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences. We propose to view text classification as a label-word joint embedding…

Computation and Language · Computer Science 2018-05-14 Guoyin Wang , Chunyuan Li , Wenlin Wang , Yizhe Zhang , Dinghan Shen , Xinyuan Zhang , Ricardo Henao , Lawrence Carin

GraphTARIF: Linear Graph Transformer with Augmented Rank and Improved Focus

Linear attention mechanisms have emerged as efficient alternatives to full self-attention in Graph Transformers, offering linear time complexity. However, existing linear attention models often suffer from a significant drop in…

Computer Vision and Pattern Recognition · Computer Science 2026-01-29 Zhaolin Hu , Kun Li , Hehe Fan , Yi Yang

Towards understanding how attention mechanism works in deep learning

Attention mechanism has been extensively integrated within mainstream neural network architectures, such as Transformers and graph attention networks. Yet, its underlying working principles remain somewhat elusive. What is its essence? Are…

Machine Learning · Computer Science 2024-12-25 Tianyu Ruan , Shihua Zhang

Improving Large-Scale Recommender Systems with Auxiliary Learning

Training large-scale recommendation models under a single global objective implicitly assumes homogeneity across user populations. However, real-world data are composites of heterogeneous cohorts with distinct conditional distributions. As…

Machine Learning · Computer Science 2026-04-23 Mertcan Cokbas , Ziteng Liu , Zeyi Tao , Elder Veliz , Qin Huang , Ellie Wen , Huayu Li , Qiang Jin , Murat Duman , Benjamin Au , Guy Lebanon , Sagar Chordia , Chengkai Zhang

An Attention Mechanism for Answer Selection Using a Combined Global and Local View

We propose a new attention mechanism for neural based question answering, which depends on varying granularities of the input. Previous work focused on augmenting recurrent neural networks with simple attention mechanisms which are a…

Computation and Language · Computer Science 2017-09-21 Yoram Bachrach , Andrej Zukov-Gregoric , Sam Coope , Ed Tovell , Bogdan Maksak , Jose Rodriguez , Conan McMurtie