Related papers: Attention Boosted Sequential Inference Model

Supervising Model Attention with Human Explanations for Robust Natural Language Inference

Natural Language Inference (NLI) models are known to learn from biases and artefacts within their training data, impacting how well they generalise to other unseen datasets. Existing de-biasing approaches focus on preventing the models from…

Computation and Language · Computer Science 2022-05-03 Joe Stacey , Yonatan Belinkov , Marek Rei

Bayes optimal learning of attention-indexed models

We introduce the attention-indexed model (AIM), a theoretical framework for analyzing learning in deep attention layers. Inspired by multi-index models, AIM captures how token-level outputs emerge from layered bilinear interactions over…

Machine Learning · Computer Science 2026-02-03 Fabrizio Boncoraglio , Emanuele Troiani , Vittorio Erba , Lenka Zdeborová

Enhanced LSTM for Natural Language Inference

Reasoning and inference are central to human and artificial intelligence. Modeling inference in human language is very challenging. With the availability of large annotated data (Bowman et al., 2015), it has recently become feasible to…

Computation and Language · Computer Science 2020-03-04 Qian Chen , Xiaodan Zhu , Zhenhua Ling , Si Wei , Hui Jiang , Diana Inkpen

Learning Natural Language Inference with LSTM

Natural language inference (NLI) is a fundamentally important task in natural language processing that has many applications. The recently released Stanford Natural Language Inference (SNLI) corpus has made it possible to develop and…

Computation and Language · Computer Science 2016-11-11 Shuohang Wang , Jing Jiang

Attention-based model for predicting question relatedness on Stack Overflow

Stack Overflow is one of the most popular Programming Community-based Question Answering (PCQA) websites that has attracted more and more users in recent years. When users raise or inquire questions in Stack Overflow, providing related…

Computation and Language · Computer Science 2021-04-06 Jiayan Pei , Yimin Wu , Zishan Qin , Yao Cong , Jingtao Guan

Ensemble Neural Relation Extraction with Adaptive Boosting

Relation extraction has been widely studied to extract new relational facts from open corpus. Previous relation extraction methods are faced with the problem of wrong labels and noisy data, which substantially decrease the performance of…

Information Retrieval · Computer Science 2018-05-01 Dongdong Yang , Senzhang Wang , Zhoujun Li

Interpreting Recurrent and Attention-Based Neural Models: a Case Study on Natural Language Inference

Deep learning models have achieved remarkable success in natural language inference (NLI) tasks. While these models are widely explored, they are hard to interpret and it is often unclear how and why they actually work. In this paper, we…

Computation and Language · Computer Science 2019-05-21 Reza Ghaeini , Xiaoli Z. Fern , Prasad Tadepalli

A Recurrent Neural Model with Attention for the Recognition of Chinese Implicit Discourse Relations

We introduce an attention-based Bi-LSTM for Chinese implicit discourse relations and demonstrate that modeling argument pairs as a joint sequence can outperform word order-agnostic approaches. Our model benefits from a partial sampling…

Computation and Language · Computer Science 2018-02-01 Samuel Rönnqvist , Niko Schenk , Christian Chiarcos

Attention-Based Models for Text-Dependent Speaker Verification

Attention-based models have recently shown great performance on a range of tasks, such as speech recognition, machine translation, and image captioning due to their ability to summarize relevant information that expands through the entire…

Audio and Speech Processing · Electrical Eng. & Systems 2018-02-02 F A Rezaur Rahman Chowdhury , Quan Wang , Ignacio Lopez Moreno , Li Wan

Knowledge Enhanced Attention for Robust Natural Language Inference

Neural network models have been very successful at achieving high accuracy on natural language inference (NLI) tasks. However, as demonstrated in recent literature, when tested on some simple adversarial examples, most of the models suffer…

Computation and Language · Computer Science 2019-09-04 Alexander Hanbo Li , Abhinav Sethy

Distil-xLSTM: Learning Attention Mechanisms through Recurrent Structures

The current era of Natural Language Processing (NLP) is dominated by Transformer models. However, novel architectures relying on recurrent mechanisms, such as xLSTM and Mamba, have been proposed as alternatives to attention-based models.…

Machine Learning · Computer Science 2025-03-25 Abdoul Majid O. Thiombiano , Brahim Hnich , Ali Ben Mrad , Mohamed Wiem Mkaouer

Attention-based Contextual Language Model Adaptation for Speech Recognition

Language modeling (LM) for automatic speech recognition (ASR) does not usually incorporate utterance level contextual information. For some domains like voice assistants, however, additional context, such as the time at which an utterance…

Computation and Language · Computer Science 2021-06-04 Richard Diehl Martinez , Scott Novotney , Ivan Bulyko , Ariya Rastrow , Andreas Stolcke , Ankur Gandhe

Investigating Methods to Improve Language Model Integration for Attention-based Encoder-Decoder ASR Models

Attention-based encoder-decoder (AED) models learn an implicit internal language model (ILM) from the training transcriptions. The integration with an external LM trained on much more unpaired text usually leads to better performance. A…

Computation and Language · Computer Science 2021-06-18 Mohammad Zeineldeen , Aleksandr Glushko , Wilfried Michel , Albert Zeyer , Ralf Schlüter , Hermann Ney

Neural Language Modeling With Implicit Cache Pointers

A cache-inspired approach is proposed for neural language models (LMs) to improve long-range dependency and better predict rare words from long contexts. This approach is a simpler alternative to attention-based pointer mechanism that…

Audio and Speech Processing · Electrical Eng. & Systems 2020-09-30 Ke Li , Daniel Povey , Sanjeev Khudanpur

Internal Language Model Estimation Through Explicit Context Vector Learning for Attention-based Encoder-decoder ASR

An end-to-end (E2E) ASR model implicitly learns a prior Internal Language Model (ILM) from the training transcripts. To fuse an external LM using Bayes posterior theory, the log likelihood produced by the ILM has to be accurately estimated…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-03 Yufei Liu , Rao Ma , Haihua Xu , Yi He , Zejun Ma , Weibin Zhang

A Decomposable Attention Model for Natural Language Inference

We propose a simple neural architecture for natural language inference. Our approach uses attention to decompose the problem into subproblems that can be solved separately, thus making it trivially parallelizable. On the Stanford Natural…

Computation and Language · Computer Science 2016-09-27 Ankur P. Parikh , Oscar Täckström , Dipanjan Das , Jakob Uszkoreit

Improving Natural Language Inference with a Pretrained Parser

We introduce a novel approach to incorporate syntax into natural language inference (NLI) models. Our method uses contextual token-level vector representations from a pretrained dependency parser. Like other contextual embedders, our method…

Computation and Language · Computer Science 2019-09-19 Deric Pang , Lucy H. Lin , Noah A. Smith

Iterative Recursive Attention Model for Interpretable Sequence Classification

Natural language processing has greatly benefited from the introduction of the attention mechanism. However, standard attention models are of limited interpretability for tasks that involve a series of inference steps. We describe an…

Computation and Language · Computer Science 2018-09-03 Martin Tutek , Jan Šnajder

Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention

In this paper, we proposed a sentence encoding-based model for recognizing text entailment. In our approach, the encoding of sentence is a two-stage process. Firstly, average pooling was used over word-level bidirectional LSTM (biLSTM) to…

Computation and Language · Computer Science 2016-05-31 Yang Liu , Chengjie Sun , Lei Lin , Xiaolong Wang

Learning to Deceive with Attention-Based Explanations

Attention mechanisms are ubiquitous components in neural architectures applied to natural language processing. In addition to yielding gains in predictive accuracy, attention weights are often claimed to confer interpretability, purportedly…

Computation and Language · Computer Science 2020-04-08 Danish Pruthi , Mansi Gupta , Bhuwan Dhingra , Graham Neubig , Zachary C. Lipton