Related papers: Sequential Attention Module for Natural Language P…
We present SAM, a biologically-plausible selective attention-driven modulation approach to enhance classification models in a continual learning setting. Inspired by neurophysiological evidence that the primary visual cortex does not…
Recent advances in deep neural networks have been developed via architecture search for stronger representational power. In this work, we focus on the effect of attention in general deep neural networks. We propose a simple and effective…
Recently, textual information has been proved to play a positive role in recommendation systems. However, most of the existing methods only focus on representation learning of textual information in ratings, while potential selection bias…
Time series classification(TSC) has always been an important and challenging research task. With the wide application of deep learning, more and more researchers use deep learning models to solve TSC problems. Since time series always…
In multi-task learning (MTL) for visual scene understanding, it is crucial to transfer useful information between multiple tasks with minimal interferences. In this paper, we propose a novel architecture that effectively transfers…
Attention mechanism has gained great success in vision recognition. Many works are devoted to improving the effectiveness of attention mechanism, which finely design the structure of the attention operator. These works need lots of…
Attention modules for Convolutional Neural Networks (CNNs) are an effective method to enhance performance on multiple computer-vision tasks. While existing methods appropriately model channel-, spatial- and self-attention, they primarily…
Combining several embeddings typically improves performance in downstream tasks as different embeddings encode different information. It has been shown that even models using embeddings from transformers still benefit from the inclusion of…
With the success of pre-trained visual-language (VL) models such as CLIP in visual representation tasks, transferring pre-trained models to downstream tasks has become a crucial paradigm. Recently, the prompt tuning paradigm, which draws…
The trade-off between feature representation power and spatial localization accuracy is crucial for the dense classification/semantic segmentation of aerial images. High-level features extracted from the late layers of a neural network are…
The attention mechanism has been proven effective on various visual tasks in recent years. In the semantic segmentation task, the attention mechanism is applied in various methods, including the case of both Convolution Neural Networks…
Current feature matching methods focus on point-level matching, pursuing better representation learning of individual features, but lacking further understanding of the scene. This results in significant performance degradation when…
We propose Convolutional Block Attention Module (CBAM), a simple yet effective attention module for feed-forward convolutional neural networks. Given an intermediate feature map, our module sequentially infers attention maps along two…
We propose a novel perspective of the attention mechanism by reinventing it as a memory architecture for neural networks, namely Neural Attention Memory (NAM). NAM is a memory structure that is both readable and writable via differentiable…
This paper presents a Semantic Attribute Modulation (SAM) for language modeling and style variation. The semantic attribute modulation includes various document attributes, such as titles, authors, and document categories. We consider two…
Transformer-based language models significantly advanced the state-of-the-art in many linguistic tasks. As this revolution continues, the ability to explain model predictions has become a major area of interest for the NLP community. In…
Semantic segmentation of remote sensing images is essential for various applications, including vegetation monitoring, disaster management, and urban planning. Previous studies have demonstrated that the self-attention mechanism (SA) is an…
While the Self-Attention mechanism in the Transformer model has proven to be effective in many domains, we observe that it is less effective in more diverse settings (e.g. multimodality) due to the varying granularity of each token and the…
Experiments probing natural language processing by both humans and LLMs suggest that the meaning of a semantic expression is indeterminate prior to the act of interpretation rather than being specifiable simply as the sum of its parts (i.e.…
Majority of the recent approaches for text-independent speaker recognition apply attention or similar techniques for aggregation of frame-level feature descriptors generated by a deep neural network (DNN) front-end. In this paper, we…