Related papers: Adaptive Embedding Gate for Attention-Based Scene …

Gated Attentive-Autoencoder for Content-Aware Recommendation

The rapid growth of Internet services and mobile devices provides an excellent opportunity to satisfy the strong demand for the personalized item or product recommendation. However, with the tremendous increase of users and items,…

Information Retrieval · Computer Science 2018-12-10 Chen Ma , Peng Kang , Bin Wu , Qinglong Wang , Xue Liu

SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

Scene text recognition is a hot research topic in computer vision. Recently, many recognition methods based on the encoder-decoder framework have been proposed, and they can handle scene texts of perspective distortion and curve shape.…

Computer Vision and Pattern Recognition · Computer Science 2020-05-25 Zhi Qiao , Yu Zhou , Dongbao Yang , Yucan Zhou , Weiping Wang

AASeg: Attention Aware Network for Real Time Semantic Segmentation

Semantic segmentation is a fundamental task in computer vision that involves dense pixel-wise classification for scene understanding. Despite significant progress, achieving high accuracy while maintaining real-time performance remains a…

Computer Vision and Pattern Recognition · Computer Science 2025-07-08 Abhinav Sagar

Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition

Attention-based encoder-decoder framework is widely used in the scene text recognition task. However, for the current state-of-the-art(SOTA) methods, there is room for improvement in terms of the efficient usage of local visual and global…

Computer Vision and Pattern Recognition · Computer Science 2021-11-16 Mengmeng Cui , Wei Wang , Jinjin Zhang , Liang Wang

EAD: An EEG Adapter for Automated Classification

While electroencephalography (EEG) has been a popular modality for neural decoding, it often involves task specific acquisition of the EEG data. This poses challenges for the development of a unified pipeline to learn embeddings for various…

Computer Vision and Pattern Recognition · Computer Science 2025-05-30 Pushapdeep Singh , Jyoti Nigam , Medicherla Vamsi Krishna , Arnav Bhavsar , Aditya Nigam

AMDET: Attention based Multiple Dimensions EEG Transformer for Emotion Recognition

Affective computing is an important branch of artificial intelligence, and with the rapid development of brain computer interface technology, emotion recognition based on EEG signals has received broad attention. It is still a great…

Signal Processing · Electrical Eng. & Systems 2022-12-26 Yongling Xu , Yang Du , Jing Zou , Tianying Zhou , Lushan Xiao , Li Liu , Pengcheng

Attention-based gated scaling adaptative acoustic model for ctc-based speech recognition

In this paper, we propose a novel adaptive technique that uses an attention-based gated scaling (AGS) scheme to improve deep feature learning for connectionist temporal classification (CTC) acoustic modeling. In AGS, the outputs of each…

Audio and Speech Processing · Electrical Eng. & Systems 2020-01-01 Fenglin Ding , Wu Guo , Lirong Dai , Jun Du

Parallel Scale-wise Attention Network for Effective Scene Text Recognition

The paper proposes a new text recognition network for scene-text images. Many state-of-the-art methods employ the attention mechanism either in the text encoder or decoder for the text alignment. Although the encoder-based attention yields…

Computer Vision and Pattern Recognition · Computer Science 2021-04-27 Usman Sajid , Michael Chow , Jin Zhang , Taejoon Kim , Guanghui Wang

Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation

The attention-based encoder-decoder (AED) speech recognition model has been widely successful in recent years. However, the joint optimization of acoustic model and language model in end-to-end manner has created challenges for text…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-17 Shaoshi Ling , Guoli Ye , Rui Zhao , Yifan Gong

Adaptively Aligned Image Captioning via Adaptive Attention Time

Recent neural models for image captioning usually employ an encoder-decoder framework with an attention mechanism. However, the attention mechanism in such a framework aligns one single (attended) image feature vector to one caption word,…

Computer Vision and Pattern Recognition · Computer Science 2020-01-07 Lun Huang , Wenmin Wang , Yaxian Xia , Jie Chen

TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance

Scene text recognition (STR) is an important bridge between images and text, attracting abundant research attention. While convolutional neural networks (CNNS) have achieved remarkable progress in this task, most of the existing works need…

Computer Vision and Pattern Recognition · Computer Science 2021-11-17 Yue Tao , Zhiwei Jia , Runze Ma , Shugong Xu

MANGO: A Mask Attention Guided One-Stage Scene Text Spotter

Recently end-to-end scene text spotting has become a popular research topic due to its advantages of global optimization and high maintainability in real applications. Most methods attempt to develop various region of interest (RoI)…

Computer Vision and Pattern Recognition · Computer Science 2021-10-26 Liang Qiao , Ying Chen , Zhanzhan Cheng , Yunlu Xu , Yi Niu , Shiliang Pu , Fei Wu

Scene Text Recognition with Temporal Convolutional Encoder

Texts from scene images typically consist of several characters and exhibit a characteristic sequence structure. Existing methods capture the structure with the sequence-to-sequence models by an encoder to have the visual representations…

Computer Vision and Pattern Recognition · Computer Science 2020-02-18 Xiangcheng Du , Tianlong Ma , Yingbin Zheng , Hao Ye , Xingjiao Wu , Liang He

Self-Attention Enhanced Selective Gate with Entity-Aware Embedding for Distantly Supervised Relation Extraction

Distantly supervised relation extraction intrinsically suffers from noisy labels due to the strong assumption of distant supervision. Most prior works adopt a selective attention mechanism over sentences in a bag to denoise from wrongly…

Computation and Language · Computer Science 2019-11-28 Yang Li , Guodong Long , Tao Shen , Tianyi Zhou , Lina Yao , Huan Huo , Jing Jiang

DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting

Recent end-to-end scene text spotters have achieved great improvement in recognizing arbitrary-shaped text instances. Common approaches for text spotting use region of interest pooling or segmentation masks to restrict features to single…

Computer Vision and Pattern Recognition · Computer Science 2022-03-11 Seonghyeon Kim , Seung Shin , Yoonsik Kim , Han-Cheol Cho , Taeho Kil , Jaeheung Surh , Seunghyun Park , Bado Lee , Youngmin Baek

Do You Need Text Rectification? Soft Attention Mask Embedding for Rectification-Free Scene Text Spotting

End-to-end scene text spotting, which unifies text detection and recognition within a single framework, has witnessed remarkable progress driven by deep learning advances. However, most existing approaches still suffer from incomplete mask…

Computer Vision and Pattern Recognition · Computer Science 2026-05-19 Antonio Colombo , Giovanni Bianchi

2D Attentional Irregular Scene Text Recognizer

Irregular scene text, which has complex layout in 2D space, is challenging to most previous scene text recognizers. Recently, some irregular scene text recognizers either rectify the irregular text to regular text image with approximate 1D…

Computer Vision and Pattern Recognition · Computer Science 2019-06-14 Pengyuan Lyu , Zhicheng Yang , Xinhang Leng , Xiaojun Wu , Ruiyu Li , Xiaoyong Shen

RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition

The attention-based encoder-decoder framework has recently achieved impressive results for scene text recognition, and many variants have emerged with improvements in recognition quality. However, it performs poorly on contextless texts…

Computer Vision and Pattern Recognition · Computer Science 2020-07-20 Xiaoyu Yue , Zhanghui Kuang , Chenhao Lin , Hongbin Sun , Wayne Zhang

Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers

Modern systems for automatic speech recognition, including the RNN-Transducer and Attention-based Encoder-Decoder (AED), are designed so that the encoder is not required to alter the time-position of information from the audio sequence into…

Sound · Computer Science 2025-02-11 Adam Stooke , Rohit Prabhavalkar , Khe Chai Sim , Pedro Moreno Mengibar

End-to-end Speech Recognition with Adaptive Computation Steps

In this paper, we present Adaptive Computation Steps (ACS) algo-rithm, which enables end-to-end speech recognition models to dy-namically decide how many frames should be processed to predict a linguistic output. The model that applies ACS…

Audio and Speech Processing · Electrical Eng. & Systems 2018-09-27 Mohan Li , Min Liu , Masanori Hattori