Related papers: Representation and Correlation Enhanced Encoder-De…

SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

Scene text recognition is a hot research topic in computer vision. Recently, many recognition methods based on the encoder-decoder framework have been proposed, and they can handle scene texts of perspective distortion and curve shape.…

Computer Vision and Pattern Recognition · Computer Science 2020-05-25 Zhi Qiao , Yu Zhou , Dongbao Yang , Yucan Zhou , Weiping Wang

RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition

The attention-based encoder-decoder framework has recently achieved impressive results for scene text recognition, and many variants have emerged with improvements in recognition quality. However, it performs poorly on contextless texts…

Computer Vision and Pattern Recognition · Computer Science 2020-07-20 Xiaoyu Yue , Zhanghui Kuang , Chenhao Lin , Hongbin Sun , Wayne Zhang

Scene Text Recognition with Temporal Convolutional Encoder

Texts from scene images typically consist of several characters and exhibit a characteristic sequence structure. Existing methods capture the structure with the sequence-to-sequence models by an encoder to have the visual representations…

Computer Vision and Pattern Recognition · Computer Science 2020-02-18 Xiangcheng Du , Tianlong Ma , Yingbin Zheng , Hao Ye , Xingjiao Wu , Liang He

Parallel Scale-wise Attention Network for Effective Scene Text Recognition

The paper proposes a new text recognition network for scene-text images. Many state-of-the-art methods employ the attention mechanism either in the text encoder or decoder for the text alignment. Although the encoder-based attention yields…

Computer Vision and Pattern Recognition · Computer Science 2021-04-27 Usman Sajid , Michael Chow , Jin Zhang , Taejoon Kim , Guanghui Wang

Attention-based Feature Decomposition-Reconstruction Network for Scene Text Detection

Recently, scene text detection has been a challenging task. Texts with arbitrary shape or large aspect ratio are usually hard to detect. Previous segmentation-based methods can describe curve text more accurately but suffer from over…

Computer Vision and Pattern Recognition · Computer Science 2021-11-30 Qi Zhao , Yufei Wang , Shuchang Lyu , Lijiang Chen

Evaluating Sequence-to-Sequence Models for Handwritten Text Recognition

Encoder-decoder models have become an effective approach for sequence learning tasks like machine translation, image captioning and speech recognition, but have yet to show competitive results for handwritten text recognition. To this end,…

Computer Vision and Pattern Recognition · Computer Science 2019-07-16 Johannes Michael , Roger Labahn , Tobias Grüning , Jochen Zöllner

Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks

Transducer and Attention based Encoder-Decoder (AED) are two widely used frameworks for speech-to-text tasks. They are designed for different purposes and each has its own benefits and drawbacks for speech-to-text tasks. In order to…

Computation and Language · Computer Science 2023-05-08 Yun Tang , Anna Y. Sun , Hirofumi Inaguma , Xinyue Chen , Ning Dong , Xutai Ma , Paden D. Tomasello , Juan Pino

Primitive Representation Learning for Scene Text Recognition

Scene text recognition is a challenging task due to diverse variations of text instances in natural scene images. Conventional methods based on CNN-RNN-CTC or encoder-decoder with attention mechanism may not fully investigate stable and…

Computer Vision and Pattern Recognition · Computer Science 2021-05-11 Ruijie Yan , Liangrui Peng , Shanyu Xiao , Gang Yao

Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning

Due to the flexible representation of arbitrary-shaped scene text and simple pipeline, bottom-up segmentation-based methods begin to be mainstream in real-time scene text detection. Despite great progress, these methods show deficiencies in…

Computer Vision and Pattern Recognition · Computer Science 2023-08-15 Xugong Qin , Pengyuan Lyu , Chengquan Zhang , Yu Zhou , Kun Yao , Peng Zhang , Hailun Lin , Weiping Wang

NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition

Scene text recognition has attracted a great many researches due to its importance to various applications. Existing methods mainly adopt recurrence or convolution based networks. Though have obtained good performance, these methods still…

Computer Vision and Pattern Recognition · Computer Science 2019-10-11 Fenfen Sheng , Zhineng Chen , Bo Xu

Decoder Fusion RNN: Context and Interaction Aware Decoders for Trajectory Prediction

Forecasting the future behavior of all traffic agents in the vicinity is a key task to achieve safe and reliable autonomous driving systems. It is a challenging problem as agents adjust their behavior depending on their intentions, the…

Robotics · Computer Science 2021-12-30 Edoardo Mello Rella , Jan-Nico Zaech , Alexander Liniger , Luc Van Gool

Efficient and Accurate Scene Text Recognition with Cascaded-Transformers

In recent years, vision transformers with text decoder have demonstrated remarkable performance on Scene Text Recognition (STR) due to their ability to capture long-range dependencies and contextual relationships with high learning…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Savas Ozkan , Andrea Maracani , Hyowon Kim , Sijun Cho , Eunchung Noh , Jeongwon Min , Jung Min Cho , Mete Ozay

Translate-to-Recognize Networks for RGB-D Scene Recognition

Cross-modal transfer is helpful to enhance modality-specific discriminative power for scene recognition. To this end, this paper presents a unified framework to integrate the tasks of cross-modal translation and modality-specific…

Computer Vision and Pattern Recognition · Computer Science 2019-04-30 Dapeng Du , Limin Wang , Huiling Wang , Kai Zhao , Gangshan Wu

Decoupling Visual-Semantic Feature Learning for Robust Scene Text Recognition

Semantic information has been proved effective in scene text recognition. Most existing methods tend to couple both visual and semantic information in an attention-based decoder. As a result, the learning of semantic features is prone to…

Computer Vision and Pattern Recognition · Computer Science 2021-11-25 Changxu Cheng , Bohan Li , Qi Zheng , Yongpan Wang , Wenyu Liu

IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition

Nowadays, scene text recognition has attracted more and more attention due to its diverse applications. Most state-of-the-art methods adopt an encoder-decoder framework with the attention mechanism, autoregressively generating text from…

Computer Vision and Pattern Recognition · Computer Science 2025-04-01 Xiaomeng Yang , Zhi Qiao , Yu Zhou

Adaptive Embedding Gate for Attention-Based Scene Text Recognition

Scene text recognition has attracted particular research interest because it is a very challenging problem and has various applications. The most cutting-edge methods are attentional encoder-decoder frameworks that learn the alignment…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 Xiaoxue Chen , Tianwei Wang , Yuanzhi Zhu , Lianwen Jin , Canjie Luo

Context Perception Parallel Decoder for Scene Text Recognition

Scene text recognition (STR) methods have struggled to attain high accuracy and fast inference speed. Autoregressive (AR)-based models implement the recognition in a character-by-character manner, showing superiority in accuracy but with…

Computer Vision and Pattern Recognition · Computer Science 2023-10-10 Yongkun Du , Zhineng Chen , Caiyan Jia , Xiaoting Yin , Chenxia Li , Yuning Du , Yu-Gang Jiang

A Holistic Representation Guided Attention Network for Scene Text Recognition

Reading irregular scene text of arbitrary shape in natural images is still a challenging problem, despite the progress made recently. Many existing approaches incorporate sophisticated network structures to handle various shapes, use extra…

Computer Vision and Pattern Recognition · Computer Science 2021-03-31 Lu Yang , Fan Dang , Peng Wang , Hui Li , Zhen Li , Yanning Zhang

SceneEncoder: Scene-Aware Semantic Segmentation of Point Clouds with A Learnable Scene Descriptor

Besides local features, global information plays an essential role in semantic segmentation, while recent works usually fail to explicitly extract the meaningful global information and make full use of it. In this paper, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2020-01-27 Jiachen Xu , Jingyu Gong , Jie Zhou , Xin Tan , Yuan Xie , Lizhuang Ma

DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting

Recent end-to-end scene text spotters have achieved great improvement in recognizing arbitrary-shaped text instances. Common approaches for text spotting use region of interest pooling or segmentation masks to restrict features to single…

Computer Vision and Pattern Recognition · Computer Science 2022-03-11 Seonghyeon Kim , Seung Shin , Yoonsik Kim , Han-Cheol Cho , Taeho Kil , Jaeheung Surh , Seunghyun Park , Bado Lee , Youngmin Baek