Related papers: SEED: Semantics Enhanced Encoder-Decoder Framework…

Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition

Attention-based encoder-decoder framework is widely used in the scene text recognition task. However, for the current state-of-the-art(SOTA) methods, there is room for improvement in terms of the efficient usage of local visual and global…

Computer Vision and Pattern Recognition · Computer Science 2021-11-16 Mengmeng Cui , Wei Wang , Jinjin Zhang , Liang Wang

DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting

Recent end-to-end scene text spotters have achieved great improvement in recognizing arbitrary-shaped text instances. Common approaches for text spotting use region of interest pooling or segmentation masks to restrict features to single…

Computer Vision and Pattern Recognition · Computer Science 2022-03-11 Seonghyeon Kim , Seung Shin , Yoonsik Kim , Han-Cheol Cho , Taeho Kil , Jaeheung Surh , Seunghyun Park , Bado Lee , Youngmin Baek

Focus-Enhanced Scene Text Recognition with Deformable Convolutions

Recently, scene text recognition methods based on deep learning have sprung up in computer vision area. The existing methods achieved great performances, but the recognition of irregular text is still challenging due to the various shapes…

Computer Vision and Pattern Recognition · Computer Science 2022-05-06 Linjie Deng , Yanxiang Gong , Xinchen Lu , Xin Yi , Zheng Ma , Mei Xie

Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning

Due to the flexible representation of arbitrary-shaped scene text and simple pipeline, bottom-up segmentation-based methods begin to be mainstream in real-time scene text detection. Despite great progress, these methods show deficiencies in…

Computer Vision and Pattern Recognition · Computer Science 2023-08-15 Xugong Qin , Pengyuan Lyu , Chengquan Zhang , Yu Zhou , Kun Yao , Peng Zhang , Hailun Lin , Weiping Wang

Scene Text Recognition with Temporal Convolutional Encoder

Texts from scene images typically consist of several characters and exhibit a characteristic sequence structure. Existing methods capture the structure with the sequence-to-sequence models by an encoder to have the visual representations…

Computer Vision and Pattern Recognition · Computer Science 2020-02-18 Xiangcheng Du , Tianlong Ma , Yingbin Zheng , Hao Ye , Xingjiao Wu , Liang He

Efficient scene text image super-resolution with semantic guidance

Scene text image super-resolution has significantly improved the accuracy of scene text recognition. However, many existing methods emphasize performance over efficiency and ignore the practical need for lightweight solutions in deployment…

Computer Vision and Pattern Recognition · Computer Science 2024-03-21 LeoWu TomyEnrique , Xiangcheng Du , Kangliang Liu , Han Yuan , Zhao Zhou , Cheng Jin

Text in the Dark: Extremely Low-Light Text Image Enhancement

Extremely low-light text images are common in natural scenes, making scene text detection and recognition challenging. One solution is to enhance these images using low-light image enhancement methods before text extraction. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-04-23 Che-Tsung Lin , Chun Chet Ng , Zhi Qin Tan , Wan Jun Nah , Xinyu Wang , Jie Long Kew , Pohao Hsu , Shang Hong Lai , Chee Seng Chan , Christopher Zach

Text-Attentional Convolutional Neural Networks for Scene Text Detection

Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature computed globally from a whole image component (patch), where the cluttered…

Computer Vision and Pattern Recognition · Computer Science 2016-05-04 Tong He , Weilin Huang , Yu Qiao , Jian Yao

FASTER: A Font-Agnostic Scene Text Editing and Rendering Framework

Scene Text Editing (STE) is a challenging research problem, that primarily aims towards modifying existing texts in an image while preserving the background and the font style of the original text. Despite its utility in numerous real-world…

Computer Vision and Pattern Recognition · Computer Science 2024-11-06 Alloy Das , Sanket Biswas , Prasun Roy , Subhankar Ghosh , Umapada Pal , Michael Blumenstein , Josep Lladós , Saumik Bhattacharya

IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition

Nowadays, scene text recognition has attracted more and more attention due to its diverse applications. Most state-of-the-art methods adopt an encoder-decoder framework with the attention mechanism, autoregressively generating text from…

Computer Vision and Pattern Recognition · Computer Science 2025-04-01 Xiaomeng Yang , Zhi Qiao , Yu Zhou

SCATTER: Selective Context Attentional Scene Text Recognizer

Scene Text Recognition (STR), the task of recognizing text against complex image backgrounds, is an active area of research. Current state-of-the-art (SOTA) methods still struggle to recognize text written in arbitrary shapes. In this…

Computer Vision and Pattern Recognition · Computer Science 2020-03-26 Ron Litman , Oron Anschel , Shahar Tsiper , Roee Litman , Shai Mazor , R. Manmatha

Efficient and Accurate Scene Text Recognition with Cascaded-Transformers

In recent years, vision transformers with text decoder have demonstrated remarkable performance on Scene Text Recognition (STR) due to their ability to capture long-range dependencies and contextual relationships with high learning…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Savas Ozkan , Andrea Maracani , Hyowon Kim , Sijun Cho , Eunchung Noh , Jeongwon Min , Jung Min Cho , Mete Ozay

SceneEncoder: Scene-Aware Semantic Segmentation of Point Clouds with A Learnable Scene Descriptor

Besides local features, global information plays an essential role in semantic segmentation, while recent works usually fail to explicitly extract the meaningful global information and make full use of it. In this paper, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2020-01-27 Jiachen Xu , Jingyu Gong , Jie Zhou , Xin Tan , Yuan Xie , Lizhuang Ma

RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition

The attention-based encoder-decoder framework has recently achieved impressive results for scene text recognition, and many variants have emerged with improvements in recognition quality. However, it performs poorly on contextless texts…

Computer Vision and Pattern Recognition · Computer Science 2020-07-20 Xiaoyu Yue , Zhanghui Kuang , Chenhao Lin , Hongbin Sun , Wayne Zhang

Parallel Scale-wise Attention Network for Effective Scene Text Recognition

The paper proposes a new text recognition network for scene-text images. Many state-of-the-art methods employ the attention mechanism either in the text encoder or decoder for the text alignment. Although the encoder-based attention yields…

Computer Vision and Pattern Recognition · Computer Science 2021-04-27 Usman Sajid , Michael Chow , Jin Zhang , Taejoon Kim , Guanghui Wang

TEACH: Text Encoding as Curriculum Hints for Scene Text Recognition

Scene Text Recognition (STR) remains a challenging task due to complex visual appearances and limited semantic priors. We propose TEACH, a novel training paradigm that injects ground-truth text into the model as auxiliary input and…

Computer Vision and Pattern Recognition · Computer Science 2025-08-05 Xiahan Yang , Hui Zheng

Scene Text Recognition with Semantics

Scene Text Recognition (STR) models have achieved high performance in recent years on benchmark datasets where text images are presented with minimal noise. Traditional STR recognition pipelines take a cropped image as sole input and…

Computer Vision and Pattern Recognition · Computer Science 2022-10-21 Joshua Cesare Placidi , Yishu Miao , Zixu Wang , Lucia Specia

Deep Residual Text Detection Network for Scene Text

Scene text detection is a challenging problem in computer vision. In this paper, we propose a novel text detection network based on prevalent object detection frameworks. In order to obtain stronger semantic feature, we adopt ResNet as…

Computer Vision and Pattern Recognition · Computer Science 2017-11-15 Xiangyu Zhu , Yingying Jiang , Shuli Yang , Xiaobing Wang , Wei Li , Pei Fu , Hua Wang , Zhenbo Luo

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition

Detecting and recognizing text in natural scene images is a challenging, yet not completely solved task. In recent years several new systems that try to solve at least one of the two sub-tasks (text detection and text recognition) have been…

Computer Vision and Pattern Recognition · Computer Science 2017-12-18 Christian Bartz , Haojin Yang , Christoph Meinel

Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages

This paper presents a novel training method for end-to-end scene text recognition. End-to-end scene text recognition offers high recognition accuracy, especially when using the encoder-decoder model based on Transformer. To train a highly…

Computer Vision and Pattern Recognition · Computer Science 2021-11-25 Shota Orihashi , Yoshihiro Yamazaki , Naoki Makishima , Mana Ihori , Akihiko Takashima , Tomohiro Tanaka , Ryo Masumura