Related papers: Geometric Perception based Efficient Text Recognit…

Scene Text Recognition with Semantics

Scene Text Recognition (STR) models have achieved high performance in recent years on benchmark datasets where text images are presented with minimal noise. Traditional STR recognition pipelines take a cropped image as sole input and…

Computer Vision and Pattern Recognition · Computer Science 2022-10-21 Joshua Cesare Placidi , Yishu Miao , Zixu Wang , Lucia Specia

Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition

Existing Scene Text Recognition (STR) methods typically use a language model to optimize the joint probability of the 1D character sequence predicted by a visual recognition (VR) model, which ignore the 2D spatial context of visual…

Computer Vision and Pattern Recognition · Computer Science 2021-12-28 Yue He , Chen Chen , Jing Zhang , Juhua Liu , Fengxiang He , Chaoyue Wang , Bo Du

Deep Neural Network for Semantic-based Text Recognition in Images

State-of-the-art text spotting systems typically aim to detect isolated words or word-by-word text in images of natural scenes and ignore the semantic coherence within a region of text. However, when interpreted together, seemingly isolated…

Computer Vision and Pattern Recognition · Computer Science 2019-12-11 Yi Zheng , Qitong Wang , Margrit Betke

STN-OCR: A single Neural Network for Text Detection and Text Recognition

Detecting and recognizing text in natural scene images is a challenging, yet not completely solved task. In re- cent years several new systems that try to solve at least one of the two sub-tasks (text detection and text recognition) have…

Computer Vision and Pattern Recognition · Computer Science 2017-07-28 Christian Bartz , Haojin Yang , Christoph Meinel

AutoSTR: Efficient Backbone Search for Scene Text Recognition

Scene text recognition (STR) is very challenging due to the diversity of text instances and the complexity of scenes. The community has paid increasing attention to boost the performance by improving the pre-processing image module, like…

Computer Vision and Pattern Recognition · Computer Science 2020-07-17 Hui Zhang , Quanming Yao , Mingkun Yang , Yongchao Xu , Xiang Bai

Multi-Granularity Prediction for Scene Text Recognition

Scene text recognition (STR) has been an active research topic in computer vision for years. To tackle this challenging problem, numerous innovative methods have been successively proposed and incorporating linguistic knowledge into STR…

Computer Vision and Pattern Recognition · Computer Science 2022-10-18 Peng Wang , Cheng Da , Cong Yao

Instruction-Guided Scene Text Recognition

Multi-modal models have shown appealing performance in visual recognition tasks, as free-form text-guided training evokes the ability to understand fine-grained visual content. However, current models cannot be trivially applied to scene…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Yongkun Du , Zhineng Chen , Yuchen Su , Caiyan Jia , Yu-Gang Jiang

Vision Transformer for Fast and Efficient Scene Text Recognition

Scene text recognition (STR) enables computers to read text in natural scenes such as object labels, road signs and instructions. STR helps machines perform informed decisions such as what object to pick, which direction to go, and what is…

Computer Vision and Pattern Recognition · Computer Science 2021-05-19 Rowel Atienza

On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention

Scene text recognition (STR) is the task of recognizing character sequences in natural scenes. While there have been great advances in STR methods, current methods still fail to recognize texts in arbitrary shapes, such as heavily curved or…

Computer Vision and Pattern Recognition · Computer Science 2019-10-11 Junyeop Lee , Sungrae Park , Jeonghun Baek , Seong Joon Oh , Seonghyeon Kim , Hwalsuk Lee

STR-GQN: Scene Representation and Rendering for Unknown Cameras Based on Spatial Transformation Routing

Geometry-aware modules are widely applied in recent deep learning architectures for scene representation and rendering. However, these modules require intrinsic camera information that might not be obtained accurately. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2021-08-09 Wen-Cheng Chen , Min-Chun Hu , Chu-Song Chen

Revisiting Scene Text Recognition: A Data Perspective

This paper aims to re-assess scene text recognition (STR) from a data-oriented perspective. We begin by revisiting the six commonly used benchmarks in STR and observe a trend of performance saturation, whereby only 2.91% of the benchmark…

Computer Vision and Pattern Recognition · Computer Science 2023-07-20 Qing Jiang , Jiapeng Wang , Dezhi Peng , Chongyu Liu , Lianwen Jin

TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

Scene text recognition (STR) suffers from challenges of either less realistic synthetic training data or the difficulty of collecting sufficient high-quality real-world data, limiting the effectiveness of trained models. Meanwhile, despite…

Computer Vision and Pattern Recognition · Computer Science 2025-09-11 Xingsong Ye , Yongkun Du , Yunbo Tao , Zhineng Chen

Revisiting Classification Perspective on Scene Text Recognition

The prevalent perspectives of scene text recognition are from sequence to sequence (seq2seq) and segmentation. Nevertheless, the former is composed of many components which makes implementation and deployment complicated, while the latter…

Computer Vision and Pattern Recognition · Computer Science 2021-06-15 Hongxiang Cai , Jun Sun , Yichao Xiong

Scene Text Detection and Recognition "in light of" Challenging Environmental Conditions using Aria Glasses Egocentric Vision Cameras

In an era where wearable technology is reshaping applications, Scene Text Detection and Recognition (STDR) becomes a straightforward choice through the lens of egocentric vision. Leveraging Meta's Project Aria smart glasses, this paper…

Computer Vision and Pattern Recognition · Computer Science 2025-07-23 Joseph De Mathia , Carlos Francisco Moreno-García

SCATTER: Selective Context Attentional Scene Text Recognizer

Scene Text Recognition (STR), the task of recognizing text against complex image backgrounds, is an active area of research. Current state-of-the-art (SOTA) methods still struggle to recognize text written in arbitrary shapes. In this…

Computer Vision and Pattern Recognition · Computer Science 2020-03-26 Ron Litman , Oron Anschel , Shahar Tsiper , Roee Litman , Shai Mazor , R. Manmatha

Training Protocol Matters: Towards Accurate Scene Text Recognition via Training Protocol Searching

The development of scene text recognition (STR) in the era of deep learning has been mainly focused on novel architectures of STR models. However, training protocol (i.e., settings of the hyper-parameters involved in the training of STR…

Computer Vision and Pattern Recognition · Computer Science 2022-03-18 Xiaojie Chu , Yongtao Wang , Chunhua Shen , Jingdong Chen , Wei Chu

Global and Local Texture Randomization for Synthetic-to-Real Semantic Segmentation

Semantic segmentation is a crucial image understanding task, where each pixel of image is categorized into a corresponding label. Since the pixel-wise labeling for ground-truth is tedious and labor intensive, in practical applications, many…

Computer Vision and Pattern Recognition · Computer Science 2021-08-09 Duo Peng , Yinjie Lei , Lingqiao Liu , Pingping Zhang , Jun Liu

DiffusionSTR: Diffusion Model for Scene Text Recognition

This paper presents Diffusion Model for Scene Text Recognition (DiffusionSTR), an end-to-end text recognition framework using diffusion models for recognizing text in the wild. While existing studies have viewed the scene text recognition…

Computer Vision and Pattern Recognition · Computer Science 2023-06-30 Masato Fujitake

STR-Cert: Robustness Certification for Deep Text Recognition on Deep Learning Pipelines and Vision Transformers

Robustness certification, which aims to formally certify the predictions of neural networks against adversarial inputs, has become an integral part of important tool for safety-critical applications. Despite considerable progress, existing…

Computer Vision and Pattern Recognition · Computer Science 2024-01-12 Daqian Shao , Lukas Fesser , Marta Kwiatkowska

A Feasible Framework for Arbitrary-Shaped Scene Text Recognition

Deep learning based methods have achieved surprising progress in Scene Text Recognition (STR), one of classic problems in computer vision. In this paper, we propose a feasible framework for multi-lingual arbitrary-shaped STR, including…

Computer Vision and Pattern Recognition · Computer Science 2019-12-13 Jinjin Zhang , Wei Wang , Di Huang , Qingjie Liu , Yunhong Wang