Related papers: Data Augmentation for Scene Text Recognition

Revisiting Scene Text Recognition: A Data Perspective

This paper aims to re-assess scene text recognition (STR) from a data-oriented perspective. We begin by revisiting the six commonly used benchmarks in STR and observe a trend of performance saturation, whereby only 2.91% of the benchmark…

Computer Vision and Pattern Recognition · Computer Science 2023-07-20 Qing Jiang , Jiapeng Wang , Dezhi Peng , Chongyu Liu , Lianwen Jin

An Effective Data Augmentation Method by Asking Questions about Scene Text Images

Scene text recognition (STR) and handwritten text recognition (HTR) face significant challenges in accurately transcribing textual content from images into machine-readable formats. Conventional OCR models often predict transcriptions…

Computer Vision and Pattern Recognition · Computer Science 2026-03-05 Xu Yao , Lei Kang

TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

Scene text recognition (STR) suffers from challenges of either less realistic synthetic training data or the difficulty of collecting sufficient high-quality real-world data, limiting the effectiveness of trained models. Meanwhile, despite…

Computer Vision and Pattern Recognition · Computer Science 2025-09-11 Xingsong Ye , Yongkun Du , Yunbo Tao , Zhineng Chen

Pushing the Performance Limit of Scene Text Recognizer without Human Annotation

Scene text recognition (STR) attracts much attention over the years because of its wide application. Most methods train STR model in a fully supervised manner which requires large amounts of labeled data. Although synthetic data contributes…

Computer Vision and Pattern Recognition · Computer Science 2022-05-24 Caiyuan Zheng , Hui Li , Seon-Min Rhee , Seungju Han , Jae-Joon Han , Peng Wang

What Is Wrong with Synthetic Data for Scene Text Recognition? A Strong Synthetic Engine with Diverse Simulations and Self-Evolution

Large-scale and categorical-balanced text data is essential for training effective Scene Text Recognition (STR) models, which is hard to achieve when collecting real data. Synthetic data offers a cost-effective and perfectly labeled…

Computer Vision and Pattern Recognition · Computer Science 2026-03-20 Xingsong Ye , Yongkun Du , JiaXin Zhang , Chen Li , Jing Lyu , Zhineng Chen

What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis

Many new proposals for scene text recognition (STR) models have been introduced in recent years. While each claim to have pushed the boundary of the technology, a holistic and fair comparison has been largely missing in the field due to the…

Computer Vision and Pattern Recognition · Computer Science 2019-12-19 Jeonghun Baek , Geewook Kim , Junyeop Lee , Sungrae Park , Dongyoon Han , Sangdoo Yun , Seong Joon Oh , Hwalsuk Lee

Vision Transformer for Fast and Efficient Scene Text Recognition

Scene text recognition (STR) enables computers to read text in natural scenes such as object labels, road signs and instructions. STR helps machines perform informed decisions such as what object to pick, which direction to go, and what is…

Computer Vision and Pattern Recognition · Computer Science 2021-05-19 Rowel Atienza

Scene Text Recognition with Semantics

Scene Text Recognition (STR) models have achieved high performance in recent years on benchmark datasets where text images are presented with minimal noise. Traditional STR recognition pipelines take a cropped image as sole input and…

Computer Vision and Pattern Recognition · Computer Science 2022-10-21 Joshua Cesare Placidi , Yishu Miao , Zixu Wang , Lucia Specia

JSTR: Judgment Improves Scene Text Recognition

In this paper, we present a method for enhancing the accuracy of scene text recognition tasks by judging whether the image and text match each other. While previous studies focused on generating the recognition results from input images,…

Computer Vision and Pattern Recognition · Computer Science 2024-04-10 Masato Fujitake

What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels

Scene text recognition (STR) task has a common practice: All state-of-the-art STR models are trained on large synthetic data. In contrast to this practice, training STR models only on fewer real labels (STR with fewer labels) is important…

Computer Vision and Pattern Recognition · Computer Science 2021-06-08 Jeonghun Baek , Yusuke Matsui , Kiyoharu Aizawa

AutoSTR: Efficient Backbone Search for Scene Text Recognition

Scene text recognition (STR) is very challenging due to the diversity of text instances and the complexity of scenes. The community has paid increasing attention to boost the performance by improving the pre-processing image module, like…

Computer Vision and Pattern Recognition · Computer Science 2020-07-17 Hui Zhang , Quanming Yao , Mingkun Yang , Yongchao Xu , Xiang Bai

EventSTR: A Benchmark Dataset and Baselines for Event Stream based Scene Text Recognition

Mainstream Scene Text Recognition (STR) algorithms are developed based on RGB cameras which are sensitive to challenging factors such as low illumination, motion blur, and cluttered backgrounds. In this paper, we propose to recognize the…

Computer Vision and Pattern Recognition · Computer Science 2025-02-14 Xiao Wang , Jingtao Jiang , Dong Li , Futian Wang , Lin Zhu , Yaowei Wang , Yongyong Tian , Jin Tang

Stroke-Based Scene Text Erasing Using Synthetic Data for Training

Scene text erasing, which replaces text regions with reasonable content in natural images, has drawn significant attention in the computer vision community in recent years. There are two potential subtasks in scene text erasing: text…

Computer Vision and Pattern Recognition · Computer Science 2021-12-06 Zhengmi Tang , Tomo Miyazaki , Yoshihiro Sugaya , Shinichiro Omachi

RoboAug: One Annotation to Hundreds of Scenes via Region-Contrastive Data Augmentation for Robotic Manipulation

Enhancing the generalization capability of robotic learning to enable robots to operate effectively in diverse, unseen scenes is a fundamental and challenging problem. Existing approaches often depend on pretraining with large-scale data…

Robotics · Computer Science 2026-02-17 Xinhua Wang , Kun Wu , Zhen Zhao , Hu Cao , Yinuo Zhao , Zhiyuan Xu , Meng Li , Shichao Fan , Di Wu , Yixue Zhang , Ning Liu , Zhengping Che , Jian Tang

Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing

Existing scene text recognition (STR) methods struggle to recognize challenging texts, especially for artistic and severely distorted characters. The limitation lies in the insufficient exploration of character morphologies, including the…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Yadong Qu , Yuxin Wang , Bangbang Zhou , Zixiao Wang , Hongtao Xie , Yongdong Zhang

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

Handwritten text and scene text suffer from various shapes and distorted patterns. Thus training a robust recognition model requires a large amount of data to cover diversity as much as possible. In contrast to data collection and…

Computer Vision and Pattern Recognition · Computer Science 2020-03-17 Canjie Luo , Yuanzhi Zhu , Lianwen Jin , Yongpan Wang

Feature-to-Image Data Augmentation: Improving Model Feature Extraction with Cluster-Guided Synthetic Samples

One of the growing trends in machine learning is the use of data generation techniques, since the performance of machine learning models is dependent on the quantity of the training dataset. However, in many real-world applications,…

Artificial Intelligence · Computer Science 2025-04-25 Yasaman Haghbin , Hadi Moradi , Reshad Hosseini

STA: Self-controlled Text Augmentation for Improving Text Classifications

Despite recent advancements in Machine Learning, many tasks still involve working in low-data regimes which can make solving natural language problems difficult. Recently, a number of text augmentation techniques have emerged in the field…

Computation and Language · Computer Science 2023-02-27 Congcong Wang , Gonzalo Fiz Pontiveros , Steven Derby , Tri Kurniawan Wijaya

AudRandAug: Random Image Augmentations for Audio Classification

Data augmentation has proven to be effective in training neural networks. Recently, a method called RandAug was proposed, randomly selecting data augmentation techniques from a predefined search space. RandAug has demonstrated significant…

Sound · Computer Science 2023-09-12 Teerath Kumar , Muhammad Turab , Alessandra Mileo , Malika Bendechache , Takfarinas Saber

RandAugment: Practical automated data augmentation with a reduced search space

Recent work has shown that data augmentation has the potential to significantly improve the generalization of deep learning models. Recently, automated augmentation strategies have led to state-of-the-art results in image classification and…

Computer Vision and Pattern Recognition · Computer Science 2019-11-15 Ekin D. Cubuk , Barret Zoph , Jonathon Shlens , Quoc V. Le