Related papers: Generating Text Sequence Images for Recognition

Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation

We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-to-image synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach1 decouples training data generation…

Computer Vision and Pattern Recognition · Computer Science 2023-09-13 Yunhao Ge , Jiashu Xu , Brian Nlong Zhao , Neel Joshi , Laurent Itti , Vibhav Vineet

Adversarial Text-to-Image Synthesis: A Review

With the advent of generative adversarial networks, synthesizing images from textual descriptions has recently become an active research area. It is a flexible and intuitive way for conditional image generation with significant progress in…

Computer Vision and Pattern Recognition · Computer Science 2021-10-07 Stanislav Frolov , Tobias Hinz , Federico Raue , Jörn Hees , Andreas Dengel

Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition

In this work we present a framework for the recognition of natural scene text. Our framework does not require any human-labelled data, and performs word recognition on the whole image holistically, departing from the character based…

Computer Vision and Pattern Recognition · Computer Science 2014-12-10 Max Jaderberg , Karen Simonyan , Andrea Vedaldi , Andrew Zisserman

Semantic Image Synthesis via Adversarial Learning

In this paper, we propose a way of synthesizing realistic images directly with natural language description, which has many useful applications, e.g. intelligent image manipulation. We attempt to accomplish such synthesis: given a source…

Computer Vision and Pattern Recognition · Computer Science 2017-07-24 Hao Dong , Simiao Yu , Chao Wu , Yike Guo

Style Generation: Image Synthesis based on Coarsely Matched Texts

Previous text-to-image synthesis algorithms typically use explicit textual instructions to generate/manipulate images accurately, but they have difficulty adapting to guidance in the form of coarsely matched texts. In this work, we attempt…

Computer Vision and Pattern Recognition · Computer Science 2023-09-12 Mengyao Cui , Zhe Zhu , Shao-Ping Lu , Yulu Yang

Text-to-Image Synthesis Based on Machine Generated Captions

Text to Image Synthesis refers to the process of automatic generation of a photo-realistic image starting from a given text and is revolutionizing many real-world applications. In order to perform such process it is necessary to exploit…

Machine Learning · Computer Science 2019-10-10 Marco Menardi , Alex Falcon , Saida S. Mohamed , Lorenzo Seidenari , Giuseppe Serra , Alberto Del Bimbo , Carlo Tasso

Photographic Text-to-Image Synthesis with a Hierarchically-nested Adversarial Network

This paper presents a novel method to deal with the challenging task of generating photographic images conditioned on semantic image descriptions. Our method introduces accompanying hierarchical-nested adversarial objectives inside the…

Computer Vision and Pattern Recognition · Computer Science 2018-04-10 Zizhao Zhang , Yuanpu Xie , Lin Yang

LAFITE: Towards Language-Free Training for Text-to-Image Generation

One of the major challenges in training text-to-image generation models is the need of a large number of high-quality image-text pairs. While image samples are often easily accessible, the associated text descriptions typically require…

Computer Vision and Pattern Recognition · Computer Science 2022-03-25 Yufan Zhou , Ruiyi Zhang , Changyou Chen , Chunyuan Li , Chris Tensmeyer , Tong Yu , Jiuxiang Gu , Jinhui Xu , Tong Sun

TiVGAN: Text to Image to Video Generation with Step-by-Step Evolutionary Generator

Advances in technology have led to the development of methods that can create desired visual multimedia. In particular, image generation using deep learning has been extensively studied across diverse fields. In comparison, video…

Computer Vision and Pattern Recognition · Computer Science 2021-06-29 Doyeon Kim , Donggyu Joo , Junmo Kim

Training-free Composite Scene Generation for Layout-to-Image Synthesis

Recent breakthroughs in text-to-image diffusion models have significantly advanced the generation of high-fidelity, photo-realistic images from textual descriptions. Yet, these models often struggle with interpreting spatial arrangements…

Computer Vision and Pattern Recognition · Computer Science 2024-07-19 Jiaqi Liu , Tao Huang , Chang Xu

Learning to Read by Spelling: Towards Unsupervised Text Recognition

This work presents a method for visual text recognition without using any paired supervisory data. We formulate the text recognition task as one of aligning the conditional distribution of strings predicted from given text images, with…

Computer Vision and Pattern Recognition · Computer Science 2018-12-11 Ankush Gupta , Andrea Vedaldi , Andrew Zisserman

Weakly Supervised Scene Text Generation for Low-resource Languages

A large number of annotated training images is crucial for training successful scene text recognition models. However, collecting sufficient datasets can be a labor-intensive and costly process, particularly for low-resource languages. To…

Computer Vision and Pattern Recognition · Computer Science 2023-06-28 Yangchen Xie , Xinyuan Chen , Hongjian Zhan , Palaiahankote Shivakum , Bing Yin , Cong Liu , Yue Lu

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

Image-based sequence recognition has been a long-standing research topic in computer vision. In this paper, we investigate the problem of scene text recognition, which is among the most important and challenging tasks in image-based…

Computer Vision and Pattern Recognition · Computer Science 2015-07-22 Baoguang Shi , Xiang Bai , Cong Yao

Augmented Conditioning Is Enough For Effective Training Image Generation

Image generation abilities of text-to-image diffusion models have significantly advanced, yielding highly photo-realistic images from descriptive text and increasing the viability of leveraging synthetic images to train computer vision…

Computer Vision and Pattern Recognition · Computer Science 2025-02-10 Jiahui Chen , Amy Zhang , Adriana Romero-Soriano

Efficient Neural Architecture for Text-to-Image Synthesis

Text-to-image synthesis is the task of generating images from text descriptions. Image generation, by itself, is a challenging task. When we combine image generation and text, we bring complexity to a new level: we need to combine data from…

Machine Learning · Computer Science 2020-04-27 Douglas M. Souza , Jônatas Wehrmann , Duncan D. Ruiz

Language Generation with Recurrent Generative Adversarial Networks without Pre-training

Generative Adversarial Networks (GANs) have shown great promise recently in image generation. Training GANs for language generation has proven to be more difficult, because of the non-differentiable nature of generating text with recurrent…

Computation and Language · Computer Science 2017-12-22 Ofir Press , Amir Bar , Ben Bogin , Jonathan Berant , Lior Wolf

CPGAN: Full-Spectrum Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis

Typical methods for text-to-image synthesis seek to design effective generative architecture to model the text-to-image mapping directly. It is fairly arduous due to the cross-modality translation. In this paper we circumvent this problem…

Computer Vision and Pattern Recognition · Computer Science 2020-07-14 Jiadong Liang , Wenjie Pei , Feng Lu

Adversarial Generation of Handwritten Text Images Conditioned on Sequences

State-of-the-art offline handwriting text recognition systems tend to use neural networks and therefore require a large amount of annotated data to be trained. In order to partially satisfy this requirement, we propose a system based on…

Computer Vision and Pattern Recognition · Computer Science 2020-11-12 Eloi Alonso , Bastien Moysset , Ronaldo Messina

Text + Sketch: Image Compression at Ultra Low Rates

Recent advances in text-to-image generative models provide the ability to generate high-quality images from short text descriptions. These foundation models, when pre-trained on billion-scale datasets, are effective for various downstream…

Machine Learning · Computer Science 2023-07-06 Eric Lei , Yiğit Berkay Uslu , Hamed Hassani , Shirin Saeedi Bidokhti

Unsupervised Image-to-Image Translation with Generative Adversarial Networks

It's useful to automatically transform an image from its original form to some synthetic form (style, partial contents, etc.), while keeping the original structure or semantics. We define this requirement as the "image-to-image translation"…

Computer Vision and Pattern Recognition · Computer Science 2017-01-11 Hao Dong , Paarth Neekhara , Chao Wu , Yike Guo