English
Related papers

Related papers: Unsupervised Image Captioning

200 papers

Understanding images without explicit supervision has become an important problem in computer vision. In this paper, we address image captioning by generating language descriptions of scenes without learning from annotated pairs of images…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 Iro Laina , Christian Rupprecht , Nassir Navab

Image captioning is a longstanding problem in the field of computer vision and natural language processing. To date, researchers have produced impressive state-of-the-art performance in the age of deep learning. Most of these…

Computer Vision and Pattern Recognition · Computer Science 2022-07-20 Zihang Meng , David Yang , Xuefei Cao , Ashish Shah , Ser-Nam Lim

Image captioning has emerged as an interesting research field in recent years due to its broad application scenarios. The traditional paradigm of image captioning relies on paired image-caption datasets to train the model in a supervised…

Computation and Language · Computer Science 2022-02-08 Jiahui Gao , Yi Zhou , Philip L. H. Yu , Shafiq Joty , Jiuxiang Gu

State-of-the-art approaches for image captioning require supervised training data consisting of captions with paired image data. These methods are typically unable to use unsupervised data such as textual data with no corresponding images,…

Computer Vision and Pattern Recognition · Computer Science 2017-06-27 Wenhu Chen , Aurelien Lucchi , Thomas Hofmann

Image captioning, a fundamental task in vision-language understanding, seeks to generate accurate natural language descriptions for provided images. Current image captioning approaches heavily rely on high-quality image-caption pairs, which…

Computer Vision and Pattern Recognition · Computer Science 2023-11-03 Chuanyang Jin

Constructing an organized dataset comprised of a large number of images and several captions for each image is a laborious task, which requires vast human effort. On the other hand, collecting a large number of images and sentences…

Computer Vision and Pattern Recognition · Computer Science 2019-11-22 Dong-Jin Kim , Jinsoo Choi , Tae-Hyun Oh , In So Kweon

Most of current image captioning models heavily rely on paired image-caption datasets. However, getting large scale image-caption paired data is labor-intensive and time-consuming. In this paper, we present a scene graph-based approach for…

Computer Vision and Pattern Recognition · Computer Science 2019-08-20 Jiuxiang Gu , Shafiq Joty , Jianfei Cai , Handong Zhao , Xu Yang , Gang Wang

Image caption generation is a long standing and challenging problem at the intersection of computer vision and natural language processing. A number of recently proposed approaches utilize a fully supervised object recognition model within…

Computer Vision and Pattern Recognition · Computer Science 2019-08-02 Berkan Demirel , Ramazan Gokberk Cinbis , Nazli Ikizler-Cinbis

We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models. Constructing a large-scale labeled image captioning dataset is an expensive task in terms of labor, time, and cost. In…

Computer Vision and Pattern Recognition · Computer Science 2023-01-27 Dong-Jin Kim , Tae-Hyun Oh , Jinsoo Choi , In So Kweon

Image captioning models are becoming increasingly successful at describing the content of images in restricted domains. However, if these models are to function in the wild - for example, as assistants for people with impaired vision - a…

Computer Vision and Pattern Recognition · Computer Science 2018-11-29 Peter Anderson , Stephen Gould , Mark Johnson

Automatically creating the description of an image using any natural languages sentence like English is a very challenging task. It requires expertise of both image processing as well as natural language processing. This paper discuss about…

Computer Vision and Pattern Recognition · Computer Science 2018-10-03 Parth Shah , Vishvajit Bakarola , Supriya Pati

Unsupervised image captioning is a challenging task that aims at generating captions without the supervision of image-sentence pairs, but only with images and sentences drawn from different sources and object labels detected from the…

Computation and Language · Computer Science 2021-06-02 Ukyo Honda , Yoshitaka Ushiku , Atsushi Hashimoto , Taro Watanabe , Yuji Matsumoto

Image captioning has so far been explored mostly in English, as most available datasets are in this language. However, the application of image captioning should not be restricted by language. Only few studies have been conducted for image…

Computation and Language · Computer Science 2017-08-16 Weiyu Lan , Xirong Li , Jianfeng Dong

When automatically generating a sentence description for an image or video, it often remains unclear how well the generated caption is grounded, that is whether the model uses the correct image regions to output particular words, or if the…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Chih-Yao Ma , Yannis Kalantidis , Ghassan AlRegib , Peter Vajda , Marcus Rohrbach , Zsolt Kira

While deep-learning models have been shown to perform well on image-to-text datasets, it is difficult to use them in practice for captioning images. This is because captions traditionally tend to be context-dependent and offer complementary…

Machine Learning · Computer Science 2023-06-07 Shinjini Ghosh , Sagnik Anupam

Recently it has shown that the policy-gradient methods for reinforcement learning have been utilized to train deep end-to-end systems on natural language processing tasks. What's more, with the complexity of understanding image content and…

Computer Vision and Pattern Recognition · Computer Science 2018-09-14 Haichao Shi , Peng Li , Bo Wang , Zhenyu Wang

Generating image descriptions in different languages is essential to satisfy users worldwide. However, it is prohibitively expensive to collect large-scale paired image-caption dataset for every target language which is critical for…

Computer Vision and Pattern Recognition · Computer Science 2019-08-16 Yuqing Song , Shizhe Chen , Yida Zhao , Qin Jin

Inspired by how the human brain employs a higher number of neural pathways when describing a highly focused subject, we show that deep attentive models used for the main vision-language task of image captioning, could be extended to achieve…

Computer Vision and Pattern Recognition · Computer Science 2021-09-01 Zanyar Zohourianshahzadi , Jugal K. Kalita

Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community. In this paper, we present a novel image captioning architecture to better explore semantics…

Computer Vision and Pattern Recognition · Computer Science 2020-06-23 Zhan Shi , Xu Zhou , Xipeng Qiu , Xiaodan Zhu

While recent deep neural network models have achieved promising results on the image captioning task, they rely largely on the availability of corpora with paired image and sentence captions to describe objects in context. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2016-04-29 Lisa Anne Hendricks , Subhashini Venugopalan , Marcus Rohrbach , Raymond Mooney , Kate Saenko , Trevor Darrell
‹ Prev 1 2 3 10 Next ›