English
Related papers

Related papers: Object-Centric Unsupervised Image Captioning

200 papers

Deep neural networks have achieved great successes on the image captioning task. However, most of the existing models depend heavily on paired image-sentence datasets, which are very expensive to acquire. In this paper, we make the first…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Yang Feng , Lin Ma , Wei Liu , Jiebo Luo

Image captioning, a fundamental task in vision-language understanding, seeks to generate accurate natural language descriptions for provided images. Current image captioning approaches heavily rely on high-quality image-caption pairs, which…

Computer Vision and Pattern Recognition · Computer Science 2023-11-03 Chuanyang Jin

Understanding images without explicit supervision has become an important problem in computer vision. In this paper, we address image captioning by generating language descriptions of scenes without learning from annotated pairs of images…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 Iro Laina , Christian Rupprecht , Nassir Navab

Image caption generation is a long standing and challenging problem at the intersection of computer vision and natural language processing. A number of recently proposed approaches utilize a fully supervised object recognition model within…

Computer Vision and Pattern Recognition · Computer Science 2019-08-02 Berkan Demirel , Ramazan Gokberk Cinbis , Nazli Ikizler-Cinbis

Image captioning has emerged as an interesting research field in recent years due to its broad application scenarios. The traditional paradigm of image captioning relies on paired image-caption datasets to train the model in a supervised…

Computation and Language · Computer Science 2022-02-08 Jiahui Gao , Yi Zhou , Philip L. H. Yu , Shafiq Joty , Jiuxiang Gu

Image captioning is a task in the field of Artificial Intelligence that merges between computer vision and natural language processing. It is responsible for generating legends that describe images, and has various applications like…

Computer Vision and Pattern Recognition · Computer Science 2021-07-29 Ahmed Elhagry , Karima Kadaoui

Image captioning is a multimodal task involving computer vision and natural language processing, where the goal is to learn a mapping from the image to its natural language description. In general, the mapping function is learned from a…

Computer Vision and Pattern Recognition · Computer Science 2018-07-19 Jiuxiang Gu , Shafiq Joty , Jianfei Cai , Gang Wang

Image captioning models are becoming increasingly successful at describing the content of images in restricted domains. However, if these models are to function in the wild - for example, as assistants for people with impaired vision - a…

Computer Vision and Pattern Recognition · Computer Science 2018-11-29 Peter Anderson , Stephen Gould , Mark Johnson

The goal of unpaired image captioning (UIC) is to describe images without using image-caption pairs in the training phase. Although challenging, we except the task can be accomplished by leveraging a training set of images aligned with…

Computer Vision and Pattern Recognition · Computer Science 2022-03-08 Peipei Zhu , Xiao Wang , Yong Luo , Zhenglong Sun , Wei-Shi Zheng , Yaowei Wang , Changwen Chen

Unsupervised image captioning is a challenging task that aims at generating captions without the supervision of image-sentence pairs, but only with images and sentences drawn from different sources and object labels detected from the…

Computation and Language · Computer Science 2021-06-02 Ukyo Honda , Yoshitaka Ushiku , Atsushi Hashimoto , Taro Watanabe , Yuji Matsumoto

Image captioning is a research area of immense importance, aiming to generate natural language descriptions for visual content in the form of still images. The advent of deep learning and more recently vision-language pre-training…

Computer Vision and Pattern Recognition · Computer Science 2023-08-29 Taraneh Ghandi , Hamidreza Pourreza , Hamidreza Mahyar

Object detection is a fundamental task in computer vision, requiring large annotated datasets that are difficult to collect, as annotators need to label objects and their bounding boxes. Thus, it is a significant challenge to use cheaper…

Computer Vision and Pattern Recognition · Computer Science 2020-10-01 Achiya Jerbi , Roei Herzig , Jonathan Berant , Gal Chechik , Amir Globerson

Learning with complete or partial supervision is powerful but relies on ever-growing human annotation efforts. As a way to mitigate this serious problem, as well as to serve specific applications, unsupervised learning has emerged as an…

Computer Vision and Pattern Recognition · Computer Science 2019-04-08 Huy V. Vo , Francis Bach , Minsu Cho , Kai Han , Yann LeCun , Patrick Perez , Jean Ponce

This paper discusses and demonstrates the outcomes from our experimentation on Image Captioning. Image captioning is a much more involved task than image recognition or classification, because of the additional challenge of recognizing the…

Computer Vision and Pattern Recognition · Computer Science 2018-05-24 Vikram Mullachery , Vishal Motwani

Generating a description of an image is called image captioning. Image captioning requires to recognize the important objects, their attributes and their relationships in an image. It also needs to generate syntactically and semantically…

Computer Vision and Pattern Recognition · Computer Science 2018-10-16 Md. Zakir Hossain , Ferdous Sohel , Mohd Fairuz Shiratuddin , Hamid Laga

We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models. Constructing a large-scale labeled image captioning dataset is an expensive task in terms of labor, time, and cost. In…

Computer Vision and Pattern Recognition · Computer Science 2023-01-27 Dong-Jin Kim , Tae-Hyun Oh , Jinsoo Choi , In So Kweon

Constructing an organized dataset comprised of a large number of images and several captions for each image is a laborious task, which requires vast human effort. On the other hand, collecting a large number of images and sentences…

Computer Vision and Pattern Recognition · Computer Science 2019-11-22 Dong-Jin Kim , Jinsoo Choi , Tae-Hyun Oh , In So Kweon

Generating image descriptions in different languages is essential to satisfy users worldwide. However, it is prohibitively expensive to collect large-scale paired image-caption dataset for every target language which is critical for…

Computer Vision and Pattern Recognition · Computer Science 2019-08-16 Yuqing Song , Shizhe Chen , Yida Zhao , Qin Jin

State-of-the-art approaches for image captioning require supervised training data consisting of captions with paired image data. These methods are typically unable to use unsupervised data such as textual data with no corresponding images,…

Computer Vision and Pattern Recognition · Computer Science 2017-06-27 Wenhu Chen , Aurelien Lucchi , Thomas Hofmann

Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community. In this paper, we present a novel image captioning architecture to better explore semantics…

Computer Vision and Pattern Recognition · Computer Science 2020-06-23 Zhan Shi , Xu Zhou , Xipeng Qiu , Xiaodan Zhu
‹ Prev 1 2 3 10 Next ›