English
Related papers

Related papers: Image Captioning with Object Detection and Localiz…

200 papers

We address the problem of jointly learning vision and language to understand the object in a fine-grained manner. The key idea of our approach is the use of object descriptions to provide the detailed understanding of an object. Based on…

Computer Vision and Pattern Recognition · Computer Science 2018-03-19 Anh Nguyen , Thanh-Toan Do , Ian Reid , Darwin G. Caldwell , Nikos G. Tsagarakis

Image Captioning, or the automatic generation of descriptions for images, is one of the core problems in Computer Vision and has seen considerable progress using Deep Learning Techniques. We propose to use Inception-ResNet Convolutional…

Computer Vision and Pattern Recognition · Computer Science 2021-02-23 Sulabh Katiyar , Samir Kumar Borgohain

Image captioning often requires a large set of training image-sentence pairs. In practice, however, acquiring sufficient training pairs is always expensive, making the recent captioning models limited in their ability to describe objects…

Computer Vision and Pattern Recognition · Computer Science 2017-08-18 Ting Yao , Yingwei Pan , Yehao Li , Tao Mei

Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images. We describe how we can train this model in a deterministic manner using…

Machine Learning · Computer Science 2016-04-20 Kelvin Xu , Jimmy Ba , Ryan Kiros , Kyunghyun Cho , Aaron Courville , Ruslan Salakhutdinov , Richard Zemel , Yoshua Bengio

Automatically generating a natural language description of an image has attracted interests recently both because of its importance in practical applications and because it connects two major artificial intelligence fields: computer vision…

Computer Vision and Pattern Recognition · Computer Science 2016-03-15 Quanzeng You , Hailin Jin , Zhaowen Wang , Chen Fang , Jiebo Luo

Automatically generating the descriptions of an image, i.e., image captioning, is an important and fundamental topic in artificial intelligence, which bridges the gap between computer vision and natural language processing. Based on the…

Computer Vision and Pattern Recognition · Computer Science 2019-01-14 Shiyang Yan , Yuan Xie , Fangyu Wu , Jeremy S. Smith , Wenjin Lu , Bailing Zhang

Automatically creating the description of an image using any natural languages sentence like English is a very challenging task. It requires expertise of both image processing as well as natural language processing. This paper discuss about…

Computer Vision and Pattern Recognition · Computer Science 2018-10-03 Parth Shah , Vishvajit Bakarola , Supriya Pati

Visual attention plays an important role to understand images and demonstrates its effectiveness in generating natural language descriptions of images. On the other hand, recent studies show that language associated with an image can steer…

Computer Vision and Pattern Recognition · Computer Science 2016-12-13 Jonghwan Mun , Minsu Cho , Bohyung Han

This paper presents a novel approach for automatically generating image descriptions: visual detectors, language models, and multimodal similarity models learnt directly from a dataset of image captions. We use multiple instance learning to…

Computer Vision and Pattern Recognition · Computer Science 2016-02-22 Hao Fang , Saurabh Gupta , Forrest Iandola , Rupesh Srivastava , Li Deng , Piotr Dollár , Jianfeng Gao , Xiaodong He , Margaret Mitchell , John C. Platt , C. Lawrence Zitnick , Geoffrey Zweig

In this work we formulate the problem of image captioning as a multimodal translation task. Analogous to machine translation, we present a sequence-to-sequence recurrent neural networks (RNN) model for image caption generation. Different…

Computer Vision and Pattern Recognition · Computer Science 2017-08-11 Chang Liu , Fuchun Sun , Changhu Wang , Feng Wang , Alan Yuille

It is always well believed that modeling relationships between objects would be helpful for representing and eventually describing an image. Nevertheless, there has not been evidence in support of the idea on image description generation.…

Computer Vision and Pattern Recognition · Computer Science 2018-09-20 Ting Yao , Yingwei Pan , Yehao Li , Tao Mei

A picture is worth a thousand words. Not until recently, however, we noticed some success stories in understanding of visual scenes: a model that is able to detect/name objects, describe their attributes, and recognize their…

Computation and Language · Computer Science 2017-10-27 Ying Hua Tan , Chee Seng Chan

We present a model that generates natural language descriptions of images and their regions. Our approach leverages datasets of images and their sentence descriptions to learn about the inter-modal correspondences between language and…

Computer Vision and Pattern Recognition · Computer Science 2015-04-15 Andrej Karpathy , Li Fei-Fei

Image captioning is the process of automatically generating a description of an image in natural language. Image captioning is one of the significant challenges in image understanding since it requires not only recognizing salient objects…

Computer Vision and Pattern Recognition · Computer Science 2022-07-26 Ghadah Alabduljabbar , Hafida Benhidour , Said Kerrache

Generating natural language descriptions for images is a challenging task. The traditional way is to use the convolutional neural network (CNN) to extract image features, followed by recurrent neural network (RNN) to generate sentences. In…

Computer Vision and Pattern Recognition · Computer Science 2016-02-08 Shijian Tang , Song Han

Automatically generating a human-like description for a given image is a potential research in artificial intelligence, which has attracted a great of attention recently. Most of the existing attention methods explore the mapping…

Computer Vision and Pattern Recognition · Computer Science 2020-11-03 Feicheng Huang , Zhixin Li , Haiyang Wei , Canlong Zhang , Huifang Ma

In this paper, we propose an end-to-end CNN-LSTM model for generating descriptions for sequential images with a local-object attention mechanism. To generate coherent descriptions, we capture global semantic context using a multi-layer…

Computation and Language · Computer Science 2020-12-03 Jing Su , Chenghua Lin , Mian Zhou , Qingyun Dai , Haoyu Lv

Automatically describing an image with a natural language has been an emerging challenge in both fields of computer vision and natural language processing. In this paper, we present Long Short-Term Memory with Attributes (LSTM-A) - a novel…

Computer Vision and Pattern Recognition · Computer Science 2016-11-08 Ting Yao , Yingwei Pan , Yehao Li , Zhaofan Qiu , Tao Mei

With the huge expansion of internet and trillions of gigabytes of data generated every single day, the needs for the development of various tools has become mandatory in order to maintain system adaptability to rapid changes. One of these…

Computer Vision and Pattern Recognition · Computer Science 2020-09-08 Borneel Bikash Phukan , Amiya Ranjan Panda

In this paper we explore the bi-directional mapping between images and their sentence-based descriptions. We propose learning this mapping using a recurrent neural network. Unlike previous approaches that map both sentences and images to a…

Computer Vision and Pattern Recognition · Computer Science 2014-11-21 Xinlei Chen , C. Lawrence Zitnick
‹ Prev 1 2 3 10 Next ›