Related papers: Image Captioning based on Deep Reinforcement Learn…

Deep Reinforcement Learning-based Image Captioning with Embedding Reward

Image captioning is a challenging problem owing to the complexity in understanding the image content and diverse ways of describing it in natural language. Recent advances in deep neural networks have substantially improved the performance…

Computer Vision and Pattern Recognition · Computer Science 2017-04-14 Zhou Ren , Xiaoyu Wang , Ning Zhang , Xutao Lv , Li-Jia Li

Image Captioning using Deep Neural Architectures

Automatically creating the description of an image using any natural languages sentence like English is a very challenging task. It requires expertise of both image processing as well as natural language processing. This paper discuss about…

Computer Vision and Pattern Recognition · Computer Science 2018-10-03 Parth Shah , Vishvajit Bakarola , Supriya Pati

Image Captioning using Deep Stacked LSTMs, Contextual Word Embeddings and Data Augmentation

Image Captioning, or the automatic generation of descriptions for images, is one of the core problems in Computer Vision and has seen considerable progress using Deep Learning Techniques. We propose to use Inception-ResNet Convolutional…

Computer Vision and Pattern Recognition · Computer Science 2021-02-23 Sulabh Katiyar , Samir Kumar Borgohain

A Comprehensive Survey of Deep Learning for Image Captioning

Generating a description of an image is called image captioning. Image captioning requires to recognize the important objects, their attributes and their relationships in an image. It also needs to generate syntactically and semantically…

Computer Vision and Pattern Recognition · Computer Science 2018-10-16 Md. Zakir Hossain , Ferdous Sohel , Mohd Fairuz Shiratuddin , Hamid Laga

Reflective Decoding Network for Image Captioning

State-of-the-art image captioning methods mostly focus on improving visual features, less attention has been paid to utilizing the inherent properties of language to boost captioning performance. In this paper, we show that vocabulary…

Computer Vision and Pattern Recognition · Computer Science 2019-09-02 Lei Ke , Wenjie Pei , Ruiyu Li , Xiaoyong Shen , Yu-Wing Tai

Deep Learning Approaches on Image Captioning: A Review

Image captioning is a research area of immense importance, aiming to generate natural language descriptions for visual content in the form of still images. The advent of deep learning and more recently vision-language pre-training…

Computer Vision and Pattern Recognition · Computer Science 2023-08-29 Taraneh Ghandi , Hamidreza Pourreza , Hamidreza Mahyar

An Efficient Technique for Image Captioning using Deep Neural Network

With the huge expansion of internet and trillions of gigabytes of data generated every single day, the needs for the development of various tools has become mandatory in order to maintain system adaptability to rapid changes. One of these…

Computer Vision and Pattern Recognition · Computer Science 2020-09-08 Borneel Bikash Phukan , Amiya Ranjan Panda

A Thorough Review on Recent Deep Learning Methodologies for Image Captioning

Image Captioning is a task that combines computer vision and natural language processing, where it aims to generate descriptive legends for images. It is a two-fold process relying on accurate image understanding and correct language…

Computer Vision and Pattern Recognition · Computer Science 2021-07-29 Ahmed Elhagry , Karima Kadaoui

Multi-modal reward for visual relationships-based image captioning

Deep neural networks have achieved promising results in automatic image captioning due to their effective representation learning and context-based content generation capabilities. As a prominent type of deep features used in many of the…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Ali Abedi , Hossein Karshenas , Peyman Adibi

Actor-Critic Sequence Training for Image Captioning

Generating natural language descriptions of images is an important capability for a robot or other visual-intelligence driven AI agent that may need to communicate with human users about what it is seeing. Such image captioning methods are…

Computer Vision and Pattern Recognition · Computer Science 2017-11-29 Li Zhang , Flood Sung , Feng Liu , Tao Xiang , Shaogang Gong , Yongxin Yang , Timothy M. Hospedales

Enhancing Image Captioning with Neural Models

This research explores the realm of neural image captioning using deep learning models. The study investigates the performance of different neural architecture configurations, focusing on the inject architecture, and proposes a novel…

Computer Vision and Pattern Recognition · Computer Science 2023-12-04 Pooja Bhatnagar , Sai Mrunaal , Sachin Kamnure

Improving Image Captioning with Better Use of Captions

Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community. In this paper, we present a novel image captioning architecture to better explore semantics…

Computer Vision and Pattern Recognition · Computer Science 2020-06-23 Zhan Shi , Xu Zhou , Xipeng Qiu , Xiaodan Zhu

Image Captioning based on Feature Refinement and Reflective Decoding

Image captioning is the process of automatically generating a description of an image in natural language. Image captioning is one of the significant challenges in image understanding since it requires not only recognizing salient objects…

Computer Vision and Pattern Recognition · Computer Science 2022-07-26 Ghadah Alabduljabbar , Hafida Benhidour , Said Kerrache

Towards Retrieval-Augmented Architectures for Image Captioning

The objective of image captioning models is to bridge the gap between the visual and linguistic modalities by generating natural language descriptions that accurately reflect the content of input images. In recent years, researchers have…

Computer Vision and Pattern Recognition · Computer Science 2024-05-24 Sara Sarto , Marcella Cornia , Lorenzo Baraldi , Alessandro Nicolosi , Rita Cucchiara

Image Captioning and Visual Question Answering Based on Attributes and External Knowledge

Much recent progress in Vision-to-Language problems has been achieved through a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). This approach does not explicitly represent high-level semantic…

Computer Vision and Pattern Recognition · Computer Science 2016-12-19 Qi Wu , Chunhua Shen , Anton van den Hengel , Peng Wang , Anthony Dick

Image Captioning based on Deep Learning Methods: A Survey

Image captioning is a challenging task and attracting more and more attention in the field of Artificial Intelligence, and which can be applied to efficient image retrieval, intelligent blind guidance and human-computer interaction, etc. In…

Computer Vision and Pattern Recognition · Computer Science 2019-05-21 Yiyu Wang , Jungang Xu , Yingfei Sun , Ben He

Retrieval-Augmented Transformer for Image Captioning

Image captioning models aim at connecting Vision and Language by providing natural language descriptions of input images. In the past few years, the task has been tackled by learning parametric models and proposing visual feature extraction…

Computer Vision and Pattern Recognition · Computer Science 2022-08-23 Sara Sarto , Marcella Cornia , Lorenzo Baraldi , Rita Cucchiara

A Weighted Multi-Criteria Decision Making Approach for Image Captioning

Image captioning aims at automatically generating descriptions of an image in natural language. This is a challenging problem in the field of artificial intelligence that has recently received significant attention in the computer vision…

Computer Vision and Pattern Recognition · Computer Science 2019-04-02 Hassan Maleki Galandouz , Mohsen Ebrahimi Moghaddam , Mehrnoush Shamsfard

SuperCaptioning: Image Captioning Using Two-dimensional Word Embedding

Language and vision are processed as two different modal in current work for image captioning. However, recent work on Super Characters method shows the effectiveness of two-dimensional word embedding, which converts text classification…

Computation and Language · Computer Science 2019-06-05 Baohua Sun , Lin Yang , Michael Lin , Charles Young , Patrick Dong , Wenhan Zhang , Jason Dong

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are…

Computer Vision and Pattern Recognition · Computer Science 2015-06-12 Junhua Mao , Wei Xu , Yi Yang , Jiang Wang , Zhiheng Huang , Alan Yuille