English
Related papers

Related papers: Hyperparameter Analysis for Image Captioning

200 papers

Automatic image captioning, a multifaceted task bridging computer vision and natural language processing, aims to generate descriptive textual content from visual input. While Convolutional Neural Networks (CNNs) and Long Short-Term Memory…

Computer Vision and Pattern Recognition · Computer Science 2025-09-29 Amanuel Tafese Dufera

This project aims to create an automated image captioning system that generates natural language descriptions for input images by integrating techniques from computer vision and natural language processing. We employ various different…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Joshua Adrian Cahyono , Jeremy Nathan Jusuf

Image captioning by the encoder-decoder framework has shown tremendous advancement in the last decade where CNN is mainly used as encoder and LSTM is used as a decoder. Despite such an impressive achievement in terms of accuracy in simple…

Computer Vision and Pattern Recognition · Computer Science 2023-01-09 Rana Adnan Ahmad , Muhammad Azhar , Hina Sattar

In a globalized world at the present epoch of generative intelligence, most of the manual labour tasks are automated with increased efficiency. This can support businesses to save time and money. A crucial component of generative…

Computer Vision and Pattern Recognition · Computer Science 2023-03-07 Pranav Dandwate , Chaitanya Shahane , Vandana Jagtap , Shridevi C. Karande

Modern Neural Networks are eminent in achieving state of the art performance on tasks under Computer Vision, Natural Language Processing and related verticals. However, they are notorious for their voracious memory and compute appetite…

Computer Vision and Pattern Recognition · Computer Science 2020-12-18 Harshit Rampal , Aman Mohanty

The aim of image captioning is to generate textual description of a given image. Though seemingly an easy task for humans, it is challenging for machines as it requires the ability to comprehend the image (computer vision) and consequently…

Computer Vision and Pattern Recognition · Computer Science 2020-11-12 Anubhav Shrimal , Tanmoy Chakraborty

Image captioning is a challenging task that combines the field of computer vision and natural language processing. A variety of approaches have been proposed to achieve the goal of automatically describing an image, and recurrent neural…

Computer Vision and Pattern Recognition · Computer Science 2018-05-24 Qingzhong Wang , Antoni B. Chan

Automated image captioning is one of the applications of Deep Learning which involves fusion of work done in computer vision and natural language processing, and it is typically performed using Encoder-Decoder architectures. In this…

Computer Vision and Pattern Recognition · Computer Science 2021-05-25 Aditya Bhattacharya , Eshwar Shamanna Girishekar , Padmakar Anil Deshpande

Automatic captioning of images is a task that combines the challenges of image analysis and text generation. One important aspect in captioning is the notion of attention: How to decide what to describe and in which order. Inspired by the…

Computer Vision and Pattern Recognition · Computer Science 2020-10-06 Sen He , Wentong Liao , Hamed R. Tavakoli , Michael Yang , Bodo Rosenhahn , Nicolas Pugeault

In today's world, image processing plays a crucial role across various fields, from scientific research to industrial applications. But one particularly exciting application is image captioning. The potential impact of effective image…

Computer Vision and Pattern Recognition · Computer Science 2024-04-30 Md Alif Rahman Ridoy , M Mahmud Hasan , Shovon Bhowmick

Image captioning creates informative text from an input image by creating a relationship between the words and the actual content of an image. Recently, deep learning models that utilize transformers have been the most successful in…

Computer Vision and Pattern Recognition · Computer Science 2025-01-28 Israa Al Badarneh , Bassam Hammo , Omar Al-Kadi

Image captioning is a technology that produces text-based descriptions for an image. Deep learning-based solutions built on top of feature recognition may very well serve the purpose. But as with any other machine learning solution, the…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Rishi Kesav Mohan , Sanjay Sureshkumar , Vignesh Sivasubramaniam

Change captioning is to describe the semantic change between a pair of similar images in natural language. It is more challenging than general image captioning, because it requires capturing fine-grained change information while being…

Computer Vision and Pattern Recognition · Computer Science 2023-03-07 Yunbin Tu , Liang Li , Li Su , Ke Lu , Qingming Huang

This paper describes our winning entry in the ImageCLEF 2015 image sentence generation task. We improve Google's CNN-LSTM model by introducing concept-based sentence reranking, a data-driven approach which exploits the large amounts of…

Computer Vision and Pattern Recognition · Computer Science 2016-05-04 Xirong Li , Qin Jin

CNN-LSTM based architectures have played an important role in image captioning, but limited by the training efficiency and expression ability, researchers began to explore the CNN-Transformer based models and achieved great success.…

Computer Vision and Pattern Recognition · Computer Science 2022-03-30 Yiyu Wang , Jungang Xu , Yingfei Sun

In this paper, we consider the image captioning task from a new sequence-to-sequence prediction perspective and propose CaPtion TransformeR (CPTR) which takes the sequentialized raw images as the input to Transformer. Compared to the…

Computer Vision and Pattern Recognition · Computer Science 2021-01-29 Wei Liu , Sihan Chen , Longteng Guo , Xinxin Zhu , Jing Liu

The Convolutional Neural Network (CNN) has been the dominant image feature extractor in computer vision for years. However, it fails to get the relationship between images/objects and their hierarchical interactions which can be helpful for…

Computer Vision and Pattern Recognition · Computer Science 2019-12-05 Zheng-cong Fei

Language and vision are processed as two different modal in current work for image captioning. However, recent work on Super Characters method shows the effectiveness of two-dimensional word embedding, which converts text classification…

Computation and Language · Computer Science 2019-06-05 Baohua Sun , Lin Yang , Michael Lin , Charles Young , Patrick Dong , Wenhan Zhang , Jason Dong

Sentiment analysis of online user generated content is important for many social media analytics tasks. Researchers have largely relied on textual sentiment analysis to develop systems to predict political elections, measure economic…

Computer Vision and Pattern Recognition · Computer Science 2015-09-22 Quanzeng You , Jiebo Luo , Hailin Jin , Jianchao Yang

Image Captioning (IC) has achieved astonishing developments by incorporating various techniques into the CNN-RNN encoder-decoder architecture. However, since CNN and RNN do not share the basic network component, such a heterogeneous…

Computer Vision and Pattern Recognition · Computer Science 2022-04-18 Yang Xu , Li Li , Haiyang Xu , Songfang Huang , Fei Huang , Jianfei Cai
‹ Prev 1 2 3 10 Next ›