Related papers: Pragmatic Issue-Sensitive Image Captioning

Pragmatically Informative Image Captioning with Character-Level Inference

We combine a neural image captioner with a Rational Speech Acts (RSA) model to make a system that is pragmatically informative: its objective is to produce captions that are not merely true but also distinguish their inputs from similar…

Computation and Language · Computer Science 2018-05-11 Reuben Cohn-Gordon , Noah Goodman , Christopher Potts

Improving Image Captioning with Better Use of Captions

Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community. In this paper, we present a novel image captioning architecture to better explore semantics…

Computer Vision and Pattern Recognition · Computer Science 2020-06-23 Zhan Shi , Xu Zhou , Xipeng Qiu , Xiaodan Zhu

Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights

Contextualized Image Captioning (CIC) evolves traditional image captioning into a more complex domain, necessitating the ability for multimodal reasoning. It aims to generate image captions given specific contextual information. This paper…

Computer Vision and Pattern Recognition · Computer Science 2024-07-17 Shunqi Mao , Chaoyi Zhang , Hang Su , Hwanjun Song , Igor Shalyminov , Weidong Cai

CIC-BART-SSA: Controllable Image Captioning with Structured Semantic Augmentation

Controllable Image Captioning (CIC) aims at generating natural language descriptions for an image, conditioned on information provided by end users, e.g., regions, entities or events of interest. However, available image-language datasets…

Computer Vision and Pattern Recognition · Computer Science 2024-07-18 Kalliopi Basioti , Mohamed A. Abdelsalam , Federico Fancellu , Vladimir Pavlovic , Afsaneh Fazly

Prompt-based Learning for Unpaired Image Captioning

Unpaired Image Captioning (UIC) has been developed to learn image descriptions from unaligned vision-language sample pairs. Existing works usually tackle this task using adversarial learning and visual concept reward based on reinforcement…

Computer Vision and Pattern Recognition · Computer Science 2022-11-21 Peipei Zhu , Xiao Wang , Lin Zhu , Zhenglong Sun , Weishi Zheng , Yaowei Wang , Changwen Chen

Towards Automatic Satellite Images Captions Generation Using Large Language Models

Automatic image captioning is a promising technique for conveying visual information using natural language. It can benefit various tasks in satellite remote sensing, such as environmental monitoring, resource management, disaster…

Computer Vision and Pattern Recognition · Computer Science 2023-10-18 Yingxu He , Qiqi Sun

AGIC: Attention-Guided Image Captioning to Improve Caption Relevance

Despite significant progress in image captioning, generating accurate and descriptive captions remains a long-standing challenge. In this study, we propose Attention-Guided Image Captioning (AGIC), which amplifies salient visual regions…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 L. D. M. S. Sai Teja , Ashok Urlana , Pruthwik Mishra

Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions

Automatic image captioning has recently approached human-level performance due to the latest advances in computer vision and natural language understanding. However, most of the current models can only generate plain factual descriptions…

Computer Vision and Pattern Recognition · Computer Science 2018-01-31 Quanzeng You , Hailin Jin , Jiebo Luo

Efficient Modeling of Future Context for Image Captioning

Existing approaches to image captioning usually generate the sentence word-by-word from left to right, with the constraint of conditioned on local context including the given image and history generated words. There have been many studies…

Computer Vision and Pattern Recognition · Computer Science 2022-10-19 Zhengcong Fei , Junshi Huang , Xiaoming Wei , Xiaolin Wei

Putting Humans in the Image Captioning Loop

Image Captioning (IC) models can highly benefit from human feedback in the training process, especially in cases where data is limited. We present work-in-progress on adapting an IC system to integrate human feedback, with the goal to make…

Computation and Language · Computer Science 2023-06-07 Aliki Anagnostopoulou , Mareike Hartmann , Daniel Sonntag

SPICE: Semantic Propositional Image Caption Evaluation

There is considerable interest in the task of automatically generating image captions. However, evaluation is challenging. Existing automatic evaluation metrics are primarily sensitive to n-gram overlap, which is neither necessary nor…

Computer Vision and Pattern Recognition · Computer Science 2016-08-01 Peter Anderson , Basura Fernando , Mark Johnson , Stephen Gould

SubICap: Towards Subword-informed Image Captioning

Existing Image Captioning (IC) systems model words as atomic units in captions and are unable to exploit the structural information in the words. This makes representation of rare words very difficult and out-of-vocabulary words impossible.…

Computation and Language · Computer Science 2020-12-25 Naeha Sharif , Mohammed Bennamoun , Wei Liu , Syed Afaq Ali Shah

Image Captioning with Clause-Focused Metrics in a Multi-Modal Setting for Marketing

Automatically generating descriptive captions for images is a well-researched area in computer vision. However, existing evaluation approaches focus on measuring the similarity between two sentences disregarding fine-grained semantics of…

Computer Vision and Pattern Recognition · Computer Science 2019-08-07 Philipp Harzig , Dan Zecha , Rainer Lienhart , Carolin Kaiser , René Schallner

Face-Cap: Image Captioning using Facial Expression Analysis

Image captioning is the process of generating a natural language description of an image. Most current image captioning models, however, do not take into account the emotional aspect of an image, which is very relevant to activities and…

Computer Vision and Pattern Recognition · Computer Science 2019-01-28 Omid Mohamad Nezami , Mark Dras , Peter Anderson , Len Hamey

Human-like Controllable Image Captioning with Verb-specific Semantic Roles

Controllable Image Captioning (CIC) -- generating image descriptions following designated control signals -- has received unprecedented attention over the last few years. To emulate the human ability in controlling caption generation,…

Computer Vision and Pattern Recognition · Computer Science 2021-03-24 Long Chen , Zhihong Jiang , Jun Xiao , Wei Liu

Image Captioning

This paper discusses and demonstrates the outcomes from our experimentation on Image Captioning. Image captioning is a much more involved task than image recognition or classification, because of the additional challenge of recognizing the…

Computer Vision and Pattern Recognition · Computer Science 2018-05-24 Vikram Mullachery , Vishal Motwani

Cognitive resilience: Unraveling the proficiency of image-captioning models to interpret masked visual content

This study explores the ability of Image Captioning (IC) models to decode masked visual content sourced from diverse datasets. Our findings reveal the IC model's capability to generate captions from masked images, closely resembling the…

Computer Vision and Pattern Recognition · Computer Science 2024-03-26 Zhicheng Du , Zhaotian Xie , Huazhang Ying , Likun Zhang , Peiwu Qin

A Comprehensive Analysis of Real-World Image Captioning and Scene Identification

Image captioning is a computer vision task that involves generating natural language descriptions for images. This method has numerous applications in various domains, including image retrieval systems, medicine, and various industries.…

Computer Vision and Pattern Recognition · Computer Science 2023-08-08 Sai Suprabhanu Nallapaneni , Subrahmanyam Konakanchi

Aesthetic Image Captioning From Weakly-Labelled Photographs

Aesthetic image captioning (AIC) refers to the multi-modal task of generating critical textual feedbacks for photographs. While in natural image captioning (NIC), deep models are trained in an end-to-end manner using large curated datasets…

Computer Vision and Pattern Recognition · Computer Science 2019-08-30 Koustav Ghosal , Aakanksha Rana , Aljosa Smolic

COMIC: Towards A Compact Image Captioning Model with Attention

Recent works in image captioning have shown very promising raw performance. However, we realize that most of these encoder-decoder style networks with attention do not scale naturally to large vocabulary size, making them difficult to be…

Computer Vision and Pattern Recognition · Computer Science 2019-06-13 Jia Huei Tan , Chee Seng Chan , Joon Huang Chuah