English
Related papers

Related papers: Explicit Image Caption Editing

200 papers

Explicit Caption Editing (ECE) -- refining reference image captions through a sequence of explicit edit operations (e.g., KEEP, DETELE) -- has raised significant attention due to its explainable and human-like nature. After training with…

Computer Vision and Pattern Recognition · Computer Science 2024-03-07 Zhen Wang , Xinyun Jiang , Jun Xiao , Tao Chen , Long Chen

This paper presents a new metric called TIGEr for the automatic evaluation of image captioning systems. Popular metrics, such as BLEU and CIDEr, are based solely on text matching between reference captions and machine-generated captions,…

Computation and Language · Computer Science 2019-09-06 Ming Jiang , Qiuyuan Huang , Lei Zhang , Xin Wang , Pengchuan Zhang , Zhe Gan , Jana Diesner , Jianfeng Gao

Image captioning is the process of automatically generating a description of an image in natural language. Image captioning is one of the significant challenges in image understanding since it requires not only recognizing salient objects…

Computer Vision and Pattern Recognition · Computer Science 2022-07-26 Ghadah Alabduljabbar , Hafida Benhidour , Said Kerrache

Distinctive Image Captioning (DIC) -- generating distinctive captions that describe the unique details of a target image -- has received considerable attention over the last few years. A recent DIC work proposes to generate distinctive…

Computer Vision and Pattern Recognition · Computer Science 2022-07-25 Yangjun Mao , Long Chen , Zhihong Jiang , Dong Zhang , Zhimeng Zhang , Jian Shao , Jun Xiao

Most image captioning frameworks generate captions directly from images, learning a mapping from visual features to natural language. However, editing existing captions can be easier than generating new ones from scratch. Intuitively, when…

Computer Vision and Pattern Recognition · Computer Science 2020-03-09 Fawaz Sammani , Luke Melas-Kyriazi

Image captioning creates informative text from an input image by creating a relationship between the words and the actual content of an image. Recently, deep learning models that utilize transformers have been the most successful in…

Computer Vision and Pattern Recognition · Computer Science 2025-01-28 Israa Al Badarneh , Bassam Hammo , Omar Al-Kadi

Image captioning is the process of generating a natural language description of an image. Most current image captioning models, however, do not take into account the emotional aspect of an image, which is very relevant to activities and…

Computer Vision and Pattern Recognition · Computer Science 2019-01-28 Omid Mohamad Nezami , Mark Dras , Peter Anderson , Len Hamey

Generating textual descriptions for images has been an attractive problem for the computer vision and natural language processing researchers in recent years. Dozens of models based on deep learning have been proposed to solve this problem.…

Computer Vision and Pattern Recognition · Computer Science 2019-07-01 Ahmad Asadi , Reza Safabakhsh

Image captioning is one of the most challenging tasks in AI, which aims to automatically generate textual sentences for an image. Recent methods for image captioning follow encoder-decoder framework that transforms the sequence of salient…

Computer Vision and Pattern Recognition · Computer Science 2021-05-07 Zeliang Song , Xiaofei Zhou

Effectively aligning with human judgment when evaluating machine-generated image captions represents a complex yet intriguing challenge. Existing evaluation metrics like CIDEr or CLIP-Score fall short in this regard as they do not take into…

Computer Vision and Pattern Recognition · Computer Science 2024-07-31 Sara Sarto , Marcella Cornia , Lorenzo Baraldi , Rita Cucchiara

Distinctive Image Captioning (DIC) -- generating distinctive captions that describe the unique details of a target image -- has received considerable attention over the last few years. A recent DIC method proposes to generate distinctive…

Computer Vision and Pattern Recognition · Computer Science 2023-06-27 Yangjun Mao , Jun Xiao , Dong Zhang , Meng Cao , Jian Shao , Yueting Zhuang , Long Chen

Image caption rating is becoming increasingly important because computer-generated captions are used extensively for descriptive annotation. However, rating the accuracy of captions in describing images is time-consuming and subjective in…

Computer Vision and Pattern Recognition · Computer Science 2026-03-26 Kezia Minni , Qiang Zhang , Monoshiz Mahbub Khan , Zhe Yu

Recent image captioning models are achieving impressive results based on popular metrics, i.e., BLEU, CIDEr, and SPICE. However, focusing on the most popular metrics that only consider the overlap between the generated captions and human…

Computer Vision and Pattern Recognition · Computer Science 2022-04-11 Jiuniu Wang , Wenjia Xu , Qingzhong Wang , Antoni B. Chan

Automatically generating a human-like description for a given image is a potential research in artificial intelligence, which has attracted a great of attention recently. Most of the existing attention methods explore the mapping…

Computer Vision and Pattern Recognition · Computer Science 2020-11-03 Feicheng Huang , Zhixin Li , Haiyang Wei , Canlong Zhang , Huifang Ma

When generating images from prompts that include specific entities, the model must retain as much entity-specific knowledge as possible. However, the number of entities is almost countless, and new entities emerge; memorizing all of them…

Image captioning is conventionally formulated as the task of generating captions for images that match the distribution of reference image-caption pairs. However, reference captions in standard captioning datasets are short and may not…

Computer Vision and Pattern Recognition · Computer Science 2023-08-01 Simon Kornblith , Lala Li , Zirui Wang , Thao Nguyen

In scenarios where language models must incorporate new information efficiently without extensive retraining, traditional fine-tuning methods are prone to overfitting, degraded generalization, and unnatural language generation. To address…

Computation and Language · Computer Science 2025-04-01 Siyuan Qi , Bangcheng Yang , Kailin Jiang , Xiaobo Wang , Jiaqi Li , Yifan Zhong , Yaodong Yang , Zilong Zheng

Image captioning systems have recently improved dramatically, but they still tend to produce captions that are insensitive to the communicative goals that captions should meet. To address this, we propose Issue-Sensitive Image Captioning…

Computation and Language · Computer Science 2020-10-07 Allen Nie , Reuben Cohn-Gordon , Christopher Potts

Fashion-image editing represents a challenging computer vision task, where the goal is to incorporate selected apparel into a given input image. Most existing techniques, known as Virtual Try-On methods, deal with this task by first…

Computer Vision and Pattern Recognition · Computer Science 2023-01-06 Martin Pernuš , Clinton Fookes , Vitomir Štruc , Simon Dobrišek

Fine-tuning image captioning models with hand-crafted rewards like the CIDEr metric has been a classical strategy for promoting caption quality at the sequence level. This approach, however, is known to limit descriptiveness and semantic…

Computer Vision and Pattern Recognition · Computer Science 2024-09-02 Nicholas Moratelli , Marcella Cornia , Lorenzo Baraldi , Rita Cucchiara
‹ Prev 1 2 3 10 Next ›