Related papers: Interactive Machine Learning for Image Captioning

Putting Humans in the Image Captioning Loop

Image Captioning (IC) models can highly benefit from human feedback in the training process, especially in cases where data is limited. We present work-in-progress on adapting an IC system to integrate human feedback, with the goal to make…

Computation and Language · Computer Science 2023-06-07 Aliki Anagnostopoulou , Mareike Hartmann , Daniel Sonntag

Towards Adaptable and Interactive Image Captioning with Data Augmentation and Episodic Memory

Interactive machine learning (IML) is a beneficial learning paradigm in cases of limited data availability, as human feedback is incrementally integrated into the training process. In this paper, we present an IML pipeline for image…

Computation and Language · Computer Science 2024-08-09 Aliki Anagnostopoulou , Mareike Hartmann , Daniel Sonntag

Image Captioning using Deep Neural Architectures

Automatically creating the description of an image using any natural languages sentence like English is a very challenging task. It requires expertise of both image processing as well as natural language processing. This paper discuss about…

Computer Vision and Pattern Recognition · Computer Science 2018-10-03 Parth Shah , Vishvajit Bakarola , Supriya Pati

Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time

Incorporating automatically predicted human feedback into the process of training generative models has attracted substantial recent interest, while feedback at inference time has received less attention. The typical feedback at training…

Computer Vision and Pattern Recognition · Computer Science 2025-01-09 Uri Berger , Omri Abend , Lea Frermann , Gabriel Stanovsky

Teaching Machines to Describe Images via Natural Language Feedback

Robots will eventually be part of every household. It is thus critical to enable algorithms to learn from and be guided by non-expert users. In this paper, we bring a human in the loop, and enable a human teacher to give feedback to a…

Computation and Language · Computer Science 2017-06-06 Huan Ling , Sanja Fidler

A Comprehensive Survey of Deep Learning for Image Captioning

Generating a description of an image is called image captioning. Image captioning requires to recognize the important objects, their attributes and their relationships in an image. It also needs to generate syntactically and semantically…

Computer Vision and Pattern Recognition · Computer Science 2018-10-16 Md. Zakir Hossain , Ferdous Sohel , Mohd Fairuz Shiratuddin , Hamid Laga

Image Captioning

This paper discusses and demonstrates the outcomes from our experimentation on Image Captioning. Image captioning is a much more involved task than image recognition or classification, because of the additional challenge of recognizing the…

Computer Vision and Pattern Recognition · Computer Science 2018-05-24 Vikram Mullachery , Vishal Motwani

Image Captioning based on Deep Learning Methods: A Survey

Image captioning is a challenging task and attracting more and more attention in the field of Artificial Intelligence, and which can be applied to efficient image retrieval, intelligent blind guidance and human-computer interaction, etc. In…

Computer Vision and Pattern Recognition · Computer Science 2019-05-21 Yiyu Wang , Jungang Xu , Yingfei Sun , Ben He

An Efficient Technique for Image Captioning using Deep Neural Network

With the huge expansion of internet and trillions of gigabytes of data generated every single day, the needs for the development of various tools has become mandatory in order to maintain system adaptability to rapid changes. One of these…

Computer Vision and Pattern Recognition · Computer Science 2020-09-08 Borneel Bikash Phukan , Amiya Ranjan Panda

Learning to Caption Images through a Lifetime by Asking Questions

In order to bring artificial agents into our lives, we will need to go beyond supervised learning on closed datasets to having the ability to continuously expand knowledge. Inspired by a student learning in a classroom, we present an agent…

Computer Vision and Pattern Recognition · Computer Science 2019-03-22 Kevin Shen , Amlan Kar , Sanja Fidler

Multimodal Image Captioning for Marketing Analysis

Automatically captioning images with natural language sentences is an important research topic. State of the art models are able to produce human-like sentences. These models typically describe the depicted scene as a whole and do not…

Computer Vision and Pattern Recognition · Computer Science 2019-08-07 Philipp Harzig , Stephan Brehm , Rainer Lienhart , Carolin Kaiser , René Schallner

Boost Image Captioning with Knowledge Reasoning

Automatically generating a human-like description for a given image is a potential research in artificial intelligence, which has attracted a great of attention recently. Most of the existing attention methods explore the mapping…

Computer Vision and Pattern Recognition · Computer Science 2020-11-03 Feicheng Huang , Zhixin Li , Haiyang Wei , Canlong Zhang , Huifang Ma

Show, Translate and Tell

Humans have an incredible ability to process and understand information from multiple sources such as images, video, text, and speech. Recent success of deep neural networks has enabled us to develop algorithms which give machines the…

Computer Vision and Pattern Recognition · Computer Science 2019-03-18 Dheeraj Peri , Shagan Sah , Raymond Ptucha

Improving Image Captioning with Better Use of Captions

Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community. In this paper, we present a novel image captioning architecture to better explore semantics…

Computer Vision and Pattern Recognition · Computer Science 2020-06-23 Zhan Shi , Xu Zhou , Xipeng Qiu , Xiaodan Zhu

Generating Diverse and Meaningful Captions

Image Captioning is a task that requires models to acquire a multi-modal understanding of the world and to express this understanding in natural language text. While the state-of-the-art for this task has rapidly improved in terms of n-gram…

Computer Vision and Pattern Recognition · Computer Science 2018-12-20 Annika Lindh , Robert J. Ross , Abhijit Mahalunkar , Giancarlo Salton , John D. Kelleher

Image Captioning based on Deep Reinforcement Learning

Recently it has shown that the policy-gradient methods for reinforcement learning have been utilized to train deep end-to-end systems on natural language processing tasks. What's more, with the complexity of understanding image content and…

Computer Vision and Pattern Recognition · Computer Science 2018-09-14 Haichao Shi , Peng Li , Bo Wang , Zhenyu Wang

Deep Learning Approaches on Image Captioning: A Review

Image captioning is a research area of immense importance, aiming to generate natural language descriptions for visual content in the form of still images. The advent of deep learning and more recently vision-language pre-training…

Computer Vision and Pattern Recognition · Computer Science 2023-08-29 Taraneh Ghandi , Hamidreza Pourreza , Hamidreza Mahyar

Self-Supervised Image Captioning with CLIP

Image captioning, a fundamental task in vision-language understanding, seeks to generate accurate natural language descriptions for provided images. Current image captioning approaches heavily rely on high-quality image-caption pairs, which…

Computer Vision and Pattern Recognition · Computer Science 2023-11-03 Chuanyang Jin

Actor-Critic Sequence Training for Image Captioning

Generating natural language descriptions of images is an important capability for a robot or other visual-intelligence driven AI agent that may need to communicate with human users about what it is seeing. Such image captioning methods are…

Computer Vision and Pattern Recognition · Computer Science 2017-11-29 Li Zhang , Flood Sung , Feng Liu , Tao Xiang , Shaogang Gong , Yongxin Yang , Timothy M. Hospedales

Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedback

Research on generative models to produce human-aligned / human-preferred outputs has seen significant recent contributions. Between text and image-generative models, we narrowed our focus to text-based generative models, particularly to…

Computer Vision and Pattern Recognition · Computer Science 2024-03-12 Adarsh N L , Arun P , Aravindh N L