English
Related papers

Related papers: Deep Bayesian Active Learning for Multiple Correct…

200 papers

We present an empirical study of active learning for Visual Question Answering, where a deep VQA model selects informative question-image pairs from a pool and queries an oracle for answers to maximally improve its performance under a…

Computer Vision and Pattern Recognition · Computer Science 2017-11-07 Xiao Lin , Devi Parikh

The Visual Question Answering (VQA) task combines challenges for processing data with both Visual and Linguistic processing, to answer basic `common sense' questions about given images. Given an image and a question in natural language, the…

Computer Vision and Pattern Recognition · Computer Science 2020-12-24 Yash Srivastava , Vaishnav Murali , Shiv Ram Dubey , Snehasis Mukherjee

The predominant approach to Visual Question Answering (VQA) demands that the model represents within its weights all of the information required to answer any question about any image. Learning this information from any real training set…

Computer Vision and Pattern Recognition · Computer Science 2017-11-23 Damien Teney , Anton van den Hengel

We propose a novel probabilistic model for visual question answering (Visual QA). The key idea is to infer two sets of embeddings: one for the image and the question jointly and the other for the answers. The learning objective is to learn…

Computer Vision and Pattern Recognition · Computer Science 2018-06-12 Hexiang Hu , Wei-Lun Chao , Fei Sha

3D Visual Question Answering (3D VQA) is crucial for enabling models to perceive the physical world and perform spatial reasoning. In 3D VQA, the free-form nature of answers often leads to improper annotations that can confuse or mislead…

Computer Vision and Pattern Recognition · Computer Science 2025-08-19 Shengli Zhou , Yang Liu , Feng Zheng

The field of visual question answering (VQA) has recently seen a surge in research focused on providing explanations for predicted answers. However, current systems mostly rely on separate models to predict answers and generate…

Computation and Language · Computer Science 2023-02-14 Chenxi Whitehouse , Tillman Weyde , Pranava Madhyastha

Visual question answering as recently proposed multimodal learning task has enjoyed wide attention from the deep learning community. Lately, the focus was on developing new representation fusion methods and attention mechanisms to achieve…

Computer Vision and Pattern Recognition · Computer Science 2017-08-03 Ilija Ilievski , Jiashi Feng

In visual question answering (VQA), an algorithm must answer text-based questions about images. While multiple datasets for VQA have been created since late 2014, they all have flaws in both their content and the way algorithms are…

Computer Vision and Pattern Recognition · Computer Science 2017-09-15 Kushal Kafle , Christopher Kanan

Visual question answering (VQA) refers to the problem where, given an image and a natural language question about the image, a correct natural language answer has to be generated. A VQA model has to demonstrate both the visual understanding…

Computer Vision and Pattern Recognition · Computer Science 2024-11-19 Raihan Kabir , Naznin Haque , Md Saiful Islam , Marium-E-Jannat

Despite remarkable progress in recent years, Vision Language Models (VLMs) remain prone to overconfidence and hallucinations on tasks such as Visual Question Answering (VQA) and Visual Reasoning. Bayesian methods can potentially improve…

Computer Vision and Pattern Recognition · Computer Science 2026-04-23 Tobias Jan Wieczorek , Nathalie Daun , Mohammad Emtiyaz Khan , Marcus Rohrbach

Visual question answering (or VQA) is a new and exciting problem that combines natural language processing and computer vision techniques. We present a survey of the various datasets and models that have been used to tackle this task. The…

Computation and Language · Computer Science 2017-05-12 Akshay Kumar Gupta

Visual question answering (VQA) is an interesting learning setting for evaluating the abilities and shortcomings of current systems for image understanding. Many of the recently proposed VQA systems include attention or memory mechanisms…

Computer Vision and Pattern Recognition · Computer Science 2016-11-24 Allan Jabri , Armand Joulin , Laurens van der Maaten

This paper focuses on answering fill-in-the-blank style multiple choice questions from the Visual Madlibs dataset. Previous approaches to Visual Question Answering (VQA) have mainly used generic image features from networks trained on the…

Computer Vision and Pattern Recognition · Computer Science 2016-08-12 Tatiana Tommasi , Arun Mallya , Bryan Plummer , Svetlana Lazebnik , Alexander C. Berg , Tamara L. Berg

One of the most intriguing features of the Visual Question Answering (VQA) challenge is the unpredictability of the questions. Extracting the information required to answer them demands a variety of image operations from detection and…

Computer Vision and Pattern Recognition · Computer Science 2016-12-19 Peng Wang , Qi Wu , Chunhua Shen , Anton van den Hengel

Visual question answering (VQA) is a task that combines both the techniques of computer vision and natural language processing. It requires models to answer a text-based question according to the information contained in a visual. In recent…

Computer Vision and Pattern Recognition · Computer Science 2021-05-04 Yeyun Zou , Qiyu Xie

Visual question answering requires a deep understanding of both images and natural language. However, most methods mainly focus on visual concept; such as the relationships between various objects. The limited use of object categories…

Computer Vision and Pattern Recognition · Computer Science 2021-01-25 Jung-Jun Kim , Dong-Gyu Lee , Jialin Wu , Hong-Gyu Jung , Seong-Whan Lee

Recently, a number of deep-learning based models have been proposed for the task of Visual Question Answering (VQA). The performance of most models is clustered around 60-70%. In this paper we propose systematic methods to analyze the…

Computation and Language · Computer Science 2016-10-05 Aishwarya Agrawal , Dhruv Batra , Devi Parikh

Visual Question Answering (VQA) is a challenging task that has received increasing attention from both the computer vision and the natural language processing communities. Given an image and a question in natural language, it requires…

Computer Vision and Pattern Recognition · Computer Science 2016-07-21 Qi Wu , Damien Teney , Peng Wang , Chunhua Shen , Anthony Dick , Anton van den Hengel

Machine learning has advanced dramatically, narrowing the accuracy gap to humans in multimodal tasks like visual question answering (VQA). However, while humans can say "I don't know" when they are uncertain (i.e., abstain from answering a…

Computer Vision and Pattern Recognition · Computer Science 2022-10-21 Spencer Whitehead , Suzanne Petryk , Vedaad Shakib , Joseph Gonzalez , Trevor Darrell , Anna Rohrbach , Marcus Rohrbach

Visual Question Answering (VQA) has attracted much attention since it offers insight into the relationships between the multi-modal analysis of images and natural language. Most of the current algorithms are incapable of answering…

Computer Vision and Pattern Recognition · Computer Science 2017-12-05 Guohao Li , Hang Su , Wenwu Zhu
‹ Prev 1 2 3 10 Next ›