Related papers: VOGUE: Answer Verbalization through Multi-Task Lea…

An Answer Verbalization Dataset for Conversational Question Answerings over Knowledge Graphs

We introduce a new dataset for conversational question answering over Knowledge Graphs (KGs) with verbalized answers. Question answering over KGs is currently focused on answer generation for single-turn questions (KGQA) or multiple-tun…

Computation and Language · Computer Science 2022-08-16 Endri Kacupaj , Kuldeep Singh , Maria Maleshkova , Jens Lehmann

Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering

We present Answer-Me, a task-aware multi-task framework which unifies a variety of question answering tasks, such as, visual question answering, visual entailment, visual reasoning. In contrast to previous works using contrastive or…

Computer Vision and Pattern Recognition · Computer Science 2022-12-02 AJ Piergiovanni , Wei Li , Weicheng Kuo , Mohammad Saffar , Fred Bertsch , Anelia Angelova

Variational Reasoning for Question Answering with Knowledge Graph

Knowledge graph (KG) is known to be helpful for the task of question answering (QA), since it provides well-structured relational information between entities, and allows one to further infer indirect facts. However, it is challenging to…

Machine Learning · Computer Science 2017-11-29 Yuyu Zhang , Hanjun Dai , Zornitsa Kozareva , Alexander J. Smola , Le Song

VANiLLa : Verbalized Answers in Natural Language at Large Scale

In the last years, there have been significant developments in the area of Question Answering over Knowledge Graphs (KGQA). Despite all the notable advancements, current KGQA datasets only provide the answers as the direct output result of…

Computation and Language · Computer Science 2021-05-25 Debanjali Biswas , Mohnish Dubey , Md Rashad Al Hasan Rony , Jens Lehmann

Beyond VQA: Generating Multi-word Answer and Rationale to Visual Questions

Visual Question Answering is a multi-modal task that aims to measure high-level visual understanding. Contemporary VQA models are restrictive in the sense that answers are obtained via classification over a limited vocabulary (in the case…

Computer Vision and Pattern Recognition · Computer Science 2021-06-18 Radhika Dua , Sai Srinivas Kancheti , Vineeth N Balasubramanian

Multimodal Inverse Cloze Task for Knowledge-based Visual Question Answering

We present a new pre-training method, Multimodal Inverse Cloze Task, for Knowledge-based Visual Question Answering about named Entities (KVQAE). KVQAE is a recently introduced task that consists in answering questions about named entities…

Computation and Language · Computer Science 2023-01-12 Paul Lerner , Olivier Ferret , Camille Guinaudeau

Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering

Knowledge-based Visual Question Answering (KVQA) requires external knowledge beyond the visible content to answer questions about an image. This ability is challenging but indispensable to achieve general VQA. One limitation of existing…

Artificial Intelligence · Computer Science 2020-11-04 Jing Yu , Zihao Zhu , Yujing Wang , Weifeng Zhang , Yue Hu , Jianlong Tan

Temporal-Aware Heterogeneous Graph Reasoning with Multi-View Fusion for Temporal Question Answering

Question Answering over Temporal Knowledge Graphs (TKGQA) has attracted growing interest for handling time-sensitive queries. However, existing methods still struggle with: 1) weak incorporation of temporal constraints in question…

Computation and Language · Computer Science 2026-02-24 Wuzhenghong Wen , Bowen Zhou , Jinwen Huang , Xianjie Wu , Yuwei Sun , Su Pan , Liang Li , Jianting Liu

A survey on VQA_Datasets and Approaches

Visual question answering (VQA) is a task that combines both the techniques of computer vision and natural language processing. It requires models to answer a text-based question according to the information contained in a visual. In recent…

Computer Vision and Pattern Recognition · Computer Science 2021-05-04 Yeyun Zou , Qiyu Xie

LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection

Visual question answering (VQA) often requires an understanding of visual concepts and language semantics, which relies on external knowledge. Most existing methods exploit pre-trained language models or/and unstructured text, but the…

Computer Vision and Pattern Recognition · Computer Science 2022-11-29 Zhuo Chen , Yufeng Huang , Jiaoyan Chen , Yuxia Geng , Yin Fang , Jeff Pan , Ningyu Zhang , Wen Zhang

Relation-Aware Question Answering for Heterogeneous Knowledge Graphs

Multi-hop Knowledge Base Question Answering(KBQA) aims to find the answer entity in a knowledge graph (KG), which requires multiple steps of reasoning. Existing retrieval-based approaches solve this task by concentrating on the specific…

Computation and Language · Computer Science 2023-12-20 Haowei Du , Quzhe Huang , Chen Li , Chen Zhang , Yang Li , Dongyan Zhao

Learning to Search: A Decision-Based Agent for Knowledge-Based Visual Question Answering

Knowledge-based visual question answering (KB-VQA) requires vision-language models to understand images and use external knowledge, especially for rare entities and long-tail facts. Most existing retrieval-augmented generation (RAG) methods…

Computer Vision and Pattern Recognition · Computer Science 2026-04-10 Zhuohong Chen , Zhenxian Wu , Yunyao Yu , Hangrui Xu , Zirui Liao , Zhifang Liu , Xiangwen Deng , Pen Jiao , Haoqian Wang

Modality-Aware Integration with Large Language Models for Knowledge-based Visual Question Answering

Knowledge-based visual question answering (KVQA) has been extensively studied to answer visual questions with external knowledge, e.g., knowledge graphs (KGs). While several attempts have been proposed to leverage large language models…

Computer Vision and Pattern Recognition · Computer Science 2024-03-05 Junnan Dong , Qinggang Zhang , Huachi Zhou , Daochen Zha , Pai Zheng , Xiao Huang

Multi-Task Learning with Multi-View Attention for Answer Selection and Knowledge Base Question Answering

Answer selection and knowledge base question answering (KBQA) are two important tasks of question answering (QA) systems. Existing methods solve these two tasks separately, which requires large number of repetitive work and neglects the…

Computation and Language · Computer Science 2018-12-07 Yang Deng , Yuexiang Xie , Yaliang Li , Min Yang , Nan Du , Wei Fan , Kai Lei , Ying Shen

Generative Visual Question Answering

Multi-modal tasks involving vision and language in deep learning continue to rise in popularity and are leading to the development of newer models that can generalize beyond the extent of their training data. The current models lack…

Computer Vision and Pattern Recognition · Computer Science 2023-07-21 Ethan Shen , Scotty Singh , Bhavesh Kumar

TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation

Text-VQA aims at answering questions that require understanding the textual cues in an image. Despite the great progress of existing Text-VQA methods, their performance suffers from insufficient human-labeled question-answer (QA) pairs.…

Computer Vision and Pattern Recognition · Computer Science 2022-10-11 Jun Wang , Mingfei Gao , Yuqian Hu , Ramprasaath R. Selvaraju , Chetan Ramaiah , Ran Xu , Joseph F. JaJa , Larry S. Davis

Multi-task Voice Activated Framework using Self-supervised Learning

Self-supervised learning methods such as wav2vec 2.0 have shown promising results in learning speech representations from unlabelled and untranscribed speech data that are useful for speech recognition. Since these representations are…

Audio and Speech Processing · Electrical Eng. & Systems 2022-03-22 Shehzeen Hussain , Van Nguyen , Shuhua Zhang , Erik Visser

UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph

Multi-hop Question Answering over Knowledge Graph~(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question on a large-scale Knowledge Graph (KG). To cope with the…

Computation and Language · Computer Science 2023-03-02 Jinhao Jiang , Kun Zhou , Wayne Xin Zhao , Ji-Rong Wen

Survey of Recent Advances in Visual Question Answering

Visual Question Answering (VQA) presents a unique challenge as it requires the ability to understand and encode the multi-modal inputs - in terms of image processing and natural language processing. The algorithm further needs to learn how…

Computer Vision and Pattern Recognition · Computer Science 2017-09-26 Supriya Pandhre , Shagun Sodhani

Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base

We consider the problem of conversational question answering over a large-scale knowledge base. To handle huge entity vocabulary of a large-scale knowledge base, recent neural semantic parsing based approaches usually decompose the task…

Computation and Language · Computer Science 2019-10-14 Tao Shen , Xiubo Geng , Tao Qin , Daya Guo , Duyu Tang , Nan Duan , Guodong Long , Daxin Jiang