English
Related papers

Related papers: Knowledge Generation for Zero-shot Knowledge-based…

200 papers

Zero-shot visual question answering (ZS-VQA), an emerged critical research area, intends to answer visual questions without providing training samples. Existing research in ZS-VQA has proposed to leverage knowledge graphs or large language…

Computer Vision and Pattern Recognition · Computer Science 2025-01-23 Qian Tao , Xiaoyang Fan , Yong Xu , Xingquan Zhu , Yufei Tang

Incorporating external knowledge to Visual Question Answering (VQA) has become a vital practical need. Existing methods mostly adopt pipeline approaches with different components for knowledge matching and extraction, feature learning,…

Artificial Intelligence · Computer Science 2021-10-19 Zhuo Chen , Jiaoyan Chen , Yuxia Geng , Jeff Z. Pan , Zonggang Yuan , Huajun Chen

Knowledge-Based Visual Question Answering (KB-VQA) methods focus on tasks that demand reasoning with information extending beyond the explicit content depicted in the image. Early methods relied on explicit knowledge bases to provide this…

Computation and Language · Computer Science 2025-05-27 Mohammad Mahdi Moradi , Sudhir Mudur

Zero-shot Visual Question Answering (VQA) is a prominent vision-language task that examines both the visual and textual understanding capability of systems in the absence of training data. Recently, by converting the images into captions,…

Computer Vision and Pattern Recognition · Computer Science 2023-11-16 Yunshi Lan , Xiang Li , Xin Liu , Yang Li , Wei Qin , Weining Qian

Visual Question Generation (VQG) is a task to generate questions from images. When humans ask questions about an image, their goal is often to acquire some new knowledge. However, existing studies on VQG have mainly addressed question…

Computer Vision and Pattern Recognition · Computer Science 2022-03-16 Kohei Uehara , Tatsuya Harada

Knowledge-based Visual Question Answering (KVQA) requires both image and world knowledge to answer questions. Current methods first retrieve knowledge from the image and external knowledge base with the original complex question, then…

Computer Vision and Pattern Recognition · Computer Science 2024-07-23 Wenbin An , Feng Tian , Jiahao Nie , Wenkai Shi , Haonan Lin , Yan Chen , QianYing Wang , Yaqiang Wu , Guang Dai , Ping Chen

Large Language Models (LLMs) are capable of performing zero-shot closed-book question answering tasks, based on their internal knowledge stored in parameters during pre-training. However, such internalized knowledge might be insufficient…

Computation and Language · Computer Science 2023-06-08 Jinheon Baek , Alham Fikri Aji , Amir Saffari

Knowledge-based Vision Question Answering (KB-VQA) extends general Vision Question Answering (VQA) by not only requiring the understanding of visual and textual inputs but also extensive range of knowledge, enabling significant advancements…

Computer Vision and Pattern Recognition · Computer Science 2025-04-25 Jiaqi Deng , Zonghan Wu , Huan Huo , Guandong Xu

Knowledge-based visual question answering (KVQA) task aims to answer questions that require additional external knowledge as well as an understanding of images and questions. Recent studies on KVQA inject an external knowledge in a…

Computer Vision and Pattern Recognition · Computer Science 2022-07-28 Jinyeong Chae , Jihie Kim

This paper is on the problem of Knowledge-Based Visual Question Answering (KB-VQA). Recent works have emphasized the significance of incorporating both explicit (through external databases) and implicit (through LLMs) knowledge to answer…

Computer Vision and Pattern Recognition · Computer Science 2023-10-25 Alexandros Xenos , Themos Stafylakis , Ioannis Patras , Georgios Tzimiropoulos

Visual question answering (VQA) often requires an understanding of visual concepts and language semantics, which relies on external knowledge. Most existing methods exploit pre-trained language models or/and unstructured text, but the…

Computer Vision and Pattern Recognition · Computer Science 2022-11-29 Zhuo Chen , Yufeng Huang , Jiaoyan Chen , Yuxia Geng , Yin Fang , Jeff Pan , Ningyu Zhang , Wen Zhang

We analyze knowledge-based visual question answering, for which given a question, the models need to ground it into the visual modality and retrieve the relevant knowledge from a given large knowledge base (KB) to be able to answer. Our…

Artificial Intelligence · Computer Science 2024-04-17 Elham J. Barezi , Parisa Kordjamshidi

Visual question answering (VQA) is a Multidisciplinary research problem that pursued through practices of natural language processing and computer vision. Visual question answering automatically answers natural language questions according…

Computer Vision and Pattern Recognition · Computer Science 2024-09-01 Param Ahir , Hiteishi Diwanji

Knowledge-based visual question answering (KB-VQA) is a challenging task, which requires the model to leverage external knowledge for comprehending and answering questions grounded in visual content. Recent studies retrieve the knowledge…

Computer Vision and Pattern Recognition · Computer Science 2024-03-18 Dongze Hao , Jian Jia , Longteng Guo , Qunbo Wang , Te Yang , Yan Li , Yanhua Cheng , Bo Wang , Quan Chen , Han Li , Jing Liu

Large-scale pre-trained models (PTMs) show great zero-shot capabilities. In this paper, we study how to leverage them for zero-shot visual question answering (VQA). Our approach is motivated by a few observations. First, VQA questions often…

Computer Vision and Pattern Recognition · Computer Science 2024-01-25 Rui Cao , Jing Jiang

With the rapid development of remote sensing image archives, asking questions about images has become an effective way of gathering specific information or performing image retrieval. However, automatically generated image-based questions…

Computer Vision and Pattern Recognition · Computer Science 2026-02-24 Siran Li , Li Mi , Javiera Castillo-Navarro , Devis Tuia

Part of the appeal of Visual Question Answering (VQA) is its promise to answer new questions about previously unseen images. Most current methods demand training questions that illustrate every possible concept, and will therefore never…

Computer Vision and Pattern Recognition · Computer Science 2016-11-22 Damien Teney , Anton van den Hengel

We present a neural model for question generation from knowledge base triples in a "Zero-Shot" setup, that is generating questions for triples containing predicates, subject types or object types that were not seen at training time. Our…

Computation and Language · Computer Science 2018-02-21 Hady Elsahar , Christophe Gravier , Frederique Laforest

Recent developments in pre-trained neural language modeling have led to leaps in accuracy on commonsense question-answering benchmarks. However, there is increasing concern that models overfit to specific tasks, without learning to utilize…

Computation and Language · Computer Science 2020-12-16 Kaixin Ma , Filip Ilievski , Jonathan Francis , Yonatan Bisk , Eric Nyberg , Alessandro Oltramari

An ability to learn about new objects from a small amount of visual data and produce convincing linguistic justification about the presence/absence of certain concepts (that collectively compose the object) in novel scenarios is an…

Computer Vision and Pattern Recognition · Computer Science 2024-10-18 Shailaja Keyur Sampat , Maitreya Patel , Yezhou Yang , Chitta Baral
‹ Prev 1 2 3 10 Next ›