English
Related papers

Related papers: VisKoP: Visual Knowledge oriented Programming for …

200 papers

Knowledge-based Vision Question Answering (KB-VQA) extends general Vision Question Answering (VQA) by not only requiring the understanding of visual and textual inputs but also extensive range of knowledge, enabling significant advancements…

Computer Vision and Pattern Recognition · Computer Science 2025-04-25 Jiaqi Deng , Zonghan Wu , Huan Huo , Guandong Xu

This paper is on the problem of Knowledge-Based Visual Question Answering (KB-VQA). Recent works have emphasized the significance of incorporating both explicit (through external databases) and implicit (through LLMs) knowledge to answer…

Computer Vision and Pattern Recognition · Computer Science 2023-10-25 Alexandros Xenos , Themos Stafylakis , Ioannis Patras , Georgios Tzimiropoulos

This study explores the realm of knowledge base question answering (KBQA). KBQA is considered a challenging task, particularly in parsing intricate questions into executable logical forms. Traditional semantic parsing (SP)-based methods…

Computation and Language · Computer Science 2025-03-13 Guanming Xiong , Junwei Bao , Wen Zhao

Knowledge-based visual question answering (KB-VQA) requires vision-language models to understand images and use external knowledge, especially for rare entities and long-tail facts. Most existing retrieval-augmented generation (RAG) methods…

Computer Vision and Pattern Recognition · Computer Science 2026-04-10 Zhuohong Chen , Zhenxian Wu , Yunyao Yu , Hangrui Xu , Zirui Liao , Zhifang Liu , Xiangwen Deng , Pen Jiao , Haoqian Wang

Complex question answering over knowledge base (Complex KBQA) is challenging because it requires various compositional reasoning capabilities, such as multi-hop inference, attribute comparison, set operation. Existing benchmarks have some…

Computation and Language · Computer Science 2022-06-24 Shulin Cao , Jiaxin Shi , Liangming Pan , Lunyiu Nie , Yutong Xiang , Lei Hou , Juanzi Li , Bin He , Hanwang Zhang

We analyze knowledge-based visual question answering, for which given a question, the models need to ground it into the visual modality and retrieve the relevant knowledge from a given large knowledge base (KB) to be able to answer. Our…

Artificial Intelligence · Computer Science 2024-04-17 Elham J. Barezi , Parisa Kordjamshidi

In the realm of multimodal tasks, Visual Question Answering (VQA) plays a crucial role by addressing natural language questions grounded in visual content. Knowledge-Based Visual Question Answering (KBVQA) advances this concept by adding…

Computation and Language · Computer Science 2024-06-17 Manas Jhalani , Annervaz K M , Pushpak Bhattacharyya

The multimodal task of Visual Question Answering (VQA) encompassing elements of Computer Vision (CV) and Natural Language Processing (NLP), aims to generate answers to questions on any visual input. Over time, the scope of VQA has expanded…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Md Farhan Ishmam , Md Sakib Hossain Shovon , M. F. Mridha , Nilanjan Dey

Visual Programming (VP) has emerged as a powerful framework for Visual Question Answering (VQA). By generating and executing bespoke code for each question, these methods demonstrate impressive compositional and reasoning capabilities,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-11 Jiaxin Ge , Sanjay Subramanian , Baifeng Shi , Roei Herzig , Trevor Darrell

We study the Knowledge-Based visual question-answering problem, for which given a question, the models need to ground it into the visual modality to find the answer. Although many recent works use question-dependent captioners to verbalize…

Artificial Intelligence · Computer Science 2024-06-28 Elham J. Barezi , Parisa Kordjamshidi

Visual question answering (VQA) often requires an understanding of visual concepts and language semantics, which relies on external knowledge. Most existing methods exploit pre-trained language models or/and unstructured text, but the…

Computer Vision and Pattern Recognition · Computer Science 2022-11-29 Zhuo Chen , Yufeng Huang , Jiaoyan Chen , Yuxia Geng , Yin Fang , Jeff Pan , Ningyu Zhang , Wen Zhang

Visual Question Answering (VQA) systems are tasked with answering natural language questions corresponding to a presented image. Traditional VQA datasets typically contain questions related to the spatial information of objects, object…

Computation and Language · Computer Science 2020-06-05 Goonmeet Bajaj , Bortik Bandyopadhyay , Daniel Schmidt , Pranav Maneriker , Christopher Myers , Srinivasan Parthasarathy

Knowledge-based visual question answering (KB-VQA) requires visual language models (VLMs) to integrate visual understanding with external knowledge retrieval. Although retrieval-augmented generation (RAG) achieves significant advances in…

Computer Vision and Pattern Recognition · Computer Science 2025-10-21 Yuyang Hong , Jiaqi Gu , Qi Yang , Lubin Fan , Yue Wu , Ying Wang , Kun Ding , Shiming Xiang , Jieping Ye

Visual question answering (VQA) has traditionally been treated as a single-step task where each question receives the same amount of effort, unlike natural human question-answering strategies. We explore a question decomposition strategy…

Computer Vision and Pattern Recognition · Computer Science 2023-10-27 Zaid Khan , Vijay Kumar BG , Samuel Schulter , Manmohan Chandraker , Yun Fu

Knowledge-based Visual Question Answering (KVQA) requires external knowledge beyond the visible content to answer questions about an image. This ability is challenging but indispensable to achieve general VQA. One limitation of existing…

Artificial Intelligence · Computer Science 2020-11-04 Jing Yu , Zihao Zhu , Yujing Wang , Weifeng Zhang , Yue Hu , Jianlong Tan

Knowledge-Based Visual Question Answering (KBVQA) is a bi-modal task requiring external world knowledge in order to correctly answer a text question and associated image. Recent single modality text work has shown knowledge injection into…

Computation and Language · Computer Science 2022-05-30 Diego Garcia-Olano , Yasumasa Onoe , Joydeep Ghosh

Knowledge base question answering (KBQA) is a critical yet challenging task due to the vast number of entities within knowledge bases and the diversity of natural language questions posed by users. Unfortunately, the performance of most…

Computation and Language · Computer Science 2024-01-29 Zhenyu Li , Sunqi Fan , Yu Gu , Xiuxing Li , Zhichao Duan , Bowen Dong , Ning Liu , Jianyong Wang

Visual question answering (VQA) is a Multidisciplinary research problem that pursued through practices of natural language processing and computer vision. Visual question answering automatically answers natural language questions according…

Computer Vision and Pattern Recognition · Computer Science 2024-09-01 Param Ahir , Hiteishi Diwanji

Knowledge-based question answering (KBQA) is a key task in NLP research, and also an approach to access the web data and knowledge, which requires exploiting knowledge graphs (KGs) for reasoning. In the literature, one promising solution…

Computation and Language · Computer Science 2024-03-18 Xin Lin , Tianhuang Su , Zhenya Huang , Shangzi Xue , Haifeng Liu , Enhong Chen

Visual Question Answering (VQA) is a challenging task that has received increasing attention from both the computer vision and the natural language processing communities. Given an image and a question in natural language, it requires…

Computer Vision and Pattern Recognition · Computer Science 2016-07-21 Qi Wu , Damien Teney , Peng Wang , Chunhua Shen , Anthony Dick , Anton van den Hengel
‹ Prev 1 2 3 10 Next ›