English
Related papers

Related papers: Exploring Sparse Spatial Relation in Graph Inferen…

200 papers

One of the key issues of Visual Question Answering (VQA) is to reason with semantic clues in the visual content under the guidance of the question, how to model relational semantics still remains as a great challenge. To fully capture…

Multimedia · Computer Science 2019-08-22 Zhuoqian Yang , Zengchang Qin , Jing Yu , Yue Hu

Over the past few years, a significant progress has been made in deep convolutional neural networks (CNNs)-based image recognition. This is mainly due to the strong ability of such networks in mining discriminative object pose and parts…

Computer Vision and Pattern Recognition · Computer Science 2022-10-05 Asish Bera , Zachary Wharton , Yonghuai Liu , Nik Bessis , Ardhendu Behera

Visual dialog is a task of answering a sequence of questions grounded in an image using the previous dialog history as context. In this paper, we study how to address two fundamental challenges for this task: (1) reasoning over underlying…

Computer Vision and Pattern Recognition · Computer Science 2021-09-01 Gi-Cheon Kang , Junseok Park , Hwaran Lee , Byoung-Tak Zhang , Jin-Hwa Kim

Scene graph generation (SGG) is to detect object pairs with their relations in an image. Existing SGG approaches often use multi-stage pipelines to decompose this task into object detection, relation graph construction, and dense or…

Computer Vision and Pattern Recognition · Computer Science 2022-03-29 Yao Teng , Limin Wang

In order to answer semantically-complicated questions about an image, a Visual Question Answering (VQA) model needs to fully understand the visual scene in the image, especially the interactive dynamics between different objects. We propose…

Computer Vision and Pattern Recognition · Computer Science 2019-10-11 Linjie Li , Zhe Gan , Yu Cheng , Jingjing Liu

Previous studies such as VizWiz find that Visual Question Answering (VQA) systems that can read and reason about text in images are useful in application areas such as assisting visually-impaired people. TextVQA is a VQA dataset geared…

Computer Vision and Pattern Recognition · Computer Science 2021-11-12 Michael Yang , Aditya Anantharaman , Zachary Kitowski , Derik Clive Robert

The main challenge in video question answering (VideoQA) is to capture and understand the complex spatial and temporal relations between objects based on given questions. Existing graph-based methods for VideoQA usually ignore keywords in…

Computer Vision and Pattern Recognition · Computer Science 2023-07-26 Yi Cheng , Hehe Fan , Dongyun Lin , Ying Sun , Mohan Kankanhalli , Joo-Hwee Lim

Text-based Visual Question Answering~(TextVQA) aims to produce correct answers for given questions about the images with multiple scene texts. In most cases, the texts naturally attach to the surface of the objects. Therefore, spatial…

Computer Vision and Pattern Recognition · Computer Science 2023-06-16 Hao Li , Jinfa Huang , Peng Jin , Guoli Song , Qi Wu , Jie Chen

Recently, graph neural networks (GNNs) have been widely used for document classification. However, most existing methods are based on static word co-occurrence graphs without sentence-level information, which poses three challenges:(1) word…

Computation and Language · Computer Science 2022-03-22 Yinhua Piao , Sangseon Lee , Dohoon Lee , Sun Kim

Visual commonsense reasoning (VCR) is a challenging multi-modal task, which requires high-level cognition and commonsense reasoning ability about the real world. In recent years, large-scale pre-training approaches have been developed and…

Computer Vision and Pattern Recognition · Computer Science 2023-11-10 Cheng Yang , Rui Xu , Ye Guo , Peixiang Huang , Yiru Chen , Wenkui Ding , Zhongyuan Wang , Hong Zhou

Outstanding achievements of graph neural networks for spatiotemporal time series analysis show that relational constraints introduce an effective inductive bias into neural forecasting architectures. Often, however, the relational…

Machine Learning · Computer Science 2023-08-03 Andrea Cini , Daniele Zambon , Cesare Alippi

For a given video-based Human-Object Interaction scene, modeling the spatio-temporal relationship between humans and objects are the important cue to understand the contextual information presented in the video. With the effective…

Computer Vision and Pattern Recognition · Computer Science 2021-08-20 Ning Wang , Guangming Zhu , Liang Zhang , Peiyi Shen , Hongsheng Li , Cong Hua

The paper addresses challenges in storing and retrieving sequences in contexts like anomaly detection, behavior prediction, and genetic information analysis. Associative Knowledge Graphs (AKGs) offer a promising approach by leveraging…

Artificial Intelligence · Computer Science 2025-09-11 Przemysław Stokłosa , Janusz A. Starzyk , Paweł Raif , Adrian Horzyk , Marcin Kowalik

Comprehensive visual understanding requires detection frameworks that can effectively learn and utilize object interactions while analyzing objects individually. This is the main objective in Human-Object Interaction (HOI) detection task.…

Computer Vision and Pattern Recognition · Computer Science 2020-03-13 Oytun Ulutan , A S M Iftekhar , B. S. Manjunath

Textual cues are essential for everyday tasks like buying groceries and using public transport. To develop this assistive technology, we study the TextVQA task, i.e., reasoning about text in images to answer a question. Existing approaches…

Computer Vision and Pattern Recognition · Computer Science 2020-12-24 Yash Kant , Dhruv Batra , Peter Anderson , Alex Schwing , Devi Parikh , Jiasen Lu , Harsh Agrawal

Modeling visual question answering(VQA) through scene graphs can significantly improve the reasoning accuracy and interpretability. However, existing models answer poorly for complex reasoning questions with attributes or relations, which…

Computer Vision and Pattern Recognition · Computer Science 2022-05-10 Hao Li , Xu Li , Belhal Karimi , Jie Chen , Mingming Sun

Spatio-temporal forecasting of future values of spatially correlated time series is important across many cyber-physical systems (CPS). Recent studies offer evidence that the use of graph neural networks to capture latent correlations…

Machine Learning · Computer Science 2023-12-29 Minbo Ma , Jilin Hu , Christian S. Jensen , Fei Teng , Peng Han , Zhiqiang Xu , Tianrui Li

With the process of urbanization and the rapid growth of population, the issue of traffic congestion has become an increasingly critical concern. Intelligent transportation systems heavily rely on real-time and precise prediction algorithms…

Artificial Intelligence · Computer Science 2025-01-03 Zihao Jing

Virtual sensing techniques allow for inferring signals at new unmonitored locations by exploiting spatio-temporal measurements coming from physical sensors at different locations. However, as the sensor coverage becomes sparse due to costs…

Machine Learning · Computer Science 2024-02-21 Giovanni De Felice , Andrea Cini , Daniele Zambon , Vladimir V. Gusev , Cesare Alippi

Scene Graph Generation (SGG) aims to extract entities, predicates and their semantic structure from images, enabling deep understanding of visual content, with many applications such as visual reasoning and image retrieval. Nevertheless,…

Computer Vision and Pattern Recognition · Computer Science 2020-04-02 Alireza Zareian , Svebor Karaman , Shih-Fu Chang
‹ Prev 1 2 3 10 Next ›