Related papers: Constructing a Visual Relationship Authenticity Da…

Visual Relationship Detection with Language prior and Softmax

Visual relationship detection is an intermediate image understanding task that detects two objects and classifies a predicate that explains the relationship between two objects in an image. The three components are linguistically and…

Computer Vision and Pattern Recognition · Computer Science 2019-04-17 Jaewon Jung , Jongyoul Park

Expressing Visual Relationships via Language

Describing images with text is a fundamental problem in vision-language research. Current studies in this domain mostly focus on single image captioning. However, in various real applications (e.g., image editing, difference interpretation,…

Computation and Language · Computer Science 2019-06-20 Hao Tan , Franck Dernoncourt , Zhe Lin , Trung Bui , Mohit Bansal

Improving Visual Relationship Detection using Semantic Modeling of Scene Descriptions

Structured scene descriptions of images are useful for the automatic processing and querying of large image databases. We show how the combination of a semantic and a visual statistical model can improve on the task of mapping images to…

Computation and Language · Computer Science 2018-09-10 Stephan Baier , Yunpu Ma , Volker Tresp

Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation

Understanding visual relationships involves identifying the subject, the object, and a predicate relating them. We leverage the strong correlations between the predicate and the (subj,obj) pair (both semantically and spatially) to predict…

Computer Vision and Pattern Recognition · Computer Science 2017-08-04 Ruichi Yu , Ang Li , Vlad I. Morariu , Larry S. Davis

The Missing Link: Finding label relations across datasets

Computer vision is driven by the many datasets available for training or evaluating novel methods. However, each dataset has a different set of class labels, visual definition of classes, images following a specific distribution, annotation…

Computer Vision and Pattern Recognition · Computer Science 2022-08-10 Jasper Uijlings , Thomas Mensink , Vittorio Ferrari

Visual Relationship Detection with Relative Location Mining

Visual relationship detection, as a challenging task used to find and distinguish the interactions between object pairs in one image, has received much attention recently. In this work, we propose a novel visual relationship detection…

Computer Vision and Pattern Recognition · Computer Science 2019-11-05 Hao Zhou , Chongyang Zhang , Chuanping Hu

Visual Relationship Detection Based on Guided Proposals and Semantic Knowledge Distillation

A thorough comprehension of image content demands a complex grasp of the interactions that may occur in the natural world. One of the key issues is to describe the visual relationships between objects. When dealing with real world data,…

Computer Vision and Pattern Recognition · Computer Science 2018-05-29 François Plesse , Alexandru Ginsca , Bertrand Delezoide , Françoise Prêteux

STUPD: A Synthetic Dataset for Spatial and Temporal Relation Reasoning

Understanding relations between objects is crucial for understanding the semantics of a visual scene. It is also an essential step in order to bridge visual and language models. However, current state-of-the-art computer vision models still…

Computer Vision and Pattern Recognition · Computer Science 2025-02-28 Palaash Agrawal , Haidi Azaman , Cheston Tan

Visual Relationship Detection using Scene Graphs: A Survey

Understanding a scene by decoding the visual relationships depicted in an image has been a long studied problem. While the recent advances in deep learning and the usage of deep neural networks have achieved near human accuracy on many…

Computer Vision and Pattern Recognition · Computer Science 2020-05-19 Aniket Agarwal , Ayush Mangal , Vipul

Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues

This paper presents a framework for localization or grounding of phrases in images using a large collection of linguistic and visual cues. We model the appearance, size, and position of entity bounding boxes, adjectives that contain…

Computer Vision and Pattern Recognition · Computer Science 2017-08-10 Bryan A. Plummer , Arun Mallya , Christopher M. Cervantes , Julia Hockenmaier , Svetlana Lazebnik

Visual Relationship Detection with Visual-Linguistic Knowledge from Multimodal Representations

Visual relationship detection aims to reason over relationships among salient objects in images, which has drawn increasing attention over the past few years. Inspired by human reasoning mechanisms, it is believed that external visual…

Computer Vision and Pattern Recognition · Computer Science 2021-04-06 Meng-Jiun Chiou , Roger Zimmermann , Jiashi Feng

A Comprehensive Survey on Visual Question Answering Datasets and Algorithms

Visual question answering (VQA) refers to the problem where, given an image and a natural language question about the image, a correct natural language answer has to be generated. A VQA model has to demonstrate both the visual understanding…

Computer Vision and Pattern Recognition · Computer Science 2024-11-19 Raihan Kabir , Naznin Haque , Md Saiful Islam , Marium-E-Jannat

Interpreting Context of Images using Scene Graphs

Understanding a visual scene incorporates objects, relationships, and context. Traditional methods working on an image mostly focus on object detection and fail to capture the relationship between the objects. Relationships can give rich…

Computer Vision and Pattern Recognition · Computer Science 2019-12-03 Himangi Mittal , Ajith Abraham , Anuja Arora

VCD: A Dataset for Visual Commonsense Discovery in Images

Visual commonsense plays a vital role in understanding and reasoning about the visual world. While commonsense knowledge bases like ConceptNet provide structured collections of general facts, they lack visually grounded representations.…

Computer Vision and Pattern Recognition · Computer Science 2025-06-06 Xiangqing Shen , Fanfan Wang , Siwei Wu , Rui Xia

Can Transformers Capture Spatial Relations between Objects?

Spatial relationships between objects represent key scene information for humans to understand and interact with the world. To study the capability of current computer vision systems to recognize physically grounded spatial relations, we…

Computer Vision and Pattern Recognition · Computer Science 2024-03-04 Chuan Wen , Dinesh Jayaraman , Yang Gao

Learning to Compose Visual Relations

The visual world around us can be described as a structured set of objects and their associated relations. An image of a room may be conjured given only the description of the underlying objects and their associated relations. While there…

Computer Vision and Pattern Recognition · Computer Science 2021-11-18 Nan Liu , Shuang Li , Yilun Du , Joshua B. Tenenbaum , Antonio Torralba

Detecting Visual Relationships with Deep Relational Networks

Relationships among objects play a crucial role in image understanding. Despite the great success of deep learning techniques in recognizing individual objects, reasoning about the relationships among objects remains a challenging task.…

Computer Vision and Pattern Recognition · Computer Science 2017-04-13 Bo Dai , Yuqi Zhang , Dahua Lin

Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection

Visual relationship detection aims to identify objects and their relationships in images. Prior methods approach this task by adding separate relationship modules or decoders to existing object detection architectures. This separation…

Computer Vision and Pattern Recognition · Computer Science 2024-07-22 Tim Salzmann , Markus Ryll , Alex Bewley , Matthias Minderer

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Despite progress in perceptual tasks such as image classification, computers still perform poorly on cognitive tasks such as image description and question answering. Cognition is core to tasks that involve not just recognizing, but…

Computer Vision and Pattern Recognition · Computer Science 2016-02-25 Ranjay Krishna , Yuke Zhu , Oliver Groth , Justin Johnson , Kenji Hata , Joshua Kravitz , Stephanie Chen , Yannis Kalantidis , Li-Jia Li , David A. Shamma , Michael S. Bernstein , Fei-Fei Li

VrR-VG: Refocusing Visually-Relevant Relationships

Relationships encode the interactions among individual instances, and play a critical role in deep visual scene understanding. Suffering from the high predictability with non-visual information, existing methods tend to fit the statistical…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 Yuanzhi Liang , Yalong Bai , Wei Zhang , Xueming Qian , Li Zhu , Tao Mei