Related papers: Knowledge-augmented Few-shot Visual Relation Detec…

RelVAE: Generative Pretraining for few-shot Visual Relationship Detection

Visual relations are complex, multimodal concepts that play an important role in the way humans perceive the world. As a result of their complexity, high-quality, diverse and large scale datasets for visual relations are still absent. In an…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Sotiris Karapiperis , Markos Diomataris , Vassilis Pitsikalis

Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents

Key-value relations are prevalent in Visually-Rich Documents (VRDs), often depicted in distinct spatial regions accompanied by specific color and font styles. These non-textual cues serve as important indicators that greatly enhance human…

Computer Vision and Pattern Recognition · Computer Science 2024-03-26 Hao Wang , Tang Li , Chenhui Chu , Nengjun Zhu , Rui Wang , Pinpin Zhu

Natural Language Guided Visual Relationship Detection

Reasoning about the relationships between object pairs in images is a crucial task for holistic scene understanding. Most of the existing works treat this task as a pure visual classification task: each type of relationship or phrase is…

Computer Vision and Pattern Recognition · Computer Science 2017-11-22 Wentong Liao , Lin Shuai , Bodo Rosenhahn , Michael Ying Yang

Generalized Visual Relation Detection with Diffusion Models

Visual relation detection (VRD) aims to identify relationships (or interactions) between object pairs in an image. Although recent VRD models have achieved impressive performance, they are all restricted to pre-defined relation categories,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-17 Kaifeng Gao , Siqi Chen , Hanwang Zhang , Jun Xiao , Yueting Zhuang , Qianru Sun

Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation

Understanding visual relationships involves identifying the subject, the object, and a predicate relating them. We leverage the strong correlations between the predicate and the (subj,obj) pair (both semantically and spatially) to predict…

Computer Vision and Pattern Recognition · Computer Science 2017-08-04 Ruichi Yu , Ang Li , Vlad I. Morariu , Larry S. Davis

VReBERT: A Simple and Flexible Transformer for Visual Relationship Detection

Visual Relationship Detection (VRD) impels a computer vision model to 'see' beyond an individual object instance and 'understand' how different objects in a scene are related. The traditional way of VRD is first to detect objects in an…

Computer Vision and Pattern Recognition · Computer Science 2022-06-22 Yu Cui , Moshiur Farazi

Visual Relationship Detection with Low Rank Non-Negative Tensor Decomposition

We address the problem of Visual Relationship Detection (VRD) which aims to describe the relationships between pairs of objects in the form of triplets of (subject, predicate, object). We observe that given a pair of bounding box proposals,…

Computer Vision and Pattern Recognition · Computer Science 2019-11-25 Mohammed Haroon Dupty , Zhen Zhang , Wee Sun Lee

Leveraging Auxiliary Text for Deep Recognition of Unseen Visual Relationships

One of the most difficult tasks in scene understanding is recognizing interactions between objects in an image. This task is often called visual relationship detection (VRD). We consider the question of whether, given auxiliary textual data…

Computer Vision and Pattern Recognition · Computer Science 2019-10-29 Gal Sadeh Kenigsfield , Ran El-Yaniv

Visual Relationship Detection with Visual-Linguistic Knowledge from Multimodal Representations

Visual relationship detection aims to reason over relationships among salient objects in images, which has drawn increasing attention over the past few years. Inspired by human reasoning mechanisms, it is believed that external visual…

Computer Vision and Pattern Recognition · Computer Science 2021-04-06 Meng-Jiun Chiou , Roger Zimmermann , Jiashi Feng

ART: Adaptive Relation Tuning for Generalized Relation Prediction

Visual relation detection (VRD) is the task of identifying the relationships between objects in a scene. VRD models trained solely on relation detection data struggle to generalize beyond the relations on which they are trained. While…

Computer Vision and Pattern Recognition · Computer Science 2025-08-11 Gopika Sudhakaran , Hikaru Shindo , Patrick Schramowski , Simone Schaub-Meyer , Kristian Kersting , Stefan Roth

2.5D Visual Relationship Detection

Visual 2.5D perception involves understanding the semantics and geometry of a scene through reasoning about object relationships with respect to the viewer in an environment. However, existing works in visual recognition primarily focus on…

Computer Vision and Pattern Recognition · Computer Science 2021-04-27 Yu-Chuan Su , Soravit Changpinyo , Xiangning Chen , Sathish Thoppay , Cho-Jui Hsieh , Lior Shapira , Radu Soricut , Hartwig Adam , Matthew Brown , Ming-Hsuan Yang , Boqing Gong

Visual Relationship Detection Based on Guided Proposals and Semantic Knowledge Distillation

A thorough comprehension of image content demands a complex grasp of the interactions that may occur in the natural world. One of the key issues is to describe the visual relationships between objects. When dealing with real world data,…

Computer Vision and Pattern Recognition · Computer Science 2018-05-29 François Plesse , Alexandru Ginsca , Bertrand Delezoide , Françoise Prêteux

Few-shot Visual Relationship Co-localization

In this paper, given a small bag of images, each containing a common but latent predicate, we are interested in localizing visual subject-object pairs connected via the common predicate in each of the images. We refer to this novel problem…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Revant Teotia , Vaibhav Mishra , Mayank Maheshwari , Anand Mishra

VrR-VG: Refocusing Visually-Relevant Relationships

Relationships encode the interactions among individual instances, and play a critical role in deep visual scene understanding. Suffering from the high predictability with non-visual information, existing methods tend to fit the statistical…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 Yuanzhi Liang , Yalong Bai , Wei Zhang , Xueming Qian , Li Zhu , Tao Mei

Verbalized Representation Learning for Interpretable Few-Shot Generalization

Humans recognize objects after observing only a few examples, a remarkable capability enabled by their inherent language understanding of the real-world environment. Developing verbalized and interpretable representation can significantly…

Computer Vision and Pattern Recognition · Computer Science 2025-08-08 Cheng-Fu Yang , Da Yin , Wenbo Hu , Heng Ji , Nanyun Peng , Bolei Zhou , Kai-Wei Chang

Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection

Despite progress in visual perception tasks such as image classification and detection, computers still struggle to understand the interdependency of objects in the scene as a whole, e.g., relations between objects or their attributes.…

Computer Vision and Pattern Recognition · Computer Science 2017-03-10 Xiaodan Liang , Lisa Lee , Eric P. Xing

Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection

Few-shot object detection is an imperative and long-lasting problem due to the inherent long-tail distribution of real-world data. Its performance is largely affected by the data scarcity of novel classes. But the semantic relation between…

Computer Vision and Pattern Recognition · Computer Science 2021-03-23 Chenchen Zhu , Fangyi Chen , Uzair Ahmed , Zhiqiang Shen , Marios Savvides

Deeply Supervised Multimodal Attentional Translation Embeddings for Visual Relationship Detection

Detecting visual relationships, i.e. <Subject, Predicate, Object> triplets, is a challenging Scene Understanding task approached in the past via linguistic priors or spatial information in a single feature branch. We introduce a new deeply…

Computer Vision and Pattern Recognition · Computer Science 2019-02-18 Nikolaos Gkanatsios , Vassilis Pitsikalis , Petros Koutras , Athanasia Zlatintsi , Petros Maragos

VRM: Knowledge Distillation via Virtual Relation Matching

Knowledge distillation (KD) aims to transfer the knowledge of a more capable yet cumbersome teacher model to a lightweight student model. In recent years, relation-based KD methods have fallen behind, as their instance-matching counterparts…

Computer Vision and Pattern Recognition · Computer Science 2025-08-01 Weijia Zhang , Fei Xie , Weidong Cai , Chao Ma

Document-Level In-Context Few-Shot Relation Extraction via Pre-Trained Language Models

Document-level relation extraction aims at inferring structured human knowledge from textual documents. State-of-the-art methods for this task use pre-trained language models (LMs) via fine-tuning, yet fine-tuning is computationally…

Computation and Language · Computer Science 2024-10-03 Yilmazcan Ozyurt , Stefan Feuerriegel , Ce Zhang