Related papers: RAID: A Relation-Augmented Image Descriptor

Region-Based Image Retrieval Revisited

Region-based image retrieval (RBIR) technique is revisited. In early attempts at RBIR in the late 90s, researchers found many ways to specify region-based queries and spatial relationships; however, the way to characterize the regions, such…

Multimedia · Computer Science 2017-09-27 Ryota Hinami , Yusuke Matsui , Shin'ichi Satoh

Detecting Visual Relationships with Deep Relational Networks

Relationships among objects play a crucial role in image understanding. Despite the great success of deep learning techniques in recognizing individual objects, reasoning about the relationships among objects remains a challenging task.…

Computer Vision and Pattern Recognition · Computer Science 2017-04-13 Bo Dai , Yuqi Zhang , Dahua Lin

RAID: Retrieval-Augmented Anomaly Detection

Unsupervised Anomaly Detection (UAD) aims to identify abnormal regions by establishing correspondences between test images and normal templates. Existing methods primarily rely on image reconstruction or template retrieval but face a…

Computer Vision and Pattern Recognition · Computer Science 2026-02-24 Mingxiu Cai , Zhe Zhang , Gaochang Wu , Tianyou Chai , Xiatian Zhu

Generalized Visual Relation Detection with Diffusion Models

Visual relation detection (VRD) aims to identify relationships (or interactions) between object pairs in an image. Although recent VRD models have achieved impressive performance, they are all restricted to pre-defined relation categories,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-17 Kaifeng Gao , Siqi Chen , Hanwang Zhang , Jun Xiao , Yueting Zhuang , Qianru Sun

RelationRS: Relationship Representation Network for Object Detection in Aerial Images

Object detection is a basic and important task in the field of aerial image processing and has gained much attention in computer vision. However, previous aerial image object detection approaches have insufficient use of scene semantic…

Computer Vision and Pattern Recognition · Computer Science 2021-10-14 Zhiming Liu , Xuefei Zhang , Chongyang Liu , Hao Wang , Chao Sun , Bin Li , Weifeng Sun , Pu Huang , Qingjun Li , Yu Liu , Haipeng Kuang , Jihong Xiu

Spatial Reasoning for Few-Shot Object Detection

Although modern object detectors rely heavily on a significant amount of training data, humans can easily detect novel objects using a few training examples. The mechanism of the human visual system is to interpret spatial relationships…

Computer Vision and Pattern Recognition · Computer Science 2022-11-03 Geonuk Kim , Hong-Gyu Jung , Seong-Whan Lee

Exploring Explicit and Implicit Visual Relationships for Image Captioning

Image captioning is one of the most challenging tasks in AI, which aims to automatically generate textual sentences for an image. Recent methods for image captioning follow encoder-decoder framework that transforms the sequence of salient…

Computer Vision and Pattern Recognition · Computer Science 2021-05-07 Zeliang Song , Xiaofei Zhou

Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis

Multimodal relation extraction (MRE) is the task of identifying the semantic relationships between two entities based on the context of the sentence image pair. Existing retrieval-augmented approaches mainly focused on modeling the…

Computation and Language · Computer Science 2023-05-26 Xuming Hu , Zhijiang Guo , Zhiyang Teng , Irwin King , Philip S. Yu

RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors

AI-generated images have reached a quality level at which humans are incapable of reliably distinguishing them from real images. To counteract the inherent risk of fraud and disinformation, the detection of AI-generated images is a pressing…

Computer Vision and Pattern Recognition · Computer Science 2025-06-10 Hicham Eddoubi , Jonas Ricker , Federico Cocchi , Lorenzo Baraldi , Angelo Sotgiu , Maura Pintor , Marcella Cornia , Lorenzo Baraldi , Asja Fischer , Rita Cucchiara , Battista Biggio

High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification

Occluded person re-identification (ReID) aims to match occluded person images to holistic ones across dis-joint cameras. In this paper, we propose a novel framework by learning high-order relation and topology information for discriminative…

Computer Vision and Pattern Recognition · Computer Science 2020-04-03 Guan'an Wang , Shuo Yang , Huanyu Liu , Zhicheng Wang , Yang Yang , Shuliang Wang , Gang Yu , Erjin Zhou , Jian Sun

Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval

Text-to-image person re-identification (ReID) aims to retrieve images of a person based on a given textual description. The key challenge is to learn the relations between detailed information from visual and textual modalities. Existing…

Computer Vision and Pattern Recognition · Computer Science 2023-12-05 Dixuan Lin , Yixing Peng , Jingke Meng , Wei-Shi Zheng

RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis

Clinical diagnosis is a highly specialized discipline requiring both domain expertise and strict adherence to rigorous guidelines. While current AI-driven medical research predominantly focuses on knowledge graphs or natural text…

Machine Learning · Computer Science 2025-12-12 Haolin Li , Tianjie Dai , Zhe Chen , Siyuan Du , Jiangchao Yao , Ya Zhang , Yanfeng Wang

An Exploratory Study on Abstract Images and Visual Representations Learned from Them

Imagine living in a world composed solely of primitive shapes, could you still recognise familiar objects? Recent studies have shown that abstract images-constructed by primitive shapes-can indeed convey visual semantic information to deep…

Computer Vision and Pattern Recognition · Computer Science 2025-09-18 Haotian Li , Jianbo Jiao

RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection

The rapid advances in generative AI models have empowered the creation of highly realistic images with arbitrary content, raising concerns about potential misuse and harm, such as Deepfakes. Current research focuses on training detectors…

Computer Vision and Pattern Recognition · Computer Science 2024-05-31 Zhiyuan He , Pin-Yu Chen , Tsung-Yi Ho

Dual Relation Alignment for Composed Image Retrieval

Composed image retrieval, a task involving the search for a target image using a reference image and a complementary text as the query, has witnessed significant advancements owing to the progress made in cross-modal modeling. Unlike the…

Computer Vision and Pattern Recognition · Computer Science 2024-02-01 Xintong Jiang , Yaxiong Wang , Yujiao Wu , Meng Wang , Xueming Qian

Visual Relationship Detection with Relative Location Mining

Visual relationship detection, as a challenging task used to find and distinguish the interactions between object pairs in one image, has received much attention recently. In this work, we propose a novel visual relationship detection…

Computer Vision and Pattern Recognition · Computer Science 2019-11-05 Hao Zhou , Chongyang Zhang , Chuanping Hu

RAQ: Relationship-Aware Graph Querying in Large Networks

The phenomenal growth of graph data from a wide variety of real-world applications has rendered graph querying to be a problem of paramount importance. Traditional techniques use structural as well as node similarities to find matches of a…

Databases · Computer Science 2021-05-14 Jithin Vachery , Akhil Arora , Sayan Ranu , Arnab Bhattacharya

Towards Reliable Identification of Diffusion-based Image Manipulations

Changing facial expressions, gestures, or background details may dramatically alter the meaning conveyed by an image. Notably, recent advances in diffusion models greatly improve the quality of image manipulation while also opening the door…

Computer Vision and Pattern Recognition · Computer Science 2025-06-13 Alex Costanzino , Woody Bayliss , Juil Sock , Marc Gorriz Blanch , Danijela Horak , Ivan Laptev , Philip Torr , Fabio Pizzati

Enhancing Retrieval in QA Systems with Derived Feature Association

Retrieval augmented generation (RAG) has become the standard in long context question answering (QA) systems. However, typical implementations of RAG rely on a rather naive retrieval mechanism, in which texts whose embeddings are most…

Computation and Language · Computer Science 2024-10-08 Keyush Shah , Abhishek Goyal , Isaac Wasserman

Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval

Text-to-image person retrieval aims to identify the target person based on a given textual description query. The primary challenge is to learn the mapping of visual and textual modalities into a common latent space. Prior works have…

Computer Vision and Pattern Recognition · Computer Science 2023-03-23 Ding Jiang , Mang Ye