Related papers: Fixed-size Objects Encoding for Visual Relationshi…

Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models

Pretrained vision-language models, such as CLIP, have demonstrated strong generalization capabilities, making them promising tools in the realm of zero-shot visual recognition. Visual relation detection (VRD) is a typical task that…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Lin Li , Jun Xiao , Guikun Chen , Jian Shao , Yueting Zhuang , Long Chen

Video Relationship Detection Using Mixture of Experts

Machine comprehension of visual information from images and videos by neural networks faces two primary challenges. Firstly, there exists a computational and inference gap in connecting vision and language, making it difficult to accurately…

Computer Vision and Pattern Recognition · Computer Science 2024-03-08 Ala Shaabana , Zahra Gharaee , Paul Fieguth

Generalized Visual Relation Detection with Diffusion Models

Visual relation detection (VRD) aims to identify relationships (or interactions) between object pairs in an image. Although recent VRD models have achieved impressive performance, they are all restricted to pre-defined relation categories,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-17 Kaifeng Gao , Siqi Chen , Hanwang Zhang , Jun Xiao , Yueting Zhuang , Qianru Sun

Visual Relationship Detection with Low Rank Non-Negative Tensor Decomposition

We address the problem of Visual Relationship Detection (VRD) which aims to describe the relationships between pairs of objects in the form of triplets of (subject, predicate, object). We observe that given a pair of bounding box proposals,…

Computer Vision and Pattern Recognition · Computer Science 2019-11-25 Mohammed Haroon Dupty , Zhen Zhang , Wee Sun Lee

Few-shot Visual Relationship Co-localization

In this paper, given a small bag of images, each containing a common but latent predicate, we are interested in localizing visual subject-object pairs connected via the common predicate in each of the images. We refer to this novel problem…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Revant Teotia , Vaibhav Mishra , Mayank Maheshwari , Anand Mishra

A Fixed-Size Encoding Method for Variable-Length Sequences with its Application to Neural Network Language Models

In this paper, we propose the new fixed-size ordinally-forgetting encoding (FOFE) method, which can almost uniquely encode any variable-length sequence of words into a fixed-size representation. FOFE can model the word order in a sequence…

Neural and Evolutionary Computing · Computer Science 2015-06-17 Shiliang Zhang , Hui Jiang , Mingbin Xu , Junfeng Hou , Lirong Dai

Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection

Visual relationship detection aims to identify objects and their relationships in images. Prior methods approach this task by adding separate relationship modules or decoders to existing object detection architectures. This separation…

Computer Vision and Pattern Recognition · Computer Science 2024-07-22 Tim Salzmann , Markus Ryll , Alex Bewley , Matthias Minderer

Fixed-Size Ordinally Forgetting Encoding Based Word Sense Disambiguation

In this paper, we present our method of using fixed-size ordinally forgetting encoding (FOFE) to solve the word sense disambiguation (WSD) problem. FOFE enables us to encode variable-length sequence of words into a theoretically unique…

Computation and Language · Computer Science 2019-02-28 Xi Zhu , Mingbin Xu , Hui Jiang

Perception Encoder: The best visual embeddings are not at the output of the network

We introduce Perception Encoder (PE), a state-of-the-art vision encoder for image and video understanding trained via simple vision-language learning. Traditionally, vision encoders have relied on a variety of pretraining objectives, each…

Computer Vision and Pattern Recognition · Computer Science 2025-04-30 Daniel Bolya , Po-Yao Huang , Peize Sun , Jang Hyun Cho , Andrea Madotto , Chen Wei , Tengyu Ma , Jiale Zhi , Jathushan Rajasegaran , Hanoona Rasheed , Junke Wang , Marco Monteiro , Hu Xu , Shiyu Dong , Nikhila Ravi , Daniel Li , Piotr Dollár , Christoph Feichtenhofer

Foveated image processing for faster object detection and recognition in embedded systems using deep convolutional neural networks

Object detection and recognition algorithms using deep convolutional neural networks (CNNs) tend to be computationally intensive to implement. This presents a particular challenge for embedded systems, such as mobile robots, where the…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 Uziel Jaramillo-Avila , Sean R. Anderson

EviRCOD: Evidence-Guided Probabilistic Decoding for Referring Camouflaged Object Detection

Referring Camouflaged Object Detection (Ref-COD) focuses on segmenting specific camouflaged targets in a query image using category-aligned references. Despite recent advances, existing methods struggle with reference-target semantic…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Ye Wang , Kai Huang , Sumin Shen , Chenyang Ma

VVC Extension Scheme for Object Detection Using Contrast Reduction

In recent years, video analysis using Artificial Intelligence (AI) has been widely used, due to the remarkable development of image recognition technology using deep learning. In 2019, the Moving Picture Experts Group (MPEG) has started…

Computer Vision and Pattern Recognition · Computer Science 2023-05-31 Takahiro Shindo , Taiju Watanabe , Kein Yamada , Hiroshi Watanabe

Accuracy Improvement of Object Detection in VVC Coded Video Using YOLO-v7 Features

With advances in image recognition technology based on deep learning, automatic video analysis by Artificial Intelligence is becoming more widespread. As the amount of video used for image recognition increases, efficient compression…

Computer Vision and Pattern Recognition · Computer Science 2023-04-04 Takahiro Shindo , Taiju Watanabe , Kein Yamada , Hiroshi Watanabe

Unified Visual Relationship Detection with Vision and Language Models

This work focuses on training a single visual relationship detector predicting over the union of label spaces from multiple datasets. Merging labels spanning different datasets could be challenging due to inconsistent taxonomies. The issue…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Long Zhao , Liangzhe Yuan , Boqing Gong , Yin Cui , Florian Schroff , Ming-Hsuan Yang , Hartwig Adam , Ting Liu

Visual Relationship Detection with Relative Location Mining

Visual relationship detection, as a challenging task used to find and distinguish the interactions between object pairs in one image, has received much attention recently. In this work, we propose a novel visual relationship detection…

Computer Vision and Pattern Recognition · Computer Science 2019-11-05 Hao Zhou , Chongyang Zhang , Chuanping Hu

Visual Translation Embedding Network for Visual Relation Detection

Visual relations, such as "person ride bike" and "bike next to car", offer a comprehensive scene understanding of an image, and have already shown their great utility in connecting computer vision and natural language. However, due to the…

Computer Vision and Pattern Recognition · Computer Science 2017-02-28 Hanwang Zhang , Zawlin Kyaw , Shih-Fu Chang , Tat-Seng Chua

Fourier descriptors based on the structure of the human primary visual cortex with applications to object recognition

In this paper we propose a supervised object recognition method using new global features and inspired by the model of the human primary visual cortex V1 as the semidiscrete roto-translation group $SE(2,N) = \mathbb Z_N\rtimes \mathbb R^2$.…

Computer Vision and Pattern Recognition · Computer Science 2019-02-14 Amine Bohi , Dario Prandi , Vincente Guis , Frédéric Bouchara , Jean-Paul Gauthier

Object Detection Through Exploration With A Foveated Visual Field

We present a foveated object detector (FOD) as a biologically-inspired alternative to the sliding window (SW) approach which is the dominant method of search in computer vision object detection. Similar to the human visual system, the FOD…

Computer Vision and Pattern Recognition · Computer Science 2017-11-07 Emre Akbas , Miguel P. Eckstein

Pack and Detect: Fast Object Detection in Videos Using Region-of-Interest Packing

Object detection in videos is an important task in computer vision for various applications such as object tracking, video summarization and video search. Although great progress has been made in improving the accuracy of object detection…

Computer Vision and Pattern Recognition · Computer Science 2024-07-18 Athindran Ramesh Kumar , Balaraman Ravindran , Anand Raghunathan

CubifAE-3D: Monocular Camera Space Cubification for Auto-Encoder based 3D Object Detection

We introduce a method for 3D object detection using a single monocular image. Starting from a synthetic dataset, we pre-train an RGB-to-Depth Auto-Encoder (AE). The embedding learnt from this AE is then used to train a 3D Object Detector…

Computer Vision and Pattern Recognition · Computer Science 2021-01-27 Shubham Shrivastava , Punarjay Chakravarty