English
Related papers

Related papers: Fixed-size Objects Encoding for Visual Relationshi…

200 papers

Pretrained vision-language models, such as CLIP, have demonstrated strong generalization capabilities, making them promising tools in the realm of zero-shot visual recognition. Visual relation detection (VRD) is a typical task that…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Lin Li , Jun Xiao , Guikun Chen , Jian Shao , Yueting Zhuang , Long Chen

Machine comprehension of visual information from images and videos by neural networks faces two primary challenges. Firstly, there exists a computational and inference gap in connecting vision and language, making it difficult to accurately…

Computer Vision and Pattern Recognition · Computer Science 2024-03-08 Ala Shaabana , Zahra Gharaee , Paul Fieguth

Visual relation detection (VRD) aims to identify relationships (or interactions) between object pairs in an image. Although recent VRD models have achieved impressive performance, they are all restricted to pre-defined relation categories,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-17 Kaifeng Gao , Siqi Chen , Hanwang Zhang , Jun Xiao , Yueting Zhuang , Qianru Sun

We address the problem of Visual Relationship Detection (VRD) which aims to describe the relationships between pairs of objects in the form of triplets of (subject, predicate, object). We observe that given a pair of bounding box proposals,…

Computer Vision and Pattern Recognition · Computer Science 2019-11-25 Mohammed Haroon Dupty , Zhen Zhang , Wee Sun Lee

In this paper, given a small bag of images, each containing a common but latent predicate, we are interested in localizing visual subject-object pairs connected via the common predicate in each of the images. We refer to this novel problem…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Revant Teotia , Vaibhav Mishra , Mayank Maheshwari , Anand Mishra

In this paper, we propose the new fixed-size ordinally-forgetting encoding (FOFE) method, which can almost uniquely encode any variable-length sequence of words into a fixed-size representation. FOFE can model the word order in a sequence…

Neural and Evolutionary Computing · Computer Science 2015-06-17 Shiliang Zhang , Hui Jiang , Mingbin Xu , Junfeng Hou , Lirong Dai

Visual relationship detection aims to identify objects and their relationships in images. Prior methods approach this task by adding separate relationship modules or decoders to existing object detection architectures. This separation…

Computer Vision and Pattern Recognition · Computer Science 2024-07-22 Tim Salzmann , Markus Ryll , Alex Bewley , Matthias Minderer

In this paper, we present our method of using fixed-size ordinally forgetting encoding (FOFE) to solve the word sense disambiguation (WSD) problem. FOFE enables us to encode variable-length sequence of words into a theoretically unique…

Computation and Language · Computer Science 2019-02-28 Xi Zhu , Mingbin Xu , Hui Jiang

We introduce Perception Encoder (PE), a state-of-the-art vision encoder for image and video understanding trained via simple vision-language learning. Traditionally, vision encoders have relied on a variety of pretraining objectives, each…

Object detection and recognition algorithms using deep convolutional neural networks (CNNs) tend to be computationally intensive to implement. This presents a particular challenge for embedded systems, such as mobile robots, where the…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 Uziel Jaramillo-Avila , Sean R. Anderson

Referring Camouflaged Object Detection (Ref-COD) focuses on segmenting specific camouflaged targets in a query image using category-aligned references. Despite recent advances, existing methods struggle with reference-target semantic…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Ye Wang , Kai Huang , Sumin Shen , Chenyang Ma

In recent years, video analysis using Artificial Intelligence (AI) has been widely used, due to the remarkable development of image recognition technology using deep learning. In 2019, the Moving Picture Experts Group (MPEG) has started…

Computer Vision and Pattern Recognition · Computer Science 2023-05-31 Takahiro Shindo , Taiju Watanabe , Kein Yamada , Hiroshi Watanabe

With advances in image recognition technology based on deep learning, automatic video analysis by Artificial Intelligence is becoming more widespread. As the amount of video used for image recognition increases, efficient compression…

Computer Vision and Pattern Recognition · Computer Science 2023-04-04 Takahiro Shindo , Taiju Watanabe , Kein Yamada , Hiroshi Watanabe

This work focuses on training a single visual relationship detector predicting over the union of label spaces from multiple datasets. Merging labels spanning different datasets could be challenging due to inconsistent taxonomies. The issue…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Long Zhao , Liangzhe Yuan , Boqing Gong , Yin Cui , Florian Schroff , Ming-Hsuan Yang , Hartwig Adam , Ting Liu

Visual relationship detection, as a challenging task used to find and distinguish the interactions between object pairs in one image, has received much attention recently. In this work, we propose a novel visual relationship detection…

Computer Vision and Pattern Recognition · Computer Science 2019-11-05 Hao Zhou , Chongyang Zhang , Chuanping Hu

Visual relations, such as "person ride bike" and "bike next to car", offer a comprehensive scene understanding of an image, and have already shown their great utility in connecting computer vision and natural language. However, due to the…

Computer Vision and Pattern Recognition · Computer Science 2017-02-28 Hanwang Zhang , Zawlin Kyaw , Shih-Fu Chang , Tat-Seng Chua

In this paper we propose a supervised object recognition method using new global features and inspired by the model of the human primary visual cortex V1 as the semidiscrete roto-translation group $SE(2,N) = \mathbb Z_N\rtimes \mathbb R^2$.…

Computer Vision and Pattern Recognition · Computer Science 2019-02-14 Amine Bohi , Dario Prandi , Vincente Guis , Frédéric Bouchara , Jean-Paul Gauthier

We present a foveated object detector (FOD) as a biologically-inspired alternative to the sliding window (SW) approach which is the dominant method of search in computer vision object detection. Similar to the human visual system, the FOD…

Computer Vision and Pattern Recognition · Computer Science 2017-11-07 Emre Akbas , Miguel P. Eckstein

Object detection in videos is an important task in computer vision for various applications such as object tracking, video summarization and video search. Although great progress has been made in improving the accuracy of object detection…

Computer Vision and Pattern Recognition · Computer Science 2024-07-18 Athindran Ramesh Kumar , Balaraman Ravindran , Anand Raghunathan

We introduce a method for 3D object detection using a single monocular image. Starting from a synthetic dataset, we pre-train an RGB-to-Depth Auto-Encoder (AE). The embedding learnt from this AE is then used to train a 3D Object Detector…

Computer Vision and Pattern Recognition · Computer Science 2021-01-27 Shubham Shrivastava , Punarjay Chakravarty
‹ Prev 1 2 3 10 Next ›