Related papers: Mid-level Representation for Visual Recognition

Mid-level Deep Pattern Mining

Mid-level visual element discovery aims to find clusters of image patches that are both representative and discriminative. In this work, we study this problem from the prospective of pattern mining while relying on the recently popularized…

Computer Vision and Pattern Recognition · Computer Science 2016-11-17 Yao Li , Lingqiao Liu , Chunhua Shen , Anton van den Hengel

Unsupervised Discovery of Mid-Level Discriminative Patches

The goal of this paper is to discover a set of discriminative patches which can serve as a fully unsupervised mid-level visual representation. The desired patches need to satisfy two requirements: 1) to be representative, they need to occur…

Computer Vision and Pattern Recognition · Computer Science 2012-08-21 Saurabh Singh , Abhinav Gupta , Alexei A. Efros

Mining Mid-level Visual Patterns with Deep CNN Activations

The purpose of mid-level visual element discovery is to find clusters of image patches that are both representative and discriminative. Here we study this problem from the prospective of pattern mining while relying on the recently…

Computer Vision and Pattern Recognition · Computer Science 2016-05-31 Yao Li , Lingqiao Liu , Chunhua Shen , Anton van den Hengel

Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Visuomotor Policies

How much does having visual priors about the world (e.g. the fact that the world is 3D) assist in learning to perform downstream motor tasks (e.g. delivering a package)? We study this question by integrating a generic perceptual skill set…

Computer Vision and Pattern Recognition · Computer Science 2019-04-23 Alexander Sax , Bradley Emi , Amir R. Zamir , Leonidas Guibas , Silvio Savarese , Jitendra Malik

Learning Structured Representations of Visual Scenes

As the intermediate-level representations bridging the two levels, structured representations of visual scenes, such as visual relationships between pairwise objects, have been shown to not only benefit compositional models in learning to…

Computer Vision and Pattern Recognition · Computer Science 2022-07-12 Meng-Jiun Chiou

Visualizing the Emergence of Intermediate Visual Patterns in DNNs

This paper proposes a method to visualize the discrimination power of intermediate-layer visual patterns encoded by a DNN. Specifically, we visualize (1) how the DNN gradually learns regional visual patterns in each intermediate layer…

Computer Vision and Pattern Recognition · Computer Science 2021-11-08 Mingjie Li , Shaobo Wang , Quanshi Zhang

Part-guided Relational Transformers for Fine-grained Visual Recognition

Fine-grained visual recognition is to classify objects with visually similar appearances into subcategories, which has made great progress with the development of deep CNNs. However, handling subtle differences between different…

Computer Vision and Pattern Recognition · Computer Science 2022-12-29 Yifan Zhao , Jia Li , Xiaowu Chen , Yonghong Tian

Deep Patch Learning for Weakly Supervised Object Classification and Discovery

Patch-level image representation is very important for object classification and detection, since it is robust to spatial transformation, scale variation, and cluttered background. Many existing methods usually require fine-grained…

Computer Vision and Pattern Recognition · Computer Science 2017-05-09 Peng Tang , Xinggang Wang , Zilong Huang , Xiang Bai , Wenyu Liu

HindSight: A Graph-Based Vision Model Architecture For Representing Part-Whole Hierarchies

This paper presents a model architecture for encoding the representations of part-whole hierarchies in images in form of a graph. The idea is to divide the image into patches of different levels and then treat all of these patches as nodes…

Computer Vision and Pattern Recognition · Computer Science 2021-04-09 Muhammad AbdurRafae

Deep Learning Multi-View Representation for Face Recognition

Various factors, such as identities, views (poses), and illuminations, are coupled in face images. Disentangling the identity and view representations is a major challenge in face recognition. Existing face recognition systems either use…

Computer Vision and Pattern Recognition · Computer Science 2014-06-27 Zhenyao Zhu , Ping Luo , Xiaogang Wang , Xiaoou Tang

Learning Deep Representations for Semantic Image Parsing: a Comprehensive Overview

Semantic image parsing, which refers to the process of decomposing images into semantic regions and constructing the structure representation of the input, has recently aroused widespread interest in the field of computer vision. The recent…

Computer Vision and Pattern Recognition · Computer Science 2018-10-11 Lili Huang , Jiefeng Peng , Ruimao Zhang , Guanbin Li , Liang Lin

The Cross-Depiction Problem: Computer Vision Algorithms for Recognising Objects in Artwork and in Photographs

The cross-depiction problem is that of recognising visual objects regardless of whether they are photographed, painted, drawn, etc. It is a potentially significant yet under-researched problem. Emulating the remarkable human ability to…

Computer Vision and Pattern Recognition · Computer Science 2015-05-04 Hongping Cai , Qi Wu , Tadeo Corradi , Peter Hall

Leveraging Mid-Level Deep Representations For Predicting Face Attributes in the Wild

Predicting facial attributes from faces in the wild is very challenging due to pose and lighting variations in the real world. The key to this problem is to build proper feature representations to cope with these unfavourable conditions.…

Computer Vision and Pattern Recognition · Computer Science 2016-06-22 Yang Zhong , Josephine Sullivan , Haibo Li

Unsupervised Part Discovery via Dual Representation Alignment

Object parts serve as crucial intermediate representations in various downstream tasks, but part-level representation learning still has not received as much attention as other vision tasks. Previous research has established that Vision…

Computer Vision and Pattern Recognition · Computer Science 2024-08-16 Jiahao Xia , Wenjian Huang , Min Xu , Jianguo Zhang , Haimin Zhang , Ziyu Sheng , Dong Xu

Object Pose Estimation using Mid-level Visual Representations

This work proposes a novel pose estimation model for object categories that can be effectively transferred to previously unseen environments. The deep convolutional network models (CNN) for pose estimation are typically trained and…

Computer Vision and Pattern Recognition · Computer Science 2022-03-04 Negar Nejatishahidin , Pooya Fayyazsanavi , Jana Kosecka

Object Recognition by Using Multi-level Feature Point Extraction

In this paper, we present a novel approach for object recognition in real-time by employing multilevel feature analysis and demonstrate the practicality of adapting feature extraction into a Naive Bayesian classification framework that…

Computer Vision and Pattern Recognition · Computer Science 2017-10-31 Yang Cheng , Timeo Dubois

Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics

Object recognition and motion understanding are key components of perception that complement each other. While self-supervised learning methods have shown promise in their ability to learn from unlabeled data, they have primarily focused on…

Computer Vision and Pattern Recognition · Computer Science 2025-10-08 Christopher Hoang , Mengye Ren

Robust Policies via Mid-Level Visual Representations: An Experimental Study in Manipulation and Navigation

Vision-based robotics often separates the control loop into one module for perception and a separate module for control. It is possible to train the whole system end-to-end (e.g. with deep RL), but doing it "from scratch" comes with a high…

Robotics · Computer Science 2020-11-16 Bryan Chen , Alexander Sax , Gene Lewis , Iro Armeni , Silvio Savarese , Amir Zamir , Jitendra Malik , Lerrel Pinto

Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition

With the explosive growth of video data in real-world applications, a comprehensive representation of videos becomes increasingly important. In this paper, we address the problem of video scene recognition, whose goal is to learn a…

Computer Vision and Pattern Recognition · Computer Science 2025-05-20 Xuzheng Yu , Chen Jiang , Wei Zhang , Tian Gan , Linlin Chao , Jianan Zhao , Yuan Cheng , Qingpei Guo , Wei Chu

Unsupervised Visual Representation Learning by Context Prediction

This work explores the use of spatial context as a source of free and plentiful supervisory signal for training a rich visual representation. Given only a large, unlabeled image collection, we extract random pairs of patches from each image…

Computer Vision and Pattern Recognition · Computer Science 2016-01-19 Carl Doersch , Abhinav Gupta , Alexei A. Efros