Related papers: Discriminative Video Representation Learning Using…

Video Representation Learning Using Discriminative Pooling

Popular deep models for action recognition in videos generate independent predictions for short clips, which are then pooled heuristically to assign an action label to the full video segment. As not all frames may characterize the…

Computer Vision and Pattern Recognition · Computer Science 2018-04-02 Jue Wang , Anoop Cherian , Fatih Porikli , Stephen Gould

Action Representation Using Classifier Decision Boundaries

Most popular deep learning based models for action recognition are designed to generate separate predictions within their short temporal windows, which are often aggregated by heuristic means to assign an action label to the full video…

Computer Vision and Pattern Recognition · Computer Science 2017-04-07 Jue Wang , Anoop Cherian , Fatih Porikli , Stephen Gould

AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos

We propose a novel method for temporally pooling frames in a video for the task of human action recognition. The method is motivated by the observation that there are only a small number of frames which, together, contain sufficient…

Computer Vision and Pattern Recognition · Computer Science 2017-06-27 Amlan Kar , Nishant Rai , Karan Sikka , Gaurav Sharma

Discriminatively Learned Hierarchical Rank Pooling Networks

In this work, we present novel temporal encoding methods for action and activity classification by extending the unsupervised rank pooling temporal encoding method in two ways. First, we present "discriminative rank pooling" in which the…

Computer Vision and Pattern Recognition · Computer Science 2017-05-31 Basura Fernando , Stephen Gould

Contrastive Video Representation Learning via Adversarial Perturbations

Adversarial perturbations are noise-like patterns that can subtly change the data, while failing an otherwise accurate classifier. In this paper, we propose to use such perturbations within a novel contrastive learning setup to build…

Computer Vision and Pattern Recognition · Computer Science 2020-04-17 Jue Wang , Anoop Cherian

Second-order Temporal Pooling for Action Recognition

Deep learning models for video-based action recognition usually generate features for short clips (consisting of a few frames); such clip-level features are aggregated to video-level representations by computing statistics on these…

Computer Vision and Pattern Recognition · Computer Science 2018-08-08 Anoop Cherian , Stephen Gould

Unsupervised object segmentation in video by efficient selection of highly probable positive features

We address an essential problem in computer vision, that of unsupervised object segmentation in video, where a main object of interest in a video sequence should be automatically separated from its background. An efficient solution to this…

Computer Vision and Pattern Recognition · Computer Science 2017-04-20 Emanuela Haller , Marius Leordeanu

Multipartite Pooling for Deep Convolutional Neural Networks

We propose a novel pooling strategy that learns how to adaptively rank deep convolutional features for selecting more informative representations. To this end, we exploit discriminative analysis to project the features onto a space spanned…

Machine Learning · Computer Science 2017-10-23 Arash Shahriari , Fatih Porikli

Channel Max Pooling Layer for Fine-Grained Vehicle Classification

Deep convolutional networks have recently shown excellent performance on Fine-Grained Vehicle Classification. Based on these existing works, we consider that the back-probation algorithm does not focus on extracting less discriminative…

Computer Vision and Pattern Recognition · Computer Science 2020-01-28 Zhanyu Ma , Dongliang Chang , Xiaoxu Li

Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors

Visual features are of vital importance for human action understanding in videos. This paper presents a new video representation, called trajectory-pooled deep-convolutional descriptor (TDD), which shares the merits of both hand-crafted…

Computer Vision and Pattern Recognition · Computer Science 2016-11-17 Limin Wang , Yu Qiao , Xiaoou Tang

Collaborative Layer-wise Discriminative Learning in Deep Neural Networks

Intermediate features at different layers of a deep neural network are known to be discriminative for visual patterns of different complexities. However, most existing works ignore such cross-layer heterogeneities when classifying samples…

Computer Vision and Pattern Recognition · Computer Science 2016-07-20 Xiaojie Jin , Yunpeng Chen , Jian Dong , Jiashi Feng , Shuicheng Yan

Rank Pooling for Action Recognition

We propose a function-based temporal pooling method that captures the latent structure of the video sequence data - e.g. how frame-level features evolve over time in a video. We show how the parameters of a function that has been fit to the…

Computer Vision and Pattern Recognition · Computer Science 2016-05-17 Basura Fernando , Efstratios Gavves , Jose Oramas , Amir Ghodrati , Tinne Tuytelaars

Higher-order Pooling of CNN Features via Kernel Linearization for Action Recognition

Most successful deep learning algorithms for action recognition extend models designed for image-based tasks such as object recognition to video. Such extensions are typically trained for actions on single video frames or very short clips,…

Computer Vision and Pattern Recognition · Computer Science 2017-01-20 Anoop Cherian , Piotr Koniusz , Stephen Gould

Generalized Rank Pooling for Activity Recognition

Most popular deep models for action recognition split video sequences into short sub-sequences consisting of a few frames; frame-based features are then pooled for recognizing the activity. Usually, this pooling step discards the temporal…

Computer Vision and Pattern Recognition · Computer Science 2017-07-25 Anoop Cherian , Basura Fernando , Mehrtash Harandi , Stephen Gould

Action Recognition with Dynamic Image Networks

We introduce the concept of "dynamic image", a novel compact representation of videos useful for video analysis, particularly in combination with convolutional neural networks (CNNs). A dynamic image encodes temporal data such as RGB or…

Computer Vision and Pattern Recognition · Computer Science 2017-08-22 Hakan Bilen , Basura Fernando , Efstratios Gavves , Andrea Vedaldi

Features in Concert: Discriminative Feature Selection meets Unsupervised Clustering

Feature selection is an essential problem in computer vision, important for category learning and recognition. Along with the rapid development of a wide variety of visual features and classifiers, there is a growing need for efficient…

Computer Vision and Pattern Recognition · Computer Science 2014-12-01 Marius Leordeanu , Alexandra Radu , Rahul Sukthankar

Disentangling Motion, Foreground and Background Features in Videos

This paper introduces an unsupervised framework to extract semantically rich features for video representation. Inspired by how the human visual system groups objects based on motion cues, we propose a deep convolutional neural network that…

Computer Vision and Pattern Recognition · Computer Science 2017-07-18 Xunyu Lin , Victor Campos , Xavier Giro-i-Nieto , Jordi Torres , Cristian Canton Ferrer

Learning Deep Features for Discriminative Localization

In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels.…

Computer Vision and Pattern Recognition · Computer Science 2015-12-15 Bolei Zhou , Aditya Khosla , Agata Lapedriza , Aude Oliva , Antonio Torralba

A novel learning-based frame pooling method for Event Detection

Detecting complex events in a large video collection crawled from video websites is a challenging task. When applying directly good image-based feature representation, e.g., HOG, SIFT, to videos, we have to face the problem of how to pool…

Computer Vision and Pattern Recognition · Computer Science 2016-08-22 Lan Wang , Chenqiang Gao , Jiang Liu , Deyu Meng

Pose-Selective Max Pooling for Measuring Similarity

In this paper, we deal with two challenges for measuring the similarity of the subject identities in practical video-based face recognition - the variation of the head pose in uncontrolled environments and the computational expense of…

Computer Vision and Pattern Recognition · Computer Science 2017-02-03 Xiang Xiang , Trac D. Tran