Related papers: Object Level Deep Feature Pooling for Compact Imag…
With the impressive capability to capture visual content, deep convolutional neural networks (CNN) have demon- strated promising performance in various vision-based ap- plications, such as classification, recognition, and objec- t…
Recently, image representation built upon Convolutional Neural Network (CNN) has been shown to provide effective descriptors for image search, outperforming pre-CNN features as short-vector representations. Yet such models are not…
Deep convolutional neural networks (CNN) have shown their promise as a universal representation for recognition. However, global CNN activations lack geometric invariance, which limits their robustness for classification and matching of…
Deep Convolutional Neural Networks (CNNs) have demonstrated excellent performance in image classification, but still show room for improvement in object-detection tasks with many categories, in particular for cluttered scenes and occlusion.…
In this paper we evaluate the quality of the activation layers of a convolutional neural network (CNN) for the gen- eration of object proposals. We generate hypotheses in a sliding-window fashion over different activation layers and show…
Recent studies have shown that a Deep Convolutional Neural Network (DCNN) pretrained on a large image dataset can be used as a universal image descriptor, and that doing so leads to impressive performance for a variety of image…
A number of recent studies have shown that a Deep Convolutional Neural Network (DCNN) pretrained on a large dataset can be adopted as a universal image description which leads to astounding performance in many visual classification tasks.…
Convolutional Neural Networks (CNNs) require large image corpora to be trained on classification tasks. The variation in image resolutions, sizes of objects and patterns depicted, and image scales, hampers CNN training and performance,…
Image search can be tackled using deep features from pre-trained Convolutional Neural Networks (CNN). The feature map from the last convolutional layer of a CNN encodes descriptive information from which a discriminative global descriptor…
Deep Convolutional Neural Networks (DCNNs) commonly use generic `max-pooling' (MP) layers to extract deformation-invariant features, but we argue in favor of a more refined treatment. First, we introduce epitomic convolution as a building…
Convolutional neural networks use pooling and other downscaling operations to maintain translational invariance for detection of features, but in their architecture they do not explicitly maintain a representation of the locations of the…
Feature extraction with convolutional neural networks (CNNs) is a popular method to represent images for machine learning tasks. These representations seek to capture global image content, and ideally should be independent of geometric…
In this work, we address the problem of improvement of robustness of feature representations learned using convolutional neural networks (CNNs) to image deformation. We argue that higher moment statistics of feature distributions could be…
This paper investigates how working of Convolutional Neural Network (CNN) can be explained through visualization in the context of machine perception of autonomous vehicles. We visualize what type of features are extracted in different…
In the task of Object Recognition, there exists a dichotomy between the categorization of objects and estimating object pose, where the former necessitates a view-invariant representation, while the latter requires a representation capable…
We propose a novel video object segmentation algorithm based on pixel-level matching using Convolutional Neural Networks (CNN). Our network aims to distinguish the target area from the background on the basis of the pixel-level similarity…
The recent advances brought by deep learning allowed to improve the performance in image retrieval tasks. Through the many convolutional layers, available in a Convolutional Neural Network (CNN), it is possible to obtain a hierarchy of…
Recognizing objects and scenes are two challenging but essential tasks in image understanding. In particular, the use of RGB-D sensors in handling these tasks has emerged as an important area of focus for better visual understanding.…
Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance on many visual recognition tasks. However, the combination of convolution and pooling operations only shows invariance to small local location changes in…
Deep convolutional neural networks (CNN) have revolutionized various fields of vision research and have seen unprecedented adoption for multiple tasks such as classification, detection, captioning, etc. However, they offer little…