Related papers: ContextLocNet: Context-Aware Deep Network Models f…
Modern deep neural network based object detection methods typically classify candidate proposals using their interior features. However, global and local surrounding contexts that are believed to be valuable for object detection are not…
Object localization is an important computer vision problem with a variety of applications. The lack of large scale object-level annotations and the relative abundance of image-level labels makes a compelling case for weak supervision in…
We explore the power of spatial context as a self-supervisory signal for learning visual representations. In particular, we propose spatial context networks that learn to predict a representation of one image patch from another image patch,…
Weakly supervised learning of object detection is an important problem in image understanding that still does not have a satisfactory solution. In this paper, we address this problem by exploiting the power of deep convolutional neural…
Images captured under low-light conditions manifest poor visibility, lack contrast and color vividness. Compared to conventional approaches, deep convolutional neural networks (CNNs) perform well in enhancing images. However, being solely…
In this work, we propose the combined usage of low- and high-level blocks of convolutional neural networks (CNNs) for improving object recognition. While recent research focused on either propagating the context from all layers, e.g.…
Weakly supervised learning has emerged as a compelling tool for object detection by reducing the need for strong supervision during training. However, major challenges remain: (1) differentiation of object instances can be ambiguous; (2)…
A natural way to improve the detection of objects is to consider the contextual constraints imposed by the detection of additional objects in a given scene. In this work, we exploit the spatial relations between objects in order to improve…
Object detection is one of the most active areas in computer vision, which has made significant improvement in recent years. Current state-of-the-art object detection methods mostly adhere to the framework of regions with convolutional…
Performing data augmentation for learning deep neural networks is well known to be important for training visual recognition systems. By artificially increasing the number of training examples, it helps reducing overfitting and improves…
Weakly Supervised Object Localization (WSOL) methodsusually rely on fully convolutional networks in order to ob-tain class activation maps(CAMs) of targeted labels. How-ever, these networks always highlight the most discriminativeparts to…
Performing data augmentation for learning deep neural networks is known to be important for training visual recognition systems. By artificially increasing the number of training examples, it helps reducing overfitting and improves…
We propose a novel attention model that can accurately attends to target objects of various scales and shapes in images. The model is trained to gradually suppress irrelevant regions in an input image via a progressive attentive process…
We present a novel detection method using a deep convolutional neural network (CNN), named AttentionNet. We cast an object detection problem as an iterative classification problem, which is the most suitable form of a CNN. AttentionNet…
Patch-level image representation is very important for object classification and detection, since it is robust to spatial transformation, scale variation, and cluttered background. Many existing methods usually require fine-grained…
We propose an adversarial contextual model for detecting moving objects in images. A deep neural network is trained to predict the optical flow in a region using information from everywhere else but that region (context), while another…
We address the problem of weakly supervised object localization where only image-level annotations are available for training object detectors. Numerous methods have been proposed to tackle this problem through mining object proposals.…
Weakly-supervised learning approaches have gained significant attention due to their ability to reduce the effort required for human annotations in training neural networks. This paper investigates a framework for weakly-supervised object…
Visual grounding, which aims to build a correspondence between visual objects and their language entities, plays a key role in cross-modal scene understanding. One promising and scalable strategy for learning visual grounding is to utilize…
Image-level weakly supervised semantic segmentation is a challenging task that has been deeply studied in recent years. Most of the common solutions exploit class activation map (CAM) to locate object regions. However, such response maps…