Related papers: Mid-level Elements for Object Detection
An "elephant in the room" for most current object detection and localization methods is the lack of explicit modelling of partial visibility due to occlusion by other objects or truncation by the image boundary. Based on a sliding window…
Object detection has made impressive progress in recent years with the help of deep learning. However, state-of-the-art algorithms are both computation and memory intensive. Though many lightweight networks are developed for a trade-off…
In this paper we present a hierarchical method to discover mid-level elements with the objective of modeling visual compatibility between objects. At the base-level, our method identifies patterns of CNN activations with the aim of modeling…
Detecting objects becomes difficult when we need to deal with large shape deformation, occlusion and low resolution. We propose a novel approach to i) handle large deformations and partial occlusions in animals (as examples of highly…
Visual Recognition is one of the fundamental challenges in AI, where the goal is to understand the semantics of visual data. Employing mid-level representation, in particular, shifted the paradigm in visual recognition. The mid-level…
Object detection is a critical part of visual scene understanding. The representation of the object in the detection task has important implications on the efficiency and feasibility of annotation, robustness to occlusion, pose, lighting,…
Connecting multiple machine learning models into a pipeline is effective for handling complex problems. By breaking down the problem into steps, each tackled by a specific component model of the pipeline, the overall solution can be made…
Learning to localize objects with minimal supervision is an important problem in computer vision, since large fully annotated datasets are extremely costly to obtain. In this paper, we propose a new method that achieves this goal with only…
Part-based representation has been proven to be effective for a variety of visual applications. However, automatic discovery of discriminative parts without object/part-level annotations is challenging. This paper proposes a discriminative…
Most object detection methods operate by applying a binary classifier to sub-windows of an image, followed by a non-maximum suppression step where detections on overlapping sub-windows are removed. Since the number of possible sub-windows…
Human object interaction (HOI) detection plays a crucial role in human-centric scene understanding and serves as a fundamental building-block for many vision tasks. One generalizable and scalable strategy for HOI detection is to use weak…
Discovering object-centric representations from images can significantly enhance the robustness, sample efficiency and generalizability of vision models. Works on images with multi-part objects typically follow an implicit object…
Low-cost autonomous agents including autonomous driving vehicles chiefly adopt monocular 3D object detection to perceive surrounding environment. This paper studies 3D intermediate representation methods which generate intermediate 3D…
We propose HOI Transformer to tackle human object interaction (HOI) detection in an end-to-end manner. Current approaches either decouple HOI task into separated stages of object detection and interaction classification or introduce…
There are many limitations applying object detection algorithm on various environments. Especially detecting small objects is still challenging because they have low resolution and limited information. We propose an object detection method…
Most of the current boundary detection systems rely exclusively on low-level features, such as color and texture. However, perception studies suggest that humans employ object-level reasoning when judging if a particular pixel is a…
In this work, we present a novel and effective framework to facilitate object detection with the instance-level segmentation information that is only supervised by bounding box annotation. Starting from the joint object detection and…
Mid-level visual element discovery aims to find clusters of image patches that are both representative and discriminative. In this work, we study this problem from the prospective of pattern mining while relying on the recently popularized…
Currently, the state-of-the-art image classification algorithms outperform the best available object detector by a big margin in terms of average precision. We, therefore, propose a simple yet principled approach that allows us to leverage…
In existing works that learn representation for object detection, the relationship between a candidate window and the ground truth bounding box of an object is simplified by thresholding their overlap. This paper shows information loss in…