Related papers: Multiple Object Recognition with Visual Attention

Attentional Network for Visual Object Detection

We propose augmenting deep neural networks with an attention mechanism for the visual object detection task. As perceiving a scene, humans have the capability of multiple fixation points, each attended to scene content at different…

Computer Vision and Pattern Recognition · Computer Science 2017-02-07 Kota Hara , Ming-Yu Liu , Oncel Tuzel , Amir-massoud Farahmand

Recurrent Attention Models with Object-centric Capsule Representation for Multi-object Recognition

The visual system processes a scene using a sequence of selective glimpses, each driven by spatial and object-based attention. These glimpses reflect what is relevant to the ongoing task and are selected through recurrent processing and…

Computer Vision and Pattern Recognition · Computer Science 2021-10-12 Hossein Adeli , Seoyoung Ahn , Gregory Zelinsky

Deep Attentive Tracking via Reciprocative Learning

Visual attention, derived from cognitive neuroscience, facilitates human perception on the most pertinent subset of the sensory data. Recently, significant efforts have been made to exploit attention schemes to advance computer vision…

Computer Vision and Pattern Recognition · Computer Science 2018-10-16 Shi Pu , Yibing Song , Chao Ma , Honggang Zhang , Ming-Hsuan Yang

Object Based Attention Through Internal Gating

Object-based attention is a key component of the visual system, relevant for perception, learning, and memory. Neurons tuned to features of attended objects tend to be more active than those associated with non-attended objects. There is a…

Neurons and Cognition · Quantitative Biology 2021-06-09 Jordan Lei , Ari S. Benjamin , Konrad P. Kording

Recurrent 3D Attentional Networks for End-to-End Active Object Recognition

Active vision is inherently attention-driven: The agent actively selects views to attend in order to fast achieve the vision task while improving its internal representation of the scene being observed. Inspired by the recent success of…

Computer Vision and Pattern Recognition · Computer Science 2022-01-12 Min Liu , Yifei Shi , Lintao Zheng , Kai Xu , Hui Huang , Dinesh Manocha

Semantic Aware Attention Based Deep Object Co-segmentation

Object co-segmentation is the task of segmenting the same objects from multiple images. In this paper, we propose the Attention Based Object Co-Segmentation for object co-segmentation that utilize a novel attention mechanism in the…

Computer Vision and Pattern Recognition · Computer Science 2018-10-17 Hong Chen , Yifei Huang , Hideki Nakayama

Where to Focus: Deep Attention-based Spatially Recurrent Bilinear Networks for Fine-Grained Visual Recognition

Fine-grained visual recognition typically depends on modeling subtle difference from object parts. However, these parts often exhibit dramatic visual variations such as occlusions, viewpoints, and spatial transformations, making it hard to…

Computer Vision and Pattern Recognition · Computer Science 2017-09-19 Lin Wu , Yang Wang

Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition

Attention-based learning for fine-grained image recognition remains a challenging task, where most of the existing methods treat each object part in isolation, while neglecting the correlations among them. In addition, the multi-stage or…

Computer Vision and Pattern Recognition · Computer Science 2018-06-15 Ming Sun , Yuchen Yuan , Feng Zhou , Errui Ding

Food Image Classification and Segmentation with Attention-based Multiple Instance Learning

The demand for accurate food quantification has increased in the recent years, driven by the needs of applications in dietary monitoring. At the same time, computer vision approaches have exhibited great potential in automating tasks within…

Computer Vision and Pattern Recognition · Computer Science 2023-08-23 Valasia Vlachopoulou , Ioannis Sarafis , Alexandros Papadopoulos

Moving object detection from multi-depth images with an attention-enhanced CNN

One of the greatest challenges for detecting moving objects in the solar system from wide-field survey data is determining whether a signal indicates a true object or is due to some other source, like noise. Object verification has relied…

Computer Vision and Pattern Recognition · Computer Science 2025-12-08 Masato Shibukawa , Fumi Yoshida , Toshifumi Yanagisawa , Takashi Ito , Hirohisa Kurosaki , Makoto Yoshikawa , Kohki Kamiya , Ji-an Jiang , Wesley Fraser , JJ Kavelaars , Susan Benecchi , Anne Verbiscer , Akira Hatakeyama , Hosei O , Naoya Ozaki

Enriched Deep Recurrent Visual Attention Model for Multiple Object Recognition

We design an Enriched Deep Recurrent Visual Attention Model (EDRAM) - an improved attention-based architecture for multiple object recognition. The proposed model is a fully differentiable unit that can be optimized end-to-end by using…

Computer Vision and Pattern Recognition · Computer Science 2017-06-13 Artsiom Ablavatski , Shijian Lu , Jianfei Cai

SpotNet: Self-Attention Multi-Task Network for Object Detection

Humans are very good at directing their visual attention toward relevant areas when they search for different types of objects. For instance, when we search for cars, we will look at the streets, not at the top of buildings. The motivation…

Computer Vision and Pattern Recognition · Computer Science 2020-06-12 Hughes Perreault , Guillaume-Alexandre Bilodeau , Nicolas Saunier , Maguelonne Héritier

Recurrent Models of Visual Attention

Applying convolutional neural networks to large images is computationally expensive because the amount of computation scales linearly with the number of image pixels. We present a novel recurrent neural network model that is capable of…

Machine Learning · Computer Science 2014-06-25 Volodymyr Mnih , Nicolas Heess , Alex Graves , Koray Kavukcuoglu

Deep Models for Multi-View 3D Object Recognition: A Review

Human decision-making often relies on visual information from multiple perspectives or views. In contrast, machine learning-based object recognition utilizes information from a single image of the object. However, the information conveyed…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Mona Alzahrani , Muhammad Usman , Salma Kammoun , Saeed Anwar , Tarek Helmy

Multisource Region Attention Network for Fine-Grained Object Recognition in Remote Sensing Imagery

Fine-grained object recognition concerns the identification of the type of an object among a large number of closely related sub-categories. Multisource data analysis, that aims to leverage the complementary spectral, spatial, and…

Computer Vision and Pattern Recognition · Computer Science 2019-01-23 Gencer Sumbul , Ramazan Gokberk Cinbis , Selim Aksoy

The Application of Two-level Attention Models in Deep Convolutional Neural Network for Fine-grained Image Classification

Fine-grained classification is challenging because categories can only be discriminated by subtle and local differences. Variances in the pose, scale or rotation usually make the problem more difficult. Most fine-grained classification…

Computer Vision and Pattern Recognition · Computer Science 2014-11-25 Tianjun Xiao , Yichong Xu , Kuiyuan Yang , Jiaxing Zhang , Yuxin Peng , Zheng Zhang

Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations

With the aim of promoting and understanding the multilingual version of image search, we leverage visual object detection and propose a model with diverse multi-head attention to learn grounded multilingual multimodal representations.…

Computation and Language · Computer Science 2019-10-02 Po-Yao Huang , Xiaojun Chang , Alexander Hauptmann

Visual Attention driven by Convolutional Features

The understanding of where humans look in a scene is a problem of great interest in visual perception and computer vision. When eye-tracking devices are not a viable option, models of human attention can be used to predict fixations. In…

Computer Vision and Pattern Recognition · Computer Science 2018-07-30 Dario Zanca , Marco Gori

Unsupervised Multi-object Segmentation Using Attention and Soft-argmax

We introduce a new architecture for unsupervised object-centric representation learning and multi-object detection and segmentation, which uses a translation-equivariant attention mechanism to predict the coordinates of the objects present…

Computer Vision and Pattern Recognition · Computer Science 2022-09-01 Bruno Sauvalle , Arnaud de La Fortelle

Progressive Attention Networks for Visual Attribute Prediction

We propose a novel attention model that can accurately attends to target objects of various scales and shapes in images. The model is trained to gradually suppress irrelevant regions in an input image via a progressive attentive process…

Computer Vision and Pattern Recognition · Computer Science 2018-08-08 Paul Hongsuck Seo , Zhe Lin , Scott Cohen , Xiaohui Shen , Bohyung Han