Related papers: Peripheral Vision Transformer

PerceptNet: A Human Visual System Inspired Neural Network for Estimating Perceptual Distance

Traditionally, the vision community has devised algorithms to estimate the distance between an original image and images that have been subject to perturbations. Inspiration was usually taken from the human visual perceptual system and how…

Machine Learning · Computer Science 2020-11-18 Alexander Hepburn , Valero Laparra , Jesús Malo , Ryan McConville , Raul Santos-Rodriguez

Unified Perceptual Parsing for Scene Understanding

Humans recognize the visual world at multiple levels: we effortlessly categorize scenes and detect objects inside, while also identifying the textures and surfaces of the objects along with their different compositional parts. In this…

Computer Vision and Pattern Recognition · Computer Science 2018-07-27 Tete Xiao , Yingcheng Liu , Bolei Zhou , Yuning Jiang , Jian Sun

Learning Hierarchical Image Segmentation For Recognition and By Recognition

Large vision and language models learned directly through image-text associations often lack detailed visual substantiation, whereas image segmentation tasks are treated separately from recognition, supervisedly learned without…

Computer Vision and Pattern Recognition · Computer Science 2024-05-06 Tsung-Wei Ke , Sangwoo Mo , Stella X. Yu

Efficient Dataflow Modeling of Peripheral Encoding in the Human Visual System

Computer graphics seeks to deliver compelling images, generated within a computing budget, targeted at a specific display device, and ultimately viewed by an individual user. The foveated nature of human vision offers an opportunity to…

Graphics · Computer Science 2021-07-27 Rachel Brown , Vasha DuTell , Bruce Walter , Ruth Rosenholtz , Peter Shirley , Morgan McGuire , David Luebke

Perceiver: General Perception with Iterative Attention

Biological systems perceive the world by simultaneously processing high-dimensional inputs from modalities as diverse as vision, audition, touch, proprioception, etc. The perception models used in deep learning on the other hand are…

Computer Vision and Pattern Recognition · Computer Science 2021-06-24 Andrew Jaegle , Felix Gimeno , Andrew Brock , Andrew Zisserman , Oriol Vinyals , Joao Carreira

Unifying Visual Perception by Dispersible Points Learning

We present a conceptually simple, flexible, and universal visual perception head for variant visual tasks, e.g., classification, object detection, instance segmentation and pose estimation, and different frameworks, such as one-stage or…

Computer Vision and Pattern Recognition · Computer Science 2022-09-13 Jianming Liang , Guanglu Song , Biao Leng , Yu Liu

Understanding Transformer-based Vision Models through Inversion

Understanding the mechanisms underlying deep neural networks remains a fundamental challenge in machine learning and computer vision. One promising, yet only preliminarily explored approach, is feature inversion, which attempts to…

Computer Vision and Pattern Recognition · Computer Science 2025-08-15 Jan Rathjens , Shirin Reyhanian , David Kappel , Laurenz Wiskott

Unsupervised Network Pretraining via Encoding Human Design

Over the years, computer vision researchers have spent an immense amount of effort on designing image features for the visual object recognition task. We propose to incorporate this valuable experience to guide the task of training deep…

Computer Vision and Pattern Recognition · Computer Science 2016-11-15 Ming-Yu Liu , Arun Mallya , Oncel C. Tuzel , Xi Chen

Central and peripheral vision for scene recognition: A neurocomputational modeling exploration

What are the roles of central and peripheral vision in human scene recognition? Larson and Loschky (2009) showed that peripheral vision contributes more than central vision in obtaining maximum scene recognition accuracy. However, central…

Neurons and Cognition · Quantitative Biology 2017-05-03 Panqu Wang , Garrison W. Cottrell

Multi-Object Representation Learning with Iterative Variational Inference

Human perception is structured around objects which form the basis for our higher-level cognition and impressive systematic generalization abilities. Yet most work on representation learning focuses on feature learning without even…

Machine Learning · Computer Science 2020-07-29 Klaus Greff , Raphaël Lopez Kaufman , Rishabh Kabra , Nick Watters , Chris Burgess , Daniel Zoran , Loic Matthey , Matthew Botvinick , Alexander Lerchner

Human-Understandable Decision Making for Visual Recognition

The widespread use of deep neural networks has achieved substantial success in many tasks. However, there still exists a huge gap between the operating mechanism of deep learning models and human-understandable decision making, so that…

Artificial Intelligence · Computer Science 2021-03-08 Xiaowei Zhou , Jie Yin , Ivor Tsang , Chen Wang

Capturing the objects of vision with neural networks

Human visual perception carves a scene at its physical joints, decomposing the world into objects, which are selectively attended, tracked, and predicted as we engage our surroundings. Object representations emancipate perception from the…

Neurons and Cognition · Quantitative Biology 2021-09-09 Benjamin Peters , Nikolaus Kriegeskorte

Part-based Face Recognition with Vision Transformers

Holistic methods using CNNs and margin-based losses have dominated research on face recognition. In this work, we depart from this setting in two ways: (a) we employ the Vision Transformer as an architecture for training a very strong…

Computer Vision and Pattern Recognition · Computer Science 2022-12-02 Zhonglin Sun , Georgios Tzimiropoulos

A Survey on Visual Transformer

Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism. Thanks to its strong representation capabilities, researchers are looking at ways to…

Computer Vision and Pattern Recognition · Computer Science 2023-07-11 Kai Han , Yunhe Wang , Hanting Chen , Xinghao Chen , Jianyuan Guo , Zhenhua Liu , Yehui Tang , An Xiao , Chunjing Xu , Yixing Xu , Zhaohui Yang , Yiman Zhang , Dacheng Tao

Vision Transformer with Progressive Sampling

Transformers with powerful global relation modeling abilities have been introduced to fundamental computer vision tasks recently. As a typical example, the Vision Transformer (ViT) directly applies a pure transformer architecture on image…

Computer Vision and Pattern Recognition · Computer Science 2021-08-05 Xiaoyu Yue , Shuyang Sun , Zhanghui Kuang , Meng Wei , Philip Torr , Wayne Zhang , Dahua Lin

Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features

This work presents a novel method of exploring human brain-visual representations, with a view towards replicating these processes in machines. The core idea is to learn plausible computational and biological representations by correlating…

Computer Vision and Pattern Recognition · Computer Science 2020-04-21 Simone Palazzo , Concetto Spampinato , Isaak Kavasidis , Daniela Giordano , Joseph Schmidt , Mubarak Shah

Semantic Segmentation Enhanced Transformer Model for Human Attention Prediction

Saliency Prediction aims to predict the attention distribution of human eyes given an RGB image. Most of the recent state-of-the-art methods are based on deep image feature representations from traditional CNNs. However, the traditional…

Computer Vision and Pattern Recognition · Computer Science 2023-01-27 Shuo Zhang

Deep Psychovisual Image Representations

Psychovisual models suggest human vision decouples low-level feature extraction from higher cognition by first forming intermediate abstractions. In contrast, deep learning-based vision models routinely extract and aggregate features using…

Computer Vision and Pattern Recognition · Computer Science 2026-05-29 Wendi Ma , Aryaman Sharma , Wei Dai , Shekhar S. Chandra

RegionViT: Regional-to-Local Attention for Vision Transformers

Vision transformer (ViT) has recently shown its strong capability in achieving comparable results to convolutional neural networks (CNNs) on image classification. However, vanilla ViT simply inherits the same architecture from the natural…

Computer Vision and Pattern Recognition · Computer Science 2022-04-01 Chun-Fu Chen , Rameswar Panda , Quanfu Fan

Recurrence is required to capture the representational dynamics of the human visual system

The human visual system is an intricate network of brain regions that enables us to recognize the world around us. Despite its abundant lateral and feedback connections, object processing is commonly viewed and studied as a feedforward…

Neurons and Cognition · Quantitative Biology 2019-10-09 Tim C Kietzmann , Courtney J Spoerer , Lynn Sörensen , Radoslaw M Cichy , Olaf Hauk , Nikolaus Kriegeskorte