Related papers: Patchwork: A Patch-wise Attention Network for Effi…

Adaptive Focus for Efficient Video Recognition

In this paper, we explore the spatial redundancy in video recognition with the aim to improve the computational efficiency. It is observed that the most informative region in each frame of a video is usually a small image patch, which…

Computer Vision and Pattern Recognition · Computer Science 2021-08-19 Yulin Wang , Zhaoxi Chen , Haojun Jiang , Shiji Song , Yizeng Han , Gao Huang

a novel attention-based network for fast salient object detection

In the current salient object detection network, the most popular method is using U-shape structure. However, the massive number of parameters leads to more consumption of computing and storage resources which are not feasible to deploy on…

Computer Vision and Pattern Recognition · Computer Science 2021-12-21 Bin Zhang , Yang Wu , Xiaojing Zhang , Ming Ma

Real-time Semantic Segmentation with Fast Attention

In deep CNN based models for semantic segmentation, high accuracy relies on rich spatial context (large receptive fields) and fine spatial details (high resolution), both of which incur high computational costs. In this paper, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2020-07-13 Ping Hu , Federico Perazzi , Fabian Caba Heilbron , Oliver Wang , Zhe Lin , Kate Saenko , Stan Sclaroff

CA3D: Convolutional-Attentional 3D Nets for Efficient Video Activity Recognition on the Edge

In this paper, we introduce a deep learning solution for video activity recognition that leverages an innovative combination of convolutional layers with a linear-complexity attention mechanism. Moreover, we introduce a novel quantization…

Computer Vision and Pattern Recognition · Computer Science 2025-05-27 Gabriele Lagani , Fabrizio Falchi , Claudio Gennaro , Giuseppe Amato

Density-aware global-local attention network for point cloud segmentation

3D point cloud segmentation has a wide range of applications in areas such as autonomous driving, augmented reality, virtual reality and digital twins. The point cloud data collected in real scenes often contain small objects and categories…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Chade Li , Pengju Zhang , Jiaming Zhang , Yihong Wu

PatchNet -- Short-range Template Matching for Efficient Video Processing

Object recognition is a fundamental problem in many video processing tasks, accurately locating seen objects at low computation cost paves the way for on-device video recognition. We propose PatchNet, an efficient convolutional neural…

Computer Vision and Pattern Recognition · Computer Science 2021-03-15 Huizi Mao , Sibo Zhu , Song Han , William J. Dally

BubbleNets: Learning to Select the Guidance Frame in Video Object Segmentation by Deep Sorting Frames

Semi-supervised video object segmentation has made significant progress on real and challenging videos in recent years. The current paradigm for segmentation methods and benchmark datasets is to segment objects in video provided a single…

Computer Vision and Pattern Recognition · Computer Science 2020-11-25 Brent A. Griffin , Jason J. Corso

Video Object Segmentation using Space-Time Memory Networks

We propose a novel solution for semi-supervised video object segmentation. By the nature of the problem, available cues (e.g. video frame(s) with object masks) become richer with the intermediate predictions. However, the existing methods…

Computer Vision and Pattern Recognition · Computer Science 2019-08-13 Seoung Wug Oh , Joon-Young Lee , Ning Xu , Seon Joo Kim

Patch Pruning Strategy Based on Robust Statistical Measures of Attention Weight Diversity in Vision Transformers

Multi-head self-attention is a distinctive feature extraction mechanism of vision transformers that computes pairwise relationships among all input patches, contributing significantly to their high performance. However, it is known to incur…

Computer Vision and Pattern Recognition · Computer Science 2025-07-28 Yuki Igaue , Hiroaki Aizawa

OVSNet : Towards One-Pass Real-Time Video Object Segmentation

Video object segmentation aims at accurately segmenting the target object regions across consecutive frames. It is technically challenging for coping with complicated factors (e.g., shape deformations, occlusion and out of the lens). Recent…

Computer Vision and Pattern Recognition · Computer Science 2019-07-03 Peng Sun , Peiwen Lin , Guangliang Cheng , Jianping Shi , Jiawan Zhang , Xi Li

Dynamic Network Quantization for Efficient Video Inference

Deep convolutional networks have recently achieved great success in video recognition, yet their practical realization remains a challenge due to the large amount of computational resources required to achieve robust recognition. Motivated…

Computer Vision and Pattern Recognition · Computer Science 2021-08-25 Ximeng Sun , Rameswar Panda , Chun-Fu Chen , Aude Oliva , Rogerio Feris , Kate Saenko

Progressive Attention Networks for Visual Attribute Prediction

We propose a novel attention model that can accurately attends to target objects of various scales and shapes in images. The model is trained to gradually suppress irrelevant regions in an input image via a progressive attentive process…

Computer Vision and Pattern Recognition · Computer Science 2018-08-08 Paul Hongsuck Seo , Zhe Lin , Scott Cohen , Xiaohui Shen , Bohyung Han

Local Memory Attention for Fast Video Semantic Segmentation

We propose a novel neural network module that transforms an existing single-frame semantic segmentation model into a video semantic segmentation pipeline. In contrast to prior works, we strive towards a simple, fast, and general module that…

Computer Vision and Pattern Recognition · Computer Science 2021-09-28 Matthieu Paul , Martin Danelljan , Luc Van Gool , Radu Timofte

A System-Level Solution for Low-Power Object Detection

Object detection has made impressive progress in recent years with the help of deep learning. However, state-of-the-art algorithms are both computation and memory intensive. Though many lightweight networks are developed for a trade-off…

Computer Vision and Pattern Recognition · Computer Science 2019-10-22 Fanrong Li , Zitao Mo , Peisong Wang , Zejian Liu , Jiayun Zhang , Gang Li , Qinghao Hu , Xiangyu He , Cong Leng , Yang Zhang , Jian Cheng

AttentionNet: Aggregating Weak Directions for Accurate Object Detection

We present a novel detection method using a deep convolutional neural network (CNN), named AttentionNet. We cast an object detection problem as an iterative classification problem, which is the most suitable form of a CNN. AttentionNet…

Computer Vision and Pattern Recognition · Computer Science 2015-09-29 Donggeun Yoo , Sunggyun Park , Joon-Young Lee , Anthony S. Paek , In So Kweon

Adaptive Memory Management for Video Object Segmentation

Matching-based networks have achieved state-of-the-art performance for video object segmentation (VOS) tasks by storing every-k frames in an external memory bank for future inference. Storing the intermediate frames' predictions provides…

Computer Vision and Pattern Recognition · Computer Science 2022-04-15 Ali Pourganjalikhan , Charalambos Poullis

RANet: Ranking Attention Network for Fast Video Object Segmentation

Despite online learning (OL) techniques have boosted the performance of semi-supervised video object segmentation (VOS) methods, the huge time costs of OL greatly restrict their practicality. Matching based and propagation based methods run…

Computer Vision and Pattern Recognition · Computer Science 2020-05-29 Ziqin Wang , Jun Xu , Li Liu , Fan Zhu , Ling Shao

SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer

High-resolution images enable neural networks to learn richer visual representations. However, this improved performance comes at the cost of growing computational complexity, hindering their usage in latency-sensitive applications. As not…

Computer Vision and Pattern Recognition · Computer Science 2023-03-31 Xuanyao Chen , Zhijian Liu , Haotian Tang , Li Yi , Hang Zhao , Song Han

Low-Latency Video Semantic Segmentation

Recent years have seen remarkable progress in semantic segmentation. Yet, it remains a challenging task to apply segmentation techniques to video-based applications. Specifically, the high throughput of video streams, the sheer cost of…

Computer Vision and Pattern Recognition · Computer Science 2018-04-03 Yule Li , Jianping Shi , Dahua Lin

Flash Window Attention: speedup the attention computation for Swin Transformer

To address the high resolution of image pixels, the Swin Transformer introduces window attention. This mechanism divides an image into non-overlapping windows and restricts attention computation to within each window, significantly…

Computer Vision and Pattern Recognition · Computer Science 2025-01-15 Zhendong Zhang