Related papers: Localized Feature Aggregation Module for Semantic …

Learning Local Features with Context Aggregation for Visual Localization

Keypoint detection and description is fundamental yet important in many vision applications. Most existing methods use detect-then-describe or detect-and-describe strategy to learn local features without considering their context…

Computer Vision and Pattern Recognition · Computer Science 2020-06-02 Siyu Hong , Kunhong Li , Yongcong Zhang , Zhiheng Fu , Mengyi Liu , Yulan Guo

Unifying Feature and Cost Aggregation with Transformers for Semantic and Visual Correspondence

This paper introduces a Transformer-based integrative feature and cost aggregation network designed for dense matching tasks. In the context of dense matching, many works benefit from one of two forms of aggregation: feature aggregation,…

Computer Vision and Pattern Recognition · Computer Science 2024-04-23 Sunghwan Hong , Seokju Cho , Seungryong Kim , Stephen Lin

Local Memory Attention for Fast Video Semantic Segmentation

We propose a novel neural network module that transforms an existing single-frame semantic segmentation model into a video semantic segmentation pipeline. In contrast to prior works, we strive towards a simple, fast, and general module that…

Computer Vision and Pattern Recognition · Computer Science 2021-09-28 Matthieu Paul , Martin Danelljan , Luc Van Gool , Radu Timofte

LoFLAT: Local Feature Matching using Focused Linear Attention Transformer

Local feature matching is an essential technique in image matching and plays a critical role in a wide range of vision-based applications. However, existing Transformer-based detector-free local feature matching methods encounter challenges…

Computer Vision and Pattern Recognition · Computer Science 2024-10-31 Naijian Cao , Renjie He , Yuchao Dai , Mingyi He

Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation

Referring image segmentation segments an image from a language expression. With the aim of producing high-quality masks, existing methods often adopt iterative learning approaches that rely on RNNs or stacked attention layers to refine…

Computer Vision and Pattern Recognition · Computer Science 2023-03-14 Zhao Yang , Jiaqi Wang , Yansong Tang , Kai Chen , Hengshuang Zhao , Philip H. S. Torr

Learning Spatial and Spatio-Temporal Pixel Aggregations for Image and Video Denoising

Existing denoising methods typically restore clear results by aggregating pixels from the noisy input. Instead of relying on hand-crafted aggregation schemes, we propose to explicitly learn this process with deep neural networks. We present…

Computer Vision and Pattern Recognition · Computer Science 2021-02-03 Xiangyu Xu , Muchen Li , Wenxiu Sun , Ming-Hsuan Yang

Learning to Downsample for Segmentation of Ultra-High Resolution Images

Many computer vision systems require low-cost segmentation algorithms based on deep learning, either because of the enormous size of input images or limited computational budget. Common solutions uniformly downsample the input images to…

Computer Vision and Pattern Recognition · Computer Science 2022-08-19 Chen Jin , Ryutaro Tanno , Thomy Mertzanidou , Eleftheria Panagiotaki , Daniel C. Alexander

Feature Sharing Cooperative Network for Semantic Segmentation

In recent years, deep neural networks have achieved high ac-curacy in the field of image recognition. By inspired from human learning method, we propose a semantic segmentation method using cooperative learning which shares the information…

Computer Vision and Pattern Recognition · Computer Science 2021-01-21 Ryota Ikedo , Kazuhiro Hotta

Segmentation-Aware Convolutional Networks Using Local Attention Masks

We introduce an approach to integrate segmentation information within a convolutional neural network (CNN). This counter-acts the tendency of CNNs to smooth information across regions and increases their spatial precision. To obtain…

Computer Vision and Pattern Recognition · Computer Science 2017-08-16 Adam W. Harley , Konstantinos G. Derpanis , Iasonas Kokkinos

Pixel-Semantic Revise of Position Learning A One-Stage Object Detector with A Shared Encoder-Decoder

Recently, many methods have been proposed for object detection. They cannot detect objects by semantic features, adaptively. In this work, according to channel and spatial attention mechanisms, we mainly analyze that different methods…

Computer Vision and Pattern Recognition · Computer Science 2020-09-30 Qian Li , Nan Guo , Xiaochun Ye , Dongrui Fan , Zhimin Tang

Dynamic Local Feature Aggregation for Learning on Point Clouds

Existing point cloud learning methods aggregate features from neighbouring points relying on constructing graph in the spatial domain, which results in feature update for each point based on spatially-fixed neighbours throughout layers. In…

Computer Vision and Pattern Recognition · Computer Science 2023-01-10 Zihao Li , Pan Gao , Hui Yuan , Ran Wei

Learning Where to Focus for Efficient Video Object Detection

Transferring existing image-based detectors to the video is non-trivial since the quality of frames is always deteriorated by part occlusion, rare pose, and motion blur. Previous approaches exploit to propagate and aggregate features across…

Computer Vision and Pattern Recognition · Computer Science 2020-07-17 Zhengkai Jiang , Yu Liu , Ceyuan Yang , Jihao Liu , Peng Gao , Qian Zhang , Shiming Xiang , Chunhong Pan

Adapting a Segmentation Foundation Model for Medical Image Classification

Recent advancements in foundation models, such as the Segment Anything Model (SAM), have shown strong performance in various vision tasks, particularly image segmentation, due to their impressive zero-shot segmentation capabilities.…

Computer Vision and Pattern Recognition · Computer Science 2025-05-12 Pengfei Gu , Haoteng Tang , Islam A. Ebeid , Jose A. Nunez , Fabian Vazquez , Diego Adame , Marcus Zhan , Huimin Li , Bin Fu , Danny Z. Chen

Local-Global Attention: An Adaptive Mechanism for Multi-Scale Feature Integration

In recent years, attention mechanisms have significantly enhanced the performance of object detection by focusing on key feature information. However, prevalent methods still encounter difficulties in effectively balancing local and global…

Computer Vision and Pattern Recognition · Computer Science 2024-11-15 Yifan Shao

Aggregating Deep Convolutional Features for Image Retrieval

Several recent works have shown that image descriptors produced by deep convolutional neural networks provide state-of-the-art performance for image classification and retrieval problems. It has also been shown that the activations from the…

Computer Vision and Pattern Recognition · Computer Science 2015-10-27 Artem Babenko , Victor Lempitsky

Localization Distillation for Dense Object Detection

Knowledge distillation (KD) has witnessed its powerful capability in learning compact models in object detection. Previous KD methods for object detection mostly focus on imitating deep features within the imitation regions instead of…

Computer Vision and Pattern Recognition · Computer Science 2022-04-01 Zhaohui Zheng , Rongguang Ye , Ping Wang , Dongwei Ren , Wangmeng Zuo , Qibin Hou , Ming-Ming Cheng

Learning Token-based Representation for Image Retrieval

In image retrieval, deep local features learned in a data-driven manner have been demonstrated effective to improve retrieval performance. To realize efficient retrieval on large image database, some approaches quantize deep local features…

Image and Video Processing · Electrical Eng. & Systems 2021-12-14 Hui Wu , Min Wang , Wengang Zhou , Yang Hu , Houqiang Li

Multi-Scale Context Aggregation by Dilated Convolutions

State-of-the-art models for semantic segmentation are based on adaptations of convolutional networks that had originally been designed for image classification. However, dense prediction and image classification are structurally different.…

Computer Vision and Pattern Recognition · Computer Science 2016-05-03 Fisher Yu , Vladlen Koltun

Deep feature transfer between localization and segmentation tasks

In this paper, we propose a new pre-training scheme for U-net based image segmentation. We first train the encoding arm as a localization network to predict the center of the target, before extending it into a U-net architecture for…

Computer Vision and Pattern Recognition · Computer Science 2018-11-13 Szu-Yeu Hu , Andrew Beers , Ken Chang , Kathi Höbel , J. Peter Campbell , Deniz Erdogumus , Stratis Ioannidis , Jennifer Dy , Michael F. Chiang , Jayashree Kalpathy-Cramer , James M. Brown

Learning Semantic-Aligned Feature Representation for Text-based Person Search

Text-based person search aims to retrieve images of a certain pedestrian by a textual description. The key challenge of this task is to eliminate the inter-modality gap and achieve the feature alignment across modalities. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2021-12-14 Shiping Li , Min Cao , Min Zhang